Microsoft Fabric vs. Azure Synapse Analytics: Architecture, Features, Migration Possibilities, FAQs
Share this article
Microsoft Fabric is a SaaS offering that aims to be a one-stop shop for all of your data engineering, science, analytics, and BI needs. Meanwhile, Azure Synapse Analytics is a PaaS that supports data warehousing, integration, and analytics use cases.
Fabric is seen as a successor to Azure Synapse, however, there are several gaps and differences in terms of architecture and capabilities.
In this article, we’ll explore these differences between Microsoft Fabric and Azure Synapse Analytics, while addressing the most frequently asked questions about the two solutions.
Table of contents
- Microsoft Fabric vs. Azure Synapse Analytics: What’s the difference?
- Feature comparison
- Migrating Synapse workloads to Microsoft Fabric: Here’s what you need to know
- Concluding thoughts on Microsoft Fabric vs Azure Synapse Analytics
- Frequently Asked Questions (FAQs)
- Related reads
Microsoft Fabric vs. Azure Synapse Analytics: What’s the difference?
Microsoft Fabric is an end-to-end SaaS offering that brings together several data and analytics workloads under one roof.
These workloads include Data Factory, Synapse Data Warehouse, Synapse Data Engineering, Synapse Data Science, Synapse Real-Time Analytics, Power BI, and Data Activator.
According to Microsoft, Fabric offers:
“Full-service capabilities including data movement, data lakes, data engineering, data integration, data science, real-time analytics, and business intelligence—backed by a shared platform for data security, governance, and compliance. So, your organization no longer needs to stitch together individual analytics services from multiple vendors. Instead, use a streamlined solution that’s easy to connect, onboard, and operate.”
Fabric is built on an open, lake-centric design called OneLake.
Meanwhile, Azure Synapse is a PaaS for enterprise data warehousing, integration, and analytics. It was launched as a one-stop shop for all warehousing and analytics workloads. It included several tools bundled together in a platform called Synapse Studio.
Here’s how Microsoft described the platform:
“Azure Synapse brings together the best of SQL technologies used in enterprise data warehousing, Spark technologies used for big data, Data Explorer for log and time series analytics, Pipelines for data integration and ETL/ELT, and deep integration with other Azure services such as Power BI, CosmosDB, and AzureML.”
Synapse is built on top of ADLS (Azure Data Lake Storage) Gen2, which can be seen as a precursor to OneLake.
Is Fabric like Synapse 3.0?
Microsoft Fabric can be seen as an evolution of Azure Synapse Analytics, or Synapse 3.0.
Here’s a tweet from the Corporate VP of Azure Data, Arun Ulagaratchagan, touting Microsoft Fabric as the next version of Azure Synapse.
Really excited to introduce #MicrosoftFabric today. A single unified analytics platform, built from the ground up for the era of AI!— Arun Ulag (@arunulag) May 23, 2023
Fabric is the next version of #PowerBI, #AzureSynapse, #DataFactory. We are introducing #OneLake, #DataActivator, and so much more.
Read my… pic.twitter.com/RiA51MHAAb
Microsoft Fabric retains several capabilities of Azure Synapse Analytics, while introducing next-generation capabilities as well.
For instance, Microsoft Fabric retains the following Synapse workloads:
- Data engineering
- Data warehouse
- Data science
- Real-time analytics
However, it also moves away from the conventional way of locking data in the proprietary SQL Server format. Instead, data in Microsoft Fabric is stored in the open data standard of Delta-Parquet within OneLake — a central, multi-cloud repository.
Let’s explore the specifics by examining the architecture and capabilities of each Microsoft offering.
Microsoft Fabric vs. Azure Synapse Analytics: Architecture
Microsoft Fabric architecture overview
As mentioned earlier, Microsoft Fabric is built on OneLake, the storage layer for data spread across cloud warehouses and lakes. It is a tenant-wide store for data that serves both professional and citizen developers.
Seven Microsoft Fabric workloads run on OneLake:
- Data Factory: The data integration service
- Synapse Analytics services: This includes Data Warehousing, Data Engineering, Data Science, and Real-Time Analytics
- Power BI: The business intelligence service
- Data Activator: The real-time monitoring service
Read more → Microsoft Fabric 101
Azure Synapse Analytics architecture overview
As mentioned earlier, Azure Synapse brings together the best of SQL technologies in warehousing, analytics, data integration, and ETL/ELT.
The core components of Azure Synapse Analytics include:
- Synapse Studio: An workspace to build solutions, maintain, and secure everything, such as:
- Performing key tasks — ingest, explore, prepare, orchestrate, visualize
- Monitoring resources, usage, and users across SQL, Spark, and Data Explorer
- Using RBAC (Role-based access control) to simplify access to analytics resources
- Writing SQL, Spark, or KQL queries and integrating with enterprise CI/CD processes
- Synapse SQL: A distributed query system for T-SQL that offers both serverless and dedicated resource models
- Apache Spark for Azure Synapse: Integrates Apache Spark, an open-source big data engine for data preparation, data engineering, ETL, and machine learning; To use Spark within Synapse, you can set up Spark Notebooks or Spark job definitions
- Azure Synapse Data Explorer: Offers an interactive query experience to get insights from near real-time logs and telemetry data
- Azure Data Lake Storage Gen2: The cloud-based, enterprise data lake solution acting as a storage layer; it is an evolution of the Azure Data Lake Storage Gen1 and is built on Azure Blob Storage
Microsoft Fabric vs. Azure Synapse Analytics: Key differences in architecture
The main difference is in data storage. Unlike Azure Synapse Analytics, Microsoft Fabric doesn’t have a dedicated SQL pool or relational storage. Instead, warehouse data is persisted in delta lake format within OneLake.
Moreover, since Fabric is SaaS, you don’t have to create and manage Apache Spark pools. All you have to do is specify the Spark environment and load the necessary Python packages. Fabric will then set up a Spark environment for you within seconds.
The other difference is in the workspace. The workspace in Azure Synapse Analytics is Synapse Studio. Meanwhile, Fabric uses a Power BI-based interface and is organized around personas — data science, data engineering, real-time analytics, etc.
Next, let’s compare the features.
Microsoft Fabric vs. Azure Synapse Analytics: Feature comparison
As mentioned before, there is a high degree of overlap between Microsoft Fabric and Azure Synapse Analytics.
However, a few features don’t exist yet, are redundant, or have been replaced by alternatives.
Let’s look at the most prominent feature differences, which include:
Lack of support for SQL Endpoint functions like OPENROWSET
The persistence of Warehouse data in OneLake
Managed Spark Pools are redundant as Microsoft Fabric is SaaS—you can choose the required version of Spark environment you wish to load within seconds
New Notebooks features, such as the ability to add comments, co-editing, data wrangling, etc. facilitate collaboration
Unavailability of Synapse Link, a feature for running near real-time analytics over operational data in Azure SQL Database or SQL Server 2022
Mapping Data Flows, an interface for data transformations, replaced by the Power Query experience
Synapse Studio is being replaced by a Power BI-based workspace
Fabric provides an MLFlow endpoint by default, and as such, you don’t have to create an instance of Azure Machine Learning for your machine learning models; You can write code using the MLFlow API or set it up using the Power BI UI
import mlflow # This will create a new experiment with the provided name. mlflow.create_experiment("<EXPERIMENT_NAME>") # This will set the given experiment as the active experiment. # If an experiment with this name does not exist, a new experiment with this name is created. mlflow.set_experiment("<EXPERIMENT_NAME>")
- Deeper integration between Microsoft Fabric and Power BI, enabling you to create data assets or models directly via the Fabric UI
Considering all the above changes, how can you go about migrating your Synapse workloads to Fabric? Let’s explore further.
Migrating Synapse workloads to Microsoft Fabric: Here’s what you need to know
Even though Fabric is an evolution to Azure Synapse Analytics, there isn’t a direct or automatic upgrade path, so you need to consider it as a manual migration involving modifying code—notebooks, pipelines, etc.
Let’s look at three common limitations in Microsoft Fabric, i.e., for T-SQL commands, data types, and updates to Warehouse tables:
- Limited T-SQL support
- Limited data type support
- Limitations on updates to Warehouse tables
1. Limited T-SQL command support in Microsoft Fabric
You must engineer workarounds for SQL scripts, such as the OPENROWSET syntax, which isn’t supported in Microsoft Fabric.
According to Microsoft, “there’s limited T-SQL functionality, and certain T-SQL commands can cause warehouse corruption.”
Here’s an example.
In Azure Synapse Analytics, users would rely on the OPENROWSET function to read the contents of a data source and display them as a set of rows.
SELECT * FROM OPENROWSET(BULK 'http://<storage account>.dfs.core.windows.net/container/folder/*.parquet', FORMAT = 'PARQUET') AS [file]
Microsoft Fabric doesn’t support the OPENROWSET function. So, you’ll have to modify all SQL scripts using this function.
The best way to proceed is to get all of your data into OneLake in Delta format. You can then use Delta Lake tables directly in your SQL views.
You’ll have to come up with similar workarounds or alternatives for other unsupported T-SQL commands.
2. Limitations on data types in Microsoft Fabric
Since Microsoft Fabric is built using a lakehouse architecture, all data is stored in Delta format (i.e., Parquet). So, your Azure Synapse data should be in Delta format to be auto-discovered in the SQL Endpoint.
If you’re not using Delta Lake yet, you’ll have to convert your data to Delta format using Apache Spark jobs or notebooks using scripts like the following:
Also read → How to convert from CSV to Delta Lake
After scanning your data from the Lakehouse, Fabric makes it available for querying in its Warehouse.
The Warehouse in Microsoft Fabric supports the most commonly used T-SQL data types — numerics, character strings, date and time, etc.
Since Microsoft Fabric is still in preview, it doesn’t support less common data types like image, text, nchar, etc.
3. Limitations on updates to Data Warehouse tables
If you’ve created Delta tables referencing Lakehouse shortcuts using Data Warehouse tables, you cannot update them whenever you run an ‘update’ or ‘delete’ operation on the Data Warehouse table.
According to Microsoft, you must set up the following workaround to ensure that the data in the Delta tables references the shortcut:
- Create Table as Select (CTAS)
- Drop the old table
- CTAS again to the original table name
- Drop the existing shortcut
- Re-create the shortcut to Lakehouse
There are other issues that may affect your migration from Azure Synapse to Microsoft Fabric. You can check out the Microsoft Fabric Known Issues page for real-time updates and fixes.
Concluding thoughts on Microsoft Fabric vs Azure Synapse Analytics
Both Microsoft Fabric and Azure Synapse Analytics are solutions for data and analytics use cases. If you’re looking for a solution that neatly packages several experiences for data engineering, data science, analytics, etc. in one application, then Microsoft Fabric is a good choice.
However, if your primary use cases are performing large-scale data analytics, then Azure Synapse Analytics is sufficient. It can also be more cost-effective as you only pay for the resources you use.
Now, if you aren’t an Azure Synapse Analytics customer yet (or are at an early stage), you could opt for Microsoft Fabric which offers a suite of workloads to meet your data and analytics requirements.
On the other hand, if you’re already a Synapse Analytics customer, you should weigh the following factors before migrating:
- The engineering effort involved in migrating existing Synapse Analytics workloads to Microsoft Fabric — for instance, OPENROWSET(), Synapse Link, etc. isn’t supported
- The costs, since Microsoft Fabric will charge you for capacity, i.e., the number of resources you use as opposed to Synapse’s pay-per-query pricing — this might not be prudent for smaller companies as you must pay for resources, regardless of whether or not you use them
- The pros and cons of using a SaaS offering, which gives you less control over the infrastructure
- The new features and concepts introduced by Microsoft Fabric, and their impact on your requirements and use cases
Since Microsoft Fabric is still in Preview, you can wait and watch how it evolves in the coming months before making a decision.
Microsoft Fabric vs. Azure Synapse Analytics: Frequently Asked Questions (FAQs)
Let’s look at some of the most common questions people have about the recent Microsoft Fabric announcement and its implications on existing products like Azure Synapse Analytics.
We’ll update this section as we gain clarity on Fabric’s capabilities and pricing.
1. What is the difference between Microsoft Fabric and Azure Synapse?
Microsoft Fabric is a SaaS offering that’s an end-to-end solution for all data and analytics use cases. It includes seven workloads — Data Factory, Synapse Data Warehouse, Synapse Data Engineering, Synapse Data Science, Synapse Real-Time Analytics, Power BI, and Data Activator.
Meanwhile, Azure Synapse is a PaaS offering that supports data warehousing and big data analytics use cases. Its core components include Synapse SQL, integration with Apache Spark, Data Explorer, and ADLS Gen2.
“If you want to build a data warehouse and perform analytics on large-scale data, then Azure Synapse Analytics is a great choice. On the other hand, if you are looking to build and deploy microservices-based applications, then Fabric is a great choice.”
2. Does Microsoft Fabric offer all the features in Synapse?
There are several common features between Microsoft Fabric and Azure Synapse Analytics. However, not all Synapse features are available in Microsoft Fabric. Examples include Synapse Link and Mapping Data Flows.
While Synapse Link isn’t available in Fabric yet, Mapping Data Flows isn’t supported. Instead, you can use a similar feature built using Power Query for dataflows.
3. Is Microsoft Fabric replacing Synapse?
Microsoft Fabric is designed to be the natural successor to Azure Synapse Analytics — like Synapse 3.0.
That said, Microsoft announced that Synapse will continue to exist, probably for several months/years.
According to Kevin Feasel, a data and AI specialist, the switch could take years:
“Prior history—like with Azure SQL DW—says that the deprecation timeframe is something we can measure in years rather than months. However, Fabric is intended to replace Synapse one of these days, and so, new customers should start with Fabric.”
For existing customers, there are no automatic upgrade or migration paths for Azure Synapse Analytics workloads yet. Moreover, not all Synapse features are available on Microsoft Fabric. So, you can afford to wait and watch how Fabric evolved over the coming months/years.
4. Should I migrate all my workloads from Synapse to Fabric?
At a Reddit AMA, Microsoft representatives highlighted how the existing Azure services — Data Factory, Synapse Analytics, etc. — aren’t going anywhere.
Depending on your organization’s use case and ease of interoperability with Microsoft Fabric, you can choose to migrate all workloads to Fabric.
However, it isn’t mandatory and you can continue using each Azure service separately.
Also, as mentioned earlier, it’s important to note that there are no direct or automatic upgrade paths available to migrate your Synapse workloads to Fabric yet.
Microsoft Fabric vs Azure Synapse Analytics: Related reads
- Microsoft Fabric 101: A Comprehensive Overview of Microsoft’s New Data Platform
- 7 Microsoft Fabric Use Cases in Data and Analytics
- Microsoft Fabric vs. Snowflake: Features, Architecture, and Use Cases
- What Is a Data Catalog? & Do You Need One?
- AI Data Catalog: Exploring the Possibilities That Artificial Intelligence Brings to Your Metadata Applications & Data Interactions
- 8 Ways AI-Powered Data Catalogs Save Time Spent on Documentation, Tagging, Querying & More
- 15 Essential Data Catalog Features to Look For in 2023
- What is Active Metadata? — Definition, Characteristics, Example & Use Cases
- Data catalog benefits: 5 key reasons why you need one
- Open Source Data Catalog Software: 5 Popular Tools to Consider in 2023
- Data Catalog Platform: The Key To Future-Proofing Your Data Stack
- Top Data Catalog Use Cases Intrinsic to Data-Led Enterprises
- Business Data Catalog: Users, Differentiating Features, Evolution & More
Share this article