In this post I’m looking at the Azure SQL databases option that is part of the Azure Synapse link configuration.
A bit of background
Azure Synapse Links let you configure Dataverse exports to Azure. Quite often this results in a file structure in azure storage that holds csv files. These csv files are organised in such a way that the year (or month) of the records creation in Dataverse decide where your exported data will end up.
So typically you would get 2022.csv, 2021.csv, 2020.csv for records that have been created in the last three years. Any new updates are now appended to these files as updates are made to the records.
If you now want a process to use this data you will have to read the full csv file, only to find out that the last line in the csv file includes the update that you are interested in.
It feels like we need a proper database to do the job here. In this post the steps to setup your Synapse Links.
Create a SQL Database
The next step is to create a SQL database in Azure. This database will be used later on for Synapse Analytics to store its data.
To keep the cost down I’m selecting Development. However in production environments you might want to select Production. In my case £4.30 is is better than £211.
Once the SQL database has been created we should be ready for the next steps.
Please make sure that you have enabled the Allow Azure services and resources to access this server setting.
Create a Synapse workspace
First I went to my Azure Portal and went to the Synapse workspace overview. Then I hit the create option and this is where it all begins.
A big form that needs to be completed.
After completing this form, it looks like this. Note that as always a lot of the naming within Azure is very restricted. So for example the accountname has to be in lower case. And a lot of the below has to be unique as well.
So first of all we had to create some Resource groups. I prefer to create the Managed resource group as this controls the naming a bit better then letting Azure generate this for me but you can also leave that empty.
Then we have an account name. This is the Data lake Generation 2 storage.
Now before hitting the Review + /create button, we will need to set the Security and Network settings for the workspace that we will create.
Once the above settings have been completed the Synapse Workspace can be created.
Configure the Linked connection
Once the workspace is created you will find the option to open Azure Synapse Analytics Studio link. This will open your workspace.
Click on the Link connection for the next step.
Now click on the New to create a new linked service. This new linked service will need to be configured as shown below:
Notice that only System Assigned Managed Identity works as the Authentication Type.
Configure access to the resource group
Click on the add role assignment.
And then select the Owner role and add the Managed Identify that was created earlier.
Create a Azure Synapse Link in Dataverse
The next step is to create an Azure Synapse Link in Dataverse. Do make sure that you tick the box to Connect to your Azure Synapse Analytics workspace!
This is where we select the tables that we want to sync. In my example I will only sync the accounts table. And I will partition my data on a monthly basis so that I get more smaller files.
Now, you can click on the Go to Azure Synapse Analytics workspace. This will open the Azure Synapse Analytics workspace. If you notice that you haven’t got this option available then you didn’t select the earlier mentioned tick box.
Now we want to see our data!
In the Synapse Analytics Workspace, I can now find my tables that I’m syncing. Job done!
11 thoughts on “Configure Dataverse exports using Azure Synapse Links for Azure SQL Databases”
[…] Source link […]
I am following https://learn.microsoft.com/en-us/power-apps/maker/data-platform/azure-synapse-link-pipelines?tabs=synapse-analytics this Microsoft documentation, but on execution of pipeline getting following these errors:
Inner activity name: CreateTable, Error: There is already an object named ‘contact’ in the database.
Column names in each table must be unique. Column name ‘Id’ in table ‘contact’ is specified more than once.
Before running the pipeline there was no such table in database
Hi Wajih, that shouldn’t happen. I haven’t see that at all. Is there anything out of the ordinary with the table in Dataverse?
Those are simple tables without any duplicate columns
Pieter, I am not able to see purpose and use of SQL server in above scenario. Can you please let me know where and how it is used?
If SQL is your datawarehouse and you want to include your dataverse data.
how the data will move from dataverse to sql server. my understanding is that syanpse link will export data into datalake and create serverless views synapse analytics.
after that how data is moving to sql server, i am not able to figure that out.
Within the azure Synapse workspace you can configure a sql to sql export
I was able to make it working. The issue was related to the SQL User default schema. However, I have quite a large data in some tables and that is taking long to sync data in Azure SQL, but Storage Trigger is called after each hour and in few hours many pipelines are running simultaneously, even there is no change in CRM during that time. Can we control trigger?
Do you have a link on how to do this please?
Hi Kon, please let me know which steps you are missing in the post