Export Lakehouse Notebook Results to SQL Server DB

Export data from Lakehouse to SQL Server DB.

Category: Export Data from Lakehouse | Tags: Export

How it works

Run Databricks Notebook '<<NotebookPath>>' and copy results to SQL Server database table '<<TargetSchemaName>>.<<TargetTableName>>'

To use this activity within the API, use an ActivityCode of DELTA-SQLSERVER.

Available Connections

SourceConnection:

ADLS Gen2 (Managed Identity)

TargetConnection:

SQL Server (Windows Authentication)

Example JSON

An example of what the Task Config would look like for a task using this activity. Some of these variables would be set at the group level to avoid duplication between tasks.

{
  "SourceConnection": "MY-SOURCE-CONN",
  "NotebookPath": "/Users/fred.nurks@example.com/MyRepo/My Notebook",
  "TargetConnection": "MY-TARGET-CONN",
  "TargetSchemaName": "example_schema",
  "TargetTableName": "my_table",
  "NotebookParameters": { "Param1": "Value1", "Param2": "Value2" }
}

Variable Reference

The following variables are supported:

AutoCreateTable (Optional) - Should the target table be created automatically?
DatabricksClusterId (Optional) - The Databricks Cluster to use for this task.
DIUsToUseForCopyActivity (Optional) - Specifies the powerfulness of the copy executor. Value can be between 2 and 256. When left at default, the Data Factory dynamically applies the optimal DIU setting based on the source-sink pair and data pattern.
ExtractControlQuery (Optional) - For incremental loads only, a SQL query to get a 'high-water' mark for extract control. For instance, this could be the maximum value of an modified_date or an identity column. NOTE: The column returned must be aliased as ExtractControl e.g. select max(modified_date) as ExtractControl from some_table.
Show more details
**Extract Control Query **
For incremental data extraction, the SQL statement that will be executed on the source database to extract the high-water mark for this data prior to extracting. This could be the maximum value of a modified date or an identity column and will be used in the next data extraction.
NOTE: The column returned must be aliased as ExtractControl. For example:
select max(modified_date) as ExtractControl from some_table;
ExtractControlVariableName (Optional) - For incremental loads only, the name to assign the Extract Control variable in State Config for the ExtractControl value derived from the Extract Control Query above.
ExtractControlVariableSeedValue (Optional) - The initial value to set for the Extract Control variable in State Config - this will have no impact beyond the original seeding of the Extract Control variable in State Config.
IsFederated (Optional) - Makes task available to other Insight Factories within this organisation.
Links (Optional) - NULL
MaximumNumberOfAttemptsAllowed (Optional) - The total number of times the running of this Task can be attempted.
MinutesToWaitBeforeNextAttempt (Optional) - If a Task run fails, the number of minutes to wait before re-attempting the Task.
NotebookParameters (Optional) - Parameters for use in the Databricks Notebook. This is JSON format e.g. { "Param1": "Value1", "Param2": "Value2" }.
NotebookPath (Required) - The relative path to the Databricks Notebook.
PreCopyScript (Optional) - Script to execute on target database prior to Copy.
SaveExportToHistory (Optional) - Save the exported data to history in the Lakehouse?
SourceConnection (Required) - Source connection to use.
TargetConnection (Required) - Target connection to use.
TargetSchemaName (Required) - Name of the schema within target Azure SQL Database (e.g. dbo).
TargetTableName (Required) - Name of the table within target Azure SQL Database (e.g. Climate).
UpsertKeyColumns (Optional) - A comma-separated list of column names comprising the unique key of the target table.
WriteBehaviour (Optional) - The method of writing data to the target table.

Available Connections​

Example JSON​

Variable Reference​

Available Connections

Example JSON

Variable Reference