Skip to main content

Ingest SQL Server Database as Delimited File

Copy data from a SQL Server database (via a Self-Hosted Integration Runtime) to delimited format in Azure Data Lake Storage Gen2.

Category: Ingest to Lakehouse | Tags: Ingestion SQL Server

How it works

Via Integration Runtime, execute query '<<ExtractQuery>>' against SQL Server database '<<DataLakeSystemFolder>>' and ingest as Delimited File into Data Lake location 'raw/<<DataLakeSystemFolder>>/<<DataLakeDatasetFolder>>'

To use this activity within the API, use an ActivityCode of SQLSERVER-ADLS-DELIM.

Available Connections

SourceConnection:

TargetConnection:

Example JSON

An example of what the Task Config would look like for a task using this activity. Some of these variables would be set at the group level to avoid duplication between tasks.

{
"SourceConnection": "MY-SOURCE-CONN",
"ExtractQuery": "",
"DataLakeSystemFolder": "my_folder",
"DataLakeDatasetFolder": "data",
"TargetConnection": "MY-TARGET-CONN"
}

Variable Reference

The following variables are supported:

  • CopyBehaviour (Optional) - Defines behaviour when copying files from one file system to another. Options are None (default) and MergeFiles (mulitple source files are merged into a single file at the destination).

  • DataLakeDatasetFolder (Required) - Name of the folder in the Data Lake containing the dataset.

  • DataLakeSystemFolder (Required) - Name of the parent (System) folder in the Data Lake containing the dataset.

  • DestinationColumnDelimiter (Optional) - Column delimiter to use for the destination file.

  • DestinationRowDelimiter (Optional) - Row delimiter to use for destination file.

  • DIUsToUseForCopyActivity (Optional) - Specifies the powerfulness of the copy executor. Value can be between 2 and 256. When left at default, the Data Factory dynamically applies the optimal DIU setting based on the source-sink pair and data pattern.

  • Encoding (Optional) - The encoding type used to read/write text files.

  • EscapeCharacter (Optional) - The single character to escape quotes inside a quoted value. When EscapeCharacter is defined as empty string, QuoteCharacter must be set as empty string as well (in which case make sure all column values don't contain delimiters).

  • ExtractControlQuery (Optional) - For incremental loads only, a SQL query to get a 'high-water' mark for extract control. For instance, this could be the maximum value of an modified_date or an identity column. NOTE: The column returned must be aliased as ExtractControl e.g. select max(modified_date) as ExtractControl from some_table.

    Show more details

    **Extract Control Query **

    For incremental data extraction, the SQL statement that will be executed on the source database to extract the high-water mark for this data prior to extracting. This could be the maximum value of a modified date or an identity column and will be used in the next data extraction.

    NOTE: The column returned must be aliased as ExtractControl. For example:

    select max(modified_date) as ExtractControl
    from some_table;
  • ExtractControlVariableName (Optional) - For incremental loads only, the name to assign the Extract Control variable in State Config for the ExtractControl value derived from the Extract Control Query above.

  • ExtractControlVariableSeedValue (Optional) - The initial value to set for the Extract Control variable in State Config - this will have no impact beyond the original seeding of the Extract Control variable in State Config.

  • ExtractQuery (Required) - SQL query to extract data from the source database.

  • FirstRowAsHeader (Optional) - Should the first data row should be used as the header.

  • IsFederated (Optional) - Makes task available to other Insight Factories within this organisation.

  • Links (Optional) - NULL

  • MaximumNumberOfAttemptsAllowed (Optional) - The total number of times the running of this Task can be attempted.

  • MinutesToWaitBeforeNextAttempt (Optional) - If a Task run fails, the number of minutes to wait before re-attempting the Task.

  • NullValue (Optional) - The string representation of a null value (leave blank for empty string).

  • QuoteCharacter (Optional) - The single character to use to quote a column value if it contains the column delimiter. When QuoteCharacter is defined as empty string, it means there is no quote character and column value will not be quoted, and EscapeCharacter is used to escape the column delimiter and itself.

  • RetainHistory (Optional) - Should the raw files be saved to the History Container to preserve them?

    Show more details

    **Retain History? ** By default, this flag is set to the value assigned in the Configuration item SaveRawFilesToHistory (signalled by the double triangle brackets around the Configuration item name e.g. &lt;&lt;SaveRawFilesToHistory&gt;&gt;). This default behaviour can be overridden here.

  • SourceConnection (Required) - Source connection to use.

  • TargetConnection (Optional) - Target connection to use.