Skip to main content

Ingest Azure Data Lake Delimited File (via Acct Key) with URL as Secret

Copy file(s) from Azure Data Lake Storage Gen2 (accessed via an Account Key) to Azure Data Lake Storage Gen2.

Category: Ingest to Lakehouse | Tags: Ingestion

How it works

Ingest delimited file from '<<SourceAdlsUrlSecretName>>' Data Lake (accessed via Account Key) into Data Lake location 'raw/<<DataLakeSystemFolder>>/<<DataLakeDatasetFolder>>'

Deprecated

This activity is deprecated and should not be used for new tasks.

To use this activity within the API, use an ActivityCode of ADLS-AK-FILE-ADLS.

Available Connections

SourceConnection:

TargetConnection:

Example JSON

An example of what the Task Config would look like for a task using this activity. Some of these variables would be set at the group level to avoid duplication between tasks.

{
"SourceConnection": "MY-SOURCE-CONN",
"SourceFileName": "",
"SourceFilePath": "/path/to",
"DataLakeSystemFolder": "my_folder",
"DataLakeDatasetFolder": "data",
"TargetConnection": "MY-TARGET-CONN"
}

Variable Reference

The following variables are supported:

  • CopyBehaviour (Optional) - Defines behaviour when copying files from one file system to another. Options are None (default) and MergeFiles (mulitple source files are merged into a single file at the destination).

  • DataLakeDatasetFolder (Required) - Name of the folder in the Data Lake containing the dataset.

  • DataLakeSystemFolder (Required) - Name of the parent (System) folder in the Data Lake containing the dataset.

  • DeleteFileFromSourceAfterCopying (Optional) - Should the source file be deleted once it has been successfully copied to its destination?

  • DestinationColumnDelimiter (Optional) - Column delimiter to use for the destination file.

  • DestinationRowDelimiter (Optional) - Row delimiter to use for destination file.

  • DIUsToUseForCopyActivity (Optional) - Specifies the powerfulness of the copy executor. Value can be between 2 and 256. When left at default, the Data Factory dynamically applies the optimal DIU setting based on the source-sink pair and data pattern.

  • Encoding (Optional) - The encoding type used to read/write text files.

  • EscapeCharacter (Optional) - The single character to escape quotes inside a quoted value. When EscapeCharacter is defined as empty string, QuoteCharacter must be set as empty string as well (in which case make sure all column values don't contain delimiters).

  • ExtractControlVariableName (Optional) - For incremental loads only, the name to assign the Extract Control variable in State Config for the ExtractControl value derived from the Extract Control Query above.

  • ExtractControlVariableSeedValue (Optional) - The initial value to set for the Extract Control variable in State Config - this will have no impact beyond the original seeding of the Extract Control variable in State Config.

  • FailIfFileNotExists (Optional) - Should the Task fail if the file isn't found. If set to true, the Task will retry until the file arrives (or the Task reaches the maximum retry threshold).

  • FirstRowAsHeader (Optional) - Should the first data row should be used as the header.

  • IsFederated (Optional) - Makes task available to other Insight Factories within this organisation.

  • Links (Optional) - NULL

  • MaximumNumberOfAttemptsAllowed (Optional) - The total number of times the running of this Task can be attempted.

  • MinutesToWaitBeforeNextAttempt (Optional) - If a Task run fails, the number of minutes to wait before re-attempting the Task.

  • NullValue (Optional) - The string representation of a null value (leave blank for empty string).

  • QuoteCharacter (Optional) - The single character to use to quote a column value if it contains the column delimiter. When QuoteCharacter is defined as empty string, it means there is no quote character and column value will not be quoted, and EscapeCharacter is used to escape the column delimiter and itself.

  • RetainHistory (Optional) - Should the raw files be saved to the History Container to preserve them?

    Show more details

    **Retain History? ** By default, this flag is set to the value assigned in the Configuration item SaveRawFilesToHistory (signalled by the double triangle brackets around the Configuration item name e.g. &lt;&lt;SaveRawFilesToHistory&gt;&gt;). This default behaviour can be overridden here.

  • SkipLineCount (Optional) - The number of non-empty rows to skip when reading data from source files. If both SkipLineCount and FirstRowAsHeader are specified, the lines are skipped first and then the header information is read from the input file.

  • SourceColumnDelimiter (Optional) - Column delimiter of the source file (leave empty for comma).

  • SourceCompressionType (Optional) - Compression type of the source file (e.g. zipped).

  • SourceConnection (Required) - Source connection to use.

  • SourceFileName (Required) - Source file name. Allowed wildcards are * (matching zero or more characters) and ? (matching zero or a single character). You can use ^ to escape your file name if it contains a wildcard character.

  • SourceFilePath (Required) - Path to the source file starting at the storage account Container.

  • TargetConnection (Optional) - Target connection to use.