Skip to main content

Ingest Azure Data Lake Delimited File (via Service Principal)

Copy file(s) from Azure Data Lake Storage Gen2 (accessed via a Service Principal) to Azure Data Lake Storage Gen2.

To use this activity within the API, use an ActivityCode of ADLS-SP-DELIMITED-FILE-ADLS.

Example JSON

An example of what the Task Config would look like for a task using this activity. Some of these variables would be set at the group level to avoid duplication between tasks.

NULL

Variable Reference

The following variables are supported:

  • SourceConnection - (Required) Source connection to use.

  • SourceFileName - (Required) Source file name. Allowed wildcards are * (matching zero or more characters) and ? (matching zero or a single character). You can use ^ to escape your file name if it contains a wildcard character.

  • SourceFilePath - (Required) Path to the source file starting at the storage account Container.

  • FailIfFileNotExists - (Optional) Should the Task fail if the file isn't found. If set to true, the Task will retry until the file arrives (or the Task reaches the maximum retry threshold).

  • DeleteFileFromSourceAfterCopying - (Optional) Should the source file be deleted once it has been successfully copied to its destination?

  • TargetConnection - (Optional) Target connection to use.

  • DataLakeSystemFolder - (Required) Name of the folder in the Data Lake that acts as the parent folder for all datasets belonging to this System.

  • DataLakeDatasetFolder - (Required) Name of the folder in the Data Lake that the dataset will be stored under. Used with 'Data Lake System Folder' to form the fully qualified path to the dataset within the data Container in the Data Lake.

  • ExtractControlVariableName - (Optional) For incremental loads only, the name to assign the Extract Control variable in State Config for the ExtractControl value derived from the Extract Control Query above.

  • ExtractControlVariableSeedValue - (Optional) The initial value to set for the Extract Control variable in State Config - this will have no impact beyond the original seeding of the Extract Control variable in State Config.

  • DestinationColumnDelimiter - (Optional) Column delimiter to use for the destination file.

  • DestinationRowDelimiter - (Optional) Row delimiter to use for destination file.

  • CopyBehaviour - (Optional) Defines behaviour when copying files from one file system to another. Options are None (default) and MergeFiles (mulitple source files are merged into a single file at the destination).

  • SourceColumnDelimiter - (Optional) Column delimiter of the source file (leave empty for comma).

  • SourceCompressionType - (Optional) Compression type of the source file (e.g. zipped).

  • Encoding - (Optional) The encoding type used to read/write text files.

  • EscapeCharacter - (Optional) The single character to escape quotes inside a quoted value. When EscapeCharacter is defined as empty string, QuoteCharacter must be set as empty string as well (in which case make sure all column values don't contain delimiters).

  • FirstRowAsHeader - (Optional) Should the first data row should be used as the header.

  • NullValue - (Optional) The string representation of a null value (leave blank for empty string).

  • QuoteCharacter - (Optional) The single character to use to quote a column value if it contains the column delimiter. When QuoteCharacter is defined as empty string, it means there is no quote character and column value will not be quoted, and EscapeCharacter is used to escape the column delimiter and itself.

  • SkipLineCount - (Optional) The number of non-empty rows to skip when reading data from source files. If both SkipLineCount and FirstRowAsHeader are specified, the lines are skipped first and then the header information is read from the input file.

  • DIUsToUseForCopyActivity - (Optional) Specifies the powerfulness of the copy executor. Value can be between 2 and 256. When left at default, the Data Factory dynamically applies the optimal DIU setting based on the source-sink pair and data pattern.

  • MaximumNumberOfAttemptsAllowed - (Optional) The total number of times the running of this Task can be attempted.

  • MinutesToWaitBeforeNextAttempt - (Optional) If a Task run fails, the number of minutes to wait before re-attempting the Task.

  • RetainHistory - (Optional) Should the raw files be saved to the History Container to preserve them?

  • IsFederated - (Optional) Makes task available to other Insight Factories within this organisation.

  • Links - (Optional) NULL