Skip to main content

Ingest SFTP Binary File

Copy file(s) from SFTP to Azure Data Lake Storage Gen2.

Category: Ingest to Lakehouse | Tags: Ingestion

How it works

Ingest binary file from SFTP endpoint '<<SftpHost>>' into Data Lake location 'raw/<<DataLakeSystemFolder>>/<<DataLakeDatasetFolder>>'

To use this activity within the API, use an ActivityCode of SFTP-ADLS.

Available Connections

SourceConnection:

TargetConnection:

Example JSON

An example of what the Task Config would look like for a task using this activity. Some of these variables would be set at the group level to avoid duplication between tasks.

{
"SourceConnection": "MY-SOURCE-CONN",
"SourceFileName": "",
"SourceFilePath": "/path/to",
"DataLakeSystemFolder": "my_folder",
"DataLakeDatasetFolder": "data",
"TargetConnection": "MY-TARGET-CONN"
}

Variable Reference

The following variables are supported:

  • DataLakeDatasetFolder (Required) - Name of the folder in the Data Lake containing the dataset.

  • DataLakeSystemFolder (Required) - Name of the parent (System) folder in the Data Lake containing the dataset.

  • DeleteFileFromSourceAfterCopying (Optional) - Should the source file be deleted once it has been successfully copied to its destination?

  • DIUsToUseForCopyActivity (Optional) - Specifies the powerfulness of the copy executor. Value can be between 2 and 256. When left at default, the Data Factory dynamically applies the optimal DIU setting based on the source-sink pair and data pattern.

  • ExtractControlVariableName (Optional) - For incremental loads only, the name to assign the Extract Control variable in State Config for the ExtractControl value derived from the Extract Control Query above.

  • ExtractControlVariableSeedValue (Optional) - The initial value to set for the Extract Control variable in State Config - this will have no impact beyond the original seeding of the Extract Control variable in State Config.

  • FailIfFileNotExists (Optional) - Should the Task fail if the file isn't found. If set to true, the Task will retry until the file arrives (or the Task reaches the maximum retry threshold).

  • IsFederated (Optional) - Makes task available to other Insight Factories within this organisation.

  • Links (Optional) - NULL

  • MaximumNumberOfAttemptsAllowed (Optional) - The total number of times the running of this Task can be attempted.

  • MinutesToWaitBeforeNextAttempt (Optional) - If a Task run fails, the number of minutes to wait before re-attempting the Task.

  • RetainHistory (Optional) - Should the raw files be saved to the History Container to preserve them?

    Show more details

    **Retain History? ** By default, this flag is set to the value assigned in the Configuration item SaveRawFilesToHistory (signalled by the double triangle brackets around the Configuration item name e.g. &lt;&lt;SaveRawFilesToHistory&gt;&gt;). This default behaviour can be overridden here.

  • SourceArchiveSubFolderName (Optional) - To move the Source file to a source sub-folder after a successful copy, provide the name of the 'archive' sub-folder. If no name is provided, the Source file will not be archived.

  • SourceConnection (Required) - Source connection to use.

  • SourceFileName (Required) - Source file name. Allowed wildcards are * (matching zero or more characters) and ? (matching zero or a single character). You can use ^ to escape your file name if it contains a wildcard character.

  • SourceFilePath (Required) - Relative path to the source file.

  • TargetConnection (Optional) - Target connection to use.