Skip to main content

Ingest PostgreSQL as Parquet

Copy data from PostgreSQL to parquet format in Azure Data Lake Storage Gen2.

To use this activity within the API, use an ActivityCode of POSTGRESQLV2-ADLS.

Example JSON

An example of what the Task Config would look like for a task using this activity. Some of these variables would be set at the group level to avoid duplication between tasks.

NULL

Variable Reference

The following variables are supported:

  • SourceConnection - (Required) Source connection to use.

  • ExtractQuery - (Required) SQL query to extract data from the source database.

  • TargetConnection - (Optional) Target connection to use.

  • DataLakeSystemFolder - (Required) Name of the folder in the Data Lake that acts as the parent folder for all datasets belonging to this System.

  • DataLakeDatasetFolder - (Required) Name of the folder in the Data Lake that the dataset will be stored under. Used with 'Data Lake System Folder' to form the fully qualified path to the dataset within the data Container in the Data Lake.

  • ElevateToDelta - (Optional) Ingest directly to Lakehouse Table

  • DeltaSchemaName - (Optional) The name of the Schema this transformation lives in. Required if Copy to Lakehouse Table is enabled.

  • DeltaTableName - (Optional) The name of the Lakehouse Table representing this transformation. Required if Copy to Lakehouse Table is enabled.

  • ExtractControlQuery - (Optional) For incremental loads only, a SQL query to get a 'high-water' mark for extract control. For instance, this could be the maximum value of an modified_date or an identity column. NOTE: The column returned must be aliased as ExtractControl e.g. select max(modified_date) as ExtractControl from some_table.

  • ExtractControlVariableName - (Optional) For incremental loads only, the name to assign the Extract Control variable in State Config for the ExtractControl value derived from the Extract Control Query above.

  • ExtractControlVariableSeedValue - (Optional) The initial value to set for the Extract Control variable in State Config - this will have no impact beyond the original seeding of the Extract Control variable in State Config.

  • DIUsToUseForCopyActivity - (Optional) Specifies the powerfulness of the copy executor. Value can be between 2 and 256. When left at default, the Data Factory dynamically applies the optimal DIU setting based on the source-sink pair and data pattern.

  • MaximumNumberOfAttemptsAllowed - (Optional) The total number of times the running of this Task can be attempted.

  • MinutesToWaitBeforeNextAttempt - (Optional) If a Task run fails, the number of minutes to wait before re-attempting the Task.

  • RetainHistory - (Optional) Should the raw files be saved to the History Container to preserve them?

  • IsFederated - (Optional) Makes task available to other Insight Factories within this organisation.