Skip to main content

Create/Update a Feature Store Table

Run Enrichment Notebook that will create/update a Lakehouse Table.

To use this activity within the API, use an ActivityCode of ML-FEATURE-STORE-TABLE.

Example JSON

An example of what the Task Config would look like for a task using this activity. Some of these variables would be set at the group level to avoid duplication between tasks.

NULL

Variable Reference

The following variables are supported:

  • NotebookPath - (Required) The relative path to the Databricks Notebook that will prepare the Enrichment data.

  • NotebookParameters - (Optional) Parameters for use in the Databricks Notebook. This is JSON format e.g. { "Param1": "Value1", "Param2": "Value2" }.

  • AdditionalNotebooks - (Optional) The Path to other Notebooks referenced by the main Notebook.

  • DeltaSchemaName - (Required) The name of the Schema this Feature Table lives in.

  • DeltaTableName - (Required) The name of the Lakehouse Table representing the output of this Enrichment.

  • DeltaTableComments - (Optional) Comments to add to the Lakehouse Table.

  • DeltaTableUpdateType - (Required) Indicates what type of update (if any) is to be performed on the Lakehouse Table.

  • DeltaTablePrimaryKeyColumnList - (Required) Comma-separated list of Primary Key columns in the Lakehouse Table. NOTE: Column names are case-sensitive.

  • DeltaTablePartitionColumnList - (Optional) Comma-separated ordered list of columns forming the Partitioning strategy of the Lakehouse Table.

  • PartitionDepthToReplace - (Optional) The number of columns in 'Lakehouse Table Partition Column List' (counting from the first column in order) to use in a Partition Replacement. NOTE: This cannot be greater than the number of columns defined in the 'Lakehouse Table Partition Column List'. Defaults to 1 if only one column has been specified in 'Lakehouse Table Partition Column List'.

  • DatabricksClusterId - (Optional) The Id of the Databricks Cluster to use to run the Notebook.

  • ExtractControlVariableName - (Optional) For incremental loads only, the name to assign the Extract Control variable in State Config for the ExtractControl value derived from the Extract Control Query above.

  • ExtractControlVariableSeedValue - (Optional) The initial value to set for the Extract Control variable in State Config - this will have no impact beyond the original seeding of the Extract Control variable in State Config.

  • SkipCreateVolumeAndSchema - (Optional) If a Schema and/or Volume has already been created, you can opt to skip this check - it will lead to better performance.

  • MaximumNumberOfAttemptsAllowed - (Optional) The total number of times the running of this Task can be attempted.

  • MinutesToWaitBeforeNextAttempt - (Optional) If a Task run fails, the number of minutes to wait before re-attempting the Task.

  • IsFederated - (Optional) Makes task available to other Insight Factories within this organisation.

  • Links - (Optional) NULL