Analyse Model Drift
Run Model Drift Notebook that will create/update a Lakehouse Table with model drift analysis.
To use this activity within the API, use an ActivityCode of MODEL-DRIFT.
Example JSON
An example of what the Task Config would look like for a task using this activity. Some of these variables would be set at the group level to avoid duplication between tasks.
{
"DeltaQuery": "select * from bom.climate",
"TargetSchemaName": "dbo",
"TargetTableName": "climate",
"PreCopyScript": "if object_id('bom.climate') is not null truncate table bom.climate",|||NEWL
Variable Reference
The following variables are supported:
-
NotebookPath- (Required) The relative path to the Databricks Notebook that will prepare the Enrichment data. -
NotebookParameters- (Optional) Parameters for use in the Databricks Notebook. This is JSON format e.g. { "Param1": "Value1", "Param2": "Value2" }. -
AdditionalNotebooks- (Optional) The Path to other Notebooks referenced by the main Notebook. -
DeltaSchemaName- (Required) The name of the Schema this Enrichment lives in. -
DeltaTableName- (Required) The name of the Lakehouse Table representing the output of this Enrichment. -
DeltaTableComments- (Optional) Comments to add to the Lakehouse Table. -
DeltaTableUpdateType- (Required) Indicates what type of update (if any) is to be performed on the Lakehouse Table. -
DeltaTableBusinessKeyColumnList- (Optional) Comma-separated list of Business Key columns in the Lakehouse Table. This is required if 'Lakehouse Table Update Type' is 'Dimension' or 'Merge'. If a value is specified, a uniqueness test is performed against this (composite) key for both the result of the Enrichment and the Lakehouse Table. -
DeltaTablePartitionColumnList- (Optional) Comma-separated ordered list of columns forming the Partitioning strategy of the Lakehouse Table. -
PartitionDepthToReplace- (Optional) The number of columns in 'Lakehouse Table Partition Column List' (counting from the first column in order) to use in a Partition Replacement. NOTE: This cannot be greater than the number of columns defined in the 'Lakehouse Table Partition Column List'. Defaults to 1 if only one column has been specified in 'Lakehouse Table Partition Column List'. -
DatabricksClusterId- (Optional) The Id of the Databricks Cluster to use to run the Notebook. -
ExtractControlVariableName- (Optional) For incremental loads only, the name to assign the Extract Control variable in State Config for the ExtractControl value derived from the Extract Control Query above. -
ExtractControlVariableSeedValue- (Optional) The initial value to set for the Extract Control variable in State Config - this will have no impact beyond the original seeding of the Extract Control variable in State Config. -
FormatAddedColumnsAsSnakeCase- (Optional) When 'Lakehouse Table Update Type' is 'Dimension', additional columns are added to the Enrichment. A value of true will format the names of these extra columns as snake case (e.g all lowercase with underscores between words). -
SkipCreateVolumeAndSchema- (Optional) If a Schema and/or Volume has already been created, you can opt to skip this check - it will lead to better performance. -
MaximumNumberOfAttemptsAllowed- (Optional) The total number of times the running of this Task can be attempted. -
MinutesToWaitBeforeNextAttempt- (Optional) If a Task run fails, the number of minutes to wait before re-attempting the Task. -
IsFederated- (Optional) Makes task available to other Insight Factories within this organisation. -
Links- (Optional) NULL