Skip to main content

Ingesting Data from a Database

Learn how to ingest data from relational databases like SQL Server or Azure SQL Database into your Lakehouse.

Overview

Database ingestion is one of the most common data pipeline patterns. In this guide, you'll learn how to:

  • Create a database connection
  • Configure a task with an ingest activity
  • Run the ingestion and verify data landed correctly

Prerequisites

  • An existing Production Line (see Production Lines)
  • Database credentials or connection details
  • Network connectivity to your source database

Step-by-Step Guide

1. Create a database connection

Before you can ingest data, you need to create a connection to your source database:

  1. Navigate to Build > Connections
  2. Click New Connection
  3. Select the appropriate connection type (e.g., Azure SQL Database, SQL Server)
  4. Enter your connection details:
    • Server name
    • Database name
    • Authentication method
    • Credentials
  5. Test the connection to ensure it works
  6. Save the connection

2. Create an ingestion task

  1. Open your production line and navigate to the Graph view
  2. Add a new task using one of these methods:
    • Click the + button in the graph side menu
    • Right-click on an existing node and select Add Task from the context menu
  3. Enter a unique Code and Name for your task
  4. Select an ingestion activity from the Activity dropdown (e.g., "Ingest from Azure SQL Database to Lakehouse")
  5. Configure the task properties:
    • Select your connection
    • Choose the source table or query
    • Configure the destination schema and table name
    • Set any column mappings if needed

3. Run the ingestion

  1. Save your task configuration
  2. Click Run to execute the ingestion
  3. Monitor the progress in the task details

4. Verify your data

After the task completes:

  1. Check the task status shows success
  2. Navigate to your Lakehouse to verify the data landed correctly
  3. Review row counts and sample data

Key Concepts

TermDefinition
ConnectionA saved configuration for connecting to an external data source
IngestionThe process of extracting data from a source and loading it into the Lakehouse
LakehouseThe Delta Lake-based data storage in Insight Factory