Version: devel

Configuration Reference

This page contains a reference of most configuration options and objects available in DLT.

Destination Configurations

AthenaClientConfiguration

Configuration for the Athena destination

destination_type - str
Type of this destination, e.g. postgres or duckdb
credentials - AwsCredentials
Credentials for this destination
destination_name - str
Name of the destination, e.g. my_postgres or my_duckdb, will be the same as destination_type if not set
environment - str
Environment of the destination, e.g. dev or prod
dataset_name - str
dataset name in the destination to load data to, for schemas that are not default schema, it is used as dataset prefix
default_schema_name - str
name of default schema to be used to name effective dataset to load data to
replace_strategy - truncate-and-insert | insert-from-staging | staging-optimized
How to handle replace disposition for this destination, uses first strategy from caps if not declared
staging_dataset_name_layout - str
Layout for staging dataset, where %s is replaced with dataset name. placeholder is optional
enable_dataset_name_normalization - bool
Whether to normalize the dataset name. Affects staging dataset as well.
info_tables_query_threshold - int
Threshold for information schema tables query, if exceeded tables will be filtered in code.
staging_config - DestinationClientStagingConfiguration
configuration of the staging, if present, injected at runtime
truncate_tables_on_staging_destination_before_load - bool
If dlt should truncate the tables on staging destination before loading data.
query_result_bucket - str
athena_work_group - str
aws_data_catalog - str
connection_params - typing.Dict[str, typing.Any]
force_iceberg - bool
table_location_layout - str
table_properties - typing.Dict[str, str]
db_location - str

BigQueryClientConfiguration

None

destination_type - str
Type of this destination, e.g. postgres or duckdb
credentials - GcpServiceAccountCredentials
Credentials for this destination
destination_name - str
Name of the destination, e.g. my_postgres or my_duckdb, will be the same as destination_type if not set
environment - str
Environment of the destination, e.g. dev or prod
dataset_name - str
dataset name in the destination to load data to, for schemas that are not default schema, it is used as dataset prefix
default_schema_name - str
name of default schema to be used to name effective dataset to load data to
replace_strategy - truncate-and-insert | insert-from-staging | staging-optimized
How to handle replace disposition for this destination, uses first strategy from caps if not declared
staging_dataset_name_layout - str
Layout for staging dataset, where %s is replaced with dataset name. placeholder is optional
enable_dataset_name_normalization - bool
Whether to normalize the dataset name. Affects staging dataset as well.
info_tables_query_threshold - int
Threshold for information schema tables query, if exceeded tables will be filtered in code.
staging_config - DestinationClientStagingConfiguration
configuration of the staging, if present, injected at runtime
truncate_tables_on_staging_destination_before_load - bool
If dlt should truncate the tables on staging destination before loading data.
location - str
project_id - str
Note, that this is BigQuery project_id which could be different from credentials.project_id
has_case_sensitive_identifiers - bool
If True then dlt expects to load data into case sensitive dataset
should_set_case_sensitivity_on_new_dataset - bool
If True, dlt will set case sensitivity flag on created datasets that corresponds to naming convention
http_timeout - float
connection timeout for http request to BigQuery api
file_upload_timeout - float
a timeout for file upload when loading local files
retry_deadline - float
How long to retry the operation in case of error, the backoff 60 s.
batch_size - int
Number of rows in streaming insert batch
autodetect_schema - bool
Allow BigQuery to autodetect schemas and create data tables
ignore_unknown_values - bool
Ignore unknown values in the data

ClickHouseClientConfiguration

None

destination_type - str
Type of this destination, e.g. postgres or duckdb
credentials - ClickHouseCredentials
Credentials for this destination
destination_name - str
Name of the destination, e.g. my_postgres or my_duckdb, will be the same as destination_type if not set
environment - str
Environment of the destination, e.g. dev or prod
dataset_name - str
dataset name in the destination to load data to, for schemas that are not default schema, it is used as dataset prefix
default_schema_name - str
name of default schema to be used to name effective dataset to load data to
replace_strategy - truncate-and-insert | insert-from-staging | staging-optimized
How to handle replace disposition for this destination, uses first strategy from caps if not declared
staging_dataset_name_layout - str
Layout for staging dataset, where %s is replaced with dataset name. placeholder is optional
enable_dataset_name_normalization - bool
Whether to normalize the dataset name. Affects staging dataset as well.
info_tables_query_threshold - int
Threshold for information schema tables query, if exceeded tables will be filtered in code.
staging_config - DestinationClientStagingConfiguration
configuration of the staging, if present, injected at runtime
truncate_tables_on_staging_destination_before_load - bool
If dlt should truncate the tables on staging destination before loading data.
dataset_table_separator - str
Separator for dataset table names, defaults to '___', i.e. 'database.dataset___table'.
table_engine_type - merge_tree | shared_merge_tree | replicated_merge_tree
The default table engine to use. Defaults to merge_tree. Other implemented options are shared_merge_tree and replicated_merge_tree.
dataset_sentinel_table_name - str
Special table to mark dataset as existing
staging_use_https - bool
Connect to the staging buckets via https

CustomDestinationClientConfiguration

None

destination_type - str
Type of this destination, e.g. postgres or duckdb
credentials - CredentialsConfiguration
Credentials for this destination
destination_name - str
Name of the destination, e.g. my_postgres or my_duckdb, will be the same as destination_type if not set
environment - str
Environment of the destination, e.g. dev or prod
destination_callable - str | typing.Callable[[typing.Union[typing.Any, typing.List[typing.Any], str], dlt.common.schema.typing.TTableSchema], NoneType]
loader_file_format - jsonl | typed-jsonl | insert_values | parquet | csv | reference | model
batch_size - int
skip_dlt_columns_and_tables - bool
max_table_nesting - int

DatabricksClientConfiguration

None

destination_type - str
Type of this destination, e.g. postgres or duckdb
credentials - DatabricksCredentials
Credentials for this destination
destination_name - str
Name of the destination, e.g. my_postgres or my_duckdb, will be the same as destination_type if not set
environment - str
Environment of the destination, e.g. dev or prod
dataset_name - str
dataset name in the destination to load data to, for schemas that are not default schema, it is used as dataset prefix
default_schema_name - str
name of default schema to be used to name effective dataset to load data to
replace_strategy - truncate-and-insert | insert-from-staging | staging-optimized
How to handle replace disposition for this destination, uses first strategy from caps if not declared
staging_dataset_name_layout - str
Layout for staging dataset, where %s is replaced with dataset name. placeholder is optional
enable_dataset_name_normalization - bool
Whether to normalize the dataset name. Affects staging dataset as well.
info_tables_query_threshold - int
Threshold for information schema tables query, if exceeded tables will be filtered in code.
staging_config - DestinationClientStagingConfiguration
configuration of the staging, if present, injected at runtime
truncate_tables_on_staging_destination_before_load - bool
If dlt should truncate the tables on staging destination before loading data.
staging_credentials_name - str
is_staging_external_location - bool
If true, the temporary credentials are not propagated to the COPY command
staging_volume_name - str
Name of the Databricks managed volume for temporary storage, e.g., catalog_name.database_name.volume_name. Defaults to '_dlt_temp_load_volume' if not set.
keep_staged_files - bool
Tells if to keep the files in internal (volume) stage

DestinationClientConfiguration

None

destination_type - str
Type of this destination, e.g. postgres or duckdb
credentials - CredentialsConfiguration
Credentials for this destination
destination_name - str
Name of the destination, e.g. my_postgres or my_duckdb, will be the same as destination_type if not set
environment - str
Environment of the destination, e.g. dev or prod

DestinationClientDwhConfiguration

Configuration of a destination that supports datasets/schemas

destination_type - str
Type of this destination, e.g. postgres or duckdb
credentials - CredentialsConfiguration
Credentials for this destination
destination_name - str
Name of the destination, e.g. my_postgres or my_duckdb, will be the same as destination_type if not set
environment - str
Environment of the destination, e.g. dev or prod
dataset_name - str
dataset name in the destination to load data to, for schemas that are not default schema, it is used as dataset prefix
default_schema_name - str
name of default schema to be used to name effective dataset to load data to
replace_strategy - truncate-and-insert | insert-from-staging | staging-optimized
How to handle replace disposition for this destination, uses first strategy from caps if not declared
staging_dataset_name_layout - str
Layout for staging dataset, where %s is replaced with dataset name. placeholder is optional
enable_dataset_name_normalization - bool
Whether to normalize the dataset name. Affects staging dataset as well.
info_tables_query_threshold - int
Threshold for information schema tables query, if exceeded tables will be filtered in code.

DestinationClientDwhWithStagingConfiguration

Configuration of a destination that can take data from staging destination

destination_type - str
Type of this destination, e.g. postgres or duckdb
credentials - CredentialsConfiguration
Credentials for this destination
destination_name - str
Name of the destination, e.g. my_postgres or my_duckdb, will be the same as destination_type if not set
environment - str
Environment of the destination, e.g. dev or prod
dataset_name - str
dataset name in the destination to load data to, for schemas that are not default schema, it is used as dataset prefix
default_schema_name - str
name of default schema to be used to name effective dataset to load data to
replace_strategy - truncate-and-insert | insert-from-staging | staging-optimized
How to handle replace disposition for this destination, uses first strategy from caps if not declared
staging_dataset_name_layout - str
Layout for staging dataset, where %s is replaced with dataset name. placeholder is optional
enable_dataset_name_normalization - bool
Whether to normalize the dataset name. Affects staging dataset as well.
info_tables_query_threshold - int
Threshold for information schema tables query, if exceeded tables will be filtered in code.
staging_config - DestinationClientStagingConfiguration
configuration of the staging, if present, injected at runtime
truncate_tables_on_staging_destination_before_load - bool
If dlt should truncate the tables on staging destination before loading data.

DestinationClientStagingConfiguration

Configuration of a staging destination, able to store files with desired layout at bucket_url.

Also supports datasets and can act as standalone destination.

destination_type - str
Type of this destination, e.g. postgres or duckdb
credentials - CredentialsConfiguration
Credentials for this destination
destination_name - str
Name of the destination, e.g. my_postgres or my_duckdb, will be the same as destination_type if not set
environment - str
Environment of the destination, e.g. dev or prod
dataset_name - str
dataset name in the destination to load data to, for schemas that are not default schema, it is used as dataset prefix
default_schema_name - str
name of default schema to be used to name effective dataset to load data to
replace_strategy - truncate-and-insert | insert-from-staging | staging-optimized
How to handle replace disposition for this destination, uses first strategy from caps if not declared
staging_dataset_name_layout - str
Layout for staging dataset, where %s is replaced with dataset name. placeholder is optional
enable_dataset_name_normalization - bool
Whether to normalize the dataset name. Affects staging dataset as well.
info_tables_query_threshold - int
Threshold for information schema tables query, if exceeded tables will be filtered in code.
as_staging_destination - bool
bucket_url - str
layout - str

DremioClientConfiguration

None

destination_type - str
Type of this destination, e.g. postgres or duckdb
credentials - DremioCredentials
Credentials for this destination
destination_name - str
Name of the destination, e.g. my_postgres or my_duckdb, will be the same as destination_type if not set
environment - str
Environment of the destination, e.g. dev or prod
dataset_name - str
dataset name in the destination to load data to, for schemas that are not default schema, it is used as dataset prefix
default_schema_name - str
name of default schema to be used to name effective dataset to load data to
replace_strategy - truncate-and-insert | insert-from-staging | staging-optimized
How to handle replace disposition for this destination, uses first strategy from caps if not declared
staging_dataset_name_layout - str
Layout for staging dataset, where %s is replaced with dataset name. placeholder is optional
enable_dataset_name_normalization - bool
Whether to normalize the dataset name. Affects staging dataset as well.
info_tables_query_threshold - int
Threshold for information schema tables query, if exceeded tables will be filtered in code.
staging_config - DestinationClientStagingConfiguration
configuration of the staging, if present, injected at runtime
truncate_tables_on_staging_destination_before_load - bool
If dlt should truncate the tables on staging destination before loading data.
staging_data_source - str
The name of the staging data source

DuckDbClientConfiguration

None

destination_type - str
Type of this destination, e.g. postgres or duckdb
credentials - DuckDbCredentials
Credentials for this destination
destination_name - str
Name of the destination, e.g. my_postgres or my_duckdb, will be the same as destination_type if not set
environment - str
Environment of the destination, e.g. dev or prod
dataset_name - str
dataset name in the destination to load data to, for schemas that are not default schema, it is used as dataset prefix
default_schema_name - str
name of default schema to be used to name effective dataset to load data to
replace_strategy - truncate-and-insert | insert-from-staging | staging-optimized
How to handle replace disposition for this destination, uses first strategy from caps if not declared
staging_dataset_name_layout - str
Layout for staging dataset, where %s is replaced with dataset name. placeholder is optional
enable_dataset_name_normalization - bool
Whether to normalize the dataset name. Affects staging dataset as well.
info_tables_query_threshold - int
Threshold for information schema tables query, if exceeded tables will be filtered in code.
staging_config - DestinationClientStagingConfiguration
configuration of the staging, if present, injected at runtime
truncate_tables_on_staging_destination_before_load - bool
If dlt should truncate the tables on staging destination before loading data.
local_dir - str
pipeline_name - str
pipeline_working_dir - str
legacy_db_path - str
create_indexes - bool

DummyClientConfiguration

None

destination_type - str
Type of this destination, e.g. postgres or duckdb
credentials - DummyClientCredentials
Credentials for this destination
destination_name - str
Name of the destination, e.g. my_postgres or my_duckdb, will be the same as destination_type if not set
environment - str
Environment of the destination, e.g. dev or prod
loader_file_format - jsonl | typed-jsonl | insert_values | parquet | csv | reference | model
fail_schema_update - bool
fail_prob - float
probability of terminal fail
retry_prob - float
probability of job retry
completed_prob - float
probability of successful job completion
exception_prob - float
probability of exception transient exception when running job
timeout - float
timeout time
fail_terminally_in_init - bool
raise terminal exception in job init
fail_transiently_in_init - bool
raise transient exception in job init
truncate_tables_on_staging_destination_before_load - bool
truncate tables on staging destination
create_followup_jobs - bool
create followup job for individual jobs
fail_followup_job_creation - bool
Raise generic exception during followupjob creation
fail_table_chain_followup_job_creation - bool
Raise generic exception during tablechain followupjob creation
create_followup_table_chain_sql_jobs - bool
create a table chain merge job which is guaranteed to fail
create_followup_table_chain_reference_jobs - bool
create table chain jobs which succeed

FilesystemConfigurationWithLocalFiles

None

destination_type - str
Type of this destination, e.g. postgres or duckdb
credentials - AwsCredentials | GcpServiceAccountCredentials | AzureCredentialsWithoutDefaults | AzureServicePrincipalCredentialsWithoutDefaults | AzureCredentials | AzureServicePrincipalCredentials | GcpOAuthCredentials | SFTPCredentials
Credentials for this destination
destination_name - str
Name of the destination, e.g. my_postgres or my_duckdb, will be the same as destination_type if not set
environment - str
Environment of the destination, e.g. dev or prod
local_dir - str
pipeline_name - str
pipeline_working_dir - str
legacy_db_path - str
bucket_url - str
read_only - bool
Indicates read only filesystem access. Will enable caching
kwargs - typing.Dict[str, typing.Any]
Additional arguments passed to fsspec constructor ie. dict(use_ssl=True) for s3fs
client_kwargs - typing.Dict[str, typing.Any]
Additional arguments passed to underlying fsspec native client ie. dict(verify="public.crt) for botocore
deltalake_storage_options - typing.Dict[str, typing.Any]
deltalake_configuration - typing.Dict[str, typing.Optional[str]]

FilesystemDestinationClientConfiguration

None

destination_type - str
Type of this destination, e.g. postgres or duckdb
credentials - AwsCredentials | GcpServiceAccountCredentials | AzureCredentialsWithoutDefaults | AzureServicePrincipalCredentialsWithoutDefaults | AzureCredentials | AzureServicePrincipalCredentials | GcpOAuthCredentials | SFTPCredentials
Credentials for this destination
destination_name - str
Name of the destination, e.g. my_postgres or my_duckdb, will be the same as destination_type if not set
environment - str
Environment of the destination, e.g. dev or prod
dataset_name - str
dataset name in the destination to load data to, for schemas that are not default schema, it is used as dataset prefix
default_schema_name - str
name of default schema to be used to name effective dataset to load data to
replace_strategy - truncate-and-insert | insert-from-staging | staging-optimized
How to handle replace disposition for this destination, uses first strategy from caps if not declared
staging_dataset_name_layout - str
Layout for staging dataset, where %s is replaced with dataset name. placeholder is optional
enable_dataset_name_normalization - bool
Whether to normalize the dataset name. Affects staging dataset as well.
info_tables_query_threshold - int
Threshold for information schema tables query, if exceeded tables will be filtered in code.
as_staging_destination - bool
bucket_url - str
layout - str
local_dir - str
pipeline_name - str
pipeline_working_dir - str
legacy_db_path - str
read_only - bool
Indicates read only filesystem access. Will enable caching
kwargs - typing.Dict[str, typing.Any]
Additional arguments passed to fsspec constructor ie. dict(use_ssl=True) for s3fs
client_kwargs - typing.Dict[str, typing.Any]
Additional arguments passed to underlying fsspec native client ie. dict(verify="public.crt) for botocore
deltalake_storage_options - typing.Dict[str, typing.Any]
deltalake_configuration - typing.Dict[str, typing.Optional[str]]
current_datetime - class 'pendulum.datetime.DateTime' | typing.Callable[[], pendulum.datetime.DateTime]
extra_placeholders - typing.Dict[str, typing.Union[str, int, pendulum.datetime.DateTime, typing.Callable[[str, str, str, str, str], str]]]
max_state_files - int
Maximum number of pipeline state files to keep; 0 or negative value disables cleanup.
always_refresh_views - bool
Always refresh table scanner views by setting the newest table metadata or globbing table files

LanceDBClientConfiguration

None

destination_type - str
Type of this destination, e.g. postgres or duckdb
credentials - LanceDBCredentials
Credentials for this destination
destination_name - str
Name of the destination, e.g. my_postgres or my_duckdb, will be the same as destination_type if not set
environment - str
Environment of the destination, e.g. dev or prod
dataset_name - str
dataset name in the destination to load data to, for schemas that are not default schema, it is used as dataset prefix
default_schema_name - str
name of default schema to be used to name effective dataset to load data to
replace_strategy - truncate-and-insert | insert-from-staging | staging-optimized
How to handle replace disposition for this destination, uses first strategy from caps if not declared
staging_dataset_name_layout - str
Layout for staging dataset, where %s is replaced with dataset name. placeholder is optional
enable_dataset_name_normalization - bool
Whether to normalize the dataset name. Affects staging dataset as well.
info_tables_query_threshold - int
Threshold for information schema tables query, if exceeded tables will be filtered in code.
local_dir - str
pipeline_name - str
pipeline_working_dir - str
legacy_db_path - str
lance_uri - str
LanceDB database URI. Defaults to local, on-disk instance.
dataset_separator - str
Character for the dataset separator.
options - LanceDBClientOptions
LanceDB client options.
embedding_model_provider - gemini-text | bedrock-text | cohere | gte-text | imagebind | instructor | open-clip | openai | sentence-transformers | huggingface | colbert | ollama
Embedding provider used for generating embeddings. Default is "cohere". You can find the full list of
embedding_model_provider_host - str
Full host URL with protocol and port (e.g. 'http://localhost:11434'). Uses LanceDB's default if not specified, assuming the provider accepts this parameter.
embedding_model - str
The model used by the embedding provider for generating embeddings.
embedding_model_dimensions - int
The dimensions of the embeddings generated. In most cases it will be automatically inferred, by LanceDB,
vector_field_name - str
Name of the special field to store the vector embeddings.
sentinel_table_name - str
Name of the sentinel table that encapsulates datasets. Since LanceDB has no

MotherDuckClientConfiguration

None

destination_type - str
Type of this destination, e.g. postgres or duckdb
credentials - MotherDuckCredentials
Credentials for this destination
destination_name - str
Name of the destination, e.g. my_postgres or my_duckdb, will be the same as destination_type if not set
environment - str
Environment of the destination, e.g. dev or prod
dataset_name - str
dataset name in the destination to load data to, for schemas that are not default schema, it is used as dataset prefix
default_schema_name - str
name of default schema to be used to name effective dataset to load data to
replace_strategy - truncate-and-insert | insert-from-staging | staging-optimized
How to handle replace disposition for this destination, uses first strategy from caps if not declared
staging_dataset_name_layout - str
Layout for staging dataset, where %s is replaced with dataset name. placeholder is optional
enable_dataset_name_normalization - bool
Whether to normalize the dataset name. Affects staging dataset as well.
info_tables_query_threshold - int
Threshold for information schema tables query, if exceeded tables will be filtered in code.
staging_config - DestinationClientStagingConfiguration
configuration of the staging, if present, injected at runtime
truncate_tables_on_staging_destination_before_load - bool
If dlt should truncate the tables on staging destination before loading data.
create_indexes - bool

MsSqlClientConfiguration

None

destination_type - str
Type of this destination, e.g. postgres or duckdb
credentials - MsSqlCredentials
Credentials for this destination
destination_name - str
Name of the destination, e.g. my_postgres or my_duckdb, will be the same as destination_type if not set
environment - str
Environment of the destination, e.g. dev or prod
dataset_name - str
dataset name in the destination to load data to, for schemas that are not default schema, it is used as dataset prefix
default_schema_name - str
name of default schema to be used to name effective dataset to load data to
replace_strategy - truncate-and-insert | insert-from-staging | staging-optimized
How to handle replace disposition for this destination, uses first strategy from caps if not declared
staging_dataset_name_layout - str
Layout for staging dataset, where %s is replaced with dataset name. placeholder is optional
enable_dataset_name_normalization - bool
Whether to normalize the dataset name. Affects staging dataset as well.
info_tables_query_threshold - int
Threshold for information schema tables query, if exceeded tables will be filtered in code.
staging_config - DestinationClientStagingConfiguration
configuration of the staging, if present, injected at runtime
truncate_tables_on_staging_destination_before_load - bool
If dlt should truncate the tables on staging destination before loading data.
create_indexes - bool
has_case_sensitive_identifiers - bool

PostgresClientConfiguration

None

destination_type - str
Type of this destination, e.g. postgres or duckdb
credentials - PostgresCredentials
Credentials for this destination
destination_name - str
Name of the destination, e.g. my_postgres or my_duckdb, will be the same as destination_type if not set
environment - str
Environment of the destination, e.g. dev or prod
dataset_name - str
dataset name in the destination to load data to, for schemas that are not default schema, it is used as dataset prefix
default_schema_name - str
name of default schema to be used to name effective dataset to load data to
replace_strategy - truncate-and-insert | insert-from-staging | staging-optimized
How to handle replace disposition for this destination, uses first strategy from caps if not declared
staging_dataset_name_layout - str
Layout for staging dataset, where %s is replaced with dataset name. placeholder is optional
enable_dataset_name_normalization - bool
Whether to normalize the dataset name. Affects staging dataset as well.
info_tables_query_threshold - int
Threshold for information schema tables query, if exceeded tables will be filtered in code.
staging_config - DestinationClientStagingConfiguration
configuration of the staging, if present, injected at runtime
truncate_tables_on_staging_destination_before_load - bool
If dlt should truncate the tables on staging destination before loading data.
create_indexes - bool
csv_format - CsvFormatConfiguration
Optional csv format configuration

QdrantClientConfiguration

None

destination_type - str
Type of this destination, e.g. postgres or duckdb
credentials - QdrantCredentials
Credentials for this destination
destination_name - str
Name of the destination, e.g. my_postgres or my_duckdb, will be the same as destination_type if not set
environment - str
Environment of the destination, e.g. dev or prod
dataset_name - str
dataset name in the destination to load data to, for schemas that are not default schema, it is used as dataset prefix
default_schema_name - str
name of default schema to be used to name effective dataset to load data to
replace_strategy - truncate-and-insert | insert-from-staging | staging-optimized
How to handle replace disposition for this destination, uses first strategy from caps if not declared
staging_dataset_name_layout - str
Layout for staging dataset, where %s is replaced with dataset name. placeholder is optional
enable_dataset_name_normalization - bool
Whether to normalize the dataset name. Affects staging dataset as well.
info_tables_query_threshold - int
Threshold for information schema tables query, if exceeded tables will be filtered in code.
local_dir - str
pipeline_name - str
pipeline_working_dir - str
legacy_db_path - str
qd_location - str
qd_path - str
Persistence path for QdrantLocal. Default: None
dataset_separator - str
embedding_batch_size - int
embedding_parallelism - int
upload_batch_size - int
upload_parallelism - int
upload_max_retries - int
options - QdrantClientOptions
model - str

RedshiftClientConfiguration

None

destination_type - str
Type of this destination, e.g. postgres or duckdb
credentials - RedshiftCredentials
Credentials for this destination
destination_name - str
Name of the destination, e.g. my_postgres or my_duckdb, will be the same as destination_type if not set
environment - str
Environment of the destination, e.g. dev or prod
dataset_name - str
dataset name in the destination to load data to, for schemas that are not default schema, it is used as dataset prefix
default_schema_name - str
name of default schema to be used to name effective dataset to load data to
replace_strategy - truncate-and-insert | insert-from-staging | staging-optimized
How to handle replace disposition for this destination, uses first strategy from caps if not declared
staging_dataset_name_layout - str
Layout for staging dataset, where %s is replaced with dataset name. placeholder is optional
enable_dataset_name_normalization - bool
Whether to normalize the dataset name. Affects staging dataset as well.
info_tables_query_threshold - int
Threshold for information schema tables query, if exceeded tables will be filtered in code.
staging_config - DestinationClientStagingConfiguration
configuration of the staging, if present, injected at runtime
truncate_tables_on_staging_destination_before_load - bool
If dlt should truncate the tables on staging destination before loading data.
create_indexes - bool
csv_format - CsvFormatConfiguration
Optional csv format configuration
staging_iam_role - str
has_case_sensitive_identifiers - bool

SnowflakeClientConfiguration

None

destination_type - str
Type of this destination, e.g. postgres or duckdb
credentials - SnowflakeCredentials
Credentials for this destination
destination_name - str
Name of the destination, e.g. my_postgres or my_duckdb, will be the same as destination_type if not set
environment - str
Environment of the destination, e.g. dev or prod
dataset_name - str
dataset name in the destination to load data to, for schemas that are not default schema, it is used as dataset prefix
default_schema_name - str
name of default schema to be used to name effective dataset to load data to
replace_strategy - truncate-and-insert | insert-from-staging | staging-optimized
How to handle replace disposition for this destination, uses first strategy from caps if not declared
staging_dataset_name_layout - str
Layout for staging dataset, where %s is replaced with dataset name. placeholder is optional
enable_dataset_name_normalization - bool
Whether to normalize the dataset name. Affects staging dataset as well.
info_tables_query_threshold - int
Threshold for information schema tables query, if exceeded tables will be filtered in code.
staging_config - DestinationClientStagingConfiguration
configuration of the staging, if present, injected at runtime
truncate_tables_on_staging_destination_before_load - bool
If dlt should truncate the tables on staging destination before loading data.
stage_name - str
Use an existing named stage instead of the default. Default uses the implicit table stage per table
keep_staged_files - bool
Whether to keep or delete the staged files after COPY INTO succeeds
csv_format - CsvFormatConfiguration
Optional csv format configuration
query_tag - str
A tag with placeholders to tag sessions executing jobs
create_indexes - bool
Whether UNIQUE or PRIMARY KEY constrains should be created
use_vectorized_scanner - bool
Whether to use or not use the vectorized scanner in COPY INTO

SqlalchemyClientConfiguration

None

destination_type - str
Type of this destination, e.g. postgres or duckdb
credentials - SqlalchemyCredentials
SQLAlchemy connection string
destination_name - str
Name of the destination, e.g. my_postgres or my_duckdb, will be the same as destination_type if not set
environment - str
Environment of the destination, e.g. dev or prod
dataset_name - str
dataset name in the destination to load data to, for schemas that are not default schema, it is used as dataset prefix
default_schema_name - str
name of default schema to be used to name effective dataset to load data to
replace_strategy - truncate-and-insert | insert-from-staging | staging-optimized
How to handle replace disposition for this destination, uses first strategy from caps if not declared
staging_dataset_name_layout - str
Layout for staging dataset, where %s is replaced with dataset name. placeholder is optional
enable_dataset_name_normalization - bool
Whether to normalize the dataset name. Affects staging dataset as well.
info_tables_query_threshold - int
Threshold for information schema tables query, if exceeded tables will be filtered in code.
create_unique_indexes - bool
Whether UNIQUE constrains should be created
create_primary_keys - bool
Whether PRIMARY KEY constrains should be created
engine_args - typing.Dict[str, typing.Any]
Additional arguments passed to sqlalchemy.create_engine

SynapseClientConfiguration

None

destination_type - str
Type of this destination, e.g. postgres or duckdb
credentials - SynapseCredentials
Credentials for this destination
destination_name - str
Name of the destination, e.g. my_postgres or my_duckdb, will be the same as destination_type if not set
environment - str
Environment of the destination, e.g. dev or prod
dataset_name - str
dataset name in the destination to load data to, for schemas that are not default schema, it is used as dataset prefix
default_schema_name - str
name of default schema to be used to name effective dataset to load data to
replace_strategy - truncate-and-insert | insert-from-staging | staging-optimized
How to handle replace disposition for this destination, uses first strategy from caps if not declared
staging_dataset_name_layout - str
Layout for staging dataset, where %s is replaced with dataset name. placeholder is optional
enable_dataset_name_normalization - bool
Whether to normalize the dataset name. Affects staging dataset as well.
info_tables_query_threshold - int
Threshold for information schema tables query, if exceeded tables will be filtered in code.
staging_config - DestinationClientStagingConfiguration
configuration of the staging, if present, injected at runtime
truncate_tables_on_staging_destination_before_load - bool
If dlt should truncate the tables on staging destination before loading data.
create_indexes - bool
Whether primary_key and unique column hints are applied.
has_case_sensitive_identifiers - bool
default_table_index_type - heap | clustered_columnstore_index
staging_use_msi - bool
Whether the managed identity of the Synapse workspace is used to authorize access to the staging Storage Account.

WeaviateClientConfiguration

None

destination_type - str
Type of this destination, e.g. postgres or duckdb
credentials - WeaviateCredentials
Credentials for this destination
destination_name - str
Name of the destination, e.g. my_postgres or my_duckdb, will be the same as destination_type if not set
environment - str
Environment of the destination, e.g. dev or prod
dataset_name - str
dataset name in the destination to load data to, for schemas that are not default schema, it is used as dataset prefix
default_schema_name - str
name of default schema to be used to name effective dataset to load data to
replace_strategy - truncate-and-insert | insert-from-staging | staging-optimized
How to handle replace disposition for this destination, uses first strategy from caps if not declared
staging_dataset_name_layout - str
Layout for staging dataset, where %s is replaced with dataset name. placeholder is optional
enable_dataset_name_normalization - bool
Whether to normalize the dataset name. Affects staging dataset as well.
info_tables_query_threshold - int
Threshold for information schema tables query, if exceeded tables will be filtered in code.
batch_size - int
batch_workers - int
batch_consistency - ONE | QUORUM | ALL
batch_retries - int
conn_timeout - float
read_timeout - float
startup_period - int
dataset_separator - str
vectorizer - str
module_config - typing.Dict[str, typing.Dict[str, str]]

Credential Configurations

AwsCredentials

None

aws_access_key_id - str
aws_secret_access_key - str
aws_session_token - str
profile_name - str
region_name - str
endpoint_url - str
s3_url_style - str
Only needed for duckdb sql_client s3 access, for minio this needs to be set to path for example.

AwsCredentialsWithoutDefaults

None

aws_access_key_id - str
aws_secret_access_key - str
aws_session_token - str
profile_name - str
region_name - str
endpoint_url - str
s3_url_style - str
Only needed for duckdb sql_client s3 access, for minio this needs to be set to path for example.

AzureCredentials

None

azure_storage_account_name - str
azure_account_host - str
Alternative host when accessing blob storage endpoint ie. my_account.dfs.core.windows.net
azure_storage_account_key - str
azure_storage_sas_token - str
azure_sas_token_permissions - str
Permissions to use when generating a SAS token. Ignored when sas token is provided directly

AzureCredentialsBase

None

azure_storage_account_name - str
azure_account_host - str
Alternative host when accessing blob storage endpoint ie. my_account.dfs.core.windows.net

AzureCredentialsWithoutDefaults

Credentials for Azure Blob Storage, compatible with adlfs

azure_storage_account_name - str
azure_account_host - str
Alternative host when accessing blob storage endpoint ie. my_account.dfs.core.windows.net
azure_storage_account_key - str
azure_storage_sas_token - str
azure_sas_token_permissions - str
Permissions to use when generating a SAS token. Ignored when sas token is provided directly

AzureServicePrincipalCredentials

None

azure_storage_account_name - str
azure_account_host - str
Alternative host when accessing blob storage endpoint ie. my_account.dfs.core.windows.net
azure_tenant_id - str
azure_client_id - str
azure_client_secret - str

AzureServicePrincipalCredentialsWithoutDefaults

None

azure_storage_account_name - str
azure_account_host - str
Alternative host when accessing blob storage endpoint ie. my_account.dfs.core.windows.net
azure_tenant_id - str
azure_client_id - str
azure_client_secret - str

ClickHouseCredentials

None

drivername - str
database - str
database connect to. Defaults to 'default'.
password - str
username - str
Database user. Defaults to 'default'.
host - str
Host with running ClickHouse server.
port - int
Native port ClickHouse server is bound to. Defaults to 9440.
query - typing.Dict[str, typing.Any]
http_port - int
HTTP Port to connect to ClickHouse server's HTTP interface.
secure - 0 | 1
Enables TLS encryption when connecting to ClickHouse Server. 0 means no encryption, 1 means encrypted.
connect_timeout - int
Timeout for establishing connection. Defaults to 10 seconds.
send_receive_timeout - int
Timeout for sending and receiving data. Defaults to 300 seconds.

ConnectionStringCredentials

None

drivername - str
database - str
password - str
username - str
host - str
port - int
query - typing.Dict[str, typing.Any]

CredentialsConfiguration

Base class for all credentials. Credentials are configurations that may be stored only by providers supporting secrets.

DatabricksCredentials

None

catalog - str
server_hostname - str
http_path - str
access_token - str
client_id - str
client_secret - str
http_headers - typing.Dict[str, str]
session_configuration - typing.Dict[str, typing.Any]
Dict of session parameters that will be passed to databricks.sql.connect
connection_parameters - typing.Dict[str, typing.Any]
Additional keyword arguments that are passed to databricks.sql.connect
socket_timeout - int
user_agent_entry - str

DremioCredentials

None

drivername - str
database - str
password - str
username - str
host - str
port - int
query - typing.Dict[str, typing.Any]

DuckDbBaseCredentials

None

drivername - str
database - str
password - str
username - str
host - str
port - int
query - typing.Dict[str, typing.Any]
read_only - bool

DuckDbCredentials

None

drivername - str
database - str
password - str
username - str
host - str
port - int
query - typing.Dict[str, typing.Any]
read_only - bool

DummyClientCredentials

None

GcpCredentials

None

token_uri - str
auth_uri - str
project_id - str

GcpDefaultCredentials

None

token_uri - str
auth_uri - str
project_id - str

GcpOAuthCredentials

None

client_id - str
client_secret - str
refresh_token - str
scopes - typing.List[str]
token - str
Access token
token_uri - str
auth_uri - str
project_id - str
client_type - str

GcpOAuthCredentialsWithoutDefaults

None

client_id - str
client_secret - str
refresh_token - str
scopes - typing.List[str]
token - str
Access token
token_uri - str
auth_uri - str
project_id - str
client_type - str

GcpServiceAccountCredentials

None

token_uri - str
auth_uri - str
project_id - str
private_key - str
private_key_id - str
client_email - str
type - str

GcpServiceAccountCredentialsWithoutDefaults

None

token_uri - str
auth_uri - str
project_id - str
private_key - str
private_key_id - str
client_email - str
type - str

LanceDBCredentials

None

uri - str
api_key - str
API key for the remote connections (LanceDB cloud).
embedding_model_provider_api_key - str
API key for the embedding model provider.

MotherDuckCredentials

None

drivername - str
database - str
password - str
username - str
host - str
port - int
query - typing.Dict[str, typing.Any]
read_only - bool
custom_user_agent - str

MsSqlCredentials

None

drivername - str
database - str
password - str
username - str
host - str
port - int
query - typing.Dict[str, typing.Any]
connect_timeout - int
driver - str

OAuth2Credentials

None

client_id - str
client_secret - str
refresh_token - str
scopes - typing.List[str]
token - str
Access token

PostgresCredentials

None

drivername - str
database - str
password - str
username - str
host - str
port - int
query - typing.Dict[str, typing.Any]
connect_timeout - int
client_encoding - str

QdrantCredentials

None

location - str
api_key - str
# API key for authentication in Qdrant Cloud. Default: None
path - str

RedshiftCredentials

None

drivername - str
database - str
password - str
username - str
host - str
port - int
query - typing.Dict[str, typing.Any]
connect_timeout - int
client_encoding - str

SFTPCredentials

Credentials for SFTP filesystem, compatible with fsspec SFTP protocol.

Authentication is attempted in the following order of priority:

key_filename may contain OpenSSH public certificate paths as well as regular private-key paths; when files ending in -cert.pub are found, they are assumed to match a private key, and both components will be loaded.
Any key found through an SSH agent: any “id_rsa”, “id_dsa”, or “id_ecdsa” key discoverable in ~/.ssh/.
Plain username/password authentication, if a password was provided.
If a private key requires a password to unlock it, and a password is provided, that password will be used to attempt to unlock the key.

For more information about parameters: https://docs.paramiko.org/en/3.3/api/client.html#paramiko.client.SSHClient.connect

sftp_port - int
sftp_username - str
sftp_password - str
sftp_key_filename - str
sftp_key_passphrase - str
sftp_timeout - float
sftp_banner_timeout - float
sftp_auth_timeout - float
sftp_channel_timeout - float
sftp_allow_agent - bool
sftp_look_for_keys - bool
sftp_compress - bool
sftp_gss_auth - bool
sftp_gss_kex - bool
sftp_gss_deleg_creds - bool
sftp_gss_host - str
sftp_gss_trust_dns - bool

SnowflakeCredentials

None

drivername - str
database - str
password - str
username - str
host - str
port - int
query - typing.Dict[str, typing.Any]
warehouse - str
role - str
authenticator - str
token - str
private_key - str
private_key_path - str
private_key_passphrase - str
application - str

SqlalchemyCredentials

None

drivername - str
database - str
password - str
username - str
host - str
port - int
query - typing.Dict[str, typing.Any]
engine_args - typing.Dict[str, typing.Any]
Additional arguments passed to sqlalchemy.create_engine

SynapseCredentials

None

drivername - str
database - str
password - str
username - str
host - str
port - int
query - typing.Dict[str, typing.Any]
connect_timeout - int
driver - str

WeaviateCredentials

None

url - str
api_key - str
additional_headers - typing.Dict[str, str]

All other Configurations

BaseConfiguration

None

ConfigProvidersConfiguration

None

enable_airflow_secrets - bool
enable_google_secrets - bool
airflow_secrets - VaultProviderConfiguration
google_secrets - VaultProviderConfiguration

ConfigSectionContext

None

in_container - bool
Current container, if None then not injected
extras_added - bool
Tells if extras were already added to this context
pipeline_name - str
sections - typing.Tuple[str, ...]
merge_style - typing.Callable[[dlt.common.configuration.specs.config_section_context.ConfigSectionContext, dlt.common.configuration.specs.config_section_context.ConfigSectionContext], NoneType]
source_state_key - str

ContainerInjectableContext

Base class for all configurations that may be injected from a Container. Injectable configuration is called a context

in_container - bool
Current container, if None then not injected
extras_added - bool
Tells if extras were already added to this context

CsvFormatConfiguration

None

delimiter - str
include_header - bool
quoting - quote_all | quote_needed
on_error_continue - bool
encoding - str

DBTRunnerConfiguration

None

package_location - str
package_repository_branch - str
package_repository_ssh_key - str
package_profiles_dir - str
package_profile_name - str
auto_full_refresh_when_out_of_sync - bool
package_additional_vars - typing.Mapping[str, typing.Any]
runtime - RuntimeConfiguration

DestinationCapabilitiesContext

Injectable destination capabilities required for many Pipeline stages ie. normalize

in_container - bool
Current container, if None then not injected
extras_added - bool
Tells if extras were already added to this context
preferred_loader_file_format - jsonl | typed-jsonl | insert_values | parquet | csv | reference | model
supported_loader_file_formats - typing.Sequence[typing.Literal['jsonl', 'typed-jsonl', 'insert_values', 'parquet', 'csv', 'reference', 'model']]
loader_file_format_selector - class 'dlt.common.destination.capabilities.LoaderFileFormatSelector'
Callable that adapts preferred_loader_file_format and supported_loader_file_formats at runtime.
preferred_table_format - iceberg | delta | hive | native
supported_table_formats - typing.Sequence[typing.Literal['iceberg', 'delta', 'hive', 'native']]
type_mapper - typing.Type[dlt.common.destination.capabilities.DataTypeMapper]
recommended_file_size - int
Recommended file size in bytes when writing extract/load files
preferred_staging_file_format - jsonl | typed-jsonl | insert_values | parquet | csv | reference | model
supported_staging_file_formats - typing.Sequence[typing.Literal['jsonl', 'typed-jsonl', 'insert_values', 'parquet', 'csv', 'reference', 'model']]
format_datetime_literal - typing.Callable[..., str]
escape_identifier - typing.Callable[[str], str]
escape_literal - typing.Callable[[typing.Any], typing.Any]
casefold_identifier - typing.Callable[[str], str]
Casing function applied by destination to represent case insensitive identifiers.
has_case_sensitive_identifiers - bool
Tells if destination supports case sensitive identifiers
decimal_precision - typing.Tuple[int, int]
wei_precision - typing.Tuple[int, int]
max_identifier_length - int
max_column_identifier_length - int
max_query_length - int
is_max_query_length_in_bytes - bool
max_text_data_type_length - int
is_max_text_data_type_length_in_bytes - bool
supports_transactions - bool
supports_ddl_transactions - bool
naming_convention - str | typing.Type[dlt.common.normalizers.naming.naming.NamingConvention] | class 'module'
alter_add_multi_column - bool
supports_create_table_if_not_exists - bool
supports_truncate_command - bool
schema_supports_numeric_precision - bool
timestamp_precision - int
max_rows_per_insert - int
insert_values_writer_type - str
supports_multiple_statements - bool
supports_clone_table - bool
Destination supports CREATE TABLE ... CLONE ... statements
max_table_nesting - int
Allows a destination to overwrite max_table_nesting from source
supported_merge_strategies - typing.Sequence[typing.Literal['delete-insert', 'scd2', 'upsert']]
merge_strategies_selector - class 'dlt.common.destination.capabilities.MergeStrategySelector'
supported_replace_strategies - typing.Sequence[typing.Literal['truncate-and-insert', 'insert-from-staging', 'staging-optimized']]
replace_strategies_selector - class 'dlt.common.destination.capabilities.ReplaceStrategySelector'
max_parallel_load_jobs - int
The destination can set the maximum amount of parallel load jobs being executed
loader_parallelism_strategy - parallel | table-sequential | sequential
The destination can override the parallelism strategy
max_query_parameters - int
The maximum number of parameters that can be supplied in a single parametrized query
supports_native_boolean - bool
The destination supports a native boolean type, otherwise bool columns are usually stored as integers
supports_nested_types - bool
Tells if destination can write nested types, currently only destinations storing parquet are supported
enforces_nulls_on_alter - bool
Tells if destination enforces null constraints when adding NOT NULL columns to existing tables
sqlglot_dialect - str
The SQL dialect used by sqlglot to transpile a query to match the destination syntax.

FilesystemConfiguration

A configuration defining filesystem location and access credentials.

When configuration is resolved, bucket_url is used to extract a protocol and request corresponding credentials class.

s3
gs, gcs
az, abfs, adl, abfss, azure
file, memory
gdrive
sftp
bucket_url - str
credentials - AwsCredentials | GcpServiceAccountCredentials | AzureCredentialsWithoutDefaults | AzureServicePrincipalCredentialsWithoutDefaults | AzureCredentials | AzureServicePrincipalCredentials | GcpOAuthCredentials | SFTPCredentials
read_only - bool
Indicates read only filesystem access. Will enable caching
kwargs - typing.Dict[str, typing.Any]
Additional arguments passed to fsspec constructor ie. dict(use_ssl=True) for s3fs
client_kwargs - typing.Dict[str, typing.Any]
Additional arguments passed to underlying fsspec native client ie. dict(verify="public.crt) for botocore
deltalake_storage_options - typing.Dict[str, typing.Any]
deltalake_configuration - typing.Dict[str, typing.Optional[str]]

Incremental

Adds incremental extraction for a resource by storing a cursor value in persistent state.

The cursor could for example be a timestamp for when the record was created and you can use this to load only new records created since the last run of the pipeline.

To use this the resource function should have an argument either type annotated with Incremental or a default Incremental instance. For example:

@dlt.resource(primary_key='id') def some_data(created_at=dlt.sources.incremental('created_at', '2023-01-01T00:00:00Z'): yield from request_data(created_after=created_at.last_value)

When the resource has a primary_key specified this is used to deduplicate overlapping items with the same cursor value.

Alternatively you can use this class as transform step and add it to any resource. For example:

@dlt.resource def some_data(): last_value = dlt.sources.incremental.from_existing_state("some_data", "item.ts") ...

r = some_data().add_step(dlt.sources.incremental("item.ts", initial_value=now, primary_key="delta")) info = p.run(r, destination="duckdb")

Args: cursor_path: The name or a JSON path to a cursor field. Uses the same names of fields as in your JSON document, before they are normalized to store in the database. initial_value: Optional value used for last_value when no state is available, e.g. on the first run of the pipeline. If not provided last_value will be None on the first run. last_value_func: Callable used to determine which cursor value to save in state. It is called with a list of the stored state value and all cursor vals from currently processing items. Default is max primary_key: Optional primary key used to deduplicate data. If not provided, a primary key defined by the resource will be used. Pass a tuple to define a compound key. Pass empty tuple to disable unique checks end_value: Optional value used to load a limited range of records between initial_value and end_value. Use in conjunction with initial_value, e.g. load records from given month incremental(initial_value="2022-01-01T00:00:00Z", end_value="2022-02-01T00:00:00Z") Note, when this is set the incremental filtering is stateless and initial_value always supersedes any previous incremental value in state. row_order: Declares that data source returns rows in descending (desc) or ascending (asc) order as defined by last_value_func. If row order is know, Incremental class is able to stop requesting new rows by closing pipe generator. This prevents getting more data from the source. Defaults to None, which means that row order is not known. allow_external_schedulers: If set to True, allows dlt to look for external schedulers from which it will take "initial_value" and "end_value" resulting in loading only specified range of data. Currently Airflow scheduler is detected: "data_interval_start" and "data_interval_end" are taken from the context and passed Incremental class. The values passed explicitly to Incremental will be ignored. Note that if logical "end date" is present then also "end_value" will be set which means that resource state is not used and exactly this range of date will be loaded on_cursor_value_missing: Specify what happens when the cursor_path does not exist in a record or a record has None at the cursor_path: raise, include, exclude lag: Optional value used to define a lag or attribution window. For datetime cursors, this is interpreted as seconds. For other types, it uses the + or - operator depending on the last_value_func. range_start: Decide whether the incremental filtering range is open or closed on the start value side. Default is closed. Setting this to open means that items with the same cursor value as the last value from the previous run (or initial_value) are excluded from the result. The open range disables deduplication logic so it can serve as an optimization when you know cursors don't overlap between pipeline runs. range_end: Decide whether the incremental filtering range is open or closed on the end value side. Default is open (exact end_value is excluded). Setting this to closed means that items with the exact same cursor value as the end_value are included in the result.

cursor_path - str
initial_value - typing.Any
end_value - typing.Any
row_order - asc | desc
allow_external_schedulers - bool
on_cursor_value_missing - raise | include | exclude
lag - float
range_start - open | closed
range_end - open | closed

ItemsNormalizerConfiguration

None

add_dlt_id - bool
When true, items to be normalized will have _dlt_id column added with a unique ID for each row.
add_dlt_load_id - bool
When true, items to be normalized will have _dlt_load_id column added with the current load ID.

LanceDBClientOptions

None

max_retries - int
EmbeddingFunction class wraps the calls for source and query embedding

LoadPackageStateInjectableContext

None

in_container - bool
Current container, if None then not injected
extras_added - bool
Tells if extras were already added to this context
storage - class 'dlt.common.storages.load_package.PackageStorage'
load_id - str

LoadStorageConfiguration

None

load_volume_path - str
delete_completed_jobs - bool

LoaderConfiguration

None

pool_type - process | thread | none
type of pool to run, must be set in derived configs
start_method - str
start method for the pool (typically process). None is system default
workers - int
how many parallel loads can be executed
run_sleep - float
how long to sleep between runs with workload, seconds
parallelism_strategy - parallel | table-sequential | sequential
Which parallelism strategy to use at load time
raise_on_failed_jobs - bool
when True, raises on terminally failed jobs immediately
raise_on_max_retries - int
When gt 0 will raise when job reaches raise_on_max_retries
truncate_staging_dataset - bool

NormalizeConfiguration

None

pool_type - process | thread | none
type of pool to run, must be set in derived configs
start_method - str
start method for the pool (typically process). None is system default
workers - int
# how many threads/processes in the pool
run_sleep - float
how long to sleep between runs with workload, seconds
destination_capabilities - DestinationCapabilitiesContext
json_normalizer - ItemsNormalizerConfiguration
parquet_normalizer - ItemsNormalizerConfiguration
model_normalizer - ItemsNormalizerConfiguration

NormalizeStorageConfiguration

None

normalize_volume_path - str

ParquetFormatConfiguration

None

flavor - str
version - str
data_page_size - int
timestamp_timezone - str
row_group_size - int
coerce_timestamps - s | ms | us | ns
allow_truncated_timestamps - bool

PipelineContext

None

in_container - bool
Current container, if None then not injected
extras_added - bool
Tells if extras were already added to this context

PoolRunnerConfiguration

None

pool_type - process | thread | none
type of pool to run, must be set in derived configs
start_method - str
start method for the pool (typically process). None is system default
workers - int
# how many threads/processes in the pool
run_sleep - float
how long to sleep between runs with workload, seconds

QdrantClientOptions

None

port - int
grpc_port - int
prefer_grpc - bool
https - bool
prefix - str
timeout - int
host - str

RuntimeConfiguration

None

pipeline_name - str
sentry_dsn - str
slack_incoming_hook - str
dlthub_telemetry - bool
dlthub_telemetry_endpoint - str
dlthub_telemetry_segment_write_key - str
log_format - str
log_level - str
request_timeout - float
Timeout for http requests
request_max_attempts - int
Max retry attempts for http clients
request_backoff_factor - float
Multiplier applied to exponential retry delay for http requests
request_max_retry_delay - float
Maximum delay between http request retries
config_files_storage_path - str
Platform connection
dlthub_dsn - str

SchemaConfiguration

None

naming - str | typing.Type[dlt.common.normalizers.naming.naming.NamingConvention] | class 'module'
json_normalizer - typing.Dict[str, typing.Any]
allow_identifier_change_on_table_with_data - bool
use_break_path_on_normalize - bool
Post 1.4.0 to allow table and column names that contain table separators

SchemaStorageConfiguration

None

schema_volume_path - str
import_schema_path - str
export_schema_path - str
external_schema_format - json | yaml
external_schema_format_remove_defaults - bool

SourceInjectableContext

A context containing the source schema, present when dlt.resource decorated function is executed

in_container - bool
Current container, if None then not injected
extras_added - bool
Tells if extras were already added to this context
source - class 'dlt.extract.source.DltSource'

SourceSchemaInjectableContext

A context containing the source schema, present when dlt.source/resource decorated function is executed

in_container - bool
Current container, if None then not injected
extras_added - bool
Tells if extras were already added to this context
schema - class 'dlt.common.schema.schema.Schema'

StateInjectableContext

None

in_container - bool
Current container, if None then not injected
extras_added - bool
Tells if extras were already added to this context
state - class 'dlt.common.pipeline.TPipelineState'

TransformationConfiguration

Configuration for a transformation

buffer_max_items - int

VaultProviderConfiguration

None

only_secrets - bool
only_toml_fragments - bool
list_secrets - bool

Destination Configurations​

AthenaClientConfiguration​

BigQueryClientConfiguration​

ClickHouseClientConfiguration​

CustomDestinationClientConfiguration​

DatabricksClientConfiguration​

DestinationClientConfiguration​

DestinationClientDwhConfiguration​

DestinationClientDwhWithStagingConfiguration​

DestinationClientStagingConfiguration​

DremioClientConfiguration​

DuckDbClientConfiguration​

DummyClientConfiguration​

FilesystemConfigurationWithLocalFiles​

FilesystemDestinationClientConfiguration​

LanceDBClientConfiguration​

MotherDuckClientConfiguration​

MsSqlClientConfiguration​

PostgresClientConfiguration​

QdrantClientConfiguration​

RedshiftClientConfiguration​

SnowflakeClientConfiguration​

SqlalchemyClientConfiguration​

SynapseClientConfiguration​

WeaviateClientConfiguration​

Credential Configurations​

AwsCredentials​

AwsCredentialsWithoutDefaults​

AzureCredentials​

AzureCredentialsBase​

AzureCredentialsWithoutDefaults​

AzureServicePrincipalCredentials​

AzureServicePrincipalCredentialsWithoutDefaults​

ClickHouseCredentials​

ConnectionStringCredentials​

CredentialsConfiguration​

DatabricksCredentials​

DremioCredentials​

DuckDbBaseCredentials​

DuckDbCredentials​

DummyClientCredentials​

GcpCredentials​

GcpDefaultCredentials​

GcpOAuthCredentials​

GcpOAuthCredentialsWithoutDefaults​

GcpServiceAccountCredentials​

GcpServiceAccountCredentialsWithoutDefaults​

LanceDBCredentials​

MotherDuckCredentials​

MsSqlCredentials​

OAuth2Credentials​

PostgresCredentials​

QdrantCredentials​

RedshiftCredentials​

SFTPCredentials​

SnowflakeCredentials​

SqlalchemyCredentials​

SynapseCredentials​

WeaviateCredentials​

All other Configurations​

BaseConfiguration​

ConfigProvidersConfiguration​

ConfigSectionContext​

ContainerInjectableContext​

CsvFormatConfiguration​

DBTRunnerConfiguration​

DestinationCapabilitiesContext​

FilesystemConfiguration​

Incremental​

ItemsNormalizerConfiguration​

LanceDBClientOptions​

LoadPackageStateInjectableContext​

LoadStorageConfiguration​

LoaderConfiguration​

NormalizeConfiguration​

NormalizeStorageConfiguration​

ParquetFormatConfiguration​

PipelineContext​

PoolRunnerConfiguration​

QdrantClientOptions​

Destination Configurations

AthenaClientConfiguration

BigQueryClientConfiguration

ClickHouseClientConfiguration

CustomDestinationClientConfiguration

DatabricksClientConfiguration

DestinationClientConfiguration

DestinationClientDwhConfiguration

DestinationClientDwhWithStagingConfiguration

DestinationClientStagingConfiguration

DremioClientConfiguration

DuckDbClientConfiguration

DummyClientConfiguration

FilesystemConfigurationWithLocalFiles

FilesystemDestinationClientConfiguration

LanceDBClientConfiguration

MotherDuckClientConfiguration

MsSqlClientConfiguration

PostgresClientConfiguration

QdrantClientConfiguration

RedshiftClientConfiguration

SnowflakeClientConfiguration

SqlalchemyClientConfiguration

SynapseClientConfiguration

WeaviateClientConfiguration

Credential Configurations

AwsCredentials

AwsCredentialsWithoutDefaults

AzureCredentials

AzureCredentialsBase

AzureCredentialsWithoutDefaults

AzureServicePrincipalCredentials

AzureServicePrincipalCredentialsWithoutDefaults

ClickHouseCredentials

ConnectionStringCredentials

CredentialsConfiguration

DatabricksCredentials

DremioCredentials

DuckDbBaseCredentials

DuckDbCredentials

DummyClientCredentials

GcpCredentials

GcpDefaultCredentials

GcpOAuthCredentials

GcpOAuthCredentialsWithoutDefaults

GcpServiceAccountCredentials

GcpServiceAccountCredentialsWithoutDefaults

LanceDBCredentials

MotherDuckCredentials

MsSqlCredentials

OAuth2Credentials

PostgresCredentials

QdrantCredentials

RedshiftCredentials

SFTPCredentials

SnowflakeCredentials

SqlalchemyCredentials

SynapseCredentials

WeaviateCredentials

All other Configurations

BaseConfiguration

ConfigProvidersConfiguration

ConfigSectionContext

ContainerInjectableContext

CsvFormatConfiguration

DBTRunnerConfiguration

DestinationCapabilitiesContext

FilesystemConfiguration

Incremental

ItemsNormalizerConfiguration

LanceDBClientOptions

LoadPackageStateInjectableContext

LoadStorageConfiguration

LoaderConfiguration

NormalizeConfiguration

NormalizeStorageConfiguration

ParquetFormatConfiguration

PipelineContext

PoolRunnerConfiguration

QdrantClientOptions