scd2

Here are 33 public repositories matching this topic...

edwinweber / dbt_duckdb_demo_public

Open Source data engineering demo project using dbt, DuckDB, dlt, Dagster and Metabase. Two storage modes for the delta tables are supported: local and Microsoft Fabric Onelake.

python open-data data-engineering dbt dlt scd2 delta-lake dagster duckdb medallion-architecture microsoft-fabric motherduck

Updated Jun 2, 2026
Python

KaterynaD / dbt_scd2_plus

Star

Slowly Changing Dimension Type 2 (scd2) custom materialization

dbt scd2

Updated Apr 6, 2026
Shell

spatil6 / ETL-SCD2

Star

SCD2 implementation using pyspark

pyspark datawarehouse scd2

Updated Mar 18, 2018
Jupyter Notebook

akshayush / SCD2-Implementation--using-pyspark

Star

SCD2 implementation using pyspark

pyspark scd2 multiday multiday-scd2

Updated Mar 10, 2019
Jupyter Notebook

ai-tech-karthik / banking-data-pipeline

Star

A modern banking data pipeline built with Dagster and DBT!

python data-engineering dbt databricks data-quality data-lineage scd2 dagster incremental-processing duckdb modendatastack

Updated Jan 31, 2026
Python

AlexMajiA / ecommerce-analytics-platform

Star

ELT pipeline with dbt & Snowflake on Olist dataset. Medallion architecture, dimensional modeling, SCD2, RFM and Cortex AI sentiment analysis.

aws-s3 snowflake dbt elt scd2 medallion-architecture cortexai

Updated Jun 8, 2026
PLpgSQL

abdullah-mahmoud-de / Automated-Data-Pipelines-Spark-dbt-Airflow

Star

end-to-end data pipeline system built as part of the Coursera open-source Data Engineering program. It unifies diverse data sources, implements SCD2 historical tracking, and orchestrates workflows using industry-standard tools.

spark apache-spark python3 dbt data-pipeline apache-airflow scd2 dbt-core data-pipeline-automation

Updated May 25, 2026
Python

emudamah0906 / polaris-claims-lakehouse

Star

P&C insurance claims lakehouse: Azure ADLS + Databricks (PySpark/Delta) + Snowflake + dbt, real-time FNOL fraud signals via Kafka, Airflow-orchestrated, Terraform-provisioned, OIDC-secured, with data contracts, lineage, and ADRs throughout.

Updated Jun 3, 2026
Python

Mairondc21 / pipeline_delta_s3

Star

Pipeline 100% Open Source

docker airflow s3 pyspark cicd boto3 ruff datahub scd2 delta-lake great-expectations sqlfluff

Updated Mar 19, 2026
Python

shivaranjanka / snowflake-healthcare-pipeline

Star

Advanced Healthcare Claims Pipeline using Snowflake, Snowpipe, Streams, Tasks, SCD Type 2, and AWS S3. Automates ingestion, CDC, dimensional modeling, and data quality checks for healthcare patient and claims data.

aws cloud sql analytics tasks snowflake streams data-engineering healthcare cdc data-pipeline scd2 snowpipe

Updated Nov 10, 2025

Mohameddfxxcxx / global-horizon-bank-dwh-project

Star

Fortune-500-grade banking analytics platform: OLTP -> medallion lakehouse -> Kimball star schema -> semantic layer -> 9-tab executive dashboard + 5 ML models (churn, fraud, segmentation, forecasting). Production-ready, governed, fully tested.

Updated Apr 30, 2026
Python

shukla2015 / Travel_Booking_SCD2_Project

Star

Production-grade parameterized ETL pipeline implementing SCD Type 2 for travel booking data using Databricks, Delta Lake, and ADLS — includes data quality checks, incremental fact table build, Z-Order optimization, and SQL reporting.

etl pyspark databricks scd2 delta-lake azure-data-engineering pydeequ

Updated Apr 6, 2026
Jupyter Notebook

DustinPineau / cms_portfolio

Star

End-to-end Medicare data engineering pipeline: API ingestion, PostgreSQL 17, dbt, dimensional modeling (Kimball/SCD2), Apache Airflow orchestration, and Evidence.dev dashboard. Built on a QEMU/KVM Rocky Linux VM.

python cms portfolio sql etl postgresql data-engineering dbt data-pipeline medicare evidence apache-airflow kimball scd2 dimensional-modeling

Updated Apr 28, 2026
PLpgSQL

sushmakl95 / aws-glue-cdc-framework

Star

Production-grade CDC pipeline: MySQL → Debezium → Kinesis → S3 → AWS Glue (PySpark) → Redshift + Postgres + OpenSearch. Multi-sink fanout with SCD2, idempotency tracking, and 13 modular Terraform modules.

Updated Apr 23, 2026
Python

OsamaMustafa32 / Enterprise_Retail_Data_Lakehouse

Star

Batch retail data lakehouse on Databricks: Delta Live Tables (bronze → silver → gold), Unity Catalog, synthetic data generator, and an executive analytics dashboard.

python sql pyspark databricks data-quality-checks etl-pipeline scd2 delta-lake data-lakehouse delta-live-tables unity-catalog medallion-architecture

Updated Apr 2, 2026
Python

Cindy-txr / Employee-data-platform

Star

Production-style Data Warehouse project using Airflow + PostgreSQL with CDC event layer, SCD2 modeling, checkpoint-based incremental loading, and idempotent pipelines.

python docker postgres airflow sql kafka analytics data-warehouse data-engineering cdc tel scd2

Updated May 21, 2026
Python

Aayushi-Anand / SCD2_Implementation

Star

Implementation of SCD2 for employee relocation data

etl-pipeline scd2

Updated Feb 28, 2022

sushmakl95 / dbt-bigquery-analytics-platform

Star

Modern data stack reference: dbt + BigQuery + Airflow (Cloud Composer) with medallion layering, SCD2 snapshots, exposures, freshness SLAs, and 45× cost reduction via partition + cluster + incremental tuning.

Updated Apr 23, 2026
Python

ZuhairBhati / travel_bookings_pipeline

Star

This is a data engineering pipeline built on Databricks + Delta Lake + PySpark that ingests travel booking and customer master data, applies SCD Type 2 logic, and delivers analytics-ready tables. It includes data quality enforcement, dimension versioning, fact aggregation, and performance tuning.

python analytics travel pyspark data-engineering hospitality notebooks databricks bookings etl-pipeline scd2

Updated Oct 8, 2025
Jupyter Notebook

moniburnejko / snowflake-ingestion-patterns

Star

reference snowflake ingestion patterns: streams and tasks, and dynamic tables with scd2 and deduplication. provisioned with terraform, plus a dbt sandbox.

terraform snowflake data-engineering dbt dynamic-tables elt scd2 streams-and-tasks

Updated Jun 8, 2026
HCL

Improve this page

Add a description, image, and links to the scd2 topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the scd2 topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

scd2

Here are 33 public repositories matching this topic...

edwinweber / dbt_duckdb_demo_public

KaterynaD / dbt_scd2_plus

spatil6 / ETL-SCD2

akshayush / SCD2-Implementation--using-pyspark

ai-tech-karthik / banking-data-pipeline

AlexMajiA / ecommerce-analytics-platform

abdullah-mahmoud-de / Automated-Data-Pipelines-Spark-dbt-Airflow

emudamah0906 / polaris-claims-lakehouse

Mairondc21 / pipeline_delta_s3

shivaranjanka / snowflake-healthcare-pipeline

Mohameddfxxcxx / global-horizon-bank-dwh-project

shukla2015 / Travel_Booking_SCD2_Project

DustinPineau / cms_portfolio

sushmakl95 / aws-glue-cdc-framework

OsamaMustafa32 / Enterprise_Retail_Data_Lakehouse

Cindy-txr / Employee-data-platform

Aayushi-Anand / SCD2_Implementation

sushmakl95 / dbt-bigquery-analytics-platform

ZuhairBhati / travel_bookings_pipeline

moniburnejko / snowflake-ingestion-patterns

Improve this page

Add this topic to your repo