contact@ijirct.org      

 

Publication Number

2504004

 

Page Numbers

1-3

 

Paper Details

Building a Scalable ETL Pipeline with Apache Spark, Airflow, and Snowflake

Authors

Ujjawal Nayak

Abstract

Extract, Transform, and Load (ETL) pipelines are critical in modern data engineering, enabling efficient data integration and analytics. This paper presents a scalable ETL pipeline leveraging Apache Spark for distributed data processing, Apache Airflow for workflow orchestration, and Snowflake as a cloud-based data warehouse. The proposed architecture ensures fault tolerance, cost efficiency, and high scalability, making it suitable for handling large-scale enterprise data workloads.

Keywords

ETL, Apache Spark, Airflow, Snowflake, Data Engineering, Scalable Architecture

 

. . .

Citation

Building a Scalable ETL Pipeline with Apache Spark, Airflow, and Snowflake. Ujjawal Nayak. 2025. IJIRCT, Volume 11, Issue 2. Pages 1-3. https://www.ijirct.org/viewPaper.php?paperId=2504004

Download/View Paper

 

Download/View Count

15

 

Share This Article