Published In
Publication Number
Page Numbers
Paper Details
Building a Scalable ETL Pipeline with Apache Spark, Airflow, and Snowflake
Authors
Ujjawal Nayak
Abstract
Extract, Transform, and Load (ETL) pipelines are critical in modern data engineering, enabling efficient data integration and analytics. This paper presents a scalable ETL pipeline leveraging Apache Spark for distributed data processing, Apache Airflow for workflow orchestration, and Snowflake as a cloud-based data warehouse. The proposed architecture ensures fault tolerance, cost efficiency, and high scalability, making it suitable for handling large-scale enterprise data workloads.
Keywords
ETL, Apache Spark, Airflow, Snowflake, Data Engineering, Scalable Architecture
Citation
Building a Scalable ETL Pipeline with Apache Spark, Airflow, and Snowflake. Ujjawal Nayak. 2025. IJIRCT, Volume 11, Issue 2. Pages 1-3. https://www.ijirct.org/viewPaper.php?paperId=2504004