Zillow Data Analytics - End-to-End ETL Pipeline

  • Tech Stack: Python, Apache Airflow, AWS (S3, EC2, Lambda, Redshift, Quicksight)
  • Github: Project Link

→ Implemented an ETL pipeline using Python and Apache Airflow - a tool to orchestrate and manage workflows

→ Extracted Zillow housing data using RapidAPI, used AWS S3 buckets to store raw data, and AWS Lambda functions to copy data, perform ETL operations, and load processed data to another S3 bucket.

→ Used AWS Redshift cluster to fetch data from S3 bucket and dump into a database table. Leveraged AWS Quicksight to perform data analysis.