Building scalable data pipelines, transforming raw data into actionable insights.
Recent Computer Science graduate with a deep focus on data infrastructure and pipeline engineering. I thrive at the intersection of software engineering and data — turning raw, messy data into reliable, scalable systems.
Currently seeking entry-level Data Engineering roles where I can contribute immediately and grow alongside a team obsessed with data quality and reliability.
Architected a production-ready data warehouse using Medallion Architecture (Bronze → Silver → Gold layers), ingesting raw CRM and ERP data through a full-load ETL batch pipeline into Microsoft SQL Server. Designed normalized data models across Bronze (raw), Silver (cleaned/standardized), and Gold (business-ready), layers — implementing stored procedures (load_bronze, load_silver, load_gold) for automated layer loading. Enforced data governance best practices including naming conventions, deduplication, missing value handling, and schema validation across all pipeline stages.
Engineered an end-to-end real-time and historical data pipeline for cryptocurrency prices, supporting both streaming (Kafka) and batch analytics workflows — demonstrating core data engineering pipeline design.Designed Kafka producer/consumer architecture with coin-keyed partitioning; ingested 30 days of historical OHLCV data as structured CSVs; containerized the full infrastructure using Docker Compose for reproducible deployment.
Designed and executed SQL queries on PostgreSQL and SQL Server for data extraction, transformation, and reporting — contributing directly to ETL reporting workflows under developer mentorship.Identified and resolved data inconsistencies in application datasets, enforcing data quality and governance standards to improve reliability of downstream analytical outputs. Collaborated with the development team through Git-based version control, contributing to structured software delivery and code review processes.
Open to Data Engineering roles, internships, and collaborations. Let's talk about data infrastructure.