Data Engineer with over 7+ years of experience in designing, building, and optimizing scalable data pipelines and cloud-native analytics solutions. Started career as an ETL Developer, gaining strong foundations in data integration, transformation, and warehousing, and transitioned into modern Data Engineering with hands-on expertise in Python, PySpark, and SQL. Skilled in developing both batch and real-time data pipelines using Spark, Kafka, and cloud services across AWS, Azure, and GCP ecosystems. Proficient in orchestrating workflows with Apache Airflow and Azure Data Factory, managing data lakes and warehouses (Snowflake, Synapse, BigQuery), and delivering clean, reliable, and analytics-ready data for BI and machine learning teams. Adept in data modeling, performance tuning, and implementing robust data quality and validation frameworks. Proven ability to collaborate across teams to drive data initiatives that support strategic business goals.
Client: Techem GmbH / Consumers Energy
Client: ValueLabs
Python
PySpark
SQL
Scala
Pandas
Django
Flask
AWS
S3
EMR
Kinesis
Glue
Redshift
Azure
Data Lake
Synapse
ADF
Databricks
SQL
DevOps
GCP
BigQuery
Dataflow
Cloud Storage
Apache Spark
Kafka
Airflow
undefined