Senior Data Analytics Engineer
Zalando
Summary
Leading data analytics initiatives at Zalando, Europe's leading online fashion platform. Responsible for building scalable data pipelines, analyzing user behavior and product performance metrics, and providing actionable insights that drive business decisions. Spearheaded the creation of self-service reporting tools, designed data quality pipelines, and engineered high-performance ETL processes handling terabytes of data daily.
- Analysed user behaviour and product performance data using a combination of SQL, Python, Redshift, QuickSight and Looker to identify trends and opportunities for optimisation, resulting in a 15% conversion rate for key product categories.
- Partnered with diverse stakeholders to define and prioritise product features, driving a 20% improvement in user engagement and transaction metrics.
- Spearheaded the creation of a self-service reporting Customer Insights Portal, empowered stakeholders to access real-time insights and minimised ad-hoc reporting requests by 20%.
- Coordinated with product teams to build advanced statistical techniques such as regression analysis and A/B testing to provide actionable insights and recommendations for product optimisation.
- Designed and implemented data quality pipelines with DBT, Airflow, BigQuery and Databricks; decreased data discrepancies by 25%, ensuring data integrity, and presented findings to fix data discrepancies.
- Engineered SQL and Spark/Python-based data pipelines; reduced data processing and analysis time by 50% and enabled the handling of over 2TB of Google Analytics daily export data.
- Implemented time-series forecasting and outlier detection models and visualised insights in Looker and MicroStrategy, augmenting data quality by 80% and expediting incident detection and resolution.
- Orchestrated ETL pipelines using dbt, Databricks, AWS3 and Airflow, improving workflow scheduling and monitoring, leading to a 35% enhancement in pipeline reliability and efficiency.
- Developed data quality pipelines using DBT, Databricks, and Airflow, bringing about a 25% reduction in data discrepancies and ensuring data integrity across systems.
- Enhanced product development using advanced statistical techniques in collaboration with cross-functional teams, gaining approval from 17 stakeholders and improving model accuracy by 7%.
Skills Used
Python
SQL
Prompt Engineering
Data Engineering
Data Analytics
Data Science
ETL/ELT
Data Pipelines
Data Modeling
Data Warehousing
Pandas
NumPy
PostgreSQL
CI/CD
AWS
Google Cloud Platform
BigQuery
Redshift
Databricks
DBT
Airflow
Tableau
Looker
MicroStrategy
OpenAI API
MLflow
Jupyter Notebooks
GitHub
VS Code
Google Analytics
Excel
Confluence