Senior Data Analytics Engineer

Zalando

Details

March 2022 - Present

Germany

Company Website

Summary
Leading data analytics initiatives at Zalando, Europe's leading online fashion platform. Responsible for building scalable data pipelines, analyzing user behavior and product performance metrics, and providing actionable insights that drive business decisions. Spearheaded the creation of self-service reporting tools, designed data quality pipelines, and engineered high-performance ETL processes handling terabytes of data daily.
  • Analysed user behaviour and product performance data using a combination of SQL, Python, Redshift, QuickSight and Looker to identify trends and opportunities for optimisation, resulting in a 15% conversion rate for key product categories.
  • Partnered with diverse stakeholders to define and prioritise product features, driving a 20% improvement in user engagement and transaction metrics.
  • Spearheaded the creation of a self-service reporting Customer Insights Portal, empowered stakeholders to access real-time insights and minimised ad-hoc reporting requests by 20%.
  • Coordinated with product teams to build advanced statistical techniques such as regression analysis and A/B testing to provide actionable insights and recommendations for product optimisation.
  • Designed and implemented data quality pipelines with DBT, Airflow, BigQuery and Databricks; decreased data discrepancies by 25%, ensuring data integrity, and presented findings to fix data discrepancies.
  • Engineered SQL and Spark/Python-based data pipelines; reduced data processing and analysis time by 50% and enabled the handling of over 2TB of Google Analytics daily export data.
  • Implemented time-series forecasting and outlier detection models and visualised insights in Looker and MicroStrategy, augmenting data quality by 80% and expediting incident detection and resolution.
  • Orchestrated ETL pipelines using dbt, Databricks, AWS3 and Airflow, improving workflow scheduling and monitoring, leading to a 35% enhancement in pipeline reliability and efficiency.
  • Developed data quality pipelines using DBT, Databricks, and Airflow, bringing about a 25% reduction in data discrepancies and ensuring data integrity across systems.
  • Enhanced product development using advanced statistical techniques in collaboration with cross-functional teams, gaining approval from 17 stakeholders and improving model accuracy by 7%.
Skills Used
Python SQL Prompt Engineering Data Engineering Data Analytics Data Science ETL/ELT Data Pipelines Data Modeling Data Warehousing Pandas NumPy PostgreSQL CI/CD AWS Google Cloud Platform BigQuery Redshift Databricks DBT Airflow Tableau Looker MicroStrategy OpenAI API MLflow Jupyter Notebooks GitHub VS Code Google Analytics Excel Confluence