A Data Engineer with a passion for making teams work faster and better by building efficient data pipelines and creating scalable data solutions. I enjoy analyzing and wrangling data to drive actionable insights, while constantly improving processes and tools.
With extensive experience across various technologies such as Google BigQuery, DBT, Apache Beam, AWS, GCP, Docker, Airflow, and more, Iβve developed streaming pipelines from scratch, optimized data warehouses, and led engineering teams to success.
- Data Engineering: ELT/ETL pipelines, real-time streaming, data warehousing, data modeling
- Cloud Platforms: Google Cloud, AWS, Alibaba, Azure Machine Learning
- Tools & Technologies: DBT, Apache Beam, Airflow, Docker, Datastream, Pub/Sub
- Languages: Python, SQL, R, Java, C++, PHP, Assembly
- Machine Learning: Feature engineering, model deployment, predictive modeling
- Leadership: Team management, career development, hiring
- Developed a real-time streaming pipeline from scratch using Google Datastream, Pub/Sub, and Dataflow, enabling seamless data ingestion from application databases to BigQuery.
- Implemented an end-to-end ELT pipeline with DBT, including testing and query dependencies, reducing BigQuery costs by 20% through optimized partitioning and clustering.
- Led the Data Engineering team by creating job descriptions, career frameworks, entry tests, and interview processes, successfully hiring a new team from zero.
- Created a credit scoring proof-of-concept (POC) for Flipβs lending product using Docker and FastAPI.
- Provisioned Redash and Looker Studio for data analytics and visualization, empowering stakeholders with actionable insights.
- Ensured the smooth operation of a 24/7 Airflow-based data pipeline, resolving issues and ensuring uptime.
- Designed and implemented a PostgreSQL partitioning strategy for large datasets, significantly improving query performance and scalability.
- Automated daily Facebook, Google, and Appsflyer API ingestion pipelines, increasing operational efficiency.
- Modeled product prices using SARIMA and linear regression models in Azure Machine Learning.
- Enhanced location data with Google Maps Geocoding and visualized event data for a Marathon Event.
- Developed a news crawler to track and analyze media trends using R.
- KUACI: Developed an open-source KYC solution for Indonesian KTP data, translating KTP numbers into location, gender, and DOB.
- Contributed to GitHub Arctic Code Vault as part of the KUACI project.
- Bachelor of Computer Science from Multimedia Nusantara University, GPA: 3.78/4.0
- Thesis: Speech Recognition Analysis using CNN for Indonesian Language
- Certifications: TOEIC (950/990), Japanese JLPT N3
- Academic Scholarship: Awarded to the top 5% of students at Multimedia Nusantara University.
Iβm always open to new challenges and collaboration opportunities. Letβs build something amazing together!
π§ andrewtirtokusumo@gmail.com | π LinkedIn | π¨βπ» GitHub