Опыт работы
Beeline/Qazcode
Big Data Developer
•Migrated DBSS storage from Parquet to Iceberg format and refactored the codebase to improve scalability and data reliability. •Optimized Spark jobs used in analytical models, reducing resource consumption by 66%. •Upgraded data infrastructure to support newer versions of Apache Spark and Python. •Designed and automated data showcase pipelines using Apache Airflow with trigger-based execution. •Designed and maintained Grafana-based monitoring for 800+ Airflow DAGs, providing visibility into resource consumption and execution performance.
- Migrated DBSS storage from Parquet to Iceberg format and refactored the codebase to improve scalability and data reliability.
- Optimized Spark jobs used in analytical models, reducing resource consumption by 66%.
- Upgraded data infrastructure to support newer versions of Apache Spark and Python.
- Designed and automated data showcase pipelines using Apache Airflow with trigger-based execution.
- Designed and maintained Grafana-based monitoring for 800+ Airflow DAGs, providing visibility into resource consumption and execution performance.
Oris Lab
Backend Developer Internship
•Built and tested a Node.js service for Tron-based blockchain transactions. •Set up PostgreSQL as the backend database for managing transaction data. •Collaborated with the team using Git for version control and code reviews.
- Built and tested a Node.js service for Tron-based blockchain transactions.
- Set up PostgreSQL as the backend database for managing transaction data.
- Collaborated with the team using Git for version control and code reviews.
Проекты
House Prices (Kaggle)
Collaborated with Data Scientists and Analysts in the Astana Hub program to build and monitor machine learning solutions. Contributed to model training, data preprocessing, and system monitoring in a production-like environment. Tools Used: Python, PostgreSQL, CatBoost, XGBoost, SQLAlchemy, Evidently AI
Real Data Streaming
Containerizing with Docker the full environment. The main task is to create a data pipeline for obtaining information from open sources (random user.me API) and its real-time processing. Orchestration and process management using Apache Air flow, as well as data storage in PostgreSQL. Data streaming via Apache Kafka and synchronization using Zookeeper. Scalable data processing using Apache Spark. Storing the processed data in Apache Cassandra. Tools Used: Python, API, SQLAlchemy, Spark, Kafka, Docker, Zookeeper, Cassandra, Airflow
Образование
Astana Hub
2024 — 2025Big Data Engineer
КурсыHaliç Üniversitesi
2023 — 2024Software Engineer
International Information of Technology University
2022Software Engineer