Aset Nur

Алматы, Казахстан

1 г. 8 мес. опыта 16 навыка

Опыт работы

Beeline/Qazcode

02.2025 — по н.в. 1 г. 6 мес.

Big Data Developer

Офис Almaty

•Migrated DBSS storage from Parquet to Iceberg format and refactored the codebase to improve scalability and data reliability. •Optimized Spark jobs used in analytical models, reducing resource consumption by 66%. •Upgraded data infrastructure to support newer versions of Apache Spark and Python. •Designed and automated data showcase pipelines using Apache Airflow with trigger-based execution. •Designed and maintained Grafana-based monitoring for 800+ Airflow DAGs, providing visibility into resource consumption and execution performance.

Migrated DBSS storage from Parquet to Iceberg format and refactored the codebase to improve scalability and data reliability.
Optimized Spark jobs used in analytical models, reducing resource consumption by 66%.
Upgraded data infrastructure to support newer versions of Apache Spark and Python.
Designed and automated data showcase pipelines using Apache Airflow with trigger-based execution.
Designed and maintained Grafana-based monitoring for 800+ Airflow DAGs, providing visibility into resource consumption and execution performance.

Oris Lab

06.2024 — 08.2024 2 мес.

Backend Developer Internship

Стажёр Офис Almaty

•Built and tested a Node.js service for Tron-based blockchain transactions. •Set up PostgreSQL as the backend database for managing transaction data. •Collaborated with the team using Git for version control and code reviews.

Built and tested a Node.js service for Tron-based blockchain transactions.
Set up PostgreSQL as the backend database for managing transaction data.
Collaborated with the team using Git for version control and code reviews.

Проекты

House Prices (Kaggle)

Collaborated with Data Scientists and Analysts in the Astana Hub program to build and monitor machine learning solutions. Contributed to model training, data preprocessing, and system monitoring in a production-like environment. Tools Used: Python, PostgreSQL, CatBoost, XGBoost, SQLAlchemy, Evidently AI

Real Data Streaming

Containerizing with Docker the full environment. The main task is to create a data pipeline for obtaining information from open sources (random user.me API) and its real-time processing. Orchestration and process management using Apache Air flow, as well as data storage in PostgreSQL. Data streaming via Apache Kafka and synchronization using Zookeeper. Scalable data processing using Apache Spark. Storing the processed data in Apache Cassandra. Tools Used: Python, API, SQLAlchemy, Spark, Kafka, Docker, Zookeeper, Cassandra, Airflow

Образование

Astana Hub

2024 — 2025

Big Data Engineer

Курсы

Haliç Üniversitesi

2023 — 2024

Software Engineer

International Information of Technology University

2022

Software Engineer

Навыки

Python Java SQL Bash Hadoop HDFS Hive Clickhouse Apache Spark Apache Cassandra Apache Kafka Apache Airflow Git FastAPI Docker CI/CD