Перейти к содержимому
M

Maksat Mukan

Data Scientist

Middle Алматы, Казахстан
2 г. 2 мес. опыта 29 навыка

Опыт работы

TTC Transtelecom

04.2024 — 08.2024 4 мес.

Data engineer (Internship)

Стажёр Гибрид

Built and maintained data pipelines for large-scale processing using Python, Pandas, and NumPy. Wrote SQL queries and integrated PostgreSQL via Psycopg2 for data storage and retrieval. Built dashboards in PowerBI and Tableau to support operational reporting. Wrote unit tests and followed CI/CD practices to maintain code quality.

Kazakhtelecom JSC

08.2024 — по н.в. 1 г. 10 мес.

Data Scientist

Гибрид

Built NLP pipelines for text classification and complaint analysis using FastText, Sentence-BERT, and BERTopic on real call center and CRM data. Developed NPTB (Next Product to Buy) model using XGBoost and RFM features; results adopted by marketing for cross-sell campaigns. Applied topic modeling (LDA, BERTopic) to cluster customer complaints and surface recurring root causes across network, billing, and service categories. Queried and processed large internal datasets using SQL and Psycopg2; collaborated with business teams to translate findings into action.

  • Developed NPTB model adopted by marketing for cross-sell campaigns

Проекты

Service Request Categorization — Kazakhtelecom

Labeled and preprocessed historical service requests (call transcripts, chats, tickets); applied FastText embeddings for text representation. Trained LightGBM and SVM classifiers to route requests into categories (technical, billing, upgrades); integrated predictions into internal dashboards for real-time triage.

Complaint Analysis & Root Cause Detection — Kazakhtelecom

Built end-to-end NLP pipeline: preprocessing→embedding (Sentence-BERT)→clustering (BERTopic)→trend visualization. Identified recurring complaint patterns (network failures, billing errors) and delivered insights to technical teams to reduce complaint volume.

NPTB (Next Product to Buy) Prediction — Kazakhtelecom

Engineered RFM-based features from subscription history and usage logs; trained XGBoost classifier evaluated by precision@k and top-k accuracy. Recommendations delivered to marketing and sales teams for targeted upsell campaigns.

Customer Retention Prediction

Trained Random Forest and Logistic Regression models on transaction history and demographics to predict churn risk (ROC-AUC as primary metric).

Образование

Kazakh-British Technical University

2022 — 2026

Computer Science

Бакалавр

Награды

International Physics Olympiad Winner

2nd Place

2020

Курсы

KnewIT Django course

Ожидаемая зарплата

1 200 000 KZT

Навыки

Python SQL C++ Pandas NumPy Scikit-learn Psycopg2 Matplotlib PySpark Pytorch TensorFlow Transformers Docker PowerBI Tableau Jupyter Git Airflow QlikSense FastText Sentence-BERT BERTopic XGBoost LightGBM NLTK SVM Random Forest Logistic Regression RFM
Ссылка скопирована