Senior Data Engineer (Data Usage)

Indrive

Kazakhstan Data Platform
Apply on EasyApply

Create a free account to apply in seconds

We are currently actively building a Data Warehouse a key part of the product. We work with cutting edge technologies (GCP, AWS, Airflow, Kafka, K8s) and make infrastructure and architectural decisions based on data. We are building a large scale data infrastructure for analytics, machine learning, and realtime recommendations.

Our tech stack
Languages: Python, SQL
Frameworks: Spark, Apache Beam
Storage and analytics: BigQuery, GCS, S3, Trio, other GCP and AWS stack
components Integration: Apache Kafka, Google Pub/Sub, Debezium
ETL: Airflow 2
Infrastructure: Kubernetes, Terraform
Development: GitHub, GitHub Actions, Jira

Key Responsibilities

• Foster a culture of working with data across the organization, ensuring data-driven decision-making

• Create and maintain a unified system for processing, storing, and validating data, ensuring data integrity and accessibility

• Design and build processes for processing and enriching data, participating in all stages of the data pipeline from data capture to consumer presentation

• Develop and maintain infrastructure for big data storage and processing using tools like Kubernetes (K8S) and Terraform

• Create and optimize APIs (REST, gRPC) for high-load data access services, enabling efficient data retrieval

• Write integration and unit tests, develop automation tools for data validation and alerting

• Сontribute to system design and architecture with the development team

Skills, Knowledge and Expertise

• Advanced proficiency in Python 3.7+ with strong experience in developing ETL processes using PySpark

• Proven experience in developing data flows using Airflow2

• High level of expertise in SQL, including complex queries and optimization

• Extensive knowledge and industrial experience with Kubernetes (K8S)

• Strong understanding of data processing algorithms and principles, with experience in Spark/Flink

• Solid understanding of general programming concepts, including design patterns, OOP, modularity, and pure architecture

• Demonstrated ability to take ownership of technologies or services and proactively contribute ideas to the team

Conditions

• Stable salary, official employment;

• Health insurance;

• Hybrid work mode and flexile schedule;

• Relocation package offered for candidates from other regions;

• Access to professional counseling services including psychological, financial, and legal support;

• Discount club membership;

• Diverse internal training programs;

• Partially or fully payed additional training courses;

• All necessary work equipment.

Skills

PythonSQLAirflowKubernetesETL processesData processing algorithmsSparkAPIs (REST, gRPC)Data integrityTeam collaboration