L1 Data Engineer - Remote

Deepsource

Apply on EasyApply

Create a free account to apply in seconds

We are looking for a motivated and technically solid L1 Data Engineer to join our growing Data & Analytics team. In this role, you will be responsible for designing, building, and maintaining the data architecture and infrastructure that supports our organization's data strategy. You will work hands-on to develop, test, and deploy reliable data solutions — ensuring pipelines are scalable, efficient, and aligned with business requirements.

This is an ideal opportunity for a data professional who is eager to deepen their expertise in cloud-native data platforms, particularly within the Microsoft Azure and Databricks ecosystem, and who thrives in a collaborative, fast-paced environment.

KEY RESPONSIBILITIES

• Design, develop, and maintain scalable data pipelines and ETL/ELT workflows to support business intelligence and analytics use cases.

• Build and optimize data ingestion processes using Azure Data Factory and Databricks, ensuring data quality and consistency across all layers of the data platform.

• Transform and process large datasets using PySpark and Python, applying best practices for performance and maintainability.

• Write and optimize complex SQL queries to support analytical reporting and data validation requirements.

• Collaborate with data architects and senior engineers to implement and maintain data models aligned with organizational standards.

• Monitor, troubleshoot, and resolve pipeline failures and data quality issues, applying root-cause analysis to prevent recurrence.

• Contribute to documentation of data pipelines, data dictionaries, and engineering standards.

• Support the team in exploring and evaluating new tools and approaches to continuously improve the data infrastructure.

Requirements

• 2+ years of professional experience in a Data Engineering or closely related role.

• Strong proficiency in Python for data processing, transformation, and automation tasks.

• Hands-on experience with Pandas for data manipulation and PySpark for distributed data processing.

• Practical experience with Databricks, including notebook development, clusters, and job orchestration.

• Experience building and managing data pipelines with Azure Data Factory.

• Working knowledge of Azure Synapse Analytics, particularly Spark pool integration.

• Solid SQL skills, including query writing, optimization, and performance tuning.

• Familiarity with data engineering principles including incremental loading, data lake architecture, and Delta Lake.

• Understanding of data governance and security concepts within a cloud data platform.

NICE TO HAVE

• Experience with SQL Server migration projects, including schema conversion and data movement.

• Exposure to Terraform for Azure infrastructure provisioning and management.

• Familiarity with CI/CD practices applied to data engineering workflows.

• Experience with Delta Sharing or Lakehouse Federation concepts.

CERTIFICATION REQUIREMENT

• Candidates are expected to hold or be actively working toward the Databricks Certified Data Engineer Associate certification. This certification validates foundational knowledge across the following domains:

• Databricks Lakehouse Platform architecture and capabilities

• ETL and ELT workflows using Spark SQL and PySpark

• Incremental data processing and structured streaming

• Production pipeline development and orchestration

• Data governance and security within the Databricks environment

Skills

PythonPySparkAzure Data FactoryDatabricksSQLData Pipeline DevelopmentData GovernanceCollaborationProblem SolvingDocumentation