3 дня назад

Middle Data Engineer (Python/PySpark/SQL)

Формат работы

remote (Global)

Тип работы

fulltime

Грейд

middle

Английский

Вакансия из Telegram канала -

Мэтч & Сопровод

Покажет вашу совместимость и напишет письмо

Описание вакансии

#lookfor #outsource #outstaff #remote #DataEngineer #Python #SQL #PySpark #ETL #DataWarehouse #BigData #AWS #Azure

We are looking for a Middle Data Engineer to join our data team on a full-time remote basis.

The specialist will design, build, and maintain scalable data pipelines and ETL/ELT processes, working with large-scale datasets, data warehouses, and cloud platforms.

Key responsibilities:
• Design, develop, and maintain ETL/ELT pipelines for data ingestion, transformation, and loading.
• Build and optimize data processing workflows using Python and PySpark.
• Develop data warehouse architectures and data lake solutions.
• Create and maintain SQL queries, stored procedures, and data models for reporting.
• Integrate data from APIs, databases, streaming services, and third-party platforms.
• Implement data quality checks, validation rules, and monitoring mechanisms.
• Orchestrate workflows using Apache Airflow, Dagster, or similar tools.
• Collaborate with data scientists and analysts to deliver clean, reliable datasets.
• Document data architectures, pipeline logic, and data dictionaries.

Requirements:
• 3+ years of commercial experience in data engineering or related roles.
• Strong proficiency in Python for data processing and automation.
• Expert-level SQL skills including complex queries, window functions, and query optimization.
• Hands-on experience with Apache Spark and PySpark for distributed data processing.
• Solid understanding of ETL/ELT concepts and data pipeline design patterns.
• Experience with relational databases, data warehouses (Snowflake, BigQuery, Redshift), and data lakes (Delta Lake, Iceberg).
• Experience with cloud platforms (AWS, Azure, or GCP) and their data services.
• Understanding of data modeling: star schema, snowflake schema, normalization, denormalization.
• Experience with workflow orchestration tools (Airflow, Prefect, Dagster).
• Familiarity with streaming data processing (Kafka, Kinesis, Spark Streaming).
• Knowledge of Git, data governance, security, and compliance principles.
• Strong analytical and problem-solving skills with attention to data accuracy.
• English: B2 or higher (written and spoken).

Nice to have:
• Experience with infrastructure-as-code (Terraform), dbt, and CI/CD for data pipelines.
• Familiarity with NoSQL databases, BI tools, or real-time analytics frameworks.
• Understanding of MLOps concepts and machine learning pipelines.
• Contributions to open-source data engineering projects.

Location: Remote, worldwide
Restrictions: Candidates from Egypt, India, Pakistan, and Afghanistan are not considered
English: B2+
Format: Full-time, outsource, outstaff
Contact:

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →

Текст вакансии взят без изменений

Источник -

Middle Data Engineer (Python/PySpark/SQL)

Мэтч & Сопровод

Описание вакансии

Похожие вакансии

Middle Data Scientist (ML)