About the Role
Kogo poszukujemy?
Must have:
7+ years of experience in Data Engineering (including min. 5 years with Databricks)
Strong experience in Databricks Data Platform for distributed data processing
Excellent programming skills in Python and SQL
Strong understanding of data modeling, data lakehouse architecture, and ELT/ETL patterns
Experience designing scalable cloud-based data platforms (AWS / Azure)
Knowledge of data governance, security, and access control best practices (Unity Catalog, dbt)
Experience leading or mentoring engineers is a strong advantage
Strong analytical thinking and problem-solving skills
Excellent communication and collaboration skills
Fluency in English (at least B2)
Nice to have:
Data Streaming: Kafka
Databases: MS SQL (SSIS, SSAS), PostgreSQL, MySQL
BI tools: PowerBI
This is a hybrid role, which means we'd like you to work in the office occasionally, especially during client visits or other important company meetings. We'd also like you to be willing to take an occasional short business trips to Warsaw (approximately four times a year).
Czym będziesz się zajmować?
Responsibilities
Design and implement scalable data platforms and pipelines using Apache Spark on Databricks
Lead the development of distributed data processing pipelines using PySpark and SparkSQL
Build and manage Databricks Workflows for orchestration, scheduling, monitoring, and error handling
Optimize Spark workloads by applying join strategies, shuffle optimization, caching, and partitioning techniques
Design and maintain Delta Lake architectures, including schema evolution, ACID transactions, and performance tuning
Implement data governance and access control using Unity Catalog, including permissions, lineage, and secure data sharing
Collaborate with architects and engineering teams to design cloud-native data platforms
Ensure data quality, observability, and reliability across pipelines and data products
Lead performance optimization of large-scale data processing workloads
Mentor and support other Data Engineers, contributing to engineering standards and best practices
Participate in architecture discussions and contribute to the evolution of the company’s data engineering practices
Tech Stack
DatabricksPythonSQLApache SparkPySparkSparkSQLDelta LakeUnity CatalogdbtAWSAzureKafka