About the Role
Job Description
US Work Authorization Requirement:
Candidates must be legally authorized to work in the United States without employer sponsorship. This includes, but is not limited to, U.S. Citizens, Permanent Residents, and other individuals with valid U.S. work authorization.
Job Description:
We are seeking a Senior Data Engineer to join a highly collaborative, onsite data engineering team supporting large-scale cloud migration and data platform modernization initiatives. This is a hands-on, end-to-end role requiring deep expertise in Databricks, PySpark, SQL, and AWS-based ETL pipelines.
The ideal candidate is a senior practitioner who can not only build and deliver solutions independently but also evaluate existing architectures, identify gaps, and propose modern, scalable approaches aligned with Databricks and cloud data engineering best practices. You will work in an Agile environment, partnering closely with engineering peers, architects, and business stakeholders.
Key Responsibilities
Design, develop, and maintain scalable ETL/ELT pipelines using Databricks and Apache Spark (PySpark)
Lead and support migration of data pipelines and applications from on-premises environments to AWS
Build robust data ingestion frameworks for structured, semi-structured, and unstructured data sources
Write, optimize, and maintain complex SQL queries across RDBMS, data lakes, and federated data environments
Review existing data solutions and recommend architectural improvements for performance, scalability, and maintainability
Develop reusable frameworks and components to standardize data processing patterns
Tune and optimize Spark jobs for performance and cost efficiency in cloud environments
Build, deploy, and support solutions end-to-end, from development through production
Implement CI/CD pipelines and follow version control best practices
Enforce data quality, validation, security, and governance standards
Collaborate with data architects, analysts, and business stakeholders to translate requirements into technical solutions
Participate in Agile ceremonies, including sprint planning, estimation, and retrospectives
Troubleshoot, debug, and resolve production pipeline issues
Take full ownership of solutions from design through production support
Required Qualifications
12+ years of overall IT experience with strong focus on data engineering
Hands-on expertise with Databricks SaaS, Python, and PySpark
Ability to independently design and build ETL pipelines
Expert-level SQL skills, including writing and optimizing complex queries
Strong experience with AWS-based ETL services, including:
AWS Glue
EC2
EMR
Amazon S3
Solid understanding of Data Lake and Lakehouse architectures
Experience working with RDBMS and large-scale analytical datasets
Proven experience deploying production code using Git and CI/CD pipelines
Comfortable working onsite full-time in a collaborative team environment
Strong communication skills with a solution-oriented mindset
Ability to own solutions end to end, from architecture through operations
Education
Bachelor’s degree in Computer Science or equivalent professional experience
Apply Online
Your Name *
Your Phone Number *
Your Email Address *
Job id
What is your current U.S. visa or immigration status? *
SelectU.S. Citizen (USC)Lawful Permanent Resident (Green Card holder)H1BF1-OPT/Stem-OPT/CPT EADH4-EADL-2SGC-EADOther Valid Visa
Where are you currently located at? *
W2 or C2C *
SelectW2C2C
How many years of total experience do you have? *
How many years of relevant experience do you have? *
Do you require H1B sponsorship? *
YesNo
Do you require sponsorship? *
NoYes – H-1B transferYes – Green Card sponsorshipYes – Both H-1B transfer and Green Card sponsorship
Upload Resume *
Δ
US Work Authorization Requirement:
Candidates must be legally authorized to work in the United States without employer sponsorship. This includes, but is not limited to, U.S. Citizens, Permanent Residents, and other individuals with valid U.S. work authorization.
Job Description:
We are seeking a Senior Data Engineer to join a highly collaborative, onsite data engineering team supporting large-scale cloud migration and data platform modernization initiatives. This is a hands-on, end-to-end role requiring deep expertise in Databricks, PySpark, SQL, and AWS-based ETL pipelines.
The ideal candidate is a senior practitioner who can not only build and deliver solutions independently but also evaluate existing architectures, identify gaps, and propose modern, scalable approaches aligned with Databricks and cloud data engineering best practices. You will work in an Agile environment, partnering closely with engineering peers, architects, and business stakeholders.
Key Responsibilities
Design, develop, and maintain scalable ETL/ELT pipelines using Databricks and Apache Spark (PySpark)
Lead and support migration of data pipelines and applications from on-premises environments to AWS
Build robust data ingestion frameworks for structured, semi-structured, and unstructured data sources
Write, optimize, and maintain complex SQL queries across RDBMS, data lakes, and federated data environments
Review existing data solutions and recommend architectural improvements for performance, scalability, and maintainability
Develop reusable frameworks and components to standardize data processing patterns
Tune and optimize Spark jobs for performance and cost efficiency in cloud environments
Build, deploy, and support solutions end-to-end, from development through production
Implement CI/CD pipelines and follow version control best practices
Enforce data quality, validation, security, and governance standards
Collaborate with data architects, analysts, and business stakeholders to translate requirements into technical solutions
Participate in Agile ceremonies, including sprint planning, estimation, and retrospectives
Troubleshoot, debug, and resolve production pipeline issues
Take full ownership of solutions from design through production support
Required Qualifications
12+ years of overall IT experience with strong focus on data engineering
Hands-on expertise with Databricks SaaS, Python, and PySpark
Ability to independently design and build ETL pipelines
Expert-level SQL skills, including writing and optimizing complex queries
Strong experience with AWS-based ETL services, including:
AWS Glue
EC2
EMR
Amazon S3
Solid understanding of Data Lake and Lakehouse architectures
Experience working with RDBMS and large-scale analytical datasets
Proven experience deploying production code using Git and CI/CD pipelines
Comfortable working onsite full-time in a collaborative team environment
Strong communication skills with a solution-oriented mindset
Ability to own solutions end to end, from architecture through operations
Education
Bachelor’s degree in Computer Science or equivalent professional experience
[contact-form-7 id="ac31dcc" title="Job Apply Form"]
Tech Stack
pythonsqlawsgitETLagile