Let's Talk

Senior Data Engineer

Experience: 5 to 10 years

Location: Chennai/ Coimbatore/ Remote

Job Type: Full Time – Permanent

•󠁏󠁏 5+ years of hands-on data engineering experience, with at least 2 years working extensively in Databricks on enterprise-scale workloads

•󠁏󠁏 Expert-level PySpark proficiency — must be able to write, review, and optimize complex transformations, understand the Catalyst optimizer, and diagnose runtime behavior

•󠁏󠁏 Deep experience with Databricks Declarative Pipelines (Delta Live Tables) — including expectations, pipeline modes, table types (streaming vs. materialized), and event-driven triggering

•󠁏󠁏 Proven understanding of Autoloader: configuration, schema inference and evolution, checkpointing, and operational best practices

•󠁏󠁏 Ability to troubleshoot long-running MERGE operations — including understanding write amplification, file compaction, and transaction log behavior in Delta Lake

•󠁏󠁏 Demonstrated ability to diagnose pipeline performance issues and distinguish between cluster sizing problems vs. code inefficiencies vs. data volume/skew issues

•󠁏󠁏 Strong knowledge of common PySpark and pipeline performance bottlenecks: data skew, excessive shuffles, poor partition strategy, broadcast join misuse, UDF overhead

•󠁏󠁏 Strong SQL proficiency — complex transformations, window functions, query optimization

•󠁏󠁏 Experience building within a medallion architecture (Bronze / Silver / Gold) in a production environment

•󠁏󠁏 Familiarity with Azure cloud services relevant to the Databricks ecosystem (Azure Data Lake Storage, Azure Key Vault, Azure Monitor, etc.)

Preferred Qualifications

•󠁏󠁏 Databricks Certified Data Engineer Professional certification (Associate minimum)

•󠁏󠁏 Experience with Databricks serverless compute and understanding of its trade-offs vs. classic clusters

•󠁏󠁏 Familiarity with Databricks streaming workloads using Structured Streaming

•󠁏󠁏 SQL Server experience — understanding of source system structures common in enterprise retail environments

•󠁏󠁏 Familiarity with Power BI data models and downstream semantic layer considerations

•󠁏󠁏 Background in retail, fuel/convenience, or CPG data engineering

Apply for this position