Data Engineer (Python, Spark, PySpark, AWS)

Job Category: Data Engineering PySpark
Job Type: Full Time
Job Locations: Coimbatore India Work From Home

About US

Wavicle Data Solutions designs and delivers data and analytics solutions to reduce time, cost, and risk of companies’ data projects, improving the quality of their analytics and decisions now and into the futureAs a privately-held consulting service organization with popular, name brand clients across multiple industries, Wavicle offers exciting opportunities for data scientists, solutions architects, developers, and consultants to jump right in and contribute to meaningful, innovative solutions.

Our 250+ local, nearshore and offshore consultants, data architects, cloud engineers, and developers build cost-effective, right-fit solutions leveraging our team’s deep business acumen and knowledge of cutting-edge data and analytics technology and frameworks.

At Wavicle, you’ll find a challenging and rewarding work environment where we enjoy working as a team to exceed client expectations. Employees appreciate being part of something meaningful at Wavicle. Wavicle has been recognized by industry leaders as follows:

  • Chicago Tribune’s Top Workplaces
  • Inc 500 Fastest Growing Private Companies in the US
  • Crain’s Fast 50 fastest growing companies in the Chicago area
  • Talend Expert Partner recognition
  • Microsoft Gold Data Platform competency

About The Role

We are looking for a Data Engineer with strong real-life experience in building data pipelines using emerging technologies.


  • Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of sources like Hadoop, Spark, AWS Lambda, etc.
  • Experience with AWS Cloud on Data Integration with Apache Spark, EMR, Glue, Kafka, Kinesis and Lambda in S3, Redshift, RDS, and MongoDB/DynamoDB ecosystems.
  • Strong real-life experience in Python development, especially in PySpark in AWS Cloud environment.
  • Design, develop, test, deploy, maintain and improve data integration pipeline.
  • Develop pipeline objects using Apache Spark / PySpark / Python or Scala.
  • Design and develop data pipeline architectures using Hadoop, Spark and related AWS Services.
  • Load and performance test data pipelines built using the above-mentioned technologies.


  • At least 5 years of hands-on professional work experience with AWS and Python programming, experience with Python frameworks (e.g., Django, Flask, Bottle) is required
  • Expert level knowledge of using SQL to write complex, highly-optimized queries across large volumes of data.
  • Working experience on ETL pipeline implementation using AWS services such as Glue, Lambda, EMR, Athena, S3, SNS, Kinesis, Data-Pipelines, PySpark, required.
  • Hands-on experience using programming language Scala, Python, R, or required.
  • Hands-on professional work experience using emerging technologies (Snowflake, Matillion, Talend, Thoughtspot and/or Databricks) is highly desirable.
  • Expert level knowledge of using SQL to write complex, highly-optimized queries across large volumes of data is required.
  • Knowledge or experience in architectural best practices in building data lakes.
  • Strong problem solving and troubleshooting skills with the ability to exercise mature judgement.
  • Bachelor or Master’s degree in Computer Science, or related field is required.

Apply for this position

Allowed Type(s): .pdf, .doc