ETL

• Deep hands-on expertise in Apache Spark (Python, Java, or Kotlin)
• 5+ years experience in design and implementation of big data technologies (Apache Spark, Hadoop ecosystem, Snowflake, Elastic, Apache Kafka, NoSQL database) and familiarity with data architecture patterns (data warehouse, data lake, and data streaming)
• Familiarity with Kubernetes as the resource manager for Spark jobs is a plus
• Minimum 2 year of programming experience with Python in similar, relevant data engineering role
• Experience with query tuning, performance tuning, troubleshooting, and debugging Spark and other big data solutions
• Experience working with libraries like pandas, numpy, scipy and comfortable with ETL processes
• Experience with creation, orchestration and monitoring of job pipelining with either one of frameworks like airflow, pinball, etc.
• Familiarity with Restful API’s is a plus
• Experience of data processing architectures (both streaming and batch) is highly preferable
• Experience with SQL and relational databases such as SQL server, MySQL, PostgreSQL, or Oracle.
• Experience in writing complex SQL queries
• Experience with ETL tool – Matillion
• Experience with data modeling, data warehousing, and data integration[One of the Snowflake , Databricks or Spark]