Python Developer- Cloudera Hadoop CI CD Databricks
Posted
Our Client, a Banking company, is looking for a Python Developer- Cloudera/Hadoop, CI/CD, Databricks for their Jersey City, NJ/Hybrid location.
Requirements:
Why Should You Apply?
Requirements:
- Must have strong design fundamentals along with hands-on experience.
- Required Qualifications
- The Python Engineer designs and implements data ingestion and transformation jobs using Python and PySpark.
- They build reusable frameworks for data quality checks and integrate pipelines with CI/CD processes.
- The role requires performance tuning, secure coding practices, and strong debugging skills. Close collaboration with architects and Data Ops teams ensures robust and compliant solutions.
- Desired Qualifications
- Strong Python (packaging, virtual environments), PySpark/Spark performance tuning.
- Experience with data ingestion (batch/stream), schema management, and error handling/retry logic.
- Test discipline:
- Unit/integration tests, data quality assertions, reproducible pipelines.
- CI/CD (Azure DevOps/Jenkins), Git workflows, artifact versioning, release readiness.
- Experience on Cloudera/Hadoop (HDFS, Spark, Hive/Impala) and Databricks (clusters, jobs, notebooks, Delta).
- Observability:
- Structured logging, metrics, tracing; debuggability in distributed contexts.
- Secure coding and handling of secrets, PII protection, and compliance considerations.
- Strong communication and collaborative work style; documentation of frameworks and usage patterns.
- Performance optimization:
- Partitioning, caching, broadcast joins, memory tuning.
- Comfort with Agile ceremonies and iterative delivery.
- Min 5 yrs of experience
- Technology Specialist Skillset
- Yes|Cloudera Impala
- Python
- Technical by default
- Cloudera Impala
- Spark
Why Should You Apply?
- Health Benefits
- Referral Program
- Excellent growth and advancement opportunities
