Data Engineer Python
Data Engineer/Python
MALVERN, PA - HYBRID
Senior Data Engineer
Location: Malvern, PA (Hybrid)
Experience Level: Level 4 (8+ years)
Role Summary
We are seeking a highly motivated Senior Data Engineer to join the Cost Basis Accounting and Method team. This role is primarily focused on a critical, multi-year Batch Modernization effort, moving legacy mainframe batch processes to a modernized AWS cloud-based architecture. The ideal candidate will be an independent contributor and a rockstar developer who is passionate about building scalable data pipelines.
Key Responsibilities
- Design, develop, and maintain high-volume data transformation logic primarily using AWS Glue jobs written in Python.
- Develop custom code and potentially AWS Lambda functions for handling complex logic within the batch processes.
- Utilize PySpark and SQL for data querying, filtering, and manipulation against various data stores, including modernized data sources and initial DB2 tables.
- Collaborate with internal mainframe experts to understand legacy system logic and implement requirements for the modernized batch processes.
- Engage with the build and deployment pipeline, demonstrating a strong understanding of DevOps concepts and proficiency with Git/GitHub.
- Handle data ingestion from multiple sources, including various vendors, flat files, CSVs, and APIs.
- Work closely with a dedicated Tech Lead, but be prepared to operate with a high degree of independence.
- AWS database experience, e.g. Aurora, Redshift.
- Proven experience as a Data Engineer with a strong focus on data pipelines and ETL/ELT processes.
- Expertise in Python (estimated 8090% of development work).
- Experience with AWS services, particularly Glue and Lambda.
- Proficiency in PySpark and SQL for data handling and querying.
- Familiarity with DevOps practices and the Git/GitHub development workflow.
- Some experience with Java batch processes (estimated 1020% of development work) is a plus.
- Experience in dealing with varied data formats and sources (vendors, files, APIs).
- Prior experience with or understanding of mainframe concepts is a good-to-have but not a requirement.
