Machine Learning & Data Scientist
Location: NYC, NY (Hybrid)
Duration:12+ Months
Senior Machine Learning Engineer (Cloud & Data Platform)
Role Overview
We are seeking a highly capable Senior Machine Learning Engineer to support the modernization of enterprise analytics and modeling platforms. This role focuses on migrating and transforming legacy data and machine learning workflows into scalable, cloud-native architectures while improving performance, reliability, and engineering standards.
The ideal candidate combines strong ML engineering expertise with deep experience in distributed data processing and cloud data platforms.
Key Responsibilities
Machine Learning Engineering
Design, develop, and deploy scalable machine learning models using modern frameworks (e.g., PyTorch)
Re-engineer and optimize legacy models into efficient, production-grade implementations
Improve model performance, scalability, and reproducibility
Support model validation, benchmarking, and certification processes
Ensure full traceability and documentation of model logic and outputs
Data Platform & Pipeline Engineering
Design and optimize distributed data pipelines using Spark-based platforms (e.g., Databricks)
Build and refactor ETL/ELT workflows for performance and scalability
Implement data models within modern cloud data warehouses (e.g., Snowflake)
Apply best practices for cloud-native data architecture
Standardize reusable utilities and frameworks for analytics workflows
Cloud Migration & Modernization
Participate in migration of on-prem or legacy analytics platforms to cloud ecosystems
Refactor existing codebases to align with modern engineering and DevOps standards
Leverage cloud compute capabilities (including GPU acceleration where applicable)
Support scheduling and orchestration of data and ML workflows
Testing, Validation & Governance
Conduct rigorous testing and validation to ensure data and model accuracy
Perform parallel runs and benchmarking when modernizing systems
Collaborate with governance, risk, and compliance stakeholders
Maintain high standards of documentation and reproducibility
Required Qualifications
Technical Skills
Strong programming skills in Python
Hands-on experience with PyTorch (or similar deep learning frameworks)
Expertise in Spark-based data processing (Databricks preferred)
Strong SQL skills
Experience working with cloud data warehouses such as Snowflake
Experience building and optimizing ETL/ELT pipelines
Familiarity with distributed computing and performance tuning
Cloud & DevOps
Experience working in cloud environments (AWS, Azure, or GCP)
Understanding of workflow orchestration tools (e.g., Airflow, native platform schedulers)
Version control and CI/CD practices for ML pipelines
Exposure to containerization and scalable deployment patterns
Preferred Qualifications
Experience modernizing legacy codebases (C++, R, or similar)
Experience in regulated industries (Financial Services, Banking, Insurance, etc.)
GPU optimization experience
Knowledge of model risk management or model validation frameworks
Experience supporting large-scale data transformation initiatives
