Skip to main content

Machine Learning & Data Scientist

New York, NY
Permanent
Role Name: Machine Learning & Data Scientist

Location: NYC, NY (Hybrid)

Duration:12+ Months

Senior Machine Learning Engineer (Cloud & Data Platform)

Role Overview

We are seeking a highly capable Senior Machine Learning Engineer to support the modernization of enterprise analytics and modeling platforms. This role focuses on migrating and transforming legacy data and machine learning workflows into scalable, cloud-native architectures while improving performance, reliability, and engineering standards.

The ideal candidate combines strong ML engineering expertise with deep experience in distributed data processing and cloud data platforms.

Key Responsibilities

Machine Learning Engineering

Design, develop, and deploy scalable machine learning models using modern frameworks (e.g., PyTorch)

Re-engineer and optimize legacy models into efficient, production-grade implementations

Improve model performance, scalability, and reproducibility

Support model validation, benchmarking, and certification processes

Ensure full traceability and documentation of model logic and outputs

Data Platform & Pipeline Engineering

Design and optimize distributed data pipelines using Spark-based platforms (e.g., Databricks)

Build and refactor ETL/ELT workflows for performance and scalability

Implement data models within modern cloud data warehouses (e.g., Snowflake)

Apply best practices for cloud-native data architecture

Standardize reusable utilities and frameworks for analytics workflows

Cloud Migration & Modernization

Participate in migration of on-prem or legacy analytics platforms to cloud ecosystems

Refactor existing codebases to align with modern engineering and DevOps standards

Leverage cloud compute capabilities (including GPU acceleration where applicable)

Support scheduling and orchestration of data and ML workflows

Testing, Validation & Governance

Conduct rigorous testing and validation to ensure data and model accuracy

Perform parallel runs and benchmarking when modernizing systems

Collaborate with governance, risk, and compliance stakeholders

Maintain high standards of documentation and reproducibility

Required Qualifications

Technical Skills

Strong programming skills in Python

Hands-on experience with PyTorch (or similar deep learning frameworks)

Expertise in Spark-based data processing (Databricks preferred)

Strong SQL skills

Experience working with cloud data warehouses such as Snowflake

Experience building and optimizing ETL/ELT pipelines

Familiarity with distributed computing and performance tuning

Cloud & DevOps

Experience working in cloud environments (AWS, Azure, or GCP)

Understanding of workflow orchestration tools (e.g., Airflow, native platform schedulers)

Version control and CI/CD practices for ML pipelines

Exposure to containerization and scalable deployment patterns

Preferred Qualifications

Experience modernizing legacy codebases (C++, R, or similar)

Experience in regulated industries (Financial Services, Banking, Insurance, etc.)

GPU optimization experience

Knowledge of model risk management or model validation frameworks

Experience supporting large-scale data transformation initiatives

Job Type: Permanent

Job ID: 253528396