Skip to main content

Data Engineer

Painted Post, NY
Permanent
Our Client, a Business Manufacturing and Suppply company, is looking for a Data Engineer for their Painted Post, NY location.

Responsibilities:
  • Embed within a cross-functional Agile team, participating in sprint planning, stand-ups, backlog refinement, and technical discussions.
  • Design, build, troubleshoot, and maintain ETL/ELT workflows that support application functionality, analytics, reporting, and scientific workflows.
  • Develop and manage data pipelines using Apache Airflow, ensuring reliable orchestration, scheduling, monitoring, and recovery of data processes.
  • Work with stakeholders including software developers, scientists, and engineers to understand data sources, workflow requirements, and downstream data needs.
  • Extract, transform, validate, and load data across systems, including relational databases such as Postgres SQL and Oracle.
  • Write, optimize, and maintain complex SQL queries, scripts, and transformation logic to support operational and analytical use cases.
  • Troubleshoot data quality issues, ETL failures, pipeline bottlenecks, and schema inconsistencies; identify root causes and implement durable solutions.
  • Support database exploration, data validation, and troubleshooting using tools such as DBeaver and related database utilities.
  • Evaluate and help adopt new data tools and technologies, including lightweight analytics and transformation solutions (e.g. DuckDB) where appropriate.
  • Collaborate with engineering teams to support reliable integration between data pipelines, applications, APIs, and downstream consumers.
  • Assist with schema evolution, data modeling, migration planning, and data consistency across systems.
  • Document pipeline logic, data dependencies, transformation rules, and operational procedures to support maintainability and team knowledge sharing.
  • Help improve data engineering standards, observability, testing practices, and operational reliability across the team.
  • Regularly interact with scientists and engineers to understand research and technical workflows; experience in scientific or research environments is a strong plus.

Requirements:
  • Experience designing, building, and troubleshooting ETL/ELT pipelines
  • Hands-on experience with workflow orchestration tools, preferably Apache Airflow
  • Strong experience writing and optimizing SQL
  • Experience working with relational databases, especially Postgres SQL and Oracle
  • Ability to develop and maintain data transformations, validation steps, and pipeline logic across multiple systems
  • Experience with database tools such as DBeaver or similar for query development, exploration, and troubleshooting
  • Familiarity with modern data processing and analytical tools such as DuckDB or interest in evaluating emerging data technologies
  • Understanding of data modeling, schema design, data integrity, and performance tuning
  • Experience troubleshooting pipeline failures, performance issues, and inconsistent or incomplete datasets
  • Familiarity with scripting or programming for pipeline development and automation; Python experience is strongly preferred
  • Understanding of version control and collaborative development workflows
  • Experience supporting production data systems with an emphasis on reliability, maintainability, and clear documentation
  • Confident collaborating with developers, scientists, analysts, and product stakeholders
  • Able to gather and clarify technical and data requirements and translate them into scalable data solutions
  • Strong communication skills around pipeline status, data quality issues, dependencies, and tradeoffs
  • Comfortable handling ambiguity, improving incomplete processes, and helping define best practices
  • Proactive in identifying opportunities to improve data workflows, tooling, performance, and operational stability
  • Strong analytical and problem-solving skills
  • High attention to detail and commitment to data quality, consistency, and reliability
  • Demonstrated initiative in troubleshooting issues and improving pipeline robustness
  • Curiosity and willingness to evaluate and adopt new tools, technologies, and approaches
  • Ability to balance immediate operational needs with long-term maintainability and scalability
  • Comfortable proposing improvements, collaborating across teams, and building trust through reliable execution

Why Should You Apply?
  • Health Benefits
  • Referral Program
  • Excellent growth and advancement opportunities

Job Type: Permanent

Job ID: 254621405