Data Engineer

Posted by Integrated Resources, Inc

Painted Post, NY

Permanent

Apply Now

Duration: 12 months
Pay: $40 hourly

Job Description: Education and Experience

This position focuses on Data Pipelines & Workflows

Bachelor s degree in computer science, information systems, data engineering, or related field, or equivalent practical experience. An Associate s degree may be considered if the candidate has an additional 3 5 years of relevant experience beyond the stated requirement.
2+ years of professional experience in data engineering, ETL development, or related work, or equivalent hands-on experience.
Experience or interest in scientific software, materials science, research environments, or technically complex domains is a plus.

Work Schedule

Typical 40 hours per week.
May require working weekends, holidays, or longer days to support projects.

Product of Position

Embed within a cross-functional Agile team, participating in sprint planning, stand-ups, backlog refinement, and technical discussions.
Design, build, troubleshoot, and maintain ETL/ELT workflows supporting application functionality, analytics, reporting, and scientific workflows.
Develop and manage data pipelines using Apache Airflow, ensuring reliable orchestration, scheduling, monitoring, and recovery of data processes.
Collaborate with software developers, scientists, and engineers to understand data sources, workflow requirements, and downstream data needs.
Extract, transform, validate, and load data across systems, including relational databases such as PostgreSQL and Oracle.
Write, optimize, and maintain complex SQL queries, scripts, and transformation logic for operational and analytical use cases.
Troubleshoot data quality issues, ETL failures, pipeline bottlenecks, and schema inconsistencies; identify root causes and implement durable solutions.
Support database exploration, data validation, and troubleshooting using DBeaver or similar database tools.
Evaluate and help adopt new data tools and technologies, including lightweight analytics and transformation solutions such as DuckDB.
Collaborate with engineering teams to support reliable integration between data pipelines, applications, APIs, and downstream consumers.
Assist with schema evolution, data modeling, migration planning, and data consistency across systems.
Document pipeline logic, data dependencies, transformation rules, and operational procedures to support maintainability and knowledge sharing.
Improve data engineering standards, observability, testing practices, and operational reliability across the team.
Regularly interact with scientists and engineers to understand research and technical workflows; experience in scientific or research environments is a plus.

Technical Skills 2+ Years (or Equivalent Experience)

Experience designing, building, and troubleshooting ETL/ELT pipelines.
Hands-on experience with workflow orchestration tools, preferably Apache Airflow.
Strong SQL development and optimization skills.
Experience working with relational databases, especially PostgreSQL and Oracle.
Ability to develop and maintain data transformations, validation steps, and pipeline logic across multiple systems.
Experience with database tools such as DBeaver for query development, exploration, and troubleshooting.
Familiarity with modern data processing and analytical tools such as DuckDB, or interest in evaluating emerging data technologies.
Understanding of data modeling, schema design, data integrity, and performance tuning.
Experience troubleshooting pipeline failures, performance issues, and inconsistent or incomplete datasets.
Familiarity with scripting or programming for pipeline development and automation; Python experience strongly preferred.
Understanding of version control and collaborative development workflows.
Experience supporting production data systems with an emphasis on reliability, maintainability, and clear documentation.

Team Skills

Confident collaborating with developers, scientists, analysts, and product stakeholders.
Ability to gather and clarify technical and data requirements and translate them into scalable data solutions.
Strong communication skills regarding pipeline status, data quality issues, dependencies, and tradeoffs.
Comfortable handling ambiguity, improving incomplete processes, and helping define best practices.
Proactive in identifying opportunities to improve data workflows, tooling, performance, and operational stability.

Soft Skills

Strong analytical and problem-solving skills.
High attention to detail and commitment to data quality, consistency, and reliability.
Demonstrated initiative in troubleshooting issues and improving pipeline robustness.
Curiosity and willingness to evaluate and adopt new tools, technologies, and approaches.
Ability to balance immediate operational needs with long-term maintainability and scalability.
Comfortable proposing improvements, collaborating across teams, and building trust through reliable execution.