Data Engineer
Pay: $40 hourly
Job Description: Education and Experience
This position focuses on Data Pipelines & Workflows
-
Bachelor s degree in computer science, information systems, data engineering, or related field, or equivalent practical experience. An Associate s degree may be considered if the candidate has an additional 3 5 years of relevant experience beyond the stated requirement.
-
2+ years of professional experience in data engineering, ETL development, or related work, or equivalent hands-on experience.
-
Experience or interest in scientific software, materials science, research environments, or technically complex domains is a plus.
-
Typical 40 hours per week.
-
May require working weekends, holidays, or longer days to support projects.
-
Embed within a cross-functional Agile team, participating in sprint planning, stand-ups, backlog refinement, and technical discussions.
-
Design, build, troubleshoot, and maintain ETL/ELT workflows supporting application functionality, analytics, reporting, and scientific workflows.
-
Develop and manage data pipelines using Apache Airflow, ensuring reliable orchestration, scheduling, monitoring, and recovery of data processes.
-
Collaborate with software developers, scientists, and engineers to understand data sources, workflow requirements, and downstream data needs.
-
Extract, transform, validate, and load data across systems, including relational databases such as PostgreSQL and Oracle.
-
Write, optimize, and maintain complex SQL queries, scripts, and transformation logic for operational and analytical use cases.
-
Troubleshoot data quality issues, ETL failures, pipeline bottlenecks, and schema inconsistencies; identify root causes and implement durable solutions.
-
Support database exploration, data validation, and troubleshooting using DBeaver or similar database tools.
-
Evaluate and help adopt new data tools and technologies, including lightweight analytics and transformation solutions such as DuckDB.
-
Collaborate with engineering teams to support reliable integration between data pipelines, applications, APIs, and downstream consumers.
-
Assist with schema evolution, data modeling, migration planning, and data consistency across systems.
-
Document pipeline logic, data dependencies, transformation rules, and operational procedures to support maintainability and knowledge sharing.
-
Improve data engineering standards, observability, testing practices, and operational reliability across the team.
-
Regularly interact with scientists and engineers to understand research and technical workflows; experience in scientific or research environments is a plus.
-
Experience designing, building, and troubleshooting ETL/ELT pipelines.
-
Hands-on experience with workflow orchestration tools, preferably Apache Airflow.
-
Strong SQL development and optimization skills.
-
Experience working with relational databases, especially PostgreSQL and Oracle.
-
Ability to develop and maintain data transformations, validation steps, and pipeline logic across multiple systems.
-
Experience with database tools such as DBeaver for query development, exploration, and troubleshooting.
-
Familiarity with modern data processing and analytical tools such as DuckDB, or interest in evaluating emerging data technologies.
-
Understanding of data modeling, schema design, data integrity, and performance tuning.
-
Experience troubleshooting pipeline failures, performance issues, and inconsistent or incomplete datasets.
-
Familiarity with scripting or programming for pipeline development and automation; Python experience strongly preferred.
-
Understanding of version control and collaborative development workflows.
-
Experience supporting production data systems with an emphasis on reliability, maintainability, and clear documentation.
-
Confident collaborating with developers, scientists, analysts, and product stakeholders.
-
Ability to gather and clarify technical and data requirements and translate them into scalable data solutions.
-
Strong communication skills regarding pipeline status, data quality issues, dependencies, and tradeoffs.
-
Comfortable handling ambiguity, improving incomplete processes, and helping define best practices.
-
Proactive in identifying opportunities to improve data workflows, tooling, performance, and operational stability.
-
Strong analytical and problem-solving skills.
-
High attention to detail and commitment to data quality, consistency, and reliability.
-
Demonstrated initiative in troubleshooting issues and improving pipeline robustness.
-
Curiosity and willingness to evaluate and adopt new tools, technologies, and approaches.
-
Ability to balance immediate operational needs with long-term maintainability and scalability.
-
Comfortable proposing improvements, collaborating across teams, and building trust through reliable execution.
