Skip to main content

Senior Data Engineer with Databricks PySpark

New York, NY
Permanent
Senior Data Engineer with Databricks/PySpark
Location: NJ, hybrid 3X a week. Final round in person interview
RESPONSIBILITIES
" Build large-scale batch and real-time data pipelines with data processing frameworks in Azure cloud platform.
" Designing and implementing highly performant data ingestion pipelines from multiple sources using Azure Databricks.
" Direct experience of building data pipelines using Azure Data Factory and Databricks.
" Developing scalable and re-usable frameworks for ingesting of datasets
" Lead design of ETL, data integration and data migration.
" Partner with architects, engineers, information analysts, business, and technology stakeholders for developing and deploying enterprise grade platforms that enable data-driven solutions.
" Integrating the end to end data pipeline - to take data from source systems to target data repositories ensuring the quality and consistency of data is maintained at all times
" Working with event based / streaming technologies to ingest and process data
" Working with other members of the project team to support delivery of additional project components (API interfaces, Search)
" Evaluating the performance and applicability of multiple tools against customer requirements.
" Utilize version control systems such as GitHub for managing code, collaboration, and maintaining repository integrity.
" Implement and maintain materialized views, streaming pipelines, and API endpoints for data access and integration.

REQUIREMENTS
" Experience on Python scripting, Spark SQL PySpark is a must
" Experience on ADLS, Azure Databricks, Azure SQL DB, EventHub, Kafka, and Datawarehouse
" Strong working experience in Implementation of Azure cloud components using Azure Data Factory , Azure Data Analytics, Azure Data Lake, Azure Data Catalogue, LogicApps and FunctionApps
" Have knowledge in Azure Storage services (ADLS, Storage Accounts)
" Expertise in designing and deploying data applications on cloud solutions on Azure
" Hands on experience in performance tuning and optimizing code running in Databricks environment
" Good understanding of SQL, T-SQL and/or PL/SQL
" Should have experience working in Agile projects with knowledge in Jira
" Good to have handled Data Ingestion projects in Azure environment
" Demonstrated analytical and problem-solving skills particularly those that apply to a big data environment
" Strong expertise in applying narrow and wide transformations to build efficient and scalable data pipelines in Databricks.

Job Type: Permanent

Job ID: 254323724