Skip to main content

Technology and Data Engineer 5 Contingent

Irving, TX
Permanent

Posted

Principal Engineer (SRE/DevOps in leading and hands on capacity)
Location: Irving, TX/Charlotte, NC/Minneapolis, MN


Key Responsibilities
Lead production support efforts across a portfolio of 20+ applications, ensuring stability, performance, and rapid issue resolution
Design and build advanced monitoring, alerting, and observability dashboards using tools such as Splunk, Grafana, AppDynamics, and Prometheus
Proactively identify risks through gap analysis, anomaly detection, and predictive alerting, preventing production incidents before they occur
Troubleshoot complex production issues across distributed microservices environments, reducing MTTR through deep technical expertise
Drive adoption of modern SRE practices, including automation, AIOps, and intelligent monitoring solutions
Support applications running on OpenShift and cloud-native platforms, with a focus on reliability and scalability
Collaborate closely with development teams during release cycles, providing production-readiness guidance and operational support
Participate in 24x7 on-call rotation, demonstrating urgency and ownership during incidents
Mentor and guide engineers, helping elevate team capabilities in SRE, DevOps, and platform engineering practices
Act as a trusted technical leader, able to quickly switch priorities and manage competing demands in a high-pressure environment

What Were Looking For
A genuine, hands-on engineer who can operate across multiple roles (SRE, DevOps, Production Support)
Strong ability to shift priorities quickly and respond with urgency in critical situations
Deep understanding of application support in cloud environments, especially OpenShift
Experience in the financial services industry strongly preferred
Prior development experience is a plus, particularly in Java-based ecosystems

Required Qualifications:
10+ years of Platform and production support
5 years of Redhat Linux, OpenShift, Kubernetes, Java, microservices, Spring Boot, Python experience
5 years of Observability dashboard creation experience - Grafana, Splunk, SPLOC, AppDynamics
5 years of Observability alerts and Incident handling - AIOPS, Service now, Bigpanda etc
4 years of React.js, Apache, Kafka, relational databases experience
4 years of distributed systems, microservices architectures, and cloud native platforms experience

Job Type: Permanent

Job ID: 254861002