Skip to main content

Technology and Data Software Engineer 4 Contingent

Irving, TX
Permanent
Location: Charlotte, NC (CIC)
Irving, TX
Chandler, AZ

" Contract - 18 months, possibility to extend or convert
" RTO schedule - 3 days in office MANDATORY, part of 24x7 on call

Interview process: 2 step
90 minute technical virtual panel
30 minute virtual w/ RTM

INTAKE

Top Skills
Troubleshooting
Architecture
Automation
Observability
Scripting skills
Containerization, could be any cloud

Does platform automation for Consumer Lending Operations. About 210 apps in CLO, may be at 350 by end of the year. This person may support about 50, capacity is non-negotiable. CLO is building out new platforms and migrating their data, this person should have Lead level knowledge to be able to understand architecture. Will support both old and new apps. Will be assigned a portfolio that rotates. Adding to the team to come up to capacity on available workload. Will be part of the on call coverage of the platform, could be called on at any time but will otherwise have standard business hour schedule.

Needs to be a hands on, technical person. Any tech mentioned should be the starting point and not the ending point. Needs to understand capacity, where the bottleneck is, and how to solve for it.

Observability and automation are the key skills this person must be sufficient with. Does the person have good learning skills and ability to adapt? Skills are more important than years of experience

If someone has only worked on one technology, like Splunk, it won t be worth sending along. Must be able to troubleshoot and understand architecture.

Splunk, Powershell, Bash, Python are all in the environment with Elastic as a legacy technology. Can be strong in some of them and ability to learn the others. Needs to understand the fundamentals of production availability.

Has seen "90% of candidates" be able to "fool" technical recruiters in the past on experience they haven't had. Sees people use AI to pump up their resume and not be able to speak to hands on experience in the interview. Assurances of in-depth technical screening of the candidates, perhaps using assessments, will be helpful.

Seeking a senior engineer for L2/L3 application + middleware production support with an SRE mindset (shift from reactive to proactive reliability) across VM and container-adjacent/OpenShift (OCP) environments. Role owns incident response, problem management, and runbook-driven ops, and drives observability, automation/IaC, compliance guardrails, and CI/CD-integrated operational automation to reduce toil and improve stability/MTTR.

Core responsibilities: L2/L3 escalation + recovery; reliability signals & alert quality; blameless post-incident learning; logs/metrics/traces/dashboards + actionable alerting; IaC/config-as-code; standardized automation (status/start/stop/restart); intelligent automation/AI-assisted ops with guardrails; drift/compliance checks + remediation; CI/CD integration; runbooks & operational documentation.

About this role Wells Fargo is seeking a Senior Systems Operations Engineer in technology as part of Consumer Lending Operations Technology. This role is focused on application and middleware production support with a Site Reliability Engineering (SRE) mindset shifting from reactive operations to proactive reliability engineering through strong observability, automation, and continuous improvement. The position supports mission critical platforms across VM-based and container-adjacent environments, including OpenShift (OCP), and partners closely with application, middleware, infrastructure, network, and security teams to improve stability, reduce toil, and strengthen operational readiness. This includes hands-on ownership of incident response, problem management, and runbook-driven operations, while building automation and standardized patterns that make platform operations repeatable, auditable, and resilient. In this role, you will:

"??Provide senior-level application and middleware support for complex, high-availability services; act as an escalation point for L2/L3 incidents; lead disciplined troubleshooting, recovery, and stabilization.
"??Embed SRE practices into day-to-day operations: define reliability signals, improve alert quality, drive blameless post-incident learning, and prioritize systemic fixes and toil reduction.
"??Implement and continuously improve observability across applications and middleware (logs, metrics, traces, dashboards, and actionable alerting) to improve detection, diagnosis, and MTTR.
"??Design, develop, and maintain infrastructure-as-code and configuration-as-code capabilities supporting VM-based and container-adjacent workloads, including OpenShift (OCP) enablement.
"??Build and support automation for operational actions across middleware components (standardized status checks, start/stop/restart patterns) to enable safer self-service and reduce dependency bottlenecks.
"??Design and implement intelligent automation for platform and middleware operations, including integrating AI/agent-based approaches into workflows where appropriate (triage assistance, predictive signals, and automated remediation guardrails).
"??Monitor configuration drift; support automated compliance checks; implement remediation patterns aligned to enterprise change management, security, and risk controls.
"??Integrate infrastructure and operational automation with CI/CD pipelines to enable repeatable, auditable deployments and safer rollouts.
"??Support core platform components that enable applications and container platforms, including ingress patterns, load balancing integration, and shared supporting services.
"??Develop and maintain runbooks, operational documentation, and validation/testing approaches for automation and platform procedures to ensure operational readiness and consistent execution.

Required Qualifications
"??4+ years of Systems Engineering or Technology Infrastructure/Operations Engineering experience, or equivalent demonstrated through work experience, training, military experience, or education. Desired Qualifications
"??4+ years of application and/or middleware production support in complex, high-availability environments, including incident response and problem management with strong root cause discipline.
"??4+ years of hands-on automation and configuration management experience (Ansible preferred or similar), plus strong scripting skills (Python, Bash, PowerShell, or similar).
"??4+ years of Linux administration (RHEL preferred) and/or Windows Server administration supporting enterprise production workloads.
"??4+ years of Git-based version control practices, including pull requests and peer review, with a focus on repeatability and code quality.
"??Working experience with infrastructure-as-code concepts, including modular design and environment consistency.
"??Experience supporting hybrid/private cloud platforms and container-adjacent hosting models; familiarity with OpenShift (OCP) or Kubernetes-based platforms.
"??Experience implementing SRE operating practices (reliability metrics, reduction of manual toil, continuous improvement via post-incident learnings).
"??Experience supporting common middleware platforms and shared services; ability to build automation patterns that standardize operational actions and reduce manual intervention.
"??Familiarity with enterprise observability and operational support practices (service health dashboards, alert engineering, actionable telemetry).
"??Exposure to responsible AI usage in operations (security, validation, accuracy, and appropriate guardrails for automation/agents).
"??Strong cross-functional communication skills; experience operating in regulated environments.

Job Expectations
"??Deliver assigned operational engineering and automation outcomes with a strong focus on stability, resiliency, and measurable toil reduction.
"??Participate in on-call rotations and operational support coverage as required.
"??Follow enterprise change management, risk, and compliance processes.
"??Continuously improve platform reliability and automation maturity through standardization, documentation, and repeatable delivery.
"??This position offers a hybrid work schedule.
"??This position is not eligible for Visa sponsorship.
"??Relocation assistance is not available for this position.
"??Flexibility to work in a 24/7 environment, including weekends and holidays.
"??Flexibility to frequently be on call beyond normal working hours.

EEO:

Mindlance is an Equal Opportunity Employer and does not discriminate in employment on the basis of Minority/Gender/Disability/Religion/LGBTQI/Age/Veterans.

Job Type: Permanent

Job ID: 253862313