SRE Ops Engineer
Posted
Pay rate - $40 - $45
Note: Need only local candidates who can go F2F interview
Job Description:
" Support and enhance observability (monitoring, logging, alerting) across production systems
" Help maintain SLIs/SLOs for key services
" Participate in evaluating services for production readiness
" Collaborate with development teams to identify reliability risks and improve system architecture
" Contribute to automation of operations, including CI/CD pipelines, incident response, and infrastructure provisioning
" Participate in incident response and on-call rotations for critical services
" Contribute to post-incident analysis and drive reliability improvements
" Partner with security, infrastructure, and product teams to support performance, compliance, and operational excellence
Must-Haves
" Willingness to work onsite and participate in a 24/7 on-call rotation as needed
" 5+ years of experience managing and supporting high-traffic digital platforms
" Strong experience with CI/CD pipelines and deployment automation
" Experience with cloud platforms such as AWS and/or GCP
" Solid scripting skills (e.g., Python, Bash, Groovy)
" Hands-on experience with observability and monitoring tools like Datadog, New Relic, AppDynamics, or similar
" Understanding of web, mobile, and OTT architectures
" Experience supporting large scale websites, Mobile and OTT applications, microservices, APIs, and distributed systems
" Experience with infrastructure-as-code tools such as Ansible, Terraform, or Chef
" Familiarity with performance testing tools like JMeter or k6
" Hands on experience with debugging tools like Charles Proxy or Fiddler
" Preferred Qualifications
" Experience working with CDNs (e.g., Akamai) and reverse proxies (e.g., NGINX, Varnish)
" Exposure to video streaming platforms and Familiarity with application/infrastructure security controls and best practices
" Certifications in SRE, DevOps, or Performance Engineering are a plus
Note: Need only local candidates who can go F2F interview
Job Description:
" Support and enhance observability (monitoring, logging, alerting) across production systems
" Help maintain SLIs/SLOs for key services
" Participate in evaluating services for production readiness
" Collaborate with development teams to identify reliability risks and improve system architecture
" Contribute to automation of operations, including CI/CD pipelines, incident response, and infrastructure provisioning
" Participate in incident response and on-call rotations for critical services
" Contribute to post-incident analysis and drive reliability improvements
" Partner with security, infrastructure, and product teams to support performance, compliance, and operational excellence
Must-Haves
" Willingness to work onsite and participate in a 24/7 on-call rotation as needed
" 5+ years of experience managing and supporting high-traffic digital platforms
" Strong experience with CI/CD pipelines and deployment automation
" Experience with cloud platforms such as AWS and/or GCP
" Solid scripting skills (e.g., Python, Bash, Groovy)
" Hands-on experience with observability and monitoring tools like Datadog, New Relic, AppDynamics, or similar
" Understanding of web, mobile, and OTT architectures
" Experience supporting large scale websites, Mobile and OTT applications, microservices, APIs, and distributed systems
" Experience with infrastructure-as-code tools such as Ansible, Terraform, or Chef
" Familiarity with performance testing tools like JMeter or k6
" Hands on experience with debugging tools like Charles Proxy or Fiddler
" Preferred Qualifications
" Experience working with CDNs (e.g., Akamai) and reverse proxies (e.g., NGINX, Varnish)
" Exposure to video streaming platforms and Familiarity with application/infrastructure security controls and best practices
" Certifications in SRE, DevOps, or Performance Engineering are a plus
