Senior Site Reliability Engineer (Automation & Observability)
About the role
As a Senior Site Reliability Engineer, you will make an impact by driving automation, improving system reliability, and enabling intelligent, self-healing operations across critical services. You will be a valued member of the Engineering / Production Support team and work collaboratively with DevOps, platform engineering, and application teams to enhance performance and operational excellence.
In this role, you will:
- Design and implement automation solutions to eliminate repetitive manual support and operational tasks
- Define and manage Service Level Objectives (SLOs) and apply error budget principles to guide reliability and release decisions
- Build and enhance observability frameworks, including dashboards, monitoring, and alerting systems
- Develop runbooks and convert them into automated remediation workflows and self-service capabilities
- Implement self-healing solutions and optimize alerting to reduce noise and improve incident response efficiency
Work model
We strive to provide flexibility wherever possible. Based on this role’s business requirements, this is a hybrid position requiring 3 days per week in a client or Cognizant office in Pittsburgh, PA. Regardless of your working arrangement, we are here to support a healthy work-life balance through our various wellbeing programs. The working arrangements for this role are accurate as of the date of posting. This may change based on the project you’re engaged in, as well as business and client requirements. Rest assured; we will always be clear about role expectations.
*You must be legally authorized to work in the USA without the need for employer sponsorship, now or at any time in the future*
What you need to have to be considered:
- Proven experience in Production Support, Site Reliability Engineering (SRE), or DevOps environments
- Strong programming or scripting skills (Python, Java, or similar)
- Hands-on experience with automation, monitoring, and observability tools
- Solid understanding of SLOs, SLIs, error budgets, and reliability engineering principles
- Demonstrated ability to troubleshoot complex systems and drive root cause analysis and resolution
These will help you stand out:
- Experience implementing self-healing systems and intelligent automation (AIOps)
- Familiarity with alerting and event management tools such as Moogsoft or similar platforms
- Experience improving batch processing reliability and recovery patterns
- Track record of reducing incident volume through automation and permanent fixes
- Exposure to large-scale distributed systems and cloud-native environments
Salary and Other Compensation:
Applications will be accepted until July 30th, 2026.
The annual salary for this position is between $ 63,000 $ to 115,000 depending on experience and other qualifications of the successful candidate.
This position is also eligible for Cognizant’s discretionary annual incentive program, based on performance and subject to the terms of Cognizant’s applicable plans.
Benefits: Cognizant offers the following benefits for this position, subject to applicable eligibility requirements:
- Medical/Dental/Vision/Life Insurance
- Paid holidays plus Paid Time Off
- 401(k) plan and contributions
- Long-term/Short-term Disability
コグニザントについて
コグニザント(NASDAQ: CTSH)は、AI Builderおよびテクノロジーサービスプロバイダーとして、お客様にフルスタックのAIソリューションを構築することで、AI投資と企業価値を結ぶ架け橋となっています。業界、ビジネスプロセス、エンジニアリングに関する当社の深い専門知識を活かし、組織固有のビジネス環境をテクノロジー・システムに組み込みます。これにより、人間の可能性を最大限に引き出し、確かな成果を実現するとともに、急速に変化する世界においてグローバル企業が常に一歩先を行くための支援を行っています。 詳細については、cognizant.ai をご覧ください。
雇用に関する追加情報
本募集に記載されている報酬情報は、掲載日時点で正確なものです。Cognizantは、適用される法令に従い、いつでも本情報を変更する権利を留保します。
応募者は、対面またはビデオ会議による面接への参加を求められる場合があります。また、各面接の際に、現在有効な州政府または政府発行の身分証明書の提示を求められる場合があります。
Cognizantは機会均等雇用主です。応募および選考において、人種、肌の色、性別、宗教、信条、性的指向、性自認、国籍、障がい、遺伝情報、妊娠、退役軍人の地位、その他連邦法・州法・地方自治体の法律により保護されるいかなる特性に基づく差別も行いません。







