What makes Cognizant a unique place to work? The combination of rapid growth and an international and innovative environment! This is creating many opportunities for people like YOU — people with an entrepreneurial spirit who want to make a difference in this world.
At Cognizant, together with your colleagues from all around the world, you will collaborate on creating solutions for the world's leading companies and help them become more flexible, more innovative, and successful. Moreover, this is your chance to be part of the success story.
Position Summary:
1.Hands On Experience in RCA, Observability, Metrics
2.Expertise and Hands On Experience in AWS, Datadog and building the cloud watch dashboards is a plus
3.Expertise to collaborate with various Engineering teams, identify improvement opportunities and support building the reliable system
4.Act as a quality gate from SRE Perspective for the changes
5.Implement Reliable monitors and work on existing monitor refinement
6.Implement automation and reduce the manual tasks
7.Reduce Alert fatigue, help improving the current system
8.Onboard applications and define business dashboards
9.Having an automotive industry exposure is a plus
10.Support Active Monitoring of apps during the North America business hours
11.Having the AWS Solutions Architect & K8s Administrator certification is a plus.
Position Summary:
•Technical Lead (SRE) responsible for improving the reliability, availability, and operational stability of business-critical applications hosted in cloud environments. The role will drive observability, incident prevention, root cause analysis, automation, and service readiness by partnering closely with engineering, platform, and support teams. The engineer will also act as a quality gate for production changes, refine monitoring coverage, reduce alert fatigue, and support active application monitoring aligned to North America business hours.
Mandatory Skills:
•SRE practitioner, AWS certified practitioner, DataDog.
Duties and Responsibilities:
•Monitor application and platform health using observability tools and implement proactive alerting to identify potential incidents before business impact occurs.
•Perform detailed root cause analysis for production incidents and drive permanent corrective and preventive actions.
•Design, build, and refine dashboards, metrics, traces, and logs using Datadog, AWS CloudWatch, and related monitoring platforms.
•Act as an SRE quality gate for application and infrastructure changes by validating reliability, resilience, monitoring readiness, and rollback preparedness.
•Collaborate with engineering, DevOps, and support teams to identify service improvement opportunities and implement reliability best practices.
•Develop and maintain automation to eliminate repetitive operational tasks, improve response efficiency, and reduce manual intervention.
•Reduce alert noise and alert fatigue by tuning thresholds, improving monitor logic, and prioritising actionable notifications.
•Support onboarding of applications into the SRE operating model and define business-focused dashboards and service health views.
•Provide active monitoring and production support coverage for applications during North America business hours and participate in incident response as required.
Qualifications & Certifications (Optional):
•Bachelor’s degree in Computer Science, Information Technology, Engineering, or a related discipline.
•Proven experience in Site Reliability Engineering, Production Support, DevOps, or Cloud Operations roles supporting critical enterprise applications.
•Hands-on experience with AWS services, Datadog, CloudWatch dashboards, observability, incident management, and monitoring strategy.
•Strong understanding of metrics, logs, traces, alerting models, RCA practices, and service reliability principles.
•Working knowledge of automation and scripting using tools or languages such as Python, Shell scripting, or similar technologies.
•Experience working with Kubernetes or container platforms and CI/CD environments is highly desirable.
•Strong collaboration, communication, and stakeholder management skills, with the ability to work effectively across distributed global teams.
•Preferred certifications include AWS Certified Solutions Architect, AWS Certified SysOps Administrator,or Certified Kubernetes Administrator (CKA).
Salary Range:>$100,000
Date of Posting: 12-May-26
Next Steps: If you feel this opportunity suits you, or Cognizant is the type of organization you would like to join, we want to have a conversation with you! Please apply directly with us.
For a complete list of open opportunities with Cognizant, visit http://www.cognizant.com/careers. Cognizant is committed to providing Equal Employment Opportunities. Successful candidates will be required to undergo a background check.
コグニザントのコミュニティ:
私たちは、互いを尊重し支え合う優秀な人材の集まりです。社員一人ひとりが成長し、力を発揮できるよう、エネルギッシュで協力的かつインクルーシブな職場環境を大切にしています。
- コグニザントは、世界中に30万人以上のアソシエイトを擁するグローバルコミュニティです。
- 私たちは、より良い方法を夢見るだけでなく、それを実現します。
- 人、クライアント、企業、地域社会、そして環境に対して、常に「正しいこと」を行うことで責任を果たします。
- あなたにとって最適なキャリアパスを築くことができる、革新的な環境を提供します。
私たちについて:
コグニザント(NASDAQ: CTSH)は、AI builderおよびテクノロジーサービスプロバイダとして、AI投資を企業価値へとつなげるフルスタックのAIソリューションを提供しています。業界、業務プロセス、エンジニアリングに関する深い専門性を強みに、各企業固有のコンテキストをテクノロジーシステムに組み込み、人の力を最大限に引き出すとともに、具体的な成果の創出と、急速に変化する世界におけるグローバル企業の競争力維持を支援します。詳しくは、当社ウェブサイト www.cognizant.com をご覧ください。
コグニザントは機会均等を重視する雇用主です。応募者および候補者は、人種、肌の色、性別、宗教、信条、性的指向、性自認、国籍、障がい、遺伝情報、妊娠、退役軍人の地位、または連邦、州、地方の法律で保護されているその他の特性に基づいて差別されることはありません。
免責事項:
応募者は、対面またはビデオ会議による面接への参加を求められる場合があります。また、各面接の際に、現住所または政府発行の身分証明書の提示が必要となる場合があります。