Darktrace is a global leader in AI for cybersecurity that keeps organizations ahead of the changing threat landscape every day. Founded in 2013, Darktrace provides the essential cybersecurity platform protecting nearly 10,000 organizations from unknown threats using its proprietary AI.
The Darktrace Active AI Security Platform™ delivers a proactive approach to cyber resilience to secure the business across the entire digital estate – from network to cloud to email. Breakthrough innovations from our R&D teams have resulted in over 200 patent applications filed. Darktrace’s platform and services are supported by over 2,400 employees around the world. To learn more, visit http://www.darktrace.com.
Job Description:
About the Role
We’re looking for aSite Reliability Engineer (SRE) to bring deep expertise in a key reliability domain and help shape the future of our platform reliability strategy.
SRE sits at the heart of ouroperational trifecta alongside Platform Engineering and DevSecOps. In this role, you’ll act as thego-to authority in your area of specialism, working across teams to embed best practices, solve complex reliability challenges, and improve system resilience at scale.
Unlike a generalist SRE, this role focuses on acore domain of expertise—such asobservability, performance engineering, data infrastructure reliability, security-focused SRE, or network reliability—while influencing reliability standards across the wider engineering organisation.
Key Responsibilities
Domain Expertise & Strategy
- Act as thesubject matter expert in your chosen reliability domain
- Define and implementstandards, frameworks, and best practices across SRE, Platform Engineering, and DevSecOps
- Stay current with industry trends and bring innovative ideas into the organisation
Engineering & Delivery
- Design and implement solutions tocomplex, cross-cutting reliability challenges
- Build tooling, automation, and frameworks to improve system resilience and scalability
- Lead deep-dive investigations into systemic issues and drivelong-term fixes
Collaboration & Platform Integration
- Partner withPlatform Engineering to ensure your domain is embedded within the internal developer platform
- Collaborate withDevSecOps to integrate security, compliance, and resilience practices
- Contribute to cross-team initiatives that improve reliability across the stack
Incident & Operational Excellence
- Play a key role inincident response, particularly within your specialism
- Contribute toon-call rotations and continuous improvement of operational processes
- Developrunbooks, documentation, and training materials to support teams
What You’ll Bring
Essential
- Proven experience inSite Reliability Engineering, DevOps, or infrastructure engineering
- Deep expertise in at least one of the following areas:
- Observability & monitoring (metrics, logging, distributed tracing)
- Performance engineering & capacity planning
- Data infrastructure reliability (databases, streaming, pipelines)
- Security-focused SRE (hardening, compliance automation, secrets management)
- Network reliability & traffic management
- Strong programming skills (e.g.Go, Python, or similar)
- Experience withcloud platforms (AWS, GCP, Azure) andKubernetes
- Strong communication skills, with the ability to explain complex technical concepts clearly
- Self-driven with the ability to identify and prioritisehigh-impact work independently
Desirable
- Experience buildinginternal developer platforms or tooling
- Contributions toopen-source, technical blogs, or public speaking
- Experience working inregulated environments
- Familiarity withSLO frameworks and error budget management
- Relevant certifications in your specialist domain
Success Measures
- Improved reliability and performance within your domain of specialism
- Adoption of best practices across SRE, Platform Engineering, and DevSecOps
- Reduction in incidents and faster resolution times
- Scalable, well-integrated solutions within the internal platform
- Strong collaboration across teams and measurable improvements in operational maturity
Why Join Us?
- Shape reliability strategy in amodern, cloud-native engineering environment
- Work on complex, high-impact systems at scale
- Collaborate with expert teams across Platform Engineering and DevSecOps
- Take ownership of a domain and drive meaningful, organisation-wide impact
Benefits:
23 days’ holiday + all public holidays, rising to 25 days after 2 years of service,
Additional day off for your birthday,
Private medical insurance which covers you, your cohabiting partner and children,
Life insurance of 4 times your base salary,
Salary sacrifice pension scheme,
Enhanced family leave,
Confidential Employee Assistance Program,
Cycle to work scheme.
Darktrace is an Equal Opportunity Employer. We consider all qualified applicants for employment without regard to race, color, religion, sex (including pregnancy, childbirth, and related medical conditions), sexual orientation, gender identity or expression, national origin, age, disability, genetic information, marital status, veteran or military status, or any other characteristic protected by applicable federal, state, or local law.
Darktrace is committed to providing reasonable accommodations to qualified individuals with disabilities in accordance with applicable laws. If you require a reasonable accommodation to participate in the application or interview process, please contact your Talent Partner.