Sr. SRO Engineer
Warner Bros. Discovery
Welcome to Warner Bros. Discovery… the stuff dreams are made of.
Who We Are…
When we say, “the stuff dreams are made of,” we’re not just referring to the world of wizards, dragons and superheroes, or even to the wonders of Planet Earth. Behind WBD’s vast portfolio of iconic content and beloved brands, are the storytellers bringing our characters to life, the creators bringing them to your living rooms and the dreamers creating what’s next…
From brilliant creatives, to technology trailblazers, across the globe, WBD offers career defining opportunities, thoughtfully curated benefits, and the tools to explore and grow into your best selves. Here you are supported, here you are celebrated, here you can thrive.
Senior Site Reliability Operations Specialist
About Us
When we say, “the stuff dreams are made of,” we’re not just referring to the world of wizards, dragons and superheroes, or even to the wonders of Planet Earth. Behind WBD’s vast portfolio of iconic content and beloved brands, are the storytellers bringing our characters to life, the creators bringing them to your living rooms and the dreamers creating what’s next…
From brilliant creatives to technology trailblazers, across the globe, WBD offers career defining opportunities, thoughtfully curated benefits, and the tools to explore and grow into your best selves. Here you are supported, here you are celebrated, here you can thrive.
Job Summary
The Site Reliability Operations (SRO) is a global team which is part of the wider Site Reliability Engineering (SRE) department who are responsible for global technology reliability and operations across Warner Bros. Discovery’s Streaming platforms.
SRO is the first point of contact for technical and operational issues within Warner Bros. Discovery’s Streaming division. The SRO teams are responsible for initiating the major incident process, triaging technical problems, conducting content QC validations and action mitigations where applicable, to provide our streaming customers a best-in-class service.
The Senior Site Reliability Operations Specialist role supports an extensive suite of Live, VOD, OTT/Digital Playout via client applications. You will report to the Operations Manager and be part of a 24/7 team of other Specialists.
You will focus on the overall stability of our streaming systems including both VOD and Live content on TNT Sports, Eurosport, Discovery Plus, HBO MAX, and CNN brands. Think Olympics, Premier League, Champions League, Tennis Grand Slams and Tour de France as a start. You will also learn about our Cloud infrastructure and help support our major VOD (White Lotus, Euphoria, House of Dragon) and market launch releases.
The position requires you to be on-site and is a shift based role, requiring flexibility with working hours as weekends and public holidays are some of our busiest times. For major events/launches, you may be required to work overnight. You will be working as part of a unified team, which consists of members from multiple disciplines. You'll be monitoring quality of service, quality of experience and non-linear output, as well as supporting digital OTT cloud infrastructure.
You will be expected to use your operational analytics and incident management skills to provide support for all issue areas, working with support teams/engineers for a swift resolution to any technical issues. You will also be expected to support and educate your team members about new processes, while independently keeping your technical know-how up to date.
Having a versatile personality and technical skillset, with the ability to stay focused and continually develop is key to being successful in this role. Successful candidates for this role need to have prior experience in leading difficult situations, staying calm, reacting quickly under pressure, and being able to execute the key responsibilities noted below.
Key Responsibilities
- Provide support during major product and content launch events.
- Recreate customer facing issues across a variety of devices
- Respond to automated alarms/alerts and carry out defined analysis, triaging, and/or escalate to relevant engineering teams within agreed SLAs.
- Analyse all client issues escalated by customer service teams, as well as internal production/content operation teams, being the primary point of contact.
- Ensure all issues are logged and tracked through to resolution, leveraging the ITIL principle of Incident, Problem, and Change management.
- Ensure detailed shift handover among colleagues.
- Provide support to the Staff Specialists and Operations Managers in the management of high priority outages as needed, providing stakeholder communications at agreed intervals until resolution of the issue with root cause is determined.
- Coordinate with vendors and 3rd party suppliers on relevant issue reporting/resolution, post-incident reports, onboardings, service reviews, and system migrations/deployments.
- Engage in regular internal stakeholder meetings (customer services, production/content operations, QA team, product owners and engineering teams).
- Perform sanity checks on back-end systems/front-end platforms after deployments and releases and provide feedback to engineering teams regularly.
- Update department supports documentation promptly to reflect all changes (Knowledgebase, Run Books, etc..).
Qualifications
- Previous experience working in digital technical operations or media operations environment.
- Experience in monitoring and supporting OTT platforms.
- Working experience in handling HD and SD video feed input.
- Working experience with CMS and EPG systems.
- Education – Preferred bachelor's degree in Digital Media, Information Systems, Computer Science, Business Administration, or related field or equivalent experience.
- Working knowledge of Confluence, MS Suite or similar computer software systems.
- Working experience in cloud environments particularly AWS or Azure.
- Must be able to work independently and prioritise workload to complete tasks promptly.
- Able to work without supervision, combining initiative with discretion.
- Excellent written and verbal communication skills and a friendly disposition.
- Able to communicate technical matters to technical and non-technical audiences.
- Excellent interpersonal skills.
- PagerDuty, ServiceNow, Jira or similar Incident Management application experience.
- Understand and be able to work with monitoring systems and related technologies.
- Must be able to independently research, troubleshoot, and resolve trouble tickets within established Service Level Agreements.
How We Get Things Done…
This last bit is probably the most important! Here at WBD, our guiding principles are the core values by which we operate and are central to how we get things done. You can find them at www.wbd.com/guiding-principles/ along with some insights from the team on what they mean and how they show up in their day to day. We hope they resonate with you and look forward to discussing them during your interview.
Championing Inclusion at WBD
Warner Bros. Discovery embraces the opportunity to build a workforce that reflects a wide array of perspectives, backgrounds and experiences. Being an equal opportunity employer means that we take seriously our responsibility to consider qualified candidates on the basis of merit, regardless of sex, gender identity, ethnicity, age, sexual orientation, religion or belief, marital status, pregnancy, parenthood, disability or any other category protected by law.If you’re a qualified candidate with a disability and you require adjustments or accommodations during the job application and/or recruitment process, please visit our accessibility page for instructions to submit your request.