Netflix is reinventing how the world watches television. Over 125 million members around the globe enjoy our award-winning movies and TV shows. For the Encoding Technologies team, job #1 is ensuring that those moments of joy are never disrupted by audio or video artifacts. We obsess over producing the highest possible quality audio and video encodes for our customers so that they will always have a great experience on Netflix, regardless of whether they are at home watching on a new TV or using a mobile device in an area with low bandwidth. But we don't just produce the highest quality audio and video in the world, we do it at a very high scale, and at a very low cost. As a member of the Encoding Technologies Team working as a Senior DevOps Specialist, you will help drive operational excellence for our complex encoding system by re-imagining how we would automate and build tools to lower operational barriers, improve clarity on problematic areas, and improve reliability of the platform.
Specifically, You will
- Develop effective tooling, dashboards, alerts, and response to identify and address reliability risks.
- Build tools and automation to reduce operational tasks, improve automatic issue identification and routing, and predict platform performance in accordance to SLAs based on overall platform health and progress.
- Participate in on-call rotation to manage incident and to handle unknown/new issues.
- Drive issue resolution and root cause identification with the various data infrastructure teams.
- Evangelize best practices around collaboration and reliability to all encoding teams.
- Working experience with audio and video encoding, preferably using ffmpeg.
- High-level understanding of audio and video codecs, and related technologies, sufficient to support complex encoding workflows.
- Effective at root cause identification, triage and mitigation
- SW dev experience in automation, tools, or dashboards.
- Experience in a high-level programing language such as Java or Python.
- Understands large-scale complex systems from a reliability perspective
- Strong communication skills and the ability to engage partner teams effectively to drive issues to resolution
- Strong automation mindset and passion to identify strategies to mitigate going forward
- Experience with Cloud Computing platforms (particularly AWS) a plus