Talkdesk is hiring a

Site Reliability Engineer

Together at Talkdesk, we’re building a future of brilliant customer interactions. We have the cleverness, curiosity, and grit to make us a team of customer heroes: thinkers, achievers, and dreamers who believe that our world-class SaaS platform can influence a new kind of customer interaction.

Talkdesk started from a hackathon win in 2011 and has since become one of the fastest-growing companies in the world, a Gartner Magic Quadrant Visionary, and enabled billions of customer interactions using our platform. With $124.5 million in backing from DFJ, Salesforce Ventures, Storm Ventures, and Viking Global Investors and supported by the successes of our 1,800+ customers from IBM, Acxiom, Discovery Education, and Peloton, Talkdesk is disrupting a $40+ billion stagnant market.

We’re now looking for the new members of the Talkdesk family - those ambitious, driven, and collaborative individuals who thrive in a fast-paced environment and will push us to do even greater things together. If you would like to help us shape the future of Talkdesk, come along with us on our journey - your dream job is waiting!

 

We are looking for Site Reliability Engineers (SREs) who can help us design, build, and maintain high-performance, scalable, and reliable services. As Talkdesk provides a Contact Center service, we play a very critical role in our Customer’s business operations and therefore need to provide a highly available and fault tolerant service.

As an SRE at Talkdesk you will build, run, and maintain components that serve as the infrastructure foundation for the rest of Talkdesk, with the objective of having the least manual intervention possible, while also ensuring high availability and reliability of those components. You will also partner with other product engineering teams to help make their services more performant, scalable, observable and reliable. We believe in a DevOps philosophy where every engineering team at Talkdesk should be responsible for the software they build and deploy and SREs play a critical role in ensuring that the teams have the tools, practices, and expertise to make that happen in a blame free culture.

 
Responsibilities include:

  • Design, build, harden, and maintain the core infrastructure used by all of Talkdesk’s engineering teams
  • Automate every aspect of our infrastructure to remove as much as possible any human intervention
  • Help keep existing base infrastructure running smoothly
  • Develop effective tooling, alerts, and response to both identify and address reliability risks
  • Drive and promote protocols on production readiness and operational excellence
  • Participate in on-call rotation alongside other engineering teams (opt-in)
  • Partner with product engineering teams to debug production outages and carry out action items to improve reliability of those systems
  • Participate in design reviews and production reviews for new features, products, or pieces of infrastructure
  • Plan for growth of Talkdesk’s infrastructure


Skills and Qualifications: 

  • Understanding of the importance of observability, and have good intuitions about what to measure and how
  • Know your way around a Linux/Unix system
  • Experience with Terraform and Packer
  • Ability to identify time consuming and error prone manual tasks and then build tooling to automate them
  • Understand large-scale complex systems from a reliability perspective
  • Ability to identify root causes of instability in a large-scale distributed system, across stacks
  • Hold yourself and others around you to higher stands when working with production
  • Bringing a developer mindset and applying it to infrastructure
  • You value simplicity


Nice to haves / Pluses:

  • Experience with cloud-based solutions such as Amazon AWS, Google Cloud, or Microsoft Azure
  • Experience with technologies such as Docker, Consul, Vault, Jenkins, Concourse, Prometheus, Nexus
  • Experience with PaaS-like solutions such as Heroku, Kubernetes, Docker Swarm, Mesos, or OpenStack
  • Experience with messaging systems such as RabbitMQ or Kafka
  • Operational knowledge with various data stores such as MongoDB, Postgres, Redis, Cassandra, Elasticsearch
  • Experience with configuration management software such as Ansible or Chef
  • Experience with a programming language such as Ruby, Elixir/Erlang, Go, or any JVM-based language
  • Experience with designing and operating IP networks 

The Talkdesk story hinges on empathy and acceptance. It is the shared goal among all Talkdeskers to empower a new kind of customer hero through our innovative software solution, and we firmly believe that the best path to success for our mission is inclusivity, diversity, and genuine acceptance. To that end, we will hire, promote, work along, cheer for, bond with, and warmly welcome into the Talkdesk family all persons without regard to ethnic and racial identity, indigenous heritage, national origin, religion, gender, gender identity, gender expression, sexual orientation, age, disability, marital status, veteran status, genetic information, or any other legally protected status.

 

Similar jobs

Other jobs at Talkdesk