Slack is looking for Staff Software Engineers in Infrastructure to build, design and operate distributed systems, and improve the performance and reliability of those systems as we rapidly scale our product and organization.
We're looking for people who are passionate about building the foundational components that power the core functionality of Slack so that product feature teams can invest more time and effort on our users and product features. These frameworks and services enable Slack to achieve best in class reliability and scalability while making the delivery of new product features simple and efficient.
Infrastructure at Slack
We operate at tremendous scale with systems that process millions of events per second. Our team maintains and builds the lower levels of our stack, including:
- Edge services
- Data Stores and Caches
- Real-time messaging
- Asynchronous background job processing
We know we’ve done our job correctly when none of our users think about us. We don’t typically ship new user-facing features, but rather ensure our systems are incredibly performant, highly available, reliable, and scalable. In other words, Slack just works seamlessly.
We are a small team making a large impact. We rapidly iterate and work closely with other teams in engineering to ensure we build resilient systems that can scale. We have a strong commitment to quality and understand that simplicity and reliability should be primary aspects of the systems that we build. We are ambitious, independent, and pragmatic.
A taste of our scale and reach:
- Users spend over 10 hours connected and 2+ hours active in Slack every work day.
- 10M+ Daily Active Users in more than 150 countries.
- 1.5 billion messages are sent per month, half of those outside the United States.
- Every day we see over 8 million simultaneously connected users, over 3.5 billion web requests, over 42 billion database queries per day, and our systems see over 1 million queries/second in our caching tier.
This is a full time engineering position based in San Francisco, California.
What you will be doing
- You’ll design, build, ship and maintain the core infrastructure used by all of Slack’s engineering teams.
- You’ll collaborate with peers across Engineering to triage bugs and troubleshoot complex production issues across the stack, especially with respect to performance, reliability, and scale.
- You’ll actively own features or systems and define their long-term health, while also improving the health of surrounding systems.
You’ll work on projects such as Flannel, Scaling Job Queue, Reducing Slack’s memory footprint, International Data Residency as well as scaling the Vitess data tier.
- You’ll write, review, and provide feedback on technical design proposals.
You’ll define SLA/SLOs for your services, manage code deployments, fixes and software updates, and automate our operational processes as needed.
- You'll participate in the team’s on-call rotation, assist with triaging, and addressing production issues.
- You’ll review code and get your code reviewed; mentor and be mentored by other engineers. Teamwork is what makes the dream work.
- You'll do the best work of your life, enjoy collaborating with your coworkers, and go home on time.
You should have
- Strong Computer Science fundamentals: data structures, algorithms, programming languages, operating system, distributed systems, and information retrieval.
- A Bachelor's degree in Computer Science, Engineering or related field, or equivalent training, fellowship, or work experience.
- Built large scale systems professionally for 7+ years and can point to things you’ve worked on.
- Experience building reliable and safe distributed systems and understand the trade-offs made when engineering a feature.
- Experience with functional or imperative programming languages -- e.g., PHP, Python, Ruby, Go, C, or Java.
- Led technical architecture discussions and are passionate about drive technical decisions within your team.
- An ability to write code that can be easily understood by others with an eye towards clarity and maintainability.
- A strong dedication to code quality, automation and operational excellence: unit/integration tests, scripts, workflows.
- Strong communication skills. You’re excited to explain complex technical concepts and share your knowledge with different audiences.
- Curiosity about how things work and when things break you are eager and able to help fix them.