At Luna, our mission is to make quality, affordable vision care accessible for all.
Our diverse team is working to advance the vision industry with an ever-growing collection of innovative solutions designed to address the toughest challenges facing eyewear businesses and doctors. We offer suites of products and services that improve and increase vision prescription access, streamline online shopping with virtual try-on and facial measurements, and support fulfillment and delivery.
We partner with forward-looking eyewear retailers and brands, as well as eye doctors and optical shops, to accelerate their digital transformation with end-to-end solutions that revolutionize vision experiences online, offline, and everywhere in-between. Our solutions are currently used by over 10 million users each month, all around the world.
Luna is looking for a Senior Site Reliability Engineer to work in our operations team to drive technical roadmaps across our entire infrastructure. You will work with various technologies and systems to build an automated, monitored, fault-tolerant, maintainable, and extensible infrastructure. In addition, you will collaborate and mentor backend and front-end engineering teams to ensure that product solutions are scalable, efficient, and reliable.
What you’ll do
- Design, implement, scale, and maintain technology infrastructure to meet our rapidly increasing demand and support our global 24/7 products
- Manage our infrastructure with shell scripts, Python, Jenkins, Ansible, Terraform/Terragrunt
- Support multiple development teams with an agile, responsive CI/CD platform to deliver high-quality builds with measurable performance and quality
- Participate in an on-call team rotation to mitigate disruption for any production systems and conduct root cause analysis
- Leverage your knowledge of systems and infrastructure to mentor our engineers
- Expand our existing monitoring and alerting systems to improve reliability and efficiency
- Help engineering teams identify Service Level Indicators (SLIs) that will help them meet objectives related to availability, reliability, and performance.
- 11-12 years of relevant engineering experience
- 4-6 years of DevOps or SRE Roles with a focus on automation, tooling and infrastructure on a major cloud provider preferrably AWS
- Authoring and maintaining IaC with Terraform/Terragrunt
- Experience building Docker images and deploying containers in Kubernetes clusters
- General networking knowledge (DNS, firewall/security groups, load balancing, VPN, proxies, subnets, CIDR, DHCP)
- Experience navigating Linux and the shell, especially in Ubuntu and CentOS, including configuration, package management, startup and troubleshooting
- Solid experience with Bash/Shell
- Prior development experience with at least one of the following languages: Python, C++, Java or Ruby · Knowledge of automation within AWS (Amazon Web Services) infrastructure using CLI/SDK’s. Cloud resources provisioning and configuration through CLI/API
- Experience building, configuring and monitoring CI/CD pipelines
- Blue-Green, Canary and A/B testing deployment strategies
- Systematic, methodical problem-solving approach coupled with outstanding communication skills and a strong sense of ownership and drive.
- Strong skills around observability, troubleshooting and performance solutioning
- Knowledge of setup and provisioning for metrics monitoring tools like Prometheus and Grafana, including alerting and silences
- Systems-level design and troubleshooting: edge cases, failure modes, behaviors, specific implementations
- Strong and effective collaborator and communicator with an ability to perform asynchronously
- Excellent documentations skills - Have an urge to document all the things in accurate detail so that anyone else on the team can use it independently
- Knowledge of best practices when it comes to the observability and monitoring required to run Platform services at scale
- Understanding of Incident Management and ITIL service operations, including root cause analysis and subsequent documentation
Nice to have:
- Self-sufficient and comfortable working by yourself when needed, but love working with the team
- Shows ownership of a major part of the infrastructure
- Strong understanding of monitoring implementations and administration
- Working in an ISO-27001 or HIPAA certified environment preferred
- Knowledge of best practices related to security, performance, and disaster recovery
- Highly skilled in identifying performance bottlenecks, identifying anomalous system behavior, and determining the root cause of incidents
- Have an urge for delivering quickly and effectively, and iterating fast
- Bachelor’s or higher degree, or the equivalent, in Computer Science or related. (Luna recognizes that knowledge and skills equivalent to those earned in a degree program can also be achieved via nontraditional paths and welcomes applicants with nontraditional training.)
We are bright minds creating innovative solutions to real problems in the vision space. We are passionate people who bring big energy to big problems and aren’t afraid to do things differently or take a new approach to old challenges. We inspire and support each other in our mission to make quality, affordable vision care accessible for all.
Our diverse team includes engineers, eye doctors, entrepreneurs, creatives, and researchers who come together from all around the world to revolutionize the vision industry with smart, well-designed solutions. Our distributed workforce philosophy allows us to find the best fit for our roles based on talent, not on location.
Perks & company culture:
We offer all the perks you should expect from a great tech company: competitive salary, meaningful equity-upside, medical and dental insurance, unlimited PTO, and a 401(k) with a company match. We care a lot about doing the little things well and work hard to build a strong team culture. We organize online team events and activities to bring our remote teams together and find creative ways to build our culture with team members around the world.
Inclusion and listening are at the heart of our brand culture. We celebrate different backgrounds, experiences, and perspectives and encourage everyone to be their authentic self at work. We welcome everyone regardless of race, color, religion, sex, sexual orientation, gender identity or expression, national origin, age, disability, veteran status, or genetics. We dedicate ourselves to providing an inclusive, open, and diverse work environment. For these reasons and more, Luna is a proud equal employment opportunity employer.
*For remote applicants to be considered, you must live in one of the following approved states: Arizona, California, Colorado, Florida, Georgia, Idaho, Montana, Nevada, New York, North Carolina, Ohio Oregon, Texas, Utah, Virginia, Washington and Wisconsin.
It is anticipated this position will have a salary range of $130,000 to $180,000, depending on the candidate’s qualifications for the role. The successful candidate may also be eligible to enroll in several benefits including medical, dental, vision, 401(k), and others provided the work schedule meets the minimum Company requirements.