Site Reliability Engineer
TLDR
Designs and maintains highly available, secure cloud platforms with automation, observability, and deployment reliability across distributed systems.
- Design, implement, and maintain observability solutions to ensure high availability, performance, and reliability across cloud-based systems
- Participate in on-call rotations, incident response, and postmortem analysis to drive continuous operational improvements
- Collaborate with product and engineering teams to design and deploy scalable, resilient, and secure infrastructure solutions
- Develop and enforce cloud architecture standards, reliability practices, and automation strategies for large-scale systems
- Build and maintain infrastructure automation using Infrastructure-as-Code tools such as Terraform, ARM, Bicep, or CloudFormation
- Implement CI/CD and deployment automation workflows using modern DevOps toolchains and source control systems
- Integrate and automate monitoring and operational tools such as Dynatrace, Datadog, App Insights, and similar observability platforms
- Develop scripting and automation solutions using Python, PowerShell, Bash, or REST APIs to improve operational efficiency
- Maintain technical documentation, operational runbooks, and knowledge base content to support engineering and support teams
- Collaborate on security and compliance requirements including SOC, FedRAMP, and cloud security best practices
- 6+ years of experience in Site Reliability Engineering, cloud infrastructure, or software engineering roles
- Strong hands-on experience with Kubernetes-based environments such as AKS, EKS, GKE, or OpenShift
- Deep knowledge of cloud platforms including Microsoft Azure, AWS, or Google Cloud Platform
- Proven experience implementing Infrastructure-as-Code using tools such as Terraform, ARM templates, Bicep, or CloudFormation
- Strong expertise in observability and monitoring tools such as Dynatrace, Datadog, New Relic, Prometheus, Grafana, or Log Analytics
- Solid scripting and automation skills using Python, PowerShell, Bash, or similar languages
- Strong understanding of CI/CD pipelines, Git-based workflows, and DevOps practices
- Experience with configuration management tools such as Ansible, Chef, Puppet, or similar
- Familiarity with distributed systems, containerized applications, and cloud-native architectures
- Ability to work independently in ambiguous environments while managing multiple priorities effectively
- Strong communication skills with the ability to collaborate across engineering, product, and operations teams
- Experience working in Agile environments using Jira or Azure DevOps Boards
- Knowledge of compliance frameworks such as SOC or FedRAMP is a strong advantage
- Competitive base salary ranging from USD 114,000 to 148,000 depending on experience and location
- Comprehensive health coverage including medical, dental, vision, and life insurance
- Retirement savings plan (401K) with employer support
- Short-term and long-term disability coverage
- Paid vacation time and paid holidays
- Professional development and training opportunities
- Remote work flexibility within the United States
- Exposure to large-scale cloud environments and modern DevOps practices
- Opportunity to work on high-impact production systems with strong engineering ownership
Requirements:
Benefits:
Benefits
Health Insurance
Comprehensive health coverage including medical, dental, vision, and life insurance
Professional development and training opportunities
Opportunity to work on high-impact production systems with strong engineering ownership
Paid Time Off
Paid vacation time and paid holidays
Remote-Friendly
Remote work flexibility within the United States
Jobgether runs the largest remote job platform, effectively linking job seekers with over 200,000 flexible and remote opportunities that match their unique skills and preferences. Our focus is on enhancing the hiring process, ensuring efficiency while prioritizing the candidate experience, particularly in the growing health and wellness sector.
- Founded
- Founded 2020
- Employees
- 11-50 employees
- Industry
- Professional Services