Jobgether
Jobgether

Senior Site Reliability Engineer (SRE)

TLDR

Maintain and improve the reliability of large-scale AI and cloud-native services, while advancing CI/CD, observability, and automation across cross-functional teams.

This position is listed on behalf of a partner company, who manages all applications and next steps. Our partner is looking for a Senior Site Reliability Engineer (SRE) based in Romania. This role sits at the heart of large-scale cloud infrastructure, ensuring that highly distributed systems remain reliable, scalable, and performant under demanding production workloads. You will be responsible for maintaining and improving the stability of critical services that support modern AI and cloud-native platforms. The environment is fast-paced and engineering-driven, where automation, resilience, and operational excellence are core priorities. You will work closely with software, infrastructure, and platform teams to design systems that can withstand high traffic and complex distributed workloads. The role combines hands-on engineering with strategic improvements to CI/CD pipelines, observability, and system reliability. You will contribute to shaping infrastructure that enables seamless deployment and operation of advanced cloud services at global scale. Accountabilities:
  • Maintain high system availability by ensuring fault tolerance, monitoring, and rapid incident response across production services.
  • Design, implement, and optimize scalable infrastructure solutions using modern cloud-native technologies.
  • Improve and evolve CI/CD pipelines to enable safe, efficient, and automated software delivery.
  • Collaborate with engineering teams to troubleshoot complex system issues across compute, networking, and storage layers.
  • Apply infrastructure-as-code practices using tools such as Terraform, Ansible, or similar to manage and standardize environments.
  • Support containerized environments and orchestration platforms such as Docker, Kubernetes, and Helm.
  • Contribute to operational best practices, including observability, alerting, and performance tuning.
  • Requirements:

    • Strong programming skills in languages such as Go, Python, or C++, with a solid foundation in algorithms and data structures.
    • Deep understanding of Unix/Linux systems, networking fundamentals, and distributed system behavior.
    • Hands-on experience with containerization and orchestration tools such as Docker and Kubernetes.
    • Practical experience with infrastructure-as-code and configuration management tools (Terraform, Ansible, Salt, or similar).
    • Familiarity with CI/CD systems and modern DevOps practices.
    • Experience working with or supporting high-load distributed systems in production environments.
    • Strong problem-solving mindset with the ability to diagnose and resolve complex technical issues.
    • Excellent communication and collaboration skills in cross-functional engineering teams.
    • Benefits:

      • Competitive compensation package
      • Career growth and continuous learning opportunities
      • High degree of autonomy, flexibility, and ownership
      • Collaborative and innovation-focused engineering culture
      • Opportunity to work on large-scale, impactful cloud and AI infrastructure
      • International environment with highly skilled engineering teams
How Jobgether works:
We use an AI-powered matching process to ensure your application is reviewed quickly, objectively, and fairly against the role's core requirements. Our system identifies the top-fitting candidates, and this shortlist is then shared directly with the hiring company. The final decision and next steps (interviews, assessments) are managed by their internal team.
We appreciate your interest and wish you the best!
 
Data Privacy Notice: By submitting your application, you acknowledge that Jobgether will process your personal data to evaluate your candidacy and share relevant information with the hiring employer. This processing is based on legitimate interest and pre-contractual measures under applicable data protection laws (including GDPR). You may exercise your rights (access, rectification, erasure, objection) at any time.
 
 
#LI-CL1

Benefits

Flexible Work Hours

High degree of autonomy, flexibility, and ownership

Learning Budget

Career growth and continuous learning opportunities

International environment with skilled teams

International environment with highly skilled engineering teams

Remote-Friendly

High degree of autonomy, flexibility, and ownership

Jobgether runs the largest remote job platform, effectively linking job seekers with over 200,000 flexible and remote opportunities that match their unique skills and preferences. Our focus is on enhancing the hiring process, ensuring efficiency while prioritizing the candidate experience, particularly in the growing health and wellness sector.

Founded
Founded 2020
Employees
11-50 employees
Industry
Professional Services
View company profile
Report this job
Apply for this job