Jobgether

DevOps/Observability Engineer

Canada

Full-Time

Remote

TLDR

Design and scale a next-generation observability platform for complex, distributed systems, unifying metrics, logs, and traces with OpenTelemetry, Prometheus, Grafana, and Splunk.

Accountabilities:

Design and implement end-to-end observability architectures using OpenTelemetry, Prometheus, Grafana, and related tools across cloud environments.
Build and maintain centralized observability pipelines across multi-account AWS environments, including CloudWatch, CloudTrail, and VPC Flow Logs.
Develop scalable log aggregation and routing strategies, including filtering, noise reduction, and integration with systems such as Splunk HEC.
Create advanced alerting frameworks and high-quality dashboards using Alertmanager, CloudWatch Alarms, and Grafana with PromQL.
Deploy and manage observability infrastructure using Infrastructure as Code tools such as Terraform.
Support Kubernetes and container-based observability across EKS and ECS environments.
Optimize observability systems for performance, cost efficiency, and scalability in large-scale production environments.
Collaborate with engineering teams to improve system reliability, monitoring standards, and incident response capabilities.

Requirements:

8+ years of experience in DevOps, Site Reliability Engineering, or Observability Engineering roles.
Strong hands-on experience designing unified observability pipelines using OpenTelemetry, Prometheus, and Grafana.
Deep expertise in AWS observability services including CloudWatch, CloudTrail, and cross-account telemetry strategies.
Proven ability to build and manage large-scale log aggregation systems and optimize high-volume data pipelines.
Strong experience with Kubernetes (EKS) or containerized environments (ECS) in production settings.
Advanced proficiency with Terraform or other Infrastructure as Code tools for infrastructure and observability deployments.
Experience building alerting systems, dashboards, and monitoring frameworks for distributed systems.
Strong understanding of cost optimization strategies for observability platforms (log filtering, metric reduction, storage tiering).
Excellent problem-solving, debugging, and collaboration skills in complex cloud-native environments.

Benefits:

Competitive compensation aligned with experience and market benchmarks.
Remote work flexibility within Canada.
Opportunity to work on large-scale, AI-driven, cloud-native infrastructure systems.
Exposure to enterprise clients and high-impact digital transformation projects.
Hands-on experience with leading observability and cloud technologies in production environments.
Strong learning and upskilling culture in AI, cloud, and platform engineering.
Collaborative, high-performance engineering environment focused on innovation and reliability.
Opportunity to shape next-generation observability practices at scale.

How Jobgether works:

We use an AI-powered matching process to ensure your application is reviewed quickly, objectively, and fairly against the role's core requirements. Our system identifies the top-fitting candidates, and this shortlist is then shared directly with the hiring company. The final decision and next steps (interviews, assessments) are managed by their internal team.

We appreciate your interest and wish you the best!

Why Apply Through Jobgether?

Data Privacy Notice: By submitting your application, you acknowledge that Jobgether will process your personal data to evaluate your candidacy and share relevant information with the hiring employer. This processing is based on legitimate interest and pre-contractual measures under applicable data protection laws (including GDPR). You may exercise your rights (access, rectification, erasure, objection) at any time.

#LI-CL1

Benefits

Equity Compensation

Competitive compensation aligned with experience and market benchmarks.

Learning Budget

Strong learning and upskilling culture in AI, cloud, and platform engineering.

Next-gen observability practices

Opportunity to shape next-generation observability practices at scale.

Remote-Friendly

Remote work flexibility within Canada.

Apply for this job

Jobgether

Jobgether runs the largest remote job platform, effectively linking job seekers with over 200,000 flexible and remote opportunities that match their unique skills and preferences. Our focus is on enhancing the hiring process, ensuring efficiency while prioritizing the candidate experience, particularly in the growing health and wellness sector.

Founded: Founded 2020
Employees: 11-50 employees
Industry: Professional Services

View company profile

Engineer

Report this job