Software Engineer, Infrastructure
TLDR
Design and optimize scalable inference infrastructure powering enterprise AI models, aligning latency, throughput, and deployment from research to production.
Scaled Cognition is the world’s only model lab dedicated exclusively to customer experience and pioneering agentic models purpose-built for reliable action-taking enterprise applications. Backed by Khosla Ventures, the company’s flagship Agentic Pretrained Transformer (APT) eliminates hallucinations, enforces enterprise policies and increases reliability in real-world CX workflows. Founded by serial AI entrepreneurs, former Microsoft Corporate Vice President of Conversational AI Dan Roth, and UC Berkeley AI Professor Dan Klein, and built by a team of world-class PhD researchers and engineers, Scaled Cognition advances the science of agentic AI to deliver safe, policy-aligned automation that enterprises can trust.
As a Software Engineer, Infrastructure at Scaled Cognition you will:
Design and improve inference infrastructure that powers our AI models.
Benchmark, profile, monitor and analyze latency and throughput and drive improvements throughout the stack.
Collaborate with research scientists and product engineers to streamline the end-to-end process from model development to production deployment.
You might be the right person for the job if you:
Are a continuous learner and are eager to explore new tools and technologies.
Thrive in a dynamic environment and are comfortable adapting to changing priorities while maintaining a focus on delivering high-quality solutions.
Have successfully navigated projects with significant product and technical ambiguity, and you excel at the intersection of complex technical challenges and user-focused solutions.
Key Qualifications:
Experience deploying systems on major cloud platforms (AWS, GCP, Azure).
Prior experience designing and implementing GPU infrastructure / tooling.
A strong sense for scalability and developing secure, highly reliable environments.
Scaled Cognition is an innovative model lab focused solely on enhancing customer experience through agentic AI models that support reliable enterprise actions. Its flagship product, the Agentic Pretrained Transformer (APT), reduces errors and aligns with enterprise policies, ensuring dependable performance in real-world applications.
- Industry
- Internet Software & Services