Software Architect
TLDR
Lead architecture and development of a scalable, high-performance distributed storage platform for AI and HPC workloads, shaping metadata services, datapaths, and cloud-native infrastructure.
- Define and drive architectural direction for core VDURA storage technologies, including distributed metadata services, parallel file systems, object storage integration, cloud-native infrastructure, and high-performance datapaths.
- Lead the design of scalable, highly available, and performant distributed storage subsystems for AI and HPC workloads.
- Partner with engineering leadership to develop long-term platform strategy and technology roadmaps.
- Distributed Storage Development
- Lead the implementation and evolution of major software subsystems within the VDURA Data Platform.
- Drive development efforts in high-performance C/C++ codebases with strong emphasis on reliability, maintainability, scalability, and performance.
- Contribute hands-on development work for critical architectural components and performance-sensitive paths.
- Help architect next-generation storage capabilities optimized for AI training, inference, vector databases, GPU acceleration, and large-scale data pipelines.
- Collaborate with internal teams and strategic partners to evaluate emerging technologies including RDMA, NVMe, GPUDirect, SPDK, intelligent tiering, and cloud-based AI infrastructure.
- Performance and Scalability
- Analyze system behavior at scale and identify opportunities to improve throughput, latency, resiliency, and operational efficiency.
- Lead performance tuning initiatives across storage nodes, metadata services, networking layers, and client interfaces.
- Guide large-scale testing and validation strategies for enterprise and hyperscale deployments.
- Technical Leadership and Mentorship
- Provide technical leadership across multiple engineering teams and projects.
- Conduct architecture and code reviews while promoting engineering excellence and disciplined software development practices.
- Mentor senior and junior engineers in distributed systems architecture, debugging, and performance optimization.
- Work closely with Product Management, QA, DevOps, Support, and Customer Engineering teams to ensure successful product delivery and customer outcomes.
- Participate in technical discussions with customers, technology partners, and open-source communities where appropriate.
- Bachelor’s or Master’s degree in Computer Science, Computer Engineering, or a related technical discipline.
- 10+ years of experience developing enterprise-class distributed software systems.
- Deep expertise in C and C++ development within Linux and/or BSD environments.
- Strong understanding of distributed systems architecture, concurrency, networking, and storage technologies.
- Proven experience leading major software subsystems or platform-level architectural initiatives.
- Experience with distributed file systems, parallel file systems, object storage systems, or large-scale storage infrastructure.
- Strong debugging, performance analysis, and systems optimization skills.
- Experience with high-performance networking technologies such as RDMA, InfiniBand, RoCE, or GPUDirect is highly desirable.
- Familiarity with NVMe, SPDK, cloud-native architectures, Kubernetes, containers, or public cloud platforms is a plus.
- Experience using AI-based software development tools such as Claude, Cursor, GitHub Copilot, ChatGPT, Gemini, or similar technologies to improve engineering productivity and software quality.
- Excellent written and verbal communication skills.
- Ability to collaborate effectively across geographically distributed engineering teams.
- AI infrastructure and large-scale AI workload optimization.
- HPC storage environments and high-throughput data pipelines.
- Linux or BSD kernel-level development.
- Open-source community engagement and upstream contribution experience.
- Experience building highly available enterprise storage solutions.
- Cloud deployment and hybrid cloud storage architectures.
Benefits
Remote-Friendly
Hybrid work arrangements are preferred to support close collaboration with engineering and lab teams across VDURA’s global organization.
VDURA specializes in creating high-performance parallel file systems and distributed storage solutions tailored for the AI and high-performance computing markets. Our innovative software-defined storage platform addresses the extreme demands of data-intensive workloads, ensuring exceptional scalability, performance, and reliability for modern compute environments.