Jobgether

Lead Data Engineer with AI experience

India

Full-Time

Remote

TLDR

Hands-on role building scalable AI-enabled data pipelines, RAG and retrieval systems, and semantic layers powering enterprise AI applications.

Accountabilities:

Data Pipeline Engineering: Build, optimize, and maintain robust batch and streaming data pipelines using modern cloud-native tools such as Snowflake, PySpark, Delta Lake, and Kafka, ensuring reliability, scalability, and performance.
RAG & Retrieval Infrastructure: Design and implement end-to-end retrieval systems including embedding pipelines, vector databases, hybrid search, chunking strategies, and ranking mechanisms to optimize AI context relevance.
Semantic & Knowledge Layer Development: Develop ontologies, entity mappings, and knowledge graphs while maintaining semantic contracts, metadata systems, and lineage tracking for AI and ML use cases.
ML/LLMOps Enablement: Support ML and LLM lifecycle workflows including dataset curation, feature engineering, model evaluation, experiment tracking, and production monitoring.
Agentic Data Systems: Build APIs, context stores, and tool interfaces that enable autonomous agents, including observability for reasoning traces, tool calls, and contextual outputs.
Governance & Data Quality: Implement robust data governance frameworks including RBAC, PII handling, schema validation, data quality monitoring, and compliance-ready audit logging systems.

Requirements

This role requires a highly experienced data engineering professional with strong cloud, distributed systems, and AI infrastructure expertise. The ideal candidate combines deep technical execution with architectural thinking and hands-on experience building production-grade AI-enabled data systems.

7+ years of experience in data engineering with strong exposure to cloud-based data platforms.
2+ years of experience building production AI/ML or LLM-related data infrastructure at scale.
Strong expertise in Python, SQL, PySpark, Snowflake, Delta Lake, Kafka, and Spark Structured Streaming.
Hands-on experience with vector databases, embedding pipelines, and retrieval systems in production RAG environments.
Solid understanding of MLOps practices including MLflow, CI/CD for ML systems, and automated evaluation frameworks.
Strong knowledge of data governance, security, compliance, and data quality frameworks.
Experience working with cloud ecosystems such as AWS or Azure and containerized environments (Docker, Kubernetes).
Familiarity with AI/LLM tooling such as LangChain, LlamaIndex, OpenAI/Claude/Bedrock APIs, and FastAPI is a plus.
Strong problem-solving mindset with the ability to design scalable systems and operate in fast-moving AI environments.

Benefits

Competitive compensation package aligned with experience and market standards
Remote-friendly or hybrid work flexibility depending on team structure
Opportunity to work on cutting-edge AI, LLM, and agentic systems
Exposure to global engineering teams and enterprise-scale AI transformation projects
Health, insurance, and wellness benefits (as per policy and location)
Learning and development support for advanced AI and data engineering skills
Access to modern cloud-native and AI-first technology stacks
Collaborative, engineering-driven culture focused on innovation and impact.

How Jobgether works:

We use an AI-powered matching process to ensure your application is reviewed quickly, objectively, and fairly against the role's core requirements. Our system identifies the top-fitting candidates, and this shortlist is then shared directly with the hiring company. The final decision and next steps (interviews, assessments) are managed by their internal team.

We appreciate your interest and wish you the best!

Why Apply Through Jobgether?

Data Privacy Notice: By submitting your application, you acknowledge that Jobgether will process your personal data to evaluate your candidacy and share relevant information with the hiring employer. This processing is based on legitimate interest and pre-contractual measures under applicable data protection laws (including GDPR). You may exercise your rights (access, rectification, erasure, objection) at any time.

#LI-CL1

Benefits

Health Insurance

Health, insurance, and wellness benefits (as per policy and location)

Learning Budget

Learning and development support for advanced AI and data engineering skills

Remote-Friendly

Remote-friendly or hybrid work flexibility depending on team structure

Jobgether

Jobgether runs the largest remote job platform, effectively linking job seekers with over 200,000 flexible and remote opportunities that match their unique skills and preferences. Our focus is on enhancing the hiring process, ensuring efficiency while prioritizing the candidate experience, particularly in the growing health and wellness sector.

Founded: Founded 2020
Employees: 11-50 employees
Industry: Professional Services

View company profile

Lead Data Engineer

Lead Data Engineer with AI experience

TLDR

Requirements

Benefits

Benefits

This job is no longer available