Overview
Elastic, the Search AI Company, enables real-time answers across data at scale. The Elastic Search AI Platform combines search precision with AI to help organizations deliver on the promise of AI.
What is The Role: The Search Conversational Experiences team builds Elastic’s conversational platform that lets customers chat with their own data in Elasticsearch. We own the quality layer for RAG, agents and tools, retrieval/citations, streaming, memory, and evaluation signals that turn open-ended questions into grounded, reliable answers. As a Senior Data Scientist, you’ll be part of a cross-functional team (backend, DS, PM, UX) driving chat quality end-to-end: designing and running evaluation pipelines, improving prompts and tool behaviors, and turning measurements into product decisions that customers can feel.
You’ll tackle frontier problems—folding RAG and vector search into an agent’s knowledge base, dynamically enriching model context to boost groundedness, shaping agent routing and tool selection policies, enabling agent-driven visualizations on top of Elasticsearch data, and exploring multimodality and reasoning strategies where they truly move the needle. This is an applied role: you will prototype, evaluate, and partner with engineers to ship.
What You Will Be Doing
- Design and maintain offline/online evaluation pipelines for conversational search: golden sets, rubric/LLM-as-judge calibration, groundedness/citation checks, and A/B tests.
- Build and compare retrieval & re-ranking baselines (sparse + dense), query understanding, and semantic rewrites; implement improvements with clear metrics.
- Use results to drive product decisions: model selection, efficient agent routing, tool gating, and agent customization for Elastic use cases in search and beyond.
- Instrument dashboards and telemetry so helpfulness, faithfulness, latency, and cost trade-offs are visible and trustworthy; guard against regressions in CI.
- Collaborate with backend engineers on contracts (ES|QL, citations, telemetry), and with PM/UX to translate findings into shipped features.
- Share outcomes clearly (docs, notebooks, PRs) and mentor peers in experiment design and evaluation craft.
What You Will Bring
- 5–8 years in applied DS/ML with strong IR/NLP experience (RAG, dense/sparse retrieval, re-ranking, vector search).
- Proficiency in Python, PyTorch/Transformers, Pandas; reproducible experiments (e.g., MLflow), versioned datasets, and clean, reviewable code.
- Hands-on evaluation expertise: offline metrics (nDCG/MRR/Recall@k), LLM-as-judge calibration, groundedness/citation scoring, and online A/B testing.
- Experience turning experimental results into clear product calls (models, routing, tools) and communicating them crisply to cross-functional partners.
- Practical Elasticsearch experience (or similar); ES|QL familiarity is a plus.
- Comfort working in a distributed, async-first environment; strong written communication; low-ego collaboration.
Compensation and Benefits
Compensation is base salary with a typical starting range listed below. This role is eligible for Elastic’s stock program and a comprehensive benefits package focused on well-being. The salary ranges may be modified over time and depend on factors including location, experience, and business needs.
Salary: CAD $128,300—$203,000
Note: The above ranges represent starting ranges and may be higher or lower based on qualifications and location.
Additional Information:
- Diversity and inclusion are core to Elastic. We offer competitive pay, health coverage in many locations, flexible work arrangements, generous vacation, matching donation program, volunteer hours, and parental leave.
Elastic is an equal opportunity employer. We do not discriminate based on race, color, religion, sex, pregnancy, sexual orientation, gender identity, national origin, age, marital status, disability, or protected status. For accommodations during the application process, email . We respond within 24 business hours.
For more information, please see our Privacy Statement.
Seniority level
Employment type
Job function