Machine Learning Engineer
Posted
We are seeking a highly skilled Machine Learning Engineer to design and build a low-latency query understanding and intelligent routing system that operates without reliance on large language models. The role focuses on extracting intent, entities, application context, routing decisions, and supporting evidence from user queries in real time.
This is a full lifecycle role spanning data modeling, ML development, optimization, local deployment, and MLOps. The ideal candidate will have strong experience in applied NLP, lightweight model architectures, and production-grade ML systems, with a focus on sub-second inference, CPU-based execution, and scalable domain evolution.
Responsibilities- Design and implement a query understanding pipeline to extract intent, routing decisions, entities, application mapping, and historical evidence from user queries and conversations.
- Define and build the training data model and annotation schema for structured outputs (intent, routing, entities, applications, evidence).
- Lead data collection, synthesis, analysis, and cleaning to develop high-quality datasets for model training and evaluation.
- Develop and evaluate baseline and advanced non-LLM models for:
- Intent classification
- Query routing
- Entity extraction
- Application detection
- Evidence retrieval
- Build and maintain train, test, and evaluation pipelines with strong focus on:
- Accuracy and F1 score
- Confidence scoring and calibration
- Latency and throughput
- Optimize models to meet strict constraints:
- Sub-second inference latency
- CPU-only execution
- Compact model size (
