Skip to main content

Software Developer Engineer

Philadelphia, NY
Permanent
Core Experience
  • Hands-on experience deploying open-source LLMs such as Meta Llama 3 and Mistral / Mixtral in on-prem or private environments (25%)
  • Strong proficiency in Python for LLM inference, prompt engineering, and integration (25%)
  • Experience with CPU-based inference, model quantization, and performance tuning (25%)
Vector Databases & RAG
  • Practical experience with open-source vector databases such as Qdrant, Chroma, Milvus, or pgvector (25%)
  • Proven implementation of Retrieval-Augmented Generation (RAG) pipelines (25%)
  • Experience generating and managing embeddings and metadata filtering (25%)
Security & Governance
  • Understanding of data privacy, air-gapped deployments, and enterprise security requirements (25%)
  • Experience implementing access controls and audit logging (25%)
Nice to Have
  • Experience with LangChain or LlamaIndex
  • Exposure to Rust, Go, or C for high-performance services
  • Familiarity with Docker and Kubernetes for on-prem deployments
  • Knowledge of inference frameworks (e.g., vLLM, llama.cpp, Hugging Face Transformers)
  • Prior work in regulated or enterprise environments
Deliverables
  • Reference architecture and deployment guidance
  • Working prototype (LLM vector DB RAG)
  • Documentation and knowledge transfer to internal teams

For immediate consideration please click APPLY.

Job Type: Permanent

Job ID: 253525878