LLMOps Engineer

US-VA-McLean

External

Req #: 7033
Type: Full-Time
logo

Steampunk

Connect With Us:
Connect To Our Company
				Overview:

We are looking for an experienced Senior LLMOps Engineer to design, implement, and maintain production-grade large-language-model (LLM) pipelines, deployment architectures, and monitoring systems across enterprise environments. The Senior LLMOps Engineer will play a critical role in operationalizing generative AI capabilities, ensuring that LLM-based applications are scalable, secure, reliable, and compliant with emerging AI risk and governance frameworks. This role spans the spectrum of model deployment, orchestration, evaluation, and optimization. 

Responsibilities:

* Architect and maintain scalable LLM and RAG pipelines, including model hosting, inference optimization, retrieval layers, and context management frameworks. 

* Lead the design and implementation of secure GenAI infrastructure across cloud environments, ensuring reliability, performance, and cost efficiency. 

* Build and manage automated evaluation systems that assess LLM output quality, safety, latency, and adherence to AI governance requirements. 

* Develop CI/CD workflows tailored for LLM- and GenAI-based applications, including dataset versioning, model lineage, and automated testing of prompt and model behaviors. 

* Collaborate with AI Product Engineers and Data Scientists to productionize LLM-based prototypes into enterprise-grade, maintainable systems. 

* Integrate vector databases, model gateways, content filters, and guardrail frameworks into end-to-end LLM solutions. 

* Implement observability and monitoring solutions that track performance metrics, hallucination rates, cost profiles, and user interaction patterns. 

* Lead troubleshooting and root-cause analysis for issues related to LLM deployment, inference performance, or pipeline reliability. 

* Stay current with emerging LLM architectures, inference optimizations, fine-tuning techniques, and relevant MLSecOps patterns. 

* Ensure compliance with data privacy, ethical AI, and AI-governance frameworks throughout pipeline design and operations. 

* Mentor junior engineers and contribute to Steampunk's AI engineering best practices, tooling, and reusable infrastructure patterns. 

* You will contribute to the growth of our AI & Data Exploitation Practice! 

Qualifications:

* Ability to hold a position of public trust with the U.S. government. 

* Bachelor's and 8 years of experience. 

* 5+ years of experience in software engineering, data engineering, MLOps, or cloud engineering, with 2+ years focusing specifically on LLM or GenAI operations. 

* Strong experience deploying models using frameworks such as Hugging Face Transformers, vLLM, TensorRT-LLM, or similar. 

* Proficiency in Python and operational tooling such as FastAPI, PyTorch, LangChain, LlamaIndex, and vector databases (FAISS, Milvus, Pinecone, or similar). 

* Advanced knowledge of cloud platforms (AWS, Azure, GCP) including model hosting, distributed compute, and secure networking patterns. 

* Hands-on experience building CI/CD pipelines, automated testing frameworks, and environment provisioning for AI/ML workloads. 

* Experience with Docker, Kubernetes, and infrastructure-as-code (Terraform, CloudFormation). 

* Familiarity with MLSecOps, AI governance, model hardening, prompt injection defenses, and content safety monitoring. 

* Strong understanding of logging, observability, and performance profiling for high-throughput LLM inference systems. 

* Excellent written and verbal communication skills, with the ability to explain trade-offs and architectural decisions to technical and non-technical stakeholders. 

* Demonstrated ability to balance long-term platform thinking with hands-on operations and rapid problem solving. 

* Experience working in agile teams and using modern project management tools.
			
Share this job: