Overview:
We are looking for seasoned Data Scientist to work with our existing team of Data Scientists and Engineers to use Generative AI technology in supporting Federal use cases. We are looking for a more than just a "Data Scientist", but a technologist with excellent communication and customer service skills and a passion for data and problem solving.
Responsibilities:
* Design and train advanced machine learning models, especially generative models like GANs, VAEs, or transformer-based models.
* Work closely with ML Engineers and Software Developers to transition models from a research and development stage to production.
* Stay updated with the latest research and trends in AI to implement cutting-edge solutions.
* Evaluate the performance of foundational models (e.g., GPT, LLaMA, Claude) on domain-specific tasks and fine-tune them using supervised, reinforcement, or instruction tuning methods to align outputs with user needs and business goals.
* Design and optimize prompts, few-shot examples, and system instructions to improve LLM behavior in constrained environments (e.g., RAG pipelines, multi-agent workflows, decision support systems).
* Identify, clean, label, and synthesize high-quality datasets for model training, fine-tuning, or retrieval-augmented generation (RAG).
* Design experiments to evaluate generative model behavior (e.g., hallucination rates, factuality, coherence, safety), define appropriate benchmarks, and use metrics like BLEU, ROUGE, perplexity, and human evaluations.
* Ability to leverage and integrate various data management tools at scale - cloud experience preferred
* Support an Agile software development lifecycle
* You will contribute to the growth of our AI & Data Exploitation Practice!
Qualifications:
Qualifications
* Ability to hold a position of public trust with the US government.
* Bachelor's degree in computer science, data science/statistics, information systems, engineering, business, or a scientific or technical discipline
* 2-4 years industry experience developing ML/AI solutions and a passion for solving complex problems.
* Must have hands-on experience in building and deploying generative models, with a portfolio of relevant projects
* Must be proficient in Python, with strong coding practices for scalability and reproducibility.
* Demonstrated experience in building, training, and deploying generative models like Generative Adversarial Networks (GANs), Variational Autoencoders (VAEs), or autoregressive models like Transformer-based architectures.
* Demonstrated experience manipulating structured and unstructured data for analysis
* Demonstrated experience using AI frameworks like TensorFlow, PyTorch, Keras, or JAX.
* Ability to implement and modify complex neural network architectures.
* Skilled in using data manipulation and analysis libraries such as Pandas, NumPy, SciPy, and Scikit-learn.
* Experience with model training, fine-tuning, and evaluation using frameworks like PyTorch or TensorFlow.
* Deep knowledge of natural language processing techniques, including tokenization, embeddings, attention mechanisms, and prompt engineering.
* Hands-on experience with large language models (e.g., GPT, LLaMA, Claude, Mistral) and associated libraries (e.g., Hugging Face Transformers).
* Familiar with retrieval-augmented generation (RAG), data labeling, synthetic data generation, and data governance best practices.
* Proficiency in Python and ML tooling (e.g., Jupyter, Git, Docker, APIs). Able to work in cross-functional teams to deliver AI capabilities into production environments, and write modular/reusable code.
* Experience in Cloud analytics (AWS, Azure, or Google Cloud Platform - GCP) with tools such as AWS SageMaker, AWS Bedrock, Azure OpenAI, etc.
* Experience with DevSecOps, as it applies to data science and MLOps
* Data visualization skills in Tableau, Power BI, D3, ArcGIS, or similar are a plus
* Experience with Elasticsearch, AWS Kendra, Azure Cognitive Search, or similar tool is a plus
Share this job:
Share this Job