Loading...
Please wait a moment
Founded by passionate advocates of learning and innovation, Learni set out to make professional training accessible to everyone, everywhere in the world. Our team works in the largest cities such as Paris, Lyon, Marseille, and internationally, to support talents and organizations in their skills development.
Which format do you prefer?
30 free minutes with a training advisor — no commitment.
Loading available slots...
Professional Training training in New York in September 2026 with Learni. Certified, expert trainers, eligible for employer funding. Free quote.
Cybersecurity training in Brighton in July 2026 with Learni. Certified, expert trainers, eligible for employer funding. Free quote.
Professional Training training in Tucson in December 2026 with Learni. Certified, expert trainers, eligible for employer funding. Free quote.
Artificial Intelligence training in San Francisco in October 2026 with Learni. Certified, expert trainers, eligible for employer funding. Free quote.
The Training LLM-as-Judge - Accurately Evaluate LLMs in Production training is delivered in-person or remotely (blended-learning, e-learning, virtual classroom, remote in-person). At Learni, a Qualiopi-certified training organization, each program is designed to maximize skills acquisition, regardless of the training mode chosen.
The trainer alternates between demonstrative, interrogative, and active methods (through practical exercises and/or real-world scenarios). This pedagogical approach ensures concrete and directly applicable learning in the workplace.
To ensure the quality of the Training LLM-as-Judge - Accurately Evaluate LLMs in Production training, Learni provides the following teaching resources:
For in-house training at a location external to Learni, the client ensures and commits to having all necessary teaching materials (IT equipment, internet connection...) for the proper conduct of the training action in accordance with the prerequisites indicated in the communicated training program.
The assessment of skills acquired during the Training LLM-as-Judge - Accurately Evaluate LLMs in Production training is carried out through:
Learni is committed to the accessibility of its professional training programs. All our training programs are accessible to people with disabilities. Our teams are available to adapt teaching methods to your specific needs. Do not hesitate to contact us for any accommodation request.
Learni training programs are available for inter-company and intra-company settings, both in-person and remote. Registration is possible up to 48 business hours before the start of training. Our programs are eligible for OPCO, Pôle emploi, and FNE-Formation funding. Contact us to discuss your training project and funding possibilities.
Immersion in LLM-as-Judge principles through practical exercises on MT-Bench and Arena-Hard datasets, hands-on with pairwise comparison prompts using tools like LangChain and Hugging Face Evaluate, building a first custom judge for your business use cases, manual vs. automated comparative tests, generating initial reports highlighting 5x gains in evaluation speed, collective code review to refine approaches.
Design of end-to-end pipelines to evaluate 1000+ responses per hour, integration of vLLM for fast inference and Ray for distribution, exercises on fine-tuning specialized judges for code review or RAG, implementation of multi-step judgment chains reducing errors by 30%, real business cases with live A/B testing, production of interactive dashboards via Streamlit to visualize Spearman correlations.
In-depth analysis of positional and length biases using biased datasets, development of debiasing techniques like self-consistency and majority vote, implementation with libraries like EleutherAI and OpenAI Moderation, practical exercises on 50 real hallucination scenarios, calibration of judges for 90%+ human alignment, creation of custom deliverables including audit plans for secure production deployments.
Exploration of composite metrics like G-Eval and JudgeLM for granular evaluations, integration with frameworks like DeepEval and RAGAS, collaborative workshops on custom benchmarks for QA and code generation, optimization for >0.85 correlations with human experts, tests on your internal models with automated feedback loops, generation of executive reports demonstrating 70% ROI in evaluation time reduction.
Full-stack deployment on Kubernetes with Prometheus/Grafana monitoring for drift detection, CI/CD configuration via GitHub Actions for continuous judges, exercises on scaling to 10k evaluations/day, case studies from leading AI companies, finalization of red thread project with exposed API and complete docs, Q&A session on 2026 strategy integrating multimodality, skill certification via simulated production deployment.
Target audience
Data scientists, AI engineers, ML researchers for professional skill enhancement
Prerequisites
Mastery of Python, LLMs like GPT or Llama, fine-tuning and OpenAI/Hugging Face APIs
Loading...
Please wait a moment





























