強化微調整 (RFT) とは何か - 教師あり vs. RL LLM 再トレーニング
Reinforcement Learning from Human Feedback (RLHF) Explained
Supervised Fine-Tuning vs. Reinforcement Learning in Foundation Models
RAG vs Fine-Tuning vs Prompt Engineering: Optimizing AI Models
Lesson 04/10 – Post-Training: Supervised Fine-Tuning (SFT) & Reinforcement Learning (RL)
Build Hour: Reinforcement Fine-Tuning
RAG vs. Fine Tuning
Reinforcement Learning with Human Feedback (RLHF), Clearly Explained!!!
Soc(AI)ety Seminars, Part 8: The Truth of the Matter in the Age of Generative AI
LLM Training & Reinforcement Learning from Google Engineer | SFT + RLHF | PPO vs GRPO vs DPO
4 分で学ぶ、人間のフィードバックによる強化学習 (RLHF)
Supervised Fine Tuning on Curated Data is Reinforcement Learning (and can be improved) (Jul 2025)
Maciej and Bartek - Fine-tuning Reinforcement Learning Models is a Forgetting Mitigation Problem
[Full Workshop] Reinforcement Learning, Kernels, Reasoning, Quantization & Agents — Daniel Han
🧠 Reinforcement Fine-Tuning vs. Supervised Learning – Which Wins? 🚀
How AI Becomes Human [pre-training, supervised fine-tuning, reinforcement learning, and more]
SFT vs RL-FT: How Fine-Tuning Shapes LLMs
Fine-tuning LLMs on Human Feedback (RLHF + DPO)
Reinforcement Learning for LLMs in 2025