leveraging reinforcement learning and large language models for code optimization（関連順）

15:34

LLMの予想外の現実世界の啓示

bycloud

139,883 回視聴 - 4 か月前

0:36

What is Retrieval Augmented Generation (RAG) ? Simplified Explanation

GetDevOpsReady

361,951 回視聴 - 9 か月前

30:30

🔵 Want better RAG results? Optimize your Data

SAP Developers

311 回視聴 - 5 日前に配信済み

59:31

Early stages of the reinforcement learning era of language models

Nathan Lambert

4,973 回視聴 - 7 か月前

1:00:50

Understanding LLMs for Code Generation

DataCamp

4,348 回視聴 - 1 年前に配信済み

51:06

小さな LM を微調整して、自分で考え、パズルを解くようにする方法 (GRPO & RL!)

Neural Breakdown with AVB

17,626 回視聴 - 3 か月前

6:29

How to fine-tune LLMs for with Tunix

Google for Developers

41,198 回視聴 - 4 週間前

53:51

How language model post-training is done today

Interconnects AI

11,310 回視聴 - 9 か月前

1:06:05

Reinforcement Learning with Large Datasets: Robotics, Image Generation, and LLMs

RAIL

6,049 回視聴 - 1 年前

3:12

AI Practitioner Exam Bites #35: Fine-Tuning Methods for Optimized AI

Matthew Purcell

348 回視聴 - 1 年前

21:36

Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning

Xiaol.x

536 回視聴 - 7 か月前

14:18

Fin-R1: A Large Language Model for Financial Reasoning through Reinforcement Learning

LLMの説明 | LLMとは

HRPO: RL for Hybrid Latent Reasoning

AI Research Roundup

33 回視聴 - 4 か月前

37:16

Hands-on 10: Large Language Model Alignment with Direct Preference Optimization

BrainOmega

3,680 回視聴 - 3 か月前

1:44:31

Stanford CS229 I Machine Learning I Building Large Language Models (LLMs)

Stanford Online

1,624,985 回視聴 - 1 年前

38:35

Can Wikipedia Help Offline Reinforcement Learning? (Paper Explained)

Yannic Kilcher

11,938 回視聴 - 3 年前

32:24

[UCLA RL-LLM] Chapter 0: Course outline and prologue

Ernest Ryu

6,737 回視聴 - 3 か月前

6:42

Skill Set Optimization: Reinforcing Language Model Behavior via Transferable Skills (ICML 2024)

KolbyRL

28 回視聴 - 1 年前

5:47

ReVisual-R1: Staged MLLM Reasoning

AI Research Roundup

53 回視聴 - 4 か月前

結果 : leveraging reinforcement learning and large language models for code optimization