At QCon AI NYC 2025, Will Hang from OpenAI unveiled Agent RFT—a cutting-edge reinforcement fine-tuning approach for tool-using agents. By optimizing prompts and tasks before model adjustments, Hang showcased effective strategies to enhance decision-making and efficiency, emphasizing a balanced grading system. The session revealed a future where smarter agents reduce latency and improve outcomes. By Andrew Hoblitzell

infoq.com

Andrew Hoblitzell

1 day ago

QCon Software Development Conference ai reinforcement-learning model-serving mlops model-tuning

Top posts from tech subreddits• Updated 5 minutes ago

How do you keep agents aligned when tasks get messy?

reddit.com

The_Default_Guyxxo

13 days ago

r/AI_Agents ai multi-agent-systems reinforcement-learning alignment agent-based-models mlops system-design agent-based-systems microservices distributed-systems

You can now do FP8 reinforcement learning locally! (<5GB VRAM)

i.redd.it

472

danielhanchen

23 days ago

r/LocalLLaMA ai reinforcement-learning deep-learning mlops

[P] Training RL agent to reach #1 in Teamfight Tactics through 100M simulated games

reddit.com

aardbei123

28 days ago

r/MachineLearning ai reinforcement-learning game-engines

Thinking of creating an agent, need ideas

reddit.com

Ami_The_Inkling

29 days ago

r/AI_Agents ai agent llm reinforcement-learning nlp generative-ai

[P] SDLArch-RL is now compatible with Citra!!!! And we'll be training Street Fighter 6!!!

i.redd.it

AgeOfEmpires4AOE4

about 1 month ago

r/MachineLearning emulation gaming ai ai-ethics reinforcement-learning deep-learning

[P] RLHF (SFT, RM, PPO) with GPT-2 in Notebooks

reddit.com

ashz8888

about 1 month ago

r/MachineLearning gpt-2 reinforcement-learning transformers gpt2 nlp

[R] My RL agent taught itself a complete skill progression using only a “boredom” signal (no rewards)

reddit.com

knigre

about 1 month ago

r/MachineLearning ai ai-ethics reinforcement-learning ai-research self-hosted self-supervised

[Deep Dive] How We Solved Poker: From Academic Bots to Superhuman AI (1998-2025)

gist.github.com

Significant_Dog9466

about 1 month ago

r/programming ai reinforcement-learning game-engines deep-learning

Bosses said I had to learn agentic coding, so I made an open source zombie survival game that uses reinforcement learning

github.com

Ok-Use1747

about 2 months ago

r/programming reinforcement-learning game-engines open-source

217

Hugging Face Trending

Popular models from Hugging Face• Updated 23 minutes ago

RL ReSearch

DR-Tulu-8B

724

reinforcement-learning ai-research deep-learning

Xiaomi MiMo

MiMo-7B-RL

245

5,976

ai reinforcement-learning hardware

GitHub Trending

Popular repositories from GitHub• Updated 38 minutes ago

DLR-RM

stable-baselines3

PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.

Python

12,326

2,016

reinforcement-learning pytorch python deep-learning ai mlops

vwxyzjn

cleanrl

High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)

Python

8,485

922

reinforcement-learning python deep-learning ai

facebookresearch

habitat-lab

A modular high-level library to train embodied AI agents across a variety of tasks and environments.

Python

2,660

596

python nlp ai deep-learning robotics reinforcement-learning

microsoft

qlib

Qlib is an AI-oriented Quant investment platform that aims to use AI tech to empower Quant Research, from exploring ideas to implementing productions. Qlib supports diverse ML modeling paradigms, including supervised learning, market dynamics modeling, and RL, and is now equipped with https://github.com/microsoft/RD-Agent to automate R&D process.

Python

33,321

5,132

ai data-science reinforcement-learning mlops quantitative-finance quantitative-investment quantitative-investing deep-learning quantitative-research model-serving