An open-source, Ray-based framework for scalable Reinforcement Learning from Human Feedback (RLHF).
Volcano Engine Reinforcement Learning library for efficient LLM post-training—open-sourced HybridFlow.