📢 검색 기능 추가 예정

RRHF

1 article

LLM Aug 14, 2023

23.08.14 (Mon)

RRHF: Rank Responses to Align Language Models with Human Feedback without tearsReinforcement Learning from Human Feedback (RLHF) facilitates the alignmentof

zoomg