23.08.13 (Sun)
Open Problems and Fundamental Limitations of Reinforcement Learning from Human FeedbackReinforcement learning from human feedback (RLHF) is a technique for
Open Problems and Fundamental Limitations of Reinforcement Learning from Human FeedbackReinforcement learning from human feedback (RLHF) is a technique for
Large Language Models Are Reasoning TeachersRecent works have shown that chain-of-thought (CoT) prompting can elicitlanguage models to solve complex reasoning
LeanDojo: Theorem Proving with Retrieval-Augmented Language ModelsKaiyu YangLLM based proving assistantTowards Language Models That Can See: Computer Vision Through the
토스 디자인 원칙 Value first, Cost later토스의 제품 디자인 원칙 중에는 Value first, cost later라는 항목이 있어요. 비용을 말하기 전에
PromptBase | Prompt Marketplace: DALL·E, Midjourney, ChatGPT, Stable Diffusion & GPTA marketplace for quality DALL·E, Midjourney, ChatGPT, Stable Diffusion