๐Ÿ“ข ๊ฒ€์ƒ‰ ๊ธฐ๋Šฅ ์ถ”๊ฐ€ ์˜ˆ์ •

From Persona to Personalization: A Survey on Role-Playing Language Agents

๐Ÿ“ข
GPT๊ฐ€ ์š”์•ฝ์„ ์ƒ์„ฑํ•˜๊ณ , ์‚ฌ๋žŒ์ด ๊ฒ€ํ† ํ•œ ๋‚ด์šฉ์ž…๋‹ˆ๋‹ค.
์–ด๋–ค ๋‹จ๋ฝ์— ์–ด๋–ค ๋‚ด์šฉ์ด ์žˆ๋Š”์ง€ ๋น ๋ฅด๊ฒŒ ํŒŒ์•…ํ•˜๊ณ , ํ•ด๋‹น ๋‹จ๋ฝ์—์„œ ์†Œ๊ฐœ๋˜๋Š” ๋…ผ๋ฌธ์„ ์ฐพ๋Š”๋ฐ์— ๋„์›€์ด ๋˜๊ณ ์ž ์ •๋ฆฌํ–ˆ์Šต๋‹ˆ๋‹ค.

1. Introduction

์ตœ๊ทผ ๋Œ€ํ˜• ์–ธ์–ด ๋ชจ๋ธ(LLM)์˜ ๋น„์•ฝ์ ์ธ ๋ฐœ์ „๊ณผ ํ•จ๊ป˜, **์—ญํ•  ๋†€์ด ์–ธ์–ด ์—์ด์ „ํŠธ(RPLA)**๊ฐ€ ์ฃผ๋ชฉ๋ฐ›๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋“ค ์—์ด์ „ํŠธ๋Š” ์ฃผ์–ด์ง„ ํŽ˜๋ฅด์†Œ๋‚˜๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ์—ญ์‚ฌ์  ์ธ๋ฌผ, ์†Œ์„ค ์† ์บ๋ฆญํ„ฐ, ๋˜๋Š” ๊ฐœ์ธ์˜ ํŠน์„ฑ์„ ์ƒ์ƒํ•˜๊ฒŒ ์žฌํ˜„ํ•จ์œผ๋กœ์จ ๊ฐ์„ฑ ๋™๋ฐ˜์ž, ๊ฒŒ์ž„ ์บ๋ฆญํ„ฐ, ๊ฐœ์ธ ๋น„์„œ ๋“ฑ ๋‹ค์–‘ํ•œ ์‘์šฉ ๋ถ„์•ผ์— ํ™œ์šฉ๋˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋ฒˆ ํฌ์ŠคํŒ…์—์„œ๋Š” ๋…ผ๋ฌธ์˜ ์ „์ฒด ๊ตฌ์กฐ์™€ ์„ธ๋ถ€ ๋‚ด์šฉโ€”ํŽ˜๋ฅด์†Œ๋‚˜์˜ ๋ถ„๋ฅ˜, ๊ตฌ์ถ• ๋ฐฉ๋ฒ•๋ก , ํ‰๊ฐ€ ์ฒด๊ณ„, ๊ทธ๋ฆฌ๊ณ  ์ž ์žฌ์  ์œ„ํ—˜ ์š”์†Œ์™€ ํ–ฅํ›„ ์—ฐ๊ตฌ ๋ฐฉํ–ฅโ€”์„ ์ž์„ธํžˆ ์‚ดํŽด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

๋ฐฐ๊ฒฝ ๋ฐ ๋ฐœ์ „ ๋™ํ–ฅ: LLM์˜ ์ตœ์‹  ์—ฐ๊ตฌ ๋™ํ–ฅ๊ณผ ๊ทธ๋กœ ์ธํ•œ RPLA์˜ ๋ฐœ์ „ ๊ณผ์ •์„ ์„ค๋ช…ํ•ฉ๋‹ˆ๋‹ค.

RPLA์˜ ๊ฐœ๋…๊ณผ ์ •์˜: ์—ญํ•  ๋†€์ด ์–ธ์–ด ์—์ด์ „ํŠธ์˜ ๊ธฐ๋ณธ ๊ฐœ๋…, ๊ทธ๋ฆฌ๊ณ  ์ด๋ฅผ ๊ตฌ์„ฑํ•˜๋Š” ํ•ต์‹ฌ ์š”์†Œ๋“ค์„ ์ œ์‹œํ•ฉ๋‹ˆ๋‹ค.

ํŽ˜๋ฅด์†Œ๋‚˜ ๋ถ„๋ฅ˜ ์ฒด๊ณ„:

  • Demographic Persona: ์ง‘๋‹จ์˜ ํ†ต๊ณ„์  ํŠน์„ฑ๊ณผ ๊ณ ์ •๋œ ์‚ฌํšŒ์  ์—ญํ• (์˜ˆ: ์ง์—…, ์„ฑ๋ณ„, ์„ฑ๊ฒฉ ์œ ํ˜•)์„ ๋ฐ˜์˜.
  • Character Persona: ์—ญ์‚ฌ์  ์ธ๋ฌผ์ด๋‚˜ ์†Œ์„ค, ์˜ํ™” ์† ์ž˜ ์•Œ๋ ค์ง„ ์บ๋ฆญํ„ฐ์˜ ๊ตฌ์ฒด์ ์ธ ํŠน์„ฑ์„ ์žฌํ˜„.
  • Individualized Persona: ์‚ฌ์šฉ์ž์˜ ๊ฐœ์ธ ๋ฐ์ดํ„ฐ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ์ง€์†์ ์œผ๋กœ ๊ฐฑ์‹ ๋˜๋Š” ๋งž์ถคํ˜• ํ”„๋กœํ•„.

๊ตฌ์ถ• ๋ฐฉ๋ฒ•๋ก :

  • Parametric Training: ๋Œ€๊ทœ๋ชจ ์‚ฌ์ „ ํ•™์Šต, ์ง€๋„ํ•™์Šต, ๊ฐ•ํ™”ํ•™์Šต์„ ํ†ตํ•ด ํŽ˜๋ฅด์†Œ๋‚˜์˜ ๋‚ด์žฌ์  ์ง€์‹์„ ์ฃผ์ž….
  • Nonparametric Prompting: ํ”„๋กฌํ”„ํŠธ ๊ธฐ๋ฐ˜ ์ธ-์ปจํ…์ŠคํŠธ ๋Ÿฌ๋‹์„ ํ™œ์šฉํ•˜์—ฌ, ๋ณ„๋„์˜ ์žฌํ•™์Šต ์—†์ด๋„ ํŽ˜๋ฅด์†Œ๋‚˜๋ฅผ ์ฆ‰๊ฐ์ ์œผ๋กœ ๊ตฌํ˜„.

ํ‰๊ฐ€ ์ฒด๊ณ„: ์—ญํ•  ์ˆ˜ํ–‰ ๋Šฅ๋ ฅ(๋Œ€ํ™” ๋ชฐ์ž…๋„, ์œ ์ฐฝ์„ฑ ๋“ฑ)๊ณผ ํŽ˜๋ฅด์†Œ๋‚˜ ์ถฉ์‹ค๋„(์–ธ์–ด ์Šคํƒ€์ผ, ์ง€์‹, ์„ฑ๊ฒฉ ์žฌํ˜„ ๋“ฑ)๋ฅผ ๋‹ค์–‘ํ•œ ์ž๋™ ๋ฐ ์ธ๊ฐ„ ํ‰๊ฐ€ ๊ธฐ๋ฒ•์œผ๋กœ ๊ฒ€์ฆ.

์œ„ํ—˜ ์š”์†Œ ๋ฐ ํ•œ๊ณ„: ํŽธํ–ฅ, ๋…์„ฑ(ํ† xic) ๋ฌธ์ œ, ํ• ๋ฃจ์‹œ๋„ค์ด์…˜ ๋“ฑ RPLA ๊ฐœ๋ฐœ์— ๋”ฐ๋ฅธ ๋ถ€์ •์  ์ธก๋ฉด๊ณผ ์ด๋ฅผ ์™„ํ™”ํ•˜๊ธฐ ์œ„ํ•œ ์—ฐ๊ตฌ ๋ฐฉํ–ฅ.

๋ฏธ๋ž˜ ์—ฐ๊ตฌ ๋ฐฉํ–ฅ: ์•ˆ์ „ํ•˜๊ณ  ์œค๋ฆฌ์ ์ธ AI ๋™๋ฐ˜์ž ๊ตฌํ˜„, ๊ฐœ์ธํ™”์˜ ์ง€์†์  ์ง„ํ™”, ๋‹ค์ค‘ ๋ชจ๋‹ฌ ๋ฐ์ดํ„ฐ ํ†ตํ•ฉ ๋“ฑ ์•ž์œผ๋กœ์˜ ๋„์ „ ๊ณผ์ œ์™€ ๋ฐœ์ „ ๊ฐ€๋Šฅ์„ฑ.

2. Preliminary

2.1 The Roadmap of Large Language Models

์ตœ๊ทผ LLM์€ ์ธ-์ปจํ…์ŠคํŠธ ๋Ÿฌ๋‹, ์ธ์ŠคํŠธ๋Ÿญ์…˜ ํŒ”๋กœ์ž‰, ๋‹จ๊ณ„๋ณ„ ์ถ”๋ก  ๋“ฑ ๋‹ค์–‘ํ•œ ์ธ๊ฐ„ ์œ ์‚ฌ ๋Šฅ๋ ฅ์„ ๋ณด์—ฌ์ฃผ๋ฉฐ, ๊ทธ ๊ฒฐ๊ณผ๋กœ ์—ญํ•  ๋†€์ด์™€ ๊ฐ™์€ ๋ณต์žกํ•œ ์‚ฌํšŒ์  ์ƒํ˜ธ์ž‘์šฉ์„ ์žฌํ˜„ํ•  ์ˆ˜ ์žˆ๊ฒŒ ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

  • Emerged Abilities in LLMs: LLM์—์„œ ์ƒˆ๋กญ๊ฒŒ ๋“ฑ์žฅํ•œ ํ•ต์‹ฌ ๋Šฅ๋ ฅ๋“ค์„ ์ƒ์„ธํžˆ ์„ค๋ช…ํ•ฉ๋‹ˆ๋‹ค. ์ธ-์ปจํ…์ŠคํŠธ ๋Ÿฌ๋‹, ์ธ์ŠคํŠธ๋Ÿญ์…˜ ํŒ”๋กœ์ž‰, ๋‹จ๊ณ„๋ณ„ ์ถ”๋ก  ๋ฐ ์‚ฌํšŒ์  ์ง€๋Šฅ๊ณผ ๊ฐ™์€ ๊ธฐ๋Šฅ๋“ค์ด LLM์ด ๋ณต์žกํ•œ ์—ญํ•  ๋†€์ด๋ฅผ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•˜๋Š” ๊ธฐ๋ฐ˜์ž„์„ ๊ฐ•์กฐํ•ฉ๋‹ˆ๋‹ค.
  • Anthropomorphic Cognition in LLMs: LLM์ด ์ ์ฐจ ์ธ๊ฐ„๊ณผ ์œ ์‚ฌํ•œ ์ธ์ง€ ๋ฐ ๊ฐ์ •์  ํŠน์„ฑ์„ ๋‚˜ํƒ€๋‚ด๊ธฐ ์‹œ์ž‘ํ–ˆ์Œ์„ ๋…ผ์˜ํ•ฉ๋‹ˆ๋‹ค. ์ดˆ๊ธฐ์—๋Š” ์˜์‹์˜ ์ถœํ˜„ ๋…ผ์˜๊ฐ€ ์žˆ์—ˆ์œผ๋‚˜, ํ˜„์žฌ๋Š” ์ž๊ฐ, ๊ฐ€์น˜, ๊ฐ์ • ์ธ์‹, ์‹ฌ๋ฆฌ์  ํŠน์„ฑ ๋“ฑ ๋‹ค์–‘ํ•œ ์ธ๊ฐ„์  ์š”์†Œ๋ฅผ ๋ชจ๋ฐฉํ•˜๋Š” ๋Šฅ๋ ฅ์ด ๊ฐ•์กฐ๋ฉ๋‹ˆ๋‹ค. ๋‹จ, ์ด๋Š” ์‹ค์ œ ์˜์‹์˜ ์ฆ๊ฑฐ๊ฐ€ ์•„๋‹ˆ๋ผ ์—ญํ•  ๋†€์ด ์„ฑ๊ฒฉ์˜ ๊ฒฐ๊ณผ์ž„์„ ์–ธ๊ธ‰ํ•ฉ๋‹ˆ๋‹ค.
  • Retrieval-augmented Generation of LLMs: ์™ธ๋ถ€ ์ •๋ณด ๊ฒ€์ƒ‰์„ ํ†ตํ•ฉํ•˜๋Š” Retrieval-augmented Generation(RAG) ๊ธฐ๋ฒ•์„ ์†Œ๊ฐœํ•ฉ๋‹ˆ๋‹ค. RAG๋Š” ์ƒ์„ฑ ๊ณผ์ • ์ค‘ ์‹ค์‹œ๊ฐ„์œผ๋กœ ์™ธ๋ถ€ ๋ฐ์ดํ„ฐ๋ฅผ ์ฐธ์กฐํ•˜์—ฌ ์‚ฌ์‹ค ์˜ค๋ฅ˜๋ฅผ ์ค„์ด๊ณ , ๊ธด ์ปจํ…์ŠคํŠธ๋ฅผ ์ฒ˜๋ฆฌํ•  ์ˆ˜ ์žˆ๊ฒŒ ํ•˜์—ฌ ์—ญํ•  ๋†€์ด ์‹œ๋‚˜๋ฆฌ์˜ค์—์„œ ์œ ์šฉํ•จ์„ ์„ค๋ช…ํ•ฉ๋‹ˆ๋‹ค.

2.2 LLM-powed Language Agnets

์ „ํ†ต์ ์ธ ์‹ฌ๋ณผ๋ฆญ ์—์ด์ „ํŠธ์™€ ๊ฐ•ํ™”ํ•™์Šต ๊ธฐ๋ฐ˜ ์—์ด์ „ํŠธ์˜ ํ•œ๊ณ„๋ฅผ ์–ธ๊ธ‰ํ•˜๋ฉฐ, ์ตœ๊ทผ LLM ๊ธฐ๋ฐ˜ ์–ธ์–ด ์—์ด์ „ํŠธ๊ฐ€ ์ธ๊ฐ„ ์ˆ˜์ค€์˜ ์ง€๋Šฅ๊ณผ ์ƒํ˜ธ์ž‘์šฉ ๋Šฅ๋ ฅ์„ ๋ฐ”ํƒ•์œผ๋กœ ๋“ฑ์žฅํ•˜๊ณ  ์žˆ์Œ์„ ์†Œ๊ฐœํ•ฉ๋‹ˆ๋‹ค.

  • Planning Module: ์‹ค์ œ ์ƒํ™ฉ์—์„œ ์—์ด์ „ํŠธ๊ฐ€ ๋ณต์žกํ•œ ์ž‘์—…์„ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด ์žฅ๊ธฐ ๊ณ„ํš์„ ์ˆ˜๋ฆฝํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์„ค๋ช…ํ•ฉ๋‹ˆ๋‹ค. LLM ์—์ด์ „ํŠธ๋Š” Chain-of-Thought๋‚˜ ReAct ๊ฐ™์€ ์ „๋žต์„ ์‚ฌ์šฉํ•ด ์ž‘์—…์„ ์„ธ๋ถ„ํ™”ํ•˜๊ณ  ํ™˜๊ฒฝ ํ”ผ๋“œ๋ฐฑ์— ๋”ฐ๋ผ ๋™์ ์œผ๋กœ ๊ณ„ํš์„ ์กฐ์ •ํ•ฉ๋‹ˆ๋‹ค.
  • Tool-usage Module: LLM์ด ํŠน์ • ์ „๋ฌธ ์˜์—ญ์—์„œ ๋ฐœ์ƒํ•  ์ˆ˜ ์žˆ๋Š” ์ง€์‹์˜ ํ•œ๊ณ„๋‚˜ ํ• ๋ฃจ์‹œ๋„ค์ด์…˜ ๋ฌธ์ œ๋ฅผ ๋ณด์™„ํ•˜๊ธฐ ์œ„ํ•ด, ์™ธ๋ถ€ API, ์ง€์‹ ๋ฒ ์ด์Šค ๋“ฑ ์™ธ๋ถ€ ๋„๊ตฌ๋ฅผ ํ™œ์šฉํ•˜์—ฌ ๋ณด๋‹ค ์ •ํ™•ํ•˜๊ณ  ๋งฅ๋ฝ์— ์ ํ•ฉํ•œ ์‘๋‹ต์„ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ์Œ์„ ์„ค๋ช…ํ•ฉ๋‹ˆ๋‹ค.
  • Memory Mechanism: ์—์ด์ „ํŠธ๊ฐ€ ์‚ฌ์šฉ์ž ๋ฐ ํ™˜๊ฒฝ ์ •๋ณด๋ฅผ ์ €์žฅํ•˜์—ฌ ์ง€์†์ ์ธ ๋Œ€ํ™” ๋งฅ๋ฝ์„ ์œ ์ง€ํ•˜๋Š” ๋ฉ”๋ชจ๋ฆฌ ๋ฉ”์ปค๋‹ˆ์ฆ˜์˜ ์ค‘์š”์„ฑ์„ ๋‹ค๋ฃน๋‹ˆ๋‹ค. ๋‹จ๊ธฐ ๋ฉ”๋ชจ๋ฆฌ(ํŠธ๋žœ์Šคํฌ๋จธ์˜ ์ปจํ…์ŠคํŠธ ํ•œ๊ณ„ ๋‚ด ์ •๋ณด)์™€ ์žฅ๊ธฐ ๋ฉ”๋ชจ๋ฆฌ(์™ธ๋ถ€ ์ €์žฅ์†Œ)๋ฅผ ๊ตฌ๋ถ„ํ•˜์—ฌ, ๊ฐœ์ธํ™”๋œ ์‘๋‹ต๊ณผ ์—ฐ์†์ ์ธ ์ƒํ˜ธ์ž‘์šฉ์„ ๊ฐ€๋Šฅํ•˜๊ฒŒ ํ•˜๋Š” ์—ญํ• ์„ ์„ค๋ช…ํ•ฉ๋‹ˆ๋‹ค.

3. Overview of RPLAs

3.1 RPLA Definition

RPLA๋ฅผ ๊ตฌ์„ฑํ•˜๋Š” ํ•ต์‹ฌ ํŽ˜๋ฅด์†Œ๋‚˜๋ฅผ ์„ธ ๊ฐ€์ง€๋กœ ๊ตฌ๋ถ„ํ•˜๋Š” ์ „์ฒด์ ์ธ ํ‹€์„ ์ œ์‹œํ•ฉ๋‹ˆ๋‹ค. ์ฆ‰, ํŽ˜๋ฅด์†Œ๋‚˜๋Š” ๋ฒ”์œ„๊ฐ€ ๋„“์€ ์ง‘๋‹จ ํŠน์„ฑ์„ ๋ฐ˜์˜ํ•˜๋Š” Demographic Persona, ์ž˜ ํ™•๋ฆฝ๋œ ์ธ๋ฌผ์ด๋‚˜ ์บ๋ฆญํ„ฐ๋ฅผ ๋‚˜ํƒ€๋‚ด๋Š” Character Persona, ๊ทธ๋ฆฌ๊ณ  ์‚ฌ์šฉ์ž ๊ฐœ๊ฐœ์ธ์˜ ํ–‰๋™๊ณผ ์„ ํ˜ธ๋ฅผ ๋ฐ˜์˜ํ•˜์—ฌ ์ง€์†์ ์œผ๋กœ ๊ฐฑ์‹ ๋˜๋Š” Individualized Persona๋กœ ๋ถ„๋ฅ˜๋ฉ๋‹ˆ๋‹ค.

(1) Demographic Persona

Demographic Persona๋Š” ์ง์—…, ์„ฑ๋ณ„, ์ธ์ข…, ์„ฑ๊ฒฉ ๋“ฑ๊ณผ ๊ฐ™์ด ํ†ต๊ณ„์  ๋˜๋Š” ์ง‘๋‹จ์  ํŠน์„ฑ์„ ๋ฐ˜์˜ํ•˜๋Š” ํŽ˜๋ฅด์†Œ๋‚˜์ž…๋‹ˆ๋‹ค. ์ด๋“ค์€ LLM์ด ์‚ฌ์ „ ํ•™์Šต ๋ฐ์ดํ„ฐ์— ๋‚ด์žฌํ•œ ํ†ต๊ณ„์  ํŒจํ„ด์„ ํ™œ์šฉํ•ด ๊ฐ„๋‹จํ•œ ํ”„๋กฌํ”„ํŠธ(์˜ˆ: โ€œ๋‹น์‹ ์€ ์ˆ˜ํ•™์ž์ž…๋‹ˆ๋‹คโ€)๋กœ ์‰ฝ๊ฒŒ ํ™œ์„ฑํ™”๋˜๋ฉฐ, ํŠน์ • ์ง‘๋‹จ์˜ ์ „ํ˜•์ ์ธ ํ–‰๋™๊ณผ ์–ธ์–ด ํŒจํ„ด์„ ์‹œ๋ฎฌ๋ ˆ์ด์…˜ํ•˜๋Š” ๋ฐ ํšจ๊ณผ์ ์ž…๋‹ˆ๋‹ค.

(2) Character Persona

Character Persona๋Š” ์—ญ์‚ฌ์  ์ธ๋ฌผ, ์†Œ์„ค, ์˜ํ™” ๋“ฑ์—์„œ ์ž˜ ์•Œ๋ ค์ง„ ์ธ๋ฌผ์ด๋‚˜ ์บ๋ฆญํ„ฐ์˜ ๊ณ ์œ  ํŠน์„ฑ์„ ์žฌํ˜„ํ•˜๋Š” ๋ฐ ์ค‘์ ์„ ๋‘ก๋‹ˆ๋‹ค. ์ด ํŽ˜๋ฅด์†Œ๋‚˜๋Š” ์ „๊ธฐ, ์†Œ์„ค, ์˜ํ™” ์Šคํฌ๋ฆฝํŠธ ๋“ฑ ๋‹ค์–‘ํ•œ ์ž๋ฃŒ๋กœ๋ถ€ํ„ฐ ๋ฐ์ดํ„ฐ๋ฅผ ์ˆ˜์ง‘ํ•˜์—ฌ, ํ•ด๋‹น ์ธ๋ฌผ์˜ ๋ฐฐ๊ฒฝ, ์„ฑ๊ฒฉ, ์–ธ์–ด ์Šคํƒ€์ผ ๋ฐ ๋‚ด๋Ÿฌํ‹ฐ๋ธŒ๋ฅผ ์ถฉ์‹คํ•˜๊ฒŒ ๋ฐ˜์˜ํ•˜๋Š” ์—ญํ• ์„ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค. ์ฃผ๋กœ ์—”ํ„ฐํ…Œ์ธ๋จผํŠธ๋‚˜ ๊ฐ์„ฑ์  ๋ชฐ์ž…์„ ์œ„ํ•œ ์‘์šฉ์— ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค.

(3) Individualized Persona

Individualized Persona๋Š” ๊ฐœ๋ณ„ ์‚ฌ์šฉ์ž์˜ ๋Œ€ํ™”, ํ–‰๋™, ์„ ํ˜ธ๋„ ๋“ฑ์—์„œ ์ถ”์ถœ๋œ ๋ฐ์ดํ„ฐ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ๊ตฌ์ถ•๋ฉ๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ํŽ˜๋ฅด์†Œ๋‚˜๋Š” ์‚ฌ์šฉ์ž์˜ ์ง€์†์ ์ธ ์ƒํ˜ธ์ž‘์šฉ์„ ํ†ตํ•ด ๋ณ€ํ™”ํ•˜๋ฉฐ, ๊ฐœ์ธํ™”๋œ ์„œ๋น„์Šค(์˜ˆ: ๊ฐœ์ธ ๋น„์„œ, ๋””์ง€ํ„ธ ํด๋ก )๋ฅผ ์ œ๊ณตํ•˜๊ธฐ ์œ„ํ•ด ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค. ๋ฐ์ดํ„ฐ๊ฐ€ ์ง€์†์ ์œผ๋กœ ๊ฐฑ์‹ ๋จ์— ๋”ฐ๋ผ ์—์ด์ „ํŠธ์˜ ์‘๋‹ต๋„ ๋™์ ์œผ๋กœ ๋ณ€ํ™”ํ•˜๊ฒŒ ๋ฉ๋‹ˆ๋‹ค.

3.2 RPLA Construction

RPLA๊ฐ€ ์–ด๋–ป๊ฒŒ ๋ณต์žกํ•œ ํŽ˜๋ฅด์†Œ๋‚˜๋ฅผ ์‹œ๋ฎฌ๋ ˆ์ด์…˜ํ•˜๊ธฐ ์œ„ํ•ด ๋‹ค์–‘ํ•œ ๋ฐ์ดํ„ฐ(์„ค๋ช…์  ์„œ์ˆ , ๋Œ€ํ™”, ์—ญ์‚ฌ์  ํ–‰๋™, ๋ฌธํ•™ ๋“ฑ)๋ฅผ ํ™œ์šฉํ•˜์—ฌ ๊ตฌ์ถ•๋˜๋Š”์ง€ ๊ฐœ๊ด„ํ•ฉ๋‹ˆ๋‹ค. ์ฆ‰, RPLA๋Š” ๋‹ค์–‘ํ•œ ์ž๋ฃŒ๋กœ๋ถ€ํ„ฐ ์–ป์€ ํŽ˜๋ฅด์†Œ๋‚˜ ๋ฐ์ดํ„ฐ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ์—์ด์ „ํŠธ์˜ ์—ญํ• ๊ณผ ํ–‰๋™์„ ๊ตฌ์„ฑํ•ฉ๋‹ˆ๋‹ค.

Parametric Training ์ ‘๊ทผ๋ฒ•

Parametric Training์€ RPLA ๊ตฌ์ถ•์„ ์œ„ํ•œ ์ฃผ์š” ๋ฐฉ๋ฒ• ์ค‘ ํ•˜๋‚˜๋กœ, ์‚ฌ์ „ ํ•™์Šต(pre-training), ์ง€๋„ ํ•™์Šต(Supervised Fine-Tuning), ๊ทธ๋ฆฌ๊ณ  ๊ฐ•ํ™” ํ•™์Šต(Reinforcement Learning)์„ ํฌํ•จํ•ฉ๋‹ˆ๋‹ค.

  • ์‚ฌ์ „ ํ•™์Šต: ๋Œ€๊ทœ๋ชจ ์›์‹œ ํ…์ŠคํŠธ(์˜ˆ: ๋ฌธํ•™ ์ž‘ํ’ˆ, ๋ฐฑ๊ณผ์‚ฌ์ „ ๋“ฑ)๋ฅผ ํ†ตํ•ด ๋ชจ๋ธ์ด ํญ๋„“์€ ํŽ˜๋ฅด์†Œ๋‚˜ ๊ด€๋ จ ์ง€์‹์„ ๋‚ด์žฌํ™”ํ•ฉ๋‹ˆ๋‹ค.
  • ์ง€๋„ ํ•™์Šต: ์—ญํ•  ๋†€์ด ๋ฐ์ดํ„ฐ์…‹์„ ํ™œ์šฉํ•˜์—ฌ, ํŠน์ • ํŽ˜๋ฅด์†Œ๋‚˜์˜ ํŠน์„ฑ์„ ๋”์šฑ ์„ธ๋ฐ€ํ•˜๊ฒŒ ๋ฐ˜์˜ํ•˜๋„๋ก ๋ชจ๋ธ์„ ๋ฏธ์„ธ ์กฐ์ •ํ•ฉ๋‹ˆ๋‹ค.
  • ๊ฐ•ํ™” ํ•™์Šต: ์‚ฌ์šฉ์ž ํ”ผ๋“œ๋ฐฑ์ด๋‚˜ ์„ ํ˜ธ ๋ฐ์ดํ„ฐ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ, ๋ชจ๋ธ์ด ์ผ๋ฐ˜ ์‚ฌ์šฉ์ž์™€์˜ ์ƒํ˜ธ์ž‘์šฉ์—์„œ ์œค๋ฆฌ์ ์ด๊ณ  ์‚ฌํšŒ์ ์œผ๋กœ ์ ์ ˆํ•œ ์‘๋‹ต์„ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ๋„๋ก ์ถ”๊ฐ€์ ์œผ๋กœ ์ตœ์ ํ™”ํ•ฉ๋‹ˆ๋‹ค.

Nonparametric Prompting ์ ‘๊ทผ๋ฒ•

Nonparametric Prompting์€ ๋ณ„๋„์˜ ์žฌํ•™์Šต ์—†์ด๋„ ํ”„๋กฌํ”„ํŠธ ๋‚ด์—์„œ ํŽ˜๋ฅด์†Œ๋‚˜ ์ •๋ณด๋ฅผ ์ œ๊ณตํ•˜์—ฌ ๋ชจ๋ธ์ด ์ฆ‰๊ฐ์ ์œผ๋กœ ์—ญํ• ์„ ์ˆ˜ํ–‰ํ•˜๋„๋ก ํ•˜๋Š” ๊ธฐ๋ฒ•์ž…๋‹ˆ๋‹ค.

  • ํ”„๋กฌํ”„ํŠธ ๊ตฌ์„ฑ ์š”์†Œ: ํŽ˜๋ฅด์†Œ๋‚˜์˜ ์„ค๋ช…(ํ”„๋กœํ•„)๊ณผ ํ•จ๊ป˜ ํ•ด๋‹น ์—ญํ• ์— ๋งž๋Š” ๋Œ€ํ™” ์˜ˆ์‹œ(๋ฐ๋ชจ)๋ฅผ ํฌํ•จํ•˜์—ฌ ๋ชจ๋ธ์— ์ „๋‹ฌํ•ฉ๋‹ˆ๋‹ค.
  • ๋ฐ์ดํ„ฐ ์ œ์ž‘ ๋ฐฉ๋ฒ•: ์˜จ๋ผ์ธ ๋ฆฌ์†Œ์Šค(์˜ˆ: Wikipedia, Fandom), ์ž๋™ ์ถ”์ถœ(LLM์œผ๋กœ ์ฑ…์ด๋‚˜ ์Šคํฌ๋ฆฝํŠธ์—์„œ ์ถ”์ถœ), ๋Œ€ํ™” ํ•ฉ์„ฑ(์—ญํ• ์„ ํ•™์Šตํ•œ LLM์„ ์ด์šฉํ•œ ๋Œ€ํ™” ๋ฐ์ดํ„ฐ ์ƒ์„ฑ), ๊ทธ๋ฆฌ๊ณ  ์ธ๊ฐ„ ์ฃผ์„(์‚ฌ๋žŒ์ด ์ง์ ‘ ํŽ˜๋ฅด์†Œ๋‚˜ ์„ค๋ช…, ๋Œ€ํ™” ์˜ˆ์‹œ ์ œ์ž‘) ๋“ฑ์˜ ๋ฐฉ๋ฒ•์„ ํ†ตํ•ด ํŽ˜๋ฅด์†Œ๋‚˜ ๋ฐ์ดํ„ฐ๋ฅผ ์ƒ์„ฑํ•˜๊ณ  ์ •์ œํ•ฉ๋‹ˆ๋‹ค.
  • ๊ธฐํƒ€ ๋ณด์™„ ๊ธฐ๋ฒ•: ์ปจํ…์ŠคํŠธ ํ•œ๊ณ„๋ฅผ ๊ทน๋ณตํ•˜๊ธฐ ์œ„ํ•ด ๋ฉ”๋ชจ๋ฆฌ ๋ชจ๋“ˆ์„ ๋„์ž…ํ•˜์—ฌ, ๋ชจ๋ธ์ด ํ•„์š”ํ•  ๋•Œ ๋ฐฉ๋Œ€ํ•œ ํŽ˜๋ฅด์†Œ๋‚˜ ์ •๋ณด๋ฅผ ์™ธ๋ถ€ ์ €์žฅ์†Œ์—์„œ ๋™์ ์œผ๋กœ ๋ถˆ๋Ÿฌ์˜ฌ ์ˆ˜ ์žˆ๋„๋ก ํ•ฉ๋‹ˆ๋‹ค.

3.3 RPLA Evaluation

RPLA์˜ ํ‰๊ฐ€ ๊ธฐ์ค€์€ ํฌ๊ฒŒ ๋‘ ์ถ•์œผ๋กœ ๊ตฌ๋ถ„๋ฉ๋‹ˆ๋‹ค. ํ•˜๋‚˜๋Š” ์ „์ฒด์ ์ธ ์—ญํ•  ์ˆ˜ํ–‰ ๋Šฅ๋ ฅ(์˜ˆ: ๋Œ€ํ™” ๋ชฐ์ž…๋„, ์œ ์ฐฝ์„ฑ, ์‚ฌํšŒ์  ์ƒํ˜ธ์ž‘์šฉ ๋“ฑ)์ด๋ฉฐ, ๋‹ค๋ฅธ ํ•˜๋‚˜๋Š” ํŠน์ • ํŽ˜๋ฅด์†Œ๋‚˜๋ฅผ ์–ผ๋งˆ๋‚˜ ์ถฉ์‹คํ•˜๊ฒŒ ์žฌํ˜„ํ•˜๋Š”์ง€(์–ธ์–ด ์Šคํƒ€์ผ, ๋ฐฐ๊ฒฝ ์ง€์‹, ์„ฑ๊ฒฉ ๋ฐ ์‚ฌ๊ณ  ๊ณผ์ • ๋“ฑ)๋ฅผ ํ‰๊ฐ€ํ•˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.

์—ญํ•  ์ˆ˜ํ–‰ ๋Šฅ๋ ฅ ํ‰๊ฐ€

์—์ด์ „ํŠธ์˜ ์—ญํ•  ์ˆ˜ํ–‰ ๋Šฅ๋ ฅ์€ ์ฃผ๋กœ ๊ธฐ๋ณธ ๋ชจ๋ธ์˜ ๋Šฅ๋ ฅ๊ณผ ๊ตฌ์ถ•๋œ ํ”„๋ ˆ์ž„์›Œํฌ์— ๊ธฐ๋ฐ˜ํ•˜์—ฌ ํ‰๊ฐ€๋ฉ๋‹ˆ๋‹ค. ์—ฌ๊ธฐ์—๋Š” LLM์˜ ์ธ๋ฅ˜ ๋ชจ๋ฐฉ ๋Šฅ๋ ฅ, ๋Œ€ํ™” ์ฐธ์—ฌ๋„, ๋ชฐ์ž…๊ฐ, ๊ฐ์ • ์ดํ•ด, ๋ฌธ์ œ ํ•ด๊ฒฐ ๋Šฅ๋ ฅ ๋“ฑ ์„ธ๋ถ€ ์ง€ํ‘œ๋“ค์ด ํฌํ•จ๋ฉ๋‹ˆ๋‹ค. ์ด ํ‰๊ฐ€ ๋ฐฉ์‹์€ RPLA๊ฐ€ ์‚ฌ์šฉ์ž์˜ ๊ธฐ๋Œ€์— ๋ถ€ํ•ฉํ•˜๋Š” โ€œ์ธ๊ฐ„ ๊ฐ™์€โ€ ๋Œ€ํ™”๋ฅผ ์–ผ๋งˆ๋‚˜ ์ž˜ ์ œ๊ณตํ•˜๋Š”์ง€๋ฅผ ์ธก์ •ํ•ฉ๋‹ˆ๋‹ค.

ํŽ˜๋ฅด์†Œ๋‚˜ ์ถฉ์‹ค๋„ ํ‰๊ฐ€

ํŽ˜๋ฅด์†Œ๋‚˜ ์ถฉ์‹ค๋„๋Š” ๊ฐ RPLA๊ฐ€ ์˜๋„๋œ ์บ๋ฆญํ„ฐ์˜ ํŠน์„ฑ(์ง€์‹, ์–ธ์–ด ์Šต๊ด€, ์„ฑ๊ฒฉ, ์‹ ๋…, ๊ฒฐ์ • ๊ณผ์ • ๋“ฑ)์„ ์–ผ๋งˆ๋‚˜ ์ •ํ™•ํ•˜๊ฒŒ ์žฌํ˜„ํ•˜๋Š”์ง€๋ฅผ ํ‰๊ฐ€ํ•ฉ๋‹ˆ๋‹ค. ์ด ๊ณผ์ •์—์„œ๋Š” ๋ชจ๋ธ์ด ์ œ๊ณตํ•ด์•ผ ํ•˜๋Š” ํ•ต์‹ฌ ์ •๋ณด์™€ ํ‘œํ˜„ ๋ฐฉ์‹์ด ์˜ฌ๋ฐ”๋ฅด๊ฒŒ ๋ฐ˜์˜๋˜๋Š”์ง€๋ฅผ ์ค‘์ ์ ์œผ๋กœ ์‚ดํŽด๋ด…๋‹ˆ๋‹ค.

ํ‰๊ฐ€ ๋ฐฉ๋ฒ•๋ก 

ํ‰๊ฐ€์—๋Š” ์ฃผ๋กœ ๋„ค ๊ฐ€์ง€ ๋ฐฉ๋ฒ•์ด ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค.

  1. ์ž๋™ ํ‰๊ฐ€(ground truth ๊ธฐ๋ฐ˜): ์ •๋‹ต๊ณผ์˜ ์œ ์‚ฌ๋„ ์ธก์ •์„ ํ†ตํ•ด ๊ฐ๊ด€์  ์„ฑ๋Šฅ ์ ์ˆ˜๋ฅผ ์‚ฐ์ถœ.
  2. ์ž๋™ ํ‰๊ฐ€(ground truth ์—†์ด): LLM์„ ํ‰๊ฐ€์ž๋กœ ํ™œ์šฉํ•˜๊ฑฐ๋‚˜ ์ •ํ•ด์ง„ ๊ธฐ์ค€์œผ๋กœ ๋ชจ๋ธ์˜ ์‘๋‹ต์„ ๋ถ„๋ฅ˜.
  3. ๋‹ค์ง€์„ ๋‹คํ˜• ํ‰๊ฐ€: ๋ฏธ๋ฆฌ ์„ค์ •๋œ ์˜ต์…˜ ์ค‘ ์˜ฌ๋ฐ”๋ฅธ ์‘๋‹ต ์„ ํƒ ์—ฌ๋ถ€๋ฅผ ํ‰๊ฐ€.
  4. ์ธ๊ฐ„ ํ‰๊ฐ€: ์ „๋ฌธ๊ฐ€ ๋˜๋Š” ๊ด€๋ จ ๋ถ„์•ผ ํ‰๊ฐ€์ž๊ฐ€ ์ง์ ‘ ์‘๋‹ต์˜ ์งˆ๊ณผ ํŽ˜๋ฅด์†Œ๋‚˜ ์ถฉ์‹ค๋„๋ฅผ ๊ฒ€ํ† .

ํ˜„์žฌ RPLA๋Š” ์ ์ฐจ ๊ฐœ์„ ๋˜๊ณ  ์žˆ์œผ๋‚˜, ์—ฌ์ „ํžˆ ์™„์ „ํžˆ ์ธ๊ฐ„ ์ˆ˜์ค€์˜ ์—ญํ•  ์žฌํ˜„์—๋Š” ๋ฏธ์น˜์ง€ ๋ชปํ•˜๋ฉฐ, ํŠนํžˆ ํŽ˜๋ฅด์†Œ๋‚˜ ์ถฉ์‹ค๋„ ์ธก๋ฉด์—์„œ๋Š” ๋ณด๋‹ค ์„ธ๋ฐ€ํ•œ ํ‰๊ฐ€ ๋ฐฉ๋ฒ•์ด ์š”๊ตฌ๋ฉ๋‹ˆ๋‹ค.

4. Demographic Persona

4.1 Definition

RPLA์— ํ• ๋‹น๋œ Demographic Persona๋Š” ํŠน์ • ์ง‘๋‹จ์˜ ์ „ํ˜•์ ์ธ ํŠน์„ฑโ€”์˜ˆ๋ฅผ ๋“ค์–ด, ์ง์—…(์ˆ˜ํ•™์ž), ์ทจ๋ฏธ(์•ผ๊ตฌ๊ด‘), ์„ฑ๊ฒฉ(ENFJ) ๋“ฑโ€”์„ ๋ฐ˜์˜ํ•˜๋„๋ก ์„ค๊ณ„๋ฉ๋‹ˆ๋‹ค. ์ด ๋ฌธ๋‹จ์€ ์ด๋Ÿฌํ•œ ํŽ˜๋ฅด์†Œ๋‚˜๊ฐ€ ํ•ด๋‹น ์ง‘๋‹จ์˜ ์–ธ์–ด ์Šคํƒ€์ผ, ์ „๋ฌธ ์ง€์‹, ํ–‰๋™ ์–‘์‹์„ ํ†ตํ•ฉํ•˜์—ฌ ์žฌํ˜„๋œ๋‹ค๋Š” ์ ์„ ๊ฐ•์กฐํ•ฉ๋‹ˆ๋‹ค.

4.2 Analysis of Demographics

RPLA๋Š” ์ธ๊ฐ„๊ณผ ์œ ์‚ฌํ•œ ๋‚ด์žฌ์  ํŠน์„ฑ(์„ฑ๊ฒฉ, ์ •์น˜์  ์‹ ๋…, ์œค๋ฆฌ์  ๊ณ ๋ ค ๋“ฑ)์„ ๋ฐ˜์˜ํ•ฉ๋‹ˆ๋‹ค. ์ด๋“ค์€ ์ง€์ •๋œ ํŽ˜๋ฅด์†Œ๋‚˜์— ๋”ฐ๋ผ ํ–‰๋™์„ ๋ณ€ํ™”์‹œํ‚ฌ ์ˆ˜ ์žˆ์œผ๋‚˜, ๋™์‹œ์— ๋ถ€์ ์ ˆํ•˜๊ฑฐ๋‚˜ ๋…์„ฑ์ด ์žˆ๋Š” ์‘๋‹ต์„ ์œ ๋ฐœํ•  ์œ„ํ—˜๋„ ๋‚ดํฌํ•ฉ๋‹ˆ๋‹ค.

Inherent Demographics

RPLA๊ฐ€ ์‚ฌ์ „ ํ•™์Šต ๋ฐ์ดํ„ฐ์— ๋‚ด์žฌํ•œ ํŒจํ„ด ๋•๋ถ„์— ํŠน์ • ์ธ๊ตฌ ํ†ต๊ณ„์  ํŠน์„ฑ์„ ์ž์—ฐ์Šค๋Ÿฝ๊ฒŒ ๋ฐ˜์˜ํ•  ์ˆ˜ ์žˆ์Œ์„ ์„ค๋ช…ํ•ฉ๋‹ˆ๋‹ค. ์ด ๊ณผ์ •์—์„œ ์ธ๊ฐ„์˜ ํŽธํ–ฅ๊ณผ ํ–‰๋™ ์–‘์ƒ์ด ํ…์ŠคํŠธ ์ถœ๋ ฅ์— ์˜ํ–ฅ์„ ๋ฏธ์ณ, ํŠน์ • ์ง‘๋‹จ์˜ ํŠน์„ฑ์ด ๊ณผ๋„ํ•˜๊ฒŒ ๋“œ๋Ÿฌ๋‚  ์ˆ˜ ์žˆ์Œ์„ ์ง€์ ํ•ฉ๋‹ˆ๋‹ค.

Demographic Role-Playing

Demographic Role-Playing์€ ๋ช…์‹œ์ ์œผ๋กœ ํŽ˜๋ฅด์†Œ๋‚˜๋ฅผ ์ง€์‹œํ•˜๋Š” ํ”„๋กฌํ”„ํŠธ๋ฅผ ํ†ตํ•ด ๋ชจ๋ธ์ด ํŠน์ • ์ธ๊ตฌ ํ†ต๊ณ„์  ์—ญํ• ์„ ์ˆ˜ํ–‰ํ•˜๋„๋ก ์œ ๋„ํ•˜๋Š” ์ ‘๊ทผ๋ฒ•์ž…๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, โ€œ๋‹น์‹ ์€ ํ™œ๋ฐœํ•˜๊ณ  ์‚ฌ๊ต์ ์ธ ์‚ฌ๋žŒ์ž…๋‹ˆ๋‹คโ€์™€ ๊ฐ™์€ ํ”„๋กฌํ”„ํŠธ๋Š” ์—์ด์ „ํŠธ๊ฐ€ ํ•ด๋‹น ์—ญํ• ์— ๋งž๋Š” ์–ธ์–ด ์Šคํƒ€์ผ๊ณผ ํ–‰๋™์„ ๋ชจ๋ฐฉํ•˜๋„๋ก ํ•ฉ๋‹ˆ๋‹ค.

4.3 Application of Demographics

ํŠน์ • ์ธ๊ตฌ ํ†ต๊ณ„์  ํŽ˜๋ฅด์†Œ๋‚˜๋ฅผ ํ• ๋‹นํ•˜๋ฉด, LLM์ด ๋‹จ๋… ๋˜๋Š” ๋‹ค์ค‘ ์—์ด์ „ํŠธ ์‹œ์Šคํ…œ์—์„œ ๋‹ค์šด์ŠคํŠธ๋ฆผ ์ž‘์—… ์ˆ˜ํ–‰ ์‹œ ์„ฑ๋Šฅ์ด ํ–ฅ์ƒ๋จ์„ ์„ค๋ช…ํ•ฉ๋‹ˆ๋‹ค. ์ด๋Š” ์ „๋ฌธ ์ง€์‹์ด ์š”๊ตฌ๋˜๋Š” ์ž‘์—…์ด๋‚˜ ํ˜‘์—… ์ƒํ™ฉ์—์„œ ํฐ ๋„์›€์ด ๋ฉ๋‹ˆ๋‹ค.

Improving Task Solving in Single-Agent Systems

๋‹จ์ผ ์—์ด์ „ํŠธ์— ํŠน์ • Demographic Persona๋ฅผ ํ• ๋‹นํ•˜๋ฉด, ํ•ด๋‹น ๋ถ„์•ผ์˜ ์ „๋ฌธ ์ง€์‹์ด ๊ฐ•ํ™”๋˜์–ด ์‘๋‹ต์˜ ๊นŠ์ด์™€ ์งˆ์ด ํ–ฅ์ƒ๋ฉ๋‹ˆ๋‹ค. ํŠนํžˆ, ์‚ฌ์ „ ํ›ˆ๋ จ ์—†์ด๋„ ๋ณต์žกํ•œ ์ œ๋กœ์ƒท ๋ฌธ์ œ ํ•ด๊ฒฐ ๋“ฑ์—์„œ ๋” ํ†ต์ฐฐ๋ ฅ ์žˆ๋Š” ๋‹ต๋ณ€์„ ์ œ๊ณตํ•  ์ˆ˜ ์žˆ๊ฒŒ ๋ฉ๋‹ˆ๋‹ค.

Improving Task Solving in Multi-Agent Systems

๋‹ค์ค‘ ์—์ด์ „ํŠธ ํ™˜๊ฒฝ์—์„œ ๋‹ค์–‘ํ•œ ์ธ๊ตฌ ํ†ต๊ณ„์  ํŽ˜๋ฅด์†Œ๋‚˜๋ฅผ ์ ์šฉํ•˜๋ฉด, ๊ฐ ์—์ด์ „ํŠธ๊ฐ€ ์„œ๋กœ ๋‹ค๋ฅธ ์—ญํ• ์„ ๋งก์•„ ํ˜‘๋ ฅ์  ๋ฌธ์ œ ํ•ด๊ฒฐ ๋ฐ ์†Œํ”„ํŠธ์›จ์–ด ๊ฐœ๋ฐœ ๊ฐ™์€ ๋ณต์žกํ•œ ์ž‘์—…์˜ ํšจ์œจ์„ ๋†’์ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์‹ค์ œ ์‚ฌ๋ก€๋กœ ChatDev์™€ MetaGPT์™€ ๊ฐ™์€ ์‹œ์Šคํ…œ์ด ์†Œ๊ฐœ๋ฉ๋‹ˆ๋‹ค.

Simulating Collective Social Behaviors in Multi-Agent Systems

RPLA๋Š” ์ „๋žต ๊ฒŒ์ž„์ด๋‚˜ ์‚ฌํšŒ ์ถ”๋ฆฌ ๊ฒŒ์ž„ ๋“ฑ์—์„œ ์ธ๊ฐ„๊ณผ ์œ ์‚ฌํ•œ ๋ณต์žกํ•œ ์ƒํ˜ธ์ž‘์šฉ์„ ์‹œ๋ฎฌ๋ ˆ์ด์…˜ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ํ™˜๊ฒฝ์—์„œ๋Š” ์—์ด์ „ํŠธ๋“ค์ด ๊ณต์ •ํ•จ ๋˜๋Š” ์ด๊ธฐ์ ์ธ ํ–‰๋™์„ ํ†ตํ•ด ์ง‘๋‹จ์  ์ด์ต์— ๊ธฐ์—ฌํ•˜๊ฑฐ๋‚˜, ์™ธ๊ต ๋ฐ ์ „์Ÿ ์‹œ๋ฎฌ๋ ˆ์ด์…˜์—์„œ ๋›ฐ์–ด๋‚œ ์„ฑ๋Šฅ์„ ๋ณด์ด๋Š” ๋“ฑ, ๋‹ค์–‘ํ•œ ์‚ฌํšŒ์  ํ–‰๋™ ํŒจํ„ด์„ ์žฌํ˜„ํ•  ์ˆ˜ ์žˆ์Œ์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค.

5. Character Persona

5.1 Definition

Character Persona ๊ฐœ๋… ์†Œ๊ฐœ

Character Persona๋Š” ๋Œ€์ค‘์—๊ฒŒ ๋„๋ฆฌ ์•Œ๋ ค์ง„ ์—ญ์‚ฌ์  ์ธ๋ฌผ, ์†Œ์„คยท์˜ํ™” ์บ๋ฆญํ„ฐ ๋“ฑ ๊ตฌ์ฒด์ ์ด๊ณ  ํ™•๋ฆฝ๋œ ์ธ๋ฌผ์˜ ํŠน์„ฑ์„ ์žฌํ˜„ํ•˜๋Š” ๊ฒƒ์„ ๋ชฉํ‘œ๋กœ ํ•ฉ๋‹ˆ๋‹ค. ์—ฌ๊ธฐ์—๋Š” ๊ธฐ์กด์— ์ž˜ ์•Œ๋ ค์ง„ ์บ๋ฆญํ„ฐ๋ฟ ์•„๋‹ˆ๋ผ, ๊ฐœ๋ณ„ ์‚ฌ์šฉ์ž๊ฐ€ ์ฐฝ์ž‘ํ•œ ์›๋ณธ ์บ๋ฆญํ„ฐ๋„ ํฌํ•จ๋ฉ๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ํŽ˜๋ฅด์†Œ๋‚˜๋Š” ์ตœ๊ทผ Character.ai์™€ ๊ฐ™์ด ๊ด€๋ จ ๋ถ„์•ผ์—์„œ ๊ธ‰๋ถ€์ƒํ•˜๋ฉฐ, LLM์˜ ์—ญํ•  ๋†€์ด ์‘์šฉ์—์„œ ์ค‘์š”ํ•œ ์—ฐ๊ตฌ ์ฃผ์ œ๋กœ ์ž๋ฆฌ์žก๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

ํšจ๊ณผ์ ์ธ ์—ญํ•  ๋†€์ด๋ฅผ ์œ„ํ•œ ํ•„์ˆ˜ ์š”๊ฑด

ํšจ๊ณผ์ ์ธ ์บ๋ฆญํ„ฐ ์—ญํ•  ์žฌํ˜„์„ ์œ„ํ•ด LLM์ด ํ•ด๋‹น ์บ๋ฆญํ„ฐ์˜ ํŠน์„ฑ์„ ์ดํ•ดํ•˜๋Š” ๋Šฅ๋ ฅ์ด ํ•„์ˆ˜์ ์ž…๋‹ˆ๋‹ค. ์ดˆ๊ธฐ ์—ฐ๊ตฌ์—์„œ๋Š” โ€˜Character Predictionโ€™๊ณผ โ€˜Personality Understandingโ€™์ด๋ผ๋Š” ๋‘ ๊ฐ€์ง€ ์ธก๋ฉด์„ ํ†ตํ•ด, ๋ชจ๋ธ์ด ํ…์ŠคํŠธ์—์„œ ์บ๋ฆญํ„ฐ์˜ ์ •์ฒด, ๊ด€๊ณ„, ๊ทธ๋ฆฌ๊ณ  ์„ฑ๊ฒฉ์  ํŠน์„ฑ์„ ์ธ์‹ํ•˜๊ณ , ๋ฏธ๋ž˜ ํ–‰๋™์„ ์˜ˆ์ธกํ•  ์ˆ˜ ์žˆ๋Š”์ง€์— ๋Œ€ํ•ด ํƒ๊ตฌํ•˜์˜€์Šต๋‹ˆ๋‹ค.

์ตœ๊ทผ ์—ฐ๊ตฌ ์‚ฌ๋ก€

์บ๋ฆญํ„ฐ์˜ ์–ดํˆฌ, ์ง€์‹, ์„ฑ๊ฒฉ, ์˜์‚ฌ ๊ฒฐ์ •์— ๋Œ€ํ•œ ์žฌํ˜„๊ณผ ๊ด€๋ จ๋œ ์—ฐ๊ตฌ๋“ค์ด ์ง„ํ–‰๋˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

5.2 Data for Character RPLAs

์บ๋ฆญํ„ฐ RPLA๋ฅผ ๊ตฌ์ถ•ํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” ํ•ด๋‹น ์บ๋ฆญํ„ฐ์— ๋Œ€ํ•œ ํ’๋ถ€ํ•˜๊ณ  ์ •ํ™•ํ•œ ๋ฐ์ดํ„ฐ๊ฐ€ ํ•„์ˆ˜์ ์ž…๋‹ˆ๋‹ค. ์ด ๋ฐ์ดํ„ฐ๋Š” ์บ๋ฆญํ„ฐ์˜ ์ •์ฒด์„ฑ, ๋ฐฐ๊ฒฝ, ๊ด€๊ณ„์™€ ๊ฐ™์€ ๊ธฐ๋ณธ ์ •๋ณด์™€ ํ•จ๊ป˜, ์บ๋ฆญํ„ฐ์˜ ๊ณ ์œ ํ•œ ํŠน์„ฑ์„ ๋ชจ๋ธ์ด ํ•™์Šตํ•˜๋„๋ก ๋•์Šต๋‹ˆ๋‹ค. ๊ฒฐ๊ตญ, ์ด ์ •๋ณด๋“ค์ด ๋ชจ๋ธ์ด ์š”์ฒญ ์‹œ ํ•ด๋‹น ์บ๋ฆญํ„ฐ๋ฅผ ์˜ฌ๋ฐ”๋ฅด๊ฒŒ ํšŒ์ƒํ•˜๊ณ  ์žฌํ˜„ํ•  ์ˆ˜ ์žˆ๋Š” ๊ธฐ๋ฐ˜์ด ๋ฉ๋‹ˆ๋‹ค.

์บ๋ฆญํ„ฐ ๋ฐ์ดํ„ฐ๋ฅผ ๋‘ ๊ฐ€์ง€ ์ฃผ์š” ์œ ํ˜•์œผ๋กœ ๊ตฌ๋ถ„ํ•ฉ๋‹ˆ๋‹ค.

  • ์„ค๋ช…(Description) ๋ฐ์ดํ„ฐ: ์บ๋ฆญํ„ฐ์˜ ์ด๋ฆ„, ์†Œ์†, ์ •์ฒด์„ฑ, ๋ฐฐ๊ฒฝ ๋“ฑ ์ •์ ์ธ ํŠน์„ฑ์„ ์ง์ ‘ ์„œ์ˆ ํ•œ ์ •๋ณด๋กœ, ๋ชจ๋ธ์ด ์บ๋ฆญํ„ฐ์˜ ๊ธฐ๋ณธ์ ์ธ ํŠน์„ฑ์„ ๊ธฐ์–ตํ•˜๊ณ  ์žฌํ˜„ํ•˜๋Š” ๋ฐ ๋„์›€์„ ์ค๋‹ˆ๋‹ค.
  • ๋ฐ๋ชจ(Demonstration) ๋ฐ์ดํ„ฐ: ์บ๋ฆญํ„ฐ์˜ ์–ธ์–ด ์Šคํƒ€์ผ, ์ธ์ง€ ๋ฐ ํ–‰๋™ ํŒจํ„ด ๋“ฑ ๋™์ ์ธ ํŠน์„ฑ์„ ๋Œ€ํ™”๋‚˜ ์ƒํ™ฉ ์‹œ์—ฐ์„ ํ†ตํ•ด ๋ณด์—ฌ์ฃผ๋Š” ์ •๋ณด์ž…๋‹ˆ๋‹ค.

๋‘ ๋ฐ์ดํ„ฐ ์œ ํ˜•์€ ์ƒํ˜ธ ๋ณด์™„์ ์œผ๋กœ ์ž‘์šฉํ•˜์—ฌ, ๋ชจ๋ธ์ด ์บ๋ฆญํ„ฐ์˜ ์ƒ๋™๊ฐ ์žˆ๊ณ  ์ผ๊ด€๋œ ํ‘œํ˜„์„ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ๋„๋ก ์ง€์›ํ•ฉ๋‹ˆ๋‹ค.

์บ๋ฆญํ„ฐ ๋ฐ์ดํ„ฐ์˜ ํ•œ๊ณ„์™€ ์ถœ์ฒ˜

  • ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ ์บ๋ฆญํ„ฐ ๋ฐ์ดํ„ฐ๋Š” ํ˜„์žฌ ์ œํ•œ์ ์ด๋ฉฐ, ์ฃผ๋กœ ์†Œ์ˆ˜์˜ ์ž˜ ์•Œ๋ ค์ง„ ์บ๋ฆญํ„ฐ์— ๊ตญํ•œ๋ฉ๋‹ˆ๋‹ค.
  • ์„ค๋ช… ๋ฐ์ดํ„ฐ๋Š” ์ฃผ๋กœ ์‹ ๋ขฐํ•  ์ˆ˜ ์žˆ๋Š” ๋ฐฑ๊ณผ์‚ฌ์ „์ด๋‚˜ ์›๋ณธ ์ž‘ํ’ˆ์—์„œ ์ˆ˜์ง‘๋˜๋ฉฐ, ์ˆ˜์ž‘์—… ๋˜๋Š” ์ตœ์‹  LLM์„ ํ™œ์šฉํ•ด ์ฒ˜๋ฆฌ๋ฉ๋‹ˆ๋‹ค.

๋ฐ๋ชจ ๋ฐ์ดํ„ฐ ์ƒ์„ฑ ๋ฐฉ๋ฒ•

๊ฒฝํ—˜ ์ถ”์ถœ(Experience Extraction):
  • ์›๋ณธ ์Šคํฌ๋ฆฝํŠธ๋‚˜ ๋Œ€๋ณธ์—์„œ ์บ๋ฆญํ„ฐ์˜ ๋Œ€ํ™” ๋ฐ ์žฅ๋ฉด์„ ์ง์ ‘ ์ถ”์ถœํ•ฉ๋‹ˆ๋‹ค.
  • ์ถ”์ถœ๋œ ๋ฐ์ดํ„ฐ๋Š” ์บ๋ฆญํ„ฐ์˜ ํŠน์„ฑ์„ ์ถฉ์‹คํ•˜๊ฒŒ ๋‹ด์ง€๋งŒ, ๋ฐฐ๊ฒฝ ์ง€์‹์ด ๋ถ€์กฑํ•ด ์‹ค์ œ RPLA ํ•™์Šต์— ํ•œ๊ณ„๊ฐ€ ์žˆ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
๋Œ€ํ™” ํ•ฉ์„ฑ(Dialogue Synthesis):
  • ์ตœ์‹  LLM์„ ํ™œ์šฉํ•ด ์บ๋ฆญํ„ฐ ๋Œ€ํ™”๋ฅผ ์ƒ์„ฑ ๋ฐ ๋ณด๊ฐ•ํ•ฉ๋‹ˆ๋‹ค.
  • ๋ฌธํ—Œ, ์ผ๋ฐ˜ ์ž‘์—… ์ง€์‹œ, ์„ฑ๊ฒฉ ํ…Œ์ŠคํŠธ ๋“ฑ ๋‹ค์–‘ํ•œ ์ฃผ์ œ๋กœ ํ•ฉ์„ฑํ•˜๋ฉฐ, ์ธ-์ปจํ…์ŠคํŠธ ๋Ÿฌ๋‹์ด๋‚˜ ์ง์ ‘ ์—ญํ•  ๋†€์ด๋ฅผ ํ†ตํ•ด ๋ฐ์ดํ„ฐ๋ฅผ ์ƒ์‚ฐํ•ฉ๋‹ˆ๋‹ค.
  • ๋‹จ, ์ƒ์„ฑ๋œ ๋Œ€ํ™”์˜ ํ’ˆ์งˆ์€ โ€˜teacherโ€™ LLM์˜ ํ•œ๊ณ„๋กœ ์ธํ•ด ์ถ”๊ฐ€์ ์ธ ํ•„ํ„ฐ๋ง์ด ํ•„์š”ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
์ธ๊ฐ„ ์ฃผ์„(Human Annotation):
  • ์‹ค์ œ ์ธ๋ ฅ์ด ์ง์ ‘ ์บ๋ฆญํ„ฐ ์—ญํ•  ๋†€์ด๋ฅผ ์ˆ˜ํ–‰ํ•˜์—ฌ ๋Œ€ํ™” ๋ฐ์ดํ„ฐ๋ฅผ ์ˆ˜์ง‘ํ•ฉ๋‹ˆ๋‹ค.
  • ๋†’์€ ๋ฐ์ดํ„ฐ ํ’ˆ์งˆ์„ ๋ณด์žฅํ•˜์ง€๋งŒ, ๋น„์šฉ๊ณผ ์‹œ๊ฐ„์ด ๋งŽ์ด ์†Œ์š”๋ฉ๋‹ˆ๋‹ค.
  • ์ด๋ฅผ ํ†ตํ•ด ๊ธฐ์กด ์บ๋ฆญํ„ฐ๋ฟ ์•„๋‹ˆ๋ผ, ์ƒˆ๋กœ์šด ์›๋ณธ ์บ๋ฆญํ„ฐ ๋ฐ์ดํ„ฐ๋„ ํ™•๋ณดํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์ƒํ˜ธ์ž‘์šฉ ๋ฐ์ดํ„ฐ ๋ฐ ์‹œ๊ฐ„์  ์—ญํ•  ์žฌํ˜„

  • RPLA์™€ ์‚ฌ์šฉ์ž ๊ฐ„์˜ ์ƒํ˜ธ์ž‘์šฉ์„ ํ†ตํ•ด ์ถ”๊ฐ€์ ์ธ ๋Œ€ํ™” ๋ฐ์ดํ„ฐ๊ฐ€ ์ง€์†์ ์œผ๋กœ ์ƒ์„ฑ๋˜๋ฉฐ, ์ด๋Š” ๊ธฐ์กด ์บ๋ฆญํ„ฐ ๋ฐ์ดํ„ฐ๋ฅผ ๋ณด์™„ํ•ฉ๋‹ˆ๋‹ค.
  • ์ด ์ƒํ˜ธ์ž‘์šฉ ๋ฐ์ดํ„ฐ๋Š” ์บ๋ฆญํ„ฐ ํŽ˜๋ฅด์†Œ๋‚˜๊ฐ€ ์‚ฌ์šฉ์ž ๊ฐœ๋ณ„ ์„ ํ˜ธ์— ๋งž์ถฐ ์ ์ง„์ ์œผ๋กœ ๋ณ€ํ™”ํ•˜๋„๋ก ๋•์Šต๋‹ˆ๋‹ค.
  • ๋˜ํ•œ, ํŠน์ • ์‹œ์ ์˜ ์—ญํ•  ๋†€์ด(์˜ˆ: ์–ด๋ฆฐ ์‹œ์ ˆ์˜ ํ•ด๋ฆฌ ํฌํ„ฐ)๋ฅผ ์š”๊ตฌํ•˜๋Š” ์‘์šฉ ์‚ฌ๋ก€๋Š” ์บ๋ฆญํ„ฐ ์ง€์‹์˜ ์‹œ์ ๋ณ„ ์ œํ•œ์ด๋ผ๋Š” ์ถ”๊ฐ€์ ์ธ ๋„์ „ ๊ณผ์ œ๋ฅผ ์ œ์‹œํ•ฉ๋‹ˆ๋‹ค.

5.3 Contruction of Character RPLAs

LLM์— ์บ๋ฆญํ„ฐ ๋ฐ์ดํ„ฐ๋ฅผ ์ฃผ์ž…ํ•˜์—ฌ ์บ๋ฆญํ„ฐ RPLA๋ฅผ ๊ตฌ์ถ•ํ•˜๋Š” ๊ณผ์ •์„ ์„ค๋ช…ํ•ฉ๋‹ˆ๋‹ค. LLM์€ ์ด๋ฏธ ์ธ์ŠคํŠธ๋Ÿญ์…˜ ํŒ”๋กœ์ž‰๊ณผ ์บ๋ฆญํ„ฐ ์ดํ•ด ๋Šฅ๋ ฅ์„ ๊ฐ–์ถ”๊ณ  ์žˆ์œผ๋ฏ€๋กœ, ์ œ๊ณต๋œ ๋ฐ์ดํ„ฐ๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ํŠน์ • ์บ๋ฆญํ„ฐ ์—ญํ• ์„ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ๊ฒŒ ๋ฉ๋‹ˆ๋‹ค.

Parametric Training ์ ‘๊ทผ๋ฒ•

Parametric Training ๋ฐฉ๋ฒ•์€ ์‚ฌ์ „ ํ•™์Šต(Pre-training)๊ณผ ์ง€๋„ํ•™์Šต(Supervised Fine-Tuning)์„ ํ†ตํ•ด, ๋Œ€๊ทœ๋ชจ ๋ฌธํ—Œ๊ณผ ๋ฐฑ๊ณผ์‚ฌ์ „ ๋“ฑ์—์„œ ์บ๋ฆญํ„ฐ ๊ด€๋ จ ์ง€์‹์„ ํ•™์Šต์‹œํ‚ต๋‹ˆ๋‹ค. ์ด๋ฅผ ํ†ตํ•ด LLM์€ Hermione Granger๋‚˜ Socrates์™€ ๊ฐ™์€ ๊ธฐ์กด ์บ๋ฆญํ„ฐ์˜ ์—ญํ• ์„ ์ž์—ฐ์Šค๋Ÿฝ๊ฒŒ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ๋„๋ก ์ค€๋น„๋ฉ๋‹ˆ๋‹ค.

Nonparametric Prompting ์ ‘๊ทผ๋ฒ• ๋ฐ ๋ฉ”๋ชจ๋ฆฌ ๋ชจ๋“ˆ

Nonparametric Prompting์€ ํ”„๋กฌํ”„ํŠธ ๋‚ด์— ์บ๋ฆญํ„ฐ ๋ฐ์ดํ„ฐ๋ฅผ ์ง์ ‘ ํฌํ•จํ•˜์—ฌ LLM์ด ํ•ด๋‹น ์บ๋ฆญํ„ฐ๋กœ ์ฆ‰๊ฐ ์ „ํ™˜ํ•˜๋„๋ก ์œ ๋„ํ•ฉ๋‹ˆ๋‹ค. ๋‹ค๋งŒ, ์บ๋ฆญํ„ฐ ๋ฐ์ดํ„ฐ์˜ ์–‘์ด ๋งŽ๊ณ  ์ง€์†์ ์ธ ์ƒํ˜ธ์ž‘์šฉ ๋ฐ์ดํ„ฐ๊ฐ€ ๋ˆ„์ ๋จ์— ๋”ฐ๋ผ, ์ปจํ…์ŠคํŠธ ํ•œ๊ณ„๋ฅผ ๊ทน๋ณตํ•˜๊ธฐ ์œ„ํ•ด ์™ธ๋ถ€ ๋ฉ”๋ชจ๋ฆฌ ๋ชจ๋“ˆ์„ ๋„์ž…ํ•˜๋Š” ๋ณด์™„์  ์ ‘๊ทผ์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.

5.4 Evaluation of Character RPLAs

์บ๋ฆญํ„ฐ RPLA ํ‰๊ฐ€์—์„œ๋Š” ๋ชจ๋ธ์˜ ์—ญํ•  ์ˆ˜ํ–‰ ๋Šฅ๋ ฅ(์บ๋ฆญํ„ฐ ๋…๋ฆฝ์  ๊ธฐ๋Šฅ)๊ณผ ํŠน์ • ์บ๋ฆญํ„ฐ์˜ ์ถฉ์‹ค๋„(์–ธ์–ด ์Šคํƒ€์ผ, ์ง€์‹, ์„ฑ๊ฒฉ, ์‚ฌ๊ณ  ๊ณผ์ •)๋ผ๋Š” ๋‘ ์ถ•์œผ๋กœ ๋‚˜๋ˆ„์–ด ํ‰๊ฐ€ํ•ฉ๋‹ˆ๋‹ค.

Character-independent Capabilities

์ด ๋ถ€๋ถ„์€ ๋ชจ๋ธ์ด ์—ญํ•  ๋†€์ด ์ž‘์—… ์ž์ฒด๋ฅผ ์–ผ๋งˆ๋‚˜ ์ž˜ ์ˆ˜ํ–‰ํ•˜๋Š”์ง€ ํ‰๊ฐ€ํ•ฉ๋‹ˆ๋‹ค. ํ‰๊ฐ€ ์š”์†Œ๋กœ๋Š” ๋Œ€ํ™” ์ฐธ์—ฌ, ๋ชฐ์ž…๊ฐ, ์œ ์ฐฝํ•œ ์–ธ์–ด ์ƒ์„ฑ, ๊ฐ์ • ์ดํ•ด, ๋ฌธ์ œ ํ•ด๊ฒฐ ๋Šฅ๋ ฅ ๋“ฑ์ด ์žˆ์œผ๋ฉฐ, ๋‹ค์–‘ํ•œ ์ƒํ˜ธ์ž‘์šฉ ์ˆ˜์ค€์— ๋”ฐ๋ผ ๊ธฐ๋ณธ ์—ญํ•  ์ˆ˜ํ–‰ ๋Šฅ๋ ฅ๋ถ€ํ„ฐ ์ธ๊ฐ„์— ๊ฐ€๊นŒ์šด โ€˜์ธ๋ฅ˜ ๋ชจ๋ฐฉโ€™ ๋Šฅ๋ ฅ๊นŒ์ง€ ์ธก์ •ํ•ฉ๋‹ˆ๋‹ค.

Role-playing Engagement

RPLA๊ฐ€ ์—ญํ•  ๋†€์ด ์ƒํ™ฉ์—์„œ ์–ผ๋งˆ๋‚˜ ์ ๊ทน์ ์œผ๋กœ ์ฐธ์—ฌํ•˜๋ฉฐ ๋ชฐ์ž…ํ•˜๋Š”์ง€๋ฅผ ํ‰๊ฐ€ํ•ฉ๋‹ˆ๋‹ค. ์—์ด์ „ํŠธ๋Š” ๋Œ€ํ™” ํ˜•์‹์˜ ์‘๋‹ต์„ ์ƒ์„ฑํ•˜๊ณ , ๋Œ€ํ™” ์ „๋ฐ˜์— ๊ฑธ์ณ ์ผ๊ด€๋œ ์ธ๊ฒฉ๊ณผ ์—ญํ• ์„ ์œ ์ง€ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ๋น„์ธ๊ฒฉ์ ์ธ ํ‘œํ˜„(์˜ˆ: โ€œ๋‚˜๋Š” AI ๋ชจ๋ธ์ž…๋‹ˆ๋‹คโ€)์„ ํ”ผํ•˜๊ณ , ๋Œ€ํ™”์˜ ํ๋ฆ„์— ์ž์—ฐ์Šค๋Ÿฝ๊ฒŒ ๋…น์•„๋“ค์–ด์•ผ ํ•œ๋‹ค๋Š” ์ ์„ ๊ฐ•์กฐํ•ฉ๋‹ˆ๋‹ค.

High-quality Conversations

RPLA๊ฐ€ ์ž์—ฐ์Šค๋Ÿฝ๊ณ  ์œ ์ฐฝํ•œ ๋Œ€ํ™”๋ฅผ ์ƒ์„ฑํ•˜๋Š” ๋Šฅ๋ ฅ์„ ํ‰๊ฐ€ํ•ฉ๋‹ˆ๋‹ค. ๋Œ€ํ™”์˜ ์™„์ „์„ฑ, ์ •๋ณด ์ „๋‹ฌ์˜ ๋ช…ํ™•์„ฑ, ๊ทธ๋ฆฌ๊ณ  ์œ ์ฐฝํ•œ ๋ฌธ์žฅ ๊ตฌ์„ฑ์ด ์ฃผ์š” ํ‰๊ฐ€ ์š”์†Œ๋กœ ์–ธ๊ธ‰๋ฉ๋‹ˆ๋‹ค. ๋˜ํ•œ, ์œค๋ฆฌ์  ๊ธฐ์ค€ ์ค€์ˆ˜๋ฅผ ํ†ตํ•ด ๋ถ€์ ์ ˆํ•˜๊ฑฐ๋‚˜ ํ•ด๋กœ์šด ์ฝ˜ํ…์ธ  ์ƒ์„ฑ์ด ๋ฐฉ์ง€๋˜์–ด์•ผ ํ•จ์„ ๊ฐ•์กฐํ•ฉ๋‹ˆ๋‹ค.

Anthropomorphic Capabilities

RPLA๊ฐ€ ์ธ๊ฐ„๊ณผ ์œ ์‚ฌํ•œ ์ธ์ง€, ๊ฐ์ •, ์‚ฌํšŒ์  ์ง€๋Šฅ์„ ์–ผ๋งˆ๋‚˜ ํšจ๊ณผ์ ์œผ๋กœ ๋ชจ๋ฐฉํ•˜๋Š”์ง€๋ฅผ ํ‰๊ฐ€ํ•ฉ๋‹ˆ๋‹ค. ๊ตฌ์ฒด์ ์œผ๋กœ, ๋Œ€ํ™”์˜ ๋งค๋ ฅ๋„, ํƒ€์ธ์˜ ์‹ฌ๋ฆฌ๋ฅผ ์ดํ•ดํ•˜๋Š” ๋Šฅ๋ ฅ(Theory of Mind), ๊ณต๊ฐ ๋Šฅ๋ ฅ, ๊ฐ์„ฑ ์ง€๋Šฅ, ๊ทธ๋ฆฌ๊ณ  ๋ชฉํ‘œ ์ง€ํ–ฅ์  ์‚ฌํšŒ ๊ธฐ์ˆ  ๋“ฑ ๋‹ค์–‘ํ•œ ์ฐจ์›์˜ ์ธ๊ฐ„์  ํŠน์„ฑ์ด ๋ฐ˜์˜๋˜์–ด์•ผ ํ•จ์„ ์„ค๋ช…ํ•ฉ๋‹ˆ๋‹ค.

Character Fidelity

ํŠน์ • ์บ๋ฆญํ„ฐ๋ฅผ ์žฌํ˜„ํ•˜๋Š” ๋ฐ ์žˆ์–ด, RPLA๊ฐ€ ์–ผ๋งˆ๋‚˜ ๊ทธ ์บ๋ฆญํ„ฐ์˜ ์–ธ์–ด ์Šคํƒ€์ผ, ๋ฐฐ๊ฒฝ ์ง€์‹, ์„ฑ๊ฒฉ ๋ฐ ์‚ฌ๊ณ  ๊ณผ์ •์„ ์ •ํ™•ํ•˜๊ฒŒ ๋ฐ˜์˜ํ•˜๋Š”์ง€๋ฅผ ํ‰๊ฐ€ํ•ฉ๋‹ˆ๋‹ค. ์ด ๊ณผ์ •์—์„œ๋Š” ์บ๋ฆญํ„ฐ ํ• ๋ฃจ์‹œ๋„ค์ด์…˜(๋ชจ๋ธ์ด ์บ๋ฆญํ„ฐ์˜ ๋ฒ”์œ„๋ฅผ ๋„˜์–ด์„  ์ •๋ณด๋ฅผ ์ƒ์„ฑํ•˜๋Š” ๋ฌธ์ œ)๋„ ํ•จ๊ป˜ ๊ณ ๋ ค๋ฉ๋‹ˆ๋‹ค.

Linguistic Style

RPLA๊ฐ€ ํ•ด๋‹น ์บ๋ฆญํ„ฐ์˜ ๊ณ ์œ ํ•œ ์–ธ์–ด์  ์Šคํƒ€์ผ๊ณผ ์–ด์กฐ๋ฅผ ์–ผ๋งˆ๋‚˜ ์ถฉ์‹คํžˆ ๋ชจ๋ฐฉํ•˜๋Š”์ง€๋ฅผ ํ‰๊ฐ€ํ•ฉ๋‹ˆ๋‹ค. ์บ๋ฆญํ„ฐ์˜ ํ‘œํ˜„ ๋ฐฉ์‹, ์–ดํœ˜ ์‚ฌ์šฉ, ๋ฌธ์ฒด ๋“ฑ์ด ์ธ-์ปจํ…์ŠคํŠธ ๋Ÿฌ๋‹์„ ํ†ตํ•ด ์žฌํ˜„๋˜์–ด์•ผ ํ•˜๋ฉฐ, ์ด๋ฅผ ํ†ตํ•ด ์บ๋ฆญํ„ฐ์˜ ์ •์ฒด์„ฑ์ด ์ž์—ฐ์Šค๋Ÿฝ๊ฒŒ ์ „๋‹ฌ๋˜๋Š”์ง€ ํ™•์ธํ•ฉ๋‹ˆ๋‹ค.

Knowledge

๋ชจ๋ธ์ด ํ•ด๋‹น ์บ๋ฆญํ„ฐ๊ฐ€ ๋ณด์œ ํ•ด์•ผ ํ•˜๋Š” ๋ฐฐ๊ฒฝ ์ง€์‹๊ณผ ์ •์ฒด์„ฑ ์ •๋ณด๋ฅผ ์ •ํ™•ํžˆ ๊ธฐ์–ตํ•˜๊ณ  ์žฌํ˜„ํ•˜๋Š” ๋Šฅ๋ ฅ์„ ์ค‘์ ์ ์œผ๋กœ ๋‹ค๋ฃน๋‹ˆ๋‹ค. ์บ๋ฆญํ„ฐ์˜ ์ด๋ฆ„, ์†Œ์†, ๊ด€๊ณ„, ๊ฒฝํ—˜ ๋“ฑ ํ•ต์‹ฌ ์ •๋ณด๋ฅผ ์˜ฌ๋ฐ”๋ฅด๊ฒŒ ๋ฐ˜์˜ํ•˜๋ฉฐ, ๋ถˆํ•„์š”ํ•˜๊ฒŒ ์บ๋ฆญํ„ฐ ๋ฒ”์œ„๋ฅผ ๋„˜์–ด์„  ์ •๋ณด๋ฅผ ์ƒ์„ฑํ•˜๋Š” โ€˜์บ๋ฆญํ„ฐ ํ• ๋ฃจ์‹œ๋„ค์ด์…˜โ€™์„ ๋ฐฉ์ง€ํ•˜๋Š” ๊ฒƒ์ด ์ค‘์š”ํ•˜๋‹ค๊ณ  ์„ค๋ช…ํ•ฉ๋‹ˆ๋‹ค.

Personality and Thinking Process

RPLA๊ฐ€ ์บ๋ฆญํ„ฐ์˜ ๋‚ด๋ฉด์  ์„ฑ๊ฒฉ๊ณผ ์‚ฌ๊ณ  ๊ณผ์ •์„ ์–ผ๋งˆ๋‚˜ ์ž˜ ๋ชจ๋ฐฉํ•˜๋Š”์ง€๋ฅผ ํ‰๊ฐ€ํ•ฉ๋‹ˆ๋‹ค. ์บ๋ฆญํ„ฐ์˜ ๋™๊ธฐ, ๊ฒฐ์ • ๊ณผ์ •, ๊ทธ๋ฆฌ๊ณ  ์‹ฌ๋ฆฌ์  ํŠน์„ฑ์„ ์žฌํ˜„ํ•จ์œผ๋กœ์จ, ๋‹จ์ˆœํ•œ ์–ธ์–ด ์Šคํƒ€์ผ์„ ๋„˜์–ด ์ง„์ •ํ•œ โ€˜์บ๋ฆญํ„ฐ์˜ ๋‚ด๋ฉดโ€™์„ ํ‘œํ˜„ํ•˜๋Š”์ง€์— ์ดˆ์ ์„ ๋งž์ถฅ๋‹ˆ๋‹ค. ์ด๋ฅผ ์œ„ํ•ด ์‹ฌ๋ฆฌ ํ‰๊ฐ€ ๋„๊ตฌ ๋“ฑ์„ ํ™œ์šฉํ•ด ์ •๋ฐ€ํ•˜๊ฒŒ ๋ถ„์„ํ•  ์ˆ˜ ์žˆ์Œ์„ ์–ธ๊ธ‰ํ•ฉ๋‹ˆ๋‹ค.

Evaluation methods

ํ‰๊ฐ€ ๋ฐฉ๋ฒ•์œผ๋กœ๋Š” ์ž๋™ ํ‰๊ฐ€(ground truth ๊ธฐ๋ฐ˜ ๋ฐ ๋น„๊ธฐ๋ฐ˜), ๋‹ค์ง€์„ ๋‹คํ˜• ๋ฌธ์ œ, ๊ทธ๋ฆฌ๊ณ  ์ „๋ฌธ๊ฐ€๋‚˜ ์ธ๊ฐ„ ํ‰๊ฐ€์ž์— ์˜ํ•œ ์ธ์  ํ‰๊ฐ€๊ฐ€ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ๋ณตํ•ฉ ํ‰๊ฐ€ ๋ฐฉ์‹์„ ํ†ตํ•ด, ์บ๋ฆญํ„ฐ RPLA์˜ ์—ญํ•  ์ˆ˜ํ–‰๊ณผ ์ถฉ์‹ค๋„๋ฅผ ์ข…ํ•ฉ์ ์œผ๋กœ ๊ฒ€์ฆํ•ฉ๋‹ˆ๋‹ค.

Automatic Evaluation with Ground Truth

์ •๋‹ต(ground truth) ๋ฐ์ดํ„ฐ๊ฐ€ ์žˆ์„ ๋•Œ, ์ž๋™ ํ‰๊ฐ€ ๊ธฐ๋ฒ•์„ ํ™œ์šฉํ•˜์—ฌ RPLA์˜ ์‘๋‹ต๊ณผ ๊ธฐ์ค€ ๋‹ต๋ณ€ ๊ฐ„์˜ ์œ ์‚ฌ๋„๋ฅผ ์ธก์ •ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์„ค๋ช…ํ•ฉ๋‹ˆ๋‹ค. ์ดˆ๊ธฐ์—๋Š” Rouge-L ๊ฐ™์€ ์ „ํ†ต์  ์œ ์‚ฌ๋„ ์ง€ํ‘œ๊ฐ€ ์‚ฌ์šฉ๋˜์—ˆ์œผ๋‚˜, ์ตœ๊ทผ์—๋Š” GPT-4์™€ ๊ฐ™์€ ์ฒจ๋‹จ LLM์„ ํ‰๊ฐ€์ž๋กœ ํ™œ์šฉํ•˜์—ฌ, ์ฃผ์–ด์ง„ ๊ธฐ์ค€(๋Œ€๊ฐœ ๊ณ ๊ธ‰ LLM์ด ์ƒ์„ฑํ•œ ์ •๋‹ต)์„ ๋ฐ”ํƒ•์œผ๋กœ ์‘๋‹ต ์ ์ˆ˜๋‚˜ ์šฐ์ˆ˜ ๋‹ต๋ณ€์„ ์‚ฐ์ถœํ•˜๋Š” ๋ฐฉ์‹์ด ์ฃผ๋ฅ˜๋ฅผ ์ด๋ฃจ๊ณ  ์žˆ์Œ์„ ์–ธ๊ธ‰ํ•ฉ๋‹ˆ๋‹ค.

Automatic Evaluation without Ground Truth

์ •๋‹ต ๋ฐ์ดํ„ฐ๊ฐ€ ๋ถ€์กฑํ•œ ์ƒํ™ฉ์—์„œ, ํ‰๊ฐ€ LLM์ด ์บ๋ฆญํ„ฐ ํ”„๋กœํ•„ ๋“ฑ์˜ ์ •๋ณด๋ฅผ ์ฐธ๊ณ ํ•˜์—ฌ RPLA ์‘๋‹ต์„ ํ‰๊ฐ€ํ•˜๋Š” ๋ฐฉ๋ฒ•์— ๋Œ€ํ•ด ๋‹ค๋ฃน๋‹ˆ๋‹ค. ์ด ๋ฐฉ์‹์€ ์บ๋ฆญํ„ฐ ๋…๋ฆฝ์  ๋Šฅ๋ ฅ์ด๋‚˜ ์–ธ์–ด ์Šคํƒ€์ผ ํ‰๊ฐ€์— ํšจ๊ณผ์ ์ด์ง€๋งŒ, ์บ๋ฆญํ„ฐ ๊ณ ์œ ์˜ ์ง€์‹๊ณผ ์‚ฌ๊ณ  ๊ณผ์ •์„ ํ‰๊ฐ€ํ•˜๋Š” ๋ฐ๋Š” ํ•œ๊ณ„๊ฐ€ ์žˆ์–ด, ์ต์ˆ™ํ•˜์ง€ ์•Š์€ ์บ๋ฆญํ„ฐ์— ๋Œ€ํ•ด์„œ๋Š” ๋ถ€์ •ํ™•ํ•œ ํŒ๋‹จ์„ ๋‚ด๋ฆด ์œ„ํ—˜์ด ์žˆ์Œ์„ ์ง€์ ํ•ฉ๋‹ˆ๋‹ค.

Multi-choice Questions

๊ฐ๊ด€์‹ ๋ฌธ์ œ๋ฅผ ํ™œ์šฉํ•œ ํ‰๊ฐ€ ๋ฐฉ์‹์„ ์†Œ๊ฐœํ•ฉ๋‹ˆ๋‹ค. RPLA๊ฐ€ ๋ฏธ๋ฆฌ ์ •ํ•ด์ง„ ์„ ํƒ์ง€ ์ค‘์—์„œ ๋‹ต์„ ์„ ํƒํ•˜๋„๋ก ํ•จ์œผ๋กœ์จ, ์‘๋‹ต์˜ ์ถœ๋ ฅ ๊ณต๊ฐ„์„ ์ถ•์†Œํ•˜๊ณ  ํ‰๊ฐ€๋ฅผ ๋‹จ์ˆœํ™”ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ํŠนํžˆ, ์บ๋ฆญํ„ฐ์˜ ์‚ฌ๊ณ  ๊ณผ์ •์ด๋‚˜ ํ–‰๋™ ์˜ˆ์ธก ๋“ฑ์—์„œ ์ •๋‹ต๊ณผ ๋‹ค์†Œ ์ฐจ์ด๊ฐ€ ์žˆ๋”๋ผ๋„ ํ•ฉ๋ฆฌ์ ์ธ ์‘๋‹ต์„ ํ‰๊ฐ€ํ•  ์ˆ˜ ์žˆ๋Š” ์žฅ์ ์ด ๊ฐ•์กฐ๋ฉ๋‹ˆ๋‹ค.

Human Evaluation

์ธ๊ฐ„ ํ‰๊ฐ€์ž์˜ ์ง์ ‘ ํ‰๊ฐ€ ๋ฐฉ์‹์„ ์„ค๋ช…ํ•ฉ๋‹ˆ๋‹ค. ์ธ๊ฐ„ ํ‰๊ฐ€์ž๋Š” RPLA์˜ ์‘๋‹ต์„ ์ •๋ฐ€ํ•˜๊ฒŒ ๋ถ„์„ํ•  ์ˆ˜ ์žˆ์œผ๋‚˜, ์‹œ๊ฐ„๊ณผ ๋น„์šฉ์ด ๋งŽ์ด ๋“ค๊ณ  ์žฌํ˜„์„ฑ์ด ๋‚ฎ์€ ๋‹จ์ ์ด ์žˆ์Šต๋‹ˆ๋‹ค. ๋˜ํ•œ, ์บ๋ฆญํ„ฐ์— ๋Œ€ํ•ด ๊นŠ์€ ์ดํ•ด๋ฅผ ๊ฐ€์ง„ ํ‰๊ฐ€์ž๋ฅผ ํ™•๋ณดํ•˜๊ธฐ ์–ด๋ ต๋‹ค๋Š” ์ ์ด ์–ธ๊ธ‰๋˜๋ฉฐ, ์ผ๋ถ€ ์—ฐ๊ตฌ์—์„œ๋Š” ์ž๋™ ํ‰๊ฐ€์™€ ์ธ๊ฐ„ ํ‰๊ฐ€๋ฅผ ๊ฒฐํ•ฉํ•˜์—ฌ ํ‰๊ฐ€ LLM์„ ๋ฏธ์„ธ ์กฐ์ •ํ•˜๋Š” ์‹œ๋„๋„ ์ง„ํ–‰๋˜๊ณ  ์žˆ์Œ์„ ์†Œ๊ฐœํ•ฉ๋‹ˆ๋‹ค.

6. Individualized Persona(lization)

6.1 Definition

๊ฐœ๋ณ„ํ™” ํŽ˜๋ฅด์†Œ๋‚˜๋Š” ์‚ฌ์šฉ์ž์˜ ๊ณ ์œ ํ•œ ํŠน์„ฑ, ๊ฒฝํ—˜, ์„ ํ˜ธ ๋“ฑ์„ ๋ฐ˜์˜ํ•˜์—ฌ LLM ๊ธฐ๋ฐ˜ ์—์ด์ „ํŠธ๋ฅผ ๋งž์ถคํ˜•์œผ๋กœ ๋งŒ๋“œ๋Š” ๊ณผ์ •์„ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค. ์ด๋ฅผ ํ†ตํ•ด ๋””์ง€ํ„ธ ํด๋ก ์ด๋‚˜ ๊ฐœ์ธ ๋น„์„œ์ฒ˜๋Ÿผ, ์‚ฌ์šฉ์ž์˜ ๊ฐœ๋ณ„ ์š”๊ตฌ์— ์ตœ์ ํ™”๋œ ์„œ๋น„์Šค๋ฅผ ์ œ๊ณตํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๊ฐœ์ธํ™”๋œ RPLA์˜ ์‘์šฉ์€ ์ฃผ๋กœ ์„ธ ๊ฐ€์ง€ ์˜์—ญ์œผ๋กœ ๊ตฌ๋ถ„๋ฉ๋‹ˆ๋‹ค.

  • ๋Œ€ํ™”: ์‚ฌ์šฉ์ž์˜ ์Šคํƒ€์ผ๊ณผ ์ทจํ–ฅ์— ๋งž์ถ˜ ์ƒํ˜ธ์ž‘์šฉ ์ง€์›
  • ์ถ”์ฒœ: ๊ฐœ์ธ์˜ ์„ ํ˜ธ๋ฅผ ๋ฐ˜์˜ํ•œ ๋งž์ถคํ˜• ์ถ”์ฒœ ๊ธฐ๋Šฅ ์ œ๊ณต
  • ๊ณผ์ œ ํ•ด๊ฒฐ: ๋ณต์žกํ•œ ์ž‘์—…์„ ์ž์œจ์ ์œผ๋กœ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ๋Š” ๊ธฐ๋Šฅ ๊ตฌํ˜„

๊ฐœ์ธํ™” ํŽ˜๋ฅด์†Œ๋‚˜ ๊ตฌ์ถ•์€ ๋‘ ๊ฐ€์ง€ ์ฃผ์š” ๋‹จ๊ณ„๋กœ ์ด๋ฃจ์–ด์ง‘๋‹ˆ๋‹ค.

  • ํŽ˜๋ฅด์†Œ๋‚˜ ๋ฐ์ดํ„ฐ ์ˆ˜์ง‘: ์‚ฌ์šฉ์ž ํ”„๋กœํ•„, ๋Œ€ํ™” ๊ธฐ๋ก, ๋„๋ฉ”์ธ ์ง€์‹ ๋“ฑ ๋‹ค์–‘ํ•œ ํ˜•ํƒœ์˜ ๋ฐ์ดํ„ฐ๋ฅผ ๋ชจ์๋‹ˆ๋‹ค.
  • ํŽ˜๋ฅด์†Œ๋‚˜ ๋ชจ๋ธ๋ง: ์ˆ˜์ง‘๋œ ๋ฐฉ๋Œ€ํ•œ ๋ฐ์ดํ„ฐ ๋‚ด์—์„œ ๋…ธ์ด์ฆˆ์™€ ํฌ์†Œ์„ฑ์„ ๊ทน๋ณตํ•˜๋ฉฐ, ์‚ฌ์šฉ์ž์˜ ๊ณ ์œ  ํŠน์„ฑ์„ ํšจ๊ณผ์ ์œผ๋กœ ๋‚ด์žฌํ™”ํ•˜๋Š” ๋ชจ๋ธ์„ ์„ค๊ณ„ํ•ฉ๋‹ˆ๋‹ค.

6.2 Data Collection of Individualized Persona

๊ฐœ์ธํ™” ํŽ˜๋ฅด์†Œ๋‚˜๋ฅผ ๊ตฌ์ถ•ํ•˜๊ธฐ ์œ„ํ•ด ์ˆ˜์ง‘ํ•ด์•ผ ํ•˜๋Š” ๋ฐ์ดํ„ฐ๋Š” ์ฃผ๋กœ ์„ธ ๊ฐ€์ง€ ์œ ํ˜•์œผ๋กœ ๊ตฌ์„ฑ๋ฉ๋‹ˆ๋‹ค.

  • ํ”„๋กœํ•„ ๋ฐ์ดํ„ฐ: ์‚ฌ์šฉ์ž์˜ ๋‚˜์ด, ์„ฑ๋ณ„, ์ง์—… ๋“ฑ ๊ธฐ๋ณธ ์ •๋ณด๋ฅผ ํฌํ•จํ•ฉ๋‹ˆ๋‹ค.
  • ์ƒํ˜ธ์ž‘์šฉ ๋ฐ์ดํ„ฐ: ์‚ฌ์šฉ์ž์˜ ๋Œ€ํ™” ๊ธฐ๋ก ๋ฐ ํ–‰๋™ ํŒจํ„ด์„ ์บก์ฒ˜ํ•ฉ๋‹ˆ๋‹ค.
  • ๋„๋ฉ”์ธ ์ง€์‹: ์‚ฌ์šฉ์ž์˜ ๊ด€์‹ฌ์‚ฌ๋‚˜ ํŠน์ • ๋ถ„์•ผ ๊ด€๋ จ ์ „๋ฌธ ์ •๋ณด๋ฅผ ๋ฐ˜์˜ํ•ฉ๋‹ˆ๋‹ค.

์ˆ˜์ง‘๋œ ๋‹ค์–‘ํ•œ ๋ฐ์ดํ„ฐ๋Š” ์–‘์ด ๋งŽ๊ณ  ํฌ์†Œํ•˜๋ฉฐ ๋…ธ์ด์ฆˆ๊ฐ€ ํฌํ•จ๋  ๊ฐ€๋Šฅ์„ฑ์ด ์žˆ์œผ๋ฏ€๋กœ, ์ด๋ฅผ ํšจ๊ณผ์ ์œผ๋กœ ์ „์ฒ˜๋ฆฌํ•˜๊ณ  ํ†ตํ•ฉํ•˜๋Š” ๊ณผ์ •์ด ํ•„์ˆ˜์ ์ž…๋‹ˆ๋‹ค.

6.3 Modeling Individualized Persona

๊ฐœ๋ณ„ํ™” ํŽ˜๋ฅด์†Œ๋‚˜ ๋ชจ๋ธ๋ง์˜ ๋ชฉํ‘œ์™€ ํ•„์š”์„ฑ์„ ์†Œ๊ฐœํ•˜๋ฉฐ, ๋‘ ๊ฐ€์ง€ ์ฃผ์š” ํ•™์Šต ์ „๋žต์ธ offline learning(์‚ฌ์ „ ๋ฐฐ์น˜ ํ•™์Šต)๊ณผ online learning(์‹ค์‹œ๊ฐ„ ์—…๋ฐ์ดํŠธ)์ด ์ƒํ˜ธ ๋ณด์™„์ ์œผ๋กœ ํ™œ์šฉ๋œ๋‹ค๋Š” ์ „์ฒด ๊ฐœ์š”๋ฅผ ์ œ์‹œํ•ฉ๋‹ˆ๋‹ค.

Offline Learning

์˜คํ”„๋ผ์ธ ํ•™์Šต์€ ์‚ฌ์šฉ์ž ํ”„๋กœํ•„, ๋Œ€ํ™” ๊ธฐ๋ก, ๋„๋ฉ”์ธ ์ง€์‹ ๋“ฑ ๊ณผ๊ฑฐ์— ์ˆ˜์ง‘๋œ ์ •์  ๋ฐ์ดํ„ฐ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ๋ชจ๋ธ์„ ์ดˆ๊ธฐ ํ•™์Šต์‹œํ‚ค๋Š” ๋ฐฉ๋ฒ•์„ ์„ค๋ช…ํ•ฉ๋‹ˆ๋‹ค. ๋ฐ์ดํ„ฐ ์ „์ฒ˜๋ฆฌ ๋ฐ ํ†ตํ•ฉ ๊ณผ์ •์„ ํ†ตํ•ด ๋…ธ์ด์ฆˆ์™€ ํฌ์†Œ์„ฑ์„ ๊ทน๋ณตํ•˜๊ณ , ์‚ฌ์šฉ์ž์˜ ๊ณ ์œ  ํŠน์„ฑ์„ ์•ˆ์ •์ ์œผ๋กœ ๋‚ด์žฌํ™”ํ•˜๋Š” ๋ชจ๋ธ์„ ๊ตฌ์ถ•ํ•ฉ๋‹ˆ๋‹ค. ์ดˆ๊ธฐ ํŽ˜๋ฅด์†Œ๋‚˜ ํ‘œํ˜„์˜ ๊ธฐ์ดˆ๋ฅผ ๋งˆ๋ จํ•˜์—ฌ, ์ดํ›„ ์‹ค์‹œ๊ฐ„ ์—…๋ฐ์ดํŠธ์˜ ๊ธฐ๋ฐ˜์ด ๋ฉ๋‹ˆ๋‹ค.

Online Learning

์˜จ๋ผ์ธ ํ•™์Šต์€ ์‹ค์ œ ์‚ฌ์šฉ์ž์™€์˜ ์ƒํ˜ธ์ž‘์šฉ์„ ํ†ตํ•ด ์ง€์†์ ์œผ๋กœ ๋ชจ๋ธ์„ ์—…๋ฐ์ดํŠธํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ๋‹ค๋ฃน๋‹ˆ๋‹ค. ์‚ฌ์šฉ์ž ํ”ผ๋“œ๋ฐฑ๊ณผ ์ตœ์‹  ์ƒํ˜ธ์ž‘์šฉ ๋ฐ์ดํ„ฐ๋ฅผ ๋ฐ˜์˜ํ•˜์—ฌ, ๊ฐœ๋ณ„ ํŽ˜๋ฅด์†Œ๋‚˜๊ฐ€ ์‹œ๊ฐ„์— ๋”ฐ๋ผ ๋ณ€ํ™”ํ•˜๋Š” ์‚ฌ์šฉ์ž์˜ ํŠน์„ฑ์„ ์‹ค์‹œ๊ฐ„์œผ๋กœ ํฌ์ฐฉํ•ฉ๋‹ˆ๋‹ค. ๋ชจ๋ธ์˜ ์œ ์—ฐ์„ฑ์„ ๋†’์—ฌ, ์ตœ์‹  ์‚ฌ์šฉ์ž ์ •๋ณด์— ๊ธฐ๋ฐ˜ํ•œ ๋งž์ถค ์‘๋‹ต์„ ์ œ๊ณตํ•˜๋„๋ก ๋ณด์™„ํ•ฉ๋‹ˆ๋‹ค.

์˜คํ”„๋ผ์ธ ํ•™์Šต์œผ๋กœ ๊ตฌ์ถ•๋œ ์ดˆ๊ธฐ ํŽ˜๋ฅด์†Œ๋‚˜์™€ ์˜จ๋ผ์ธ ํ•™์Šต์œผ๋กœ ๋ฐ˜์˜๋˜๋Š” ์ตœ์‹  ์ •๋ณด ์‚ฌ์ด์˜ ๊ท ํ˜• ์œ ์ง€๊ฐ€ ํ•„์š”ํ•จ์„ ๊ฐ•์กฐํ•ฉ๋‹ˆ๋‹ค.

6.4 Evaluation for LLMs and Individualized Persona

ํ‰๊ฐ€ ๊ธฐ์ค€์€ LLM์ด ์‚ฌ์šฉ์ž ๊ฐœ๋ณ„ ํŠน์„ฑ์„ ์–ผ๋งˆ๋‚˜ ํšจ๊ณผ์ ์œผ๋กœ ๋ฐ˜์˜ํ•˜๋Š”์ง€, ๊ทธ๋ฆฌ๊ณ  ๋‹ค์–‘ํ•œ ์‘์šฉ ์ƒํ™ฉ์—์„œ ๋งž์ถคํ˜• ์‘๋‹ต์„ ์ œ๊ณตํ•˜๋Š”์ง€์— ์ค‘์ ์„ ๋‘ก๋‹ˆ๋‹ค.

ํ‰๊ฐ€ ํ•ญ๋ชฉ์€ ๋Œ€ํ™”, ์ถ”์ฒœ, ๊ณผ์ œ ํ•ด๊ฒฐ ๋“ฑ ์„œ๋กœ ๋‹ค๋ฅธ ์‚ฌ์šฉ ์‚ฌ๋ก€์— ๋”ฐ๋ผ ๊ตฌ๋ถ„๋˜๋ฉฐ, ๊ฐ ๋ฒ”์ฃผ๋ณ„๋กœ ์„ธ๋ถ€ ํ‰๊ฐ€ ๊ธฐ์ค€์ด ์กด์žฌํ•จ์„ ์„ค๋ช…ํ•ฉ๋‹ˆ๋‹ค.

Converstaion

์‚ฌ์šฉ์ž์˜ ๋Œ€ํ™” ์Šคํƒ€์ผ, ๋ฌธ์ฒด, ๊ทธ๋ฆฌ๊ณ  ์ƒํ™ฉ์— ๋งž๋Š” ์‘๋‹ต ์ƒ์„ฑ ๋Šฅ๋ ฅ ๋“ฑ, ๋Œ€ํ™”์˜ ๋ชฐ์ž…๋„์™€ ๊ฐœ์ธํ™” ์ •๋„๊ฐ€ ์ฃผ์š” ํ‰๊ฐ€ ์š”์†Œ๋กœ ๋‹ค๋ค„์ง‘๋‹ˆ๋‹ค.

Recommendation

์‚ฌ์šฉ์ž ์„ ํ˜ธ์™€ ๊ณผ๊ฑฐ ์ƒํ˜ธ์ž‘์šฉ์„ ๋ฐ˜์˜ํ•ด ๊ด€๋ จ์„ฑ ๋†’์€ ์ถ”์ฒœ์„ ์ƒ์„ฑํ•˜๋Š”์ง€, ๊ทธ๋ฆฌ๊ณ  ์ถ”์ฒœ ๊ณผ์ •์ด ๋‹ค์ค‘ ํ„ด ๋Œ€ํ™” ์†์—์„œ ์ž์—ฐ์Šค๋Ÿฝ๊ฒŒ ์ด๋ฃจ์–ด์ง€๋Š”์ง€๋ฅผ ์ค‘์ ์ ์œผ๋กœ ์‚ดํŽด๋ด…๋‹ˆ๋‹ค.

Task Solving

์‚ฌ์šฉ์ž ๊ฐœ๋ณ„ ๋ฐ์ดํ„ฐ๋ฅผ ํšจ๊ณผ์ ์œผ๋กœ ํ™œ์šฉํ•˜์—ฌ ๋„๋ฉ”์ธ๋ณ„ ๋ฌธ์ œ ํ•ด๊ฒฐ, ๊ณ„ํš ์ˆ˜๋ฆฝ ๋“ฑ ๊ณ ์ฐจ์›์ ์ธ ์ž‘์—…์„ ์ž์œจ์ ์œผ๋กœ ์ฒ˜๋ฆฌํ•  ์ˆ˜ ์žˆ๋Š”์ง€, ๊ทธ๋ฆฌ๊ณ  ๊ทธ ๊ฒฐ๊ณผ๋ฌผ์ด ์–ผ๋งˆ๋‚˜ ์‹ค์šฉ์ ์ธ์ง€๋ฅผ ๊ฒ€ํ† ํ•ฉ๋‹ˆ๋‹ค.

7. Risks Beneath RPLA Applications

7.1 Toxicity

Inherent Toxicity in LLMs

LLM์€ ๋Œ€๊ทœ๋ชจ ํ…์ŠคํŠธ ๋ฐ์ดํ„ฐ์—์„œ ํ•™์Šตํ•˜๋ฉด์„œ, ๋ฐ์ดํ„ฐ์— ๋‚ด์žฌํ•œ ๋ถ€์ •์  ์–ธ์–ด ํŒจํ„ด, ์‚ฌํšŒ์  ํŽธํ–ฅ, ๊ณ ์ •๊ด€๋… ๋“ฑ์„ ํ•จ๊ป˜ ํ•™์Šตํ•ฉ๋‹ˆ๋‹ค. ์ด๋กœ ์ธํ•ด ๋ชจ๋ธ์€ ๊ธฐ๋ณธ์ ์œผ๋กœ ๋…์„ฑ(toxic) ์–ธ์–ด๋ฅผ ์ƒ์„ฑํ•  ๊ฐ€๋Šฅ์„ฑ์ด ์žˆ์œผ๋ฉฐ, ์ด๋Š” ์˜๋„์น˜ ์•Š๊ฒŒ ํ•ด๋กœ์šด ํ‘œํ˜„์ด๋‚˜ ๊ณต๊ฒฉ์ ์ธ ์–ธ์–ด๋กœ ์ด์–ด์งˆ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

The RPLAs Conundrum

์—ญํ•  ๋†€์ด ์—์ด์ „ํŠธ(RPLA)๋Š” ํŠน์ • ํŽ˜๋ฅด์†Œ๋‚˜๋ฅผ ์‹œ๋ฎฌ๋ ˆ์ด์…˜ํ•˜๊ธฐ ์œ„ํ•ด ๊ณ ์˜์ ์œผ๋กœ ๋‹ค์–‘ํ•œ ์ธ๊ฒฉ์  ํŠน์„ฑ์„ ๊ตฌํ˜„ํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ์ด๋Ÿฌํ•œ ํŽ˜๋ฅด์†Œ๋‚˜ ๊ตฌํ˜„ ๊ณผ์ •์—์„œ, ๋…์„ฑ ํ‘œํ˜„์ด๋‚˜ ๋ถ€์ ์ ˆํ•œ ํ–‰๋™์ด ๋”์šฑ ๋ถ€๊ฐ๋  ์ˆ˜ ์žˆ๋Š” ๋”œ๋ ˆ๋งˆ(Conundrum)๊ฐ€ ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค. ์ฆ‰, ์—ญํ•  ์žฌํ˜„์˜ ์‚ฌ์‹ค์„ฑ๊ณผ ๋ชฐ์ž…๊ฐ์„ ๋†’์ด๋ ค ํ• ์ˆ˜๋ก, ์›๋ž˜ ๋‚ด์žฌ๋˜์–ด ์žˆ๋˜ ๋…์„ฑ ๋ฌธ์ œ๋„ ํ•จ๊ป˜ ๋…ธ์ถœ๋  ์œ„ํ—˜์ด ์ปค์ง‘๋‹ˆ๋‹ค.

Strategies for Balancing Safety and Performance

๋…์„ฑ ๋ฌธ์ œ๋ฅผ ์™„ํ™”ํ•˜๋ฉด์„œ๋„ ์—์ด์ „ํŠธ์˜ ์—ญํ•  ์ˆ˜ํ–‰ ๋Šฅ๋ ฅ์„ ์œ ์ง€ํ•˜๊ธฐ ์œ„ํ•œ ๋‹ค์–‘ํ•œ ์ „๋žต๋“ค์ด ๋…ผ์˜๋ฉ๋‹ˆ๋‹ค. ์—ฌ๊ธฐ์—๋Š” ๊ฐ•ํ™”ํ•™์Šต(์˜ˆ: ์ธ๊ฐ„ ํ”ผ๋“œ๋ฐฑ ๊ฐ•ํ™” ํ•™์Šต: RLHF), ํ”„๋กฌํ”„ํŠธ ์„ค๊ณ„์˜ ์ •๊ตํ™”, ์•ˆ์ „ ํ•„ํ„ฐ ๋ฐ ํ›„์ฒ˜๋ฆฌ ๊ธฐ๋ฒ• ๋“ฑ์ด ํฌํ•จ๋˜์–ด, ๋…์„ฑ์ด ๋‚ฎ์€ ๋™์‹œ์— ์„ฑ๋Šฅ ์ข‹์€ ์ถœ๋ ฅ์„ ๋„์ถœํ•˜๋ ค๋Š” ๋…ธ๋ ฅ์ด ๊ฐ•์กฐ๋ฉ๋‹ˆ๋‹ค.

7.2 Bias

Bias Manifestation in Role-Playing Scenarios

์—ญํ•  ๋†€์ด ์ƒํ™ฉ์—์„œ๋Š” LLM์ด ํ•™์Šต ๋ฐ์ดํ„ฐ์— ์กด์žฌํ•˜๋Š” ์‚ฌํšŒ์ , ๋ฌธํ™”์  ํŽธํ–ฅ์„ ๊ทธ๋Œ€๋กœ ์žฌํ˜„ํ•˜๋Š” ๊ฒฝ์šฐ๊ฐ€ ๋งŽ์Šต๋‹ˆ๋‹ค. ํŠน์ • ์ธ๋ฌผ์ด๋‚˜ ์ง‘๋‹จ์— ๋Œ€ํ•ด ๊ณ ์ •๊ด€๋…์ ์ธ ์„œ์ˆ ์ด๋‚˜ ๋ถ€์ •ํ™•ํ•œ ํ‘œํ˜„์ด ๋‚˜ํƒ€๋‚  ์ˆ˜ ์žˆ์œผ๋ฉฐ, ์ด๋Š” ์‚ฌ์šฉ์ž ๊ฒฝํ—˜์— ๋ถ€์ •์  ์˜ํ–ฅ์„ ๋ฏธ์นฉ๋‹ˆ๋‹ค.

Causes of Bias in RPLAs

ํŽธํ–ฅ์˜ ์ฃผ์š” ์›์ธ์€ ๋ถˆ๊ท ํ˜•ํ•œ ํ•™์Šต ๋ฐ์ดํ„ฐ, ๋ชจ๋ธ ์„ค๊ณ„ ์ƒ์˜ ํ•œ๊ณ„, ๊ทธ๋ฆฌ๊ณ  ์‚ฌํšŒ ์ „๋ฐ˜์— ์กด์žฌํ•˜๋Š” ์„ ์ž…๊ฒฌ ๋“ฑ์ž…๋‹ˆ๋‹ค. ํŠนํžˆ, ๋Œ€๊ทœ๋ชจ ์›น ํฌ๋กค๋ง ๋ฐ์ดํ„ฐ๋Š” ๋‹ค์–‘ํ•œ ํŽธํ–ฅ์„ ํฌํ•จํ•˜๊ณ  ์žˆ์–ด, ์ด๋ฅผ ๊ทธ๋Œ€๋กœ ํ•™์Šตํ•˜๋ฉด RPLA๊ฐ€ ํŽธํ–ฅ๋œ ํŽ˜๋ฅด์†Œ๋‚˜๋ฅผ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

Strategies for Mitigating Bias

๋ฐ์ดํ„ฐ ๋‹ค์–‘ํ™” ๋ฐ ๋ฐ˜ํŽธํ–ฅ ๋ฐ์ดํ„ฐ์…‹ ๊ตฌ์ถ•, ์•Œ๊ณ ๋ฆฌ์ฆ˜์  ์ˆ˜์ •, ํ”„๋กฌํ”„ํŠธ ์„ค๊ณ„ ๊ฐœ์„  ๋“ฑ ์—ฌ๋Ÿฌ ์ „๋žต์„ ํ†ตํ•ด ํŽธํ–ฅ ๋ฌธ์ œ๋ฅผ ์™„ํ™”ํ•˜๋Š” ๋ฐฉ์•ˆ์ด ์ œ์‹œ๋ฉ๋‹ˆ๋‹ค. ๋˜ํ•œ, ํ‰๊ฐ€ ๊ณผ์ •์—์„œ ํŽธํ–ฅ์„ ๊ฐ์ง€ํ•˜๊ณ  ์ˆ˜์ •ํ•˜๋Š” ํ›„์ฒ˜๋ฆฌ ๋‹จ๊ณ„๋„ ์ค‘์š”ํ•œ ์—ญํ• ์„ ํ•˜๋ฉฐ, ์ง€์†์ ์ธ ๋ชจ๋‹ˆํ„ฐ๋ง๊ณผ ์—…๋ฐ์ดํŠธ๊ฐ€ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.

Persona Construction Bias

ํŽ˜๋ฅด์†Œ๋‚˜๋ฅผ ๊ตฌ์„ฑํ•˜๋Š” ๊ณผ์ • ์ž์ฒด์—์„œ ์„ ํƒ๋˜๋Š” ํŠน์„ฑ, ์„œ์ˆ  ๋ฐฉ์‹, ํ˜น์€ ๋ฐ์ดํ„ฐ์˜ ์ถœ์ฒ˜๊ฐ€ ํŽธํ–ฅ์„ ๊ฐ•ํ™”ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ, ํŽ˜๋ฅด์†Œ๋‚˜ ์„ค๊ณ„ ์‹œ์—๋„ ๋‹ค์–‘ํ•œ ๊ด€์ ๊ณผ ๊ท ํ˜• ์žกํžŒ ์ •๋ณด๋ฅผ ๋ฐ˜์˜ํ•˜์—ฌ, ํŠน์ • ์ง‘๋‹จ์ด๋‚˜ ์ธ๋ฌผ์— ๋Œ€ํ•œ ์™œ๊ณก๋œ ํ‘œํ˜„์„ ์ตœ์†Œํ™”ํ•˜๋Š” ๋…ธ๋ ฅ์ด ์š”๊ตฌ๋ฉ๋‹ˆ๋‹ค.

7.3 Hallucination

Hallucination in RPLAs

LLM์€ ๋•Œ๋•Œ๋กœ ํ•™์Šต ๋ฐ์ดํ„ฐ์— ๊ทผ๊ฑฐํ•˜์ง€ ์•Š์€ ์ •๋ณด๋ฅผ ์ƒ์„ฑํ•˜๋Š” โ€˜ํ• ๋ฃจ์‹œ๋„ค์ด์…˜โ€™ ๋ฌธ์ œ๋ฅผ ๋ณด์ž…๋‹ˆ๋‹ค. ํŠนํžˆ, ์—ญํ•  ๋†€์ด ์ƒํ™ฉ์—์„œ๋Š” ์บ๋ฆญํ„ฐ์˜ ๋ฐฐ๊ฒฝ์ด๋‚˜ ํŠน์„ฑ์„ ๊ณผ๋„ํ•˜๊ฒŒ ์ผ๋ฐ˜ํ™”ํ•˜๊ฑฐ๋‚˜ ๋ถ€์ •ํ™•ํ•œ ์„ธ๋ถ€ ์ •๋ณด๋ฅผ ์ถ”๊ฐ€ํ•˜๋Š” ๊ฒฝ์šฐ๊ฐ€ ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค.

Mitigating Hallucinations in RPLAs

ํ• ๋ฃจ์‹œ๋„ค์ด์…˜ ๋ฌธ์ œ๋ฅผ ์ค„์ด๊ธฐ ์œ„ํ•ด, ์™ธ๋ถ€ ์ง€์‹ ๊ฒ€์ƒ‰(Retrieval-augmented Generation) ๊ธฐ๋ฒ•, ๋ฐ์ดํ„ฐ ๋ณด๊ฐ•, ๊ทธ๋ฆฌ๊ณ  ๋ฏธ์„ธ ์กฐ์ •(fine-tuning) ๊ธฐ๋ฒ•์ด ์ ์šฉ๋ฉ๋‹ˆ๋‹ค. ๋˜ํ•œ, ์ƒ์„ฑ๋œ ํ…์ŠคํŠธ์˜ ์‚ฌ์‹ค์„ฑ์„ ๊ฒ€์ฆํ•˜๋Š” ์ž๋™ ํ‰๊ฐ€ ๋ฐ ์ธ๊ฐ„ ํ‰๊ฐ€ ์ ˆ์ฐจ๋ฅผ ํ†ตํ•ด, ํ• ๋ฃจ์‹œ๋„ค์ด์…˜ ๋ฐœ์ƒ ๋นˆ๋„๋ฅผ ๋‚ฎ์ถ”๊ณ  ๋ณด๋‹ค ์‹ ๋ขฐ์„ฑ ์žˆ๋Š” ์ถœ๋ ฅ์„ ๋„์ถœํ•˜๋ ค๋Š” ๋…ธ๋ ฅ์ด ๊ฐ•์กฐ๋ฉ๋‹ˆ๋‹ค.

7.4 Privacy Violations

Privacy Challenges in LLMs

LLM์€ ๋Œ€๊ทœ๋ชจ ๋ฐ์ดํ„ฐ ํ•™์Šต ๊ณผ์ •์—์„œ ๋ฏผ๊ฐํ•œ ์ •๋ณด๋‚˜ ๊ฐœ์ธ ์ •๋ณด๋ฅผ ๋ฌด์‹ฌ์ฝ” ํ•™์Šตํ•  ๊ฐ€๋Šฅ์„ฑ์ด ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋กœ ์ธํ•ด, ์ƒ์„ฑ๋œ ํ…์ŠคํŠธ์— ์›์น˜ ์•Š๋Š” ๊ฐœ์ธ ์ •๋ณด๊ฐ€ ํฌํ•จ๋˜๊ฑฐ๋‚˜, ๋ฐ์ดํ„ฐ ์œ ์ถœ๊ณผ ๊ด€๋ จ๋œ ์œ„ํ—˜์ด ๋ฐœ์ƒํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

Hidden Danger of Privacy Violations in RPLAs

RPLA๋Š” ๊ฐœ๋ณ„ ์‚ฌ์šฉ์ž ๋ฐ์ดํ„ฐ๋ฅผ ํ™œ์šฉํ•ด ๊ฐœ์ธํ™”๋œ ์„œ๋น„์Šค๋ฅผ ์ œ๊ณตํ•˜๊ธฐ ๋•Œ๋ฌธ์—, ๋”์šฑ ์€๋ฐ€ํ•˜๊ฒŒ ๊ฐœ์ธ์ •๋ณด๊ฐ€ ๋…ธ์ถœ๋  ์œ„ํ—˜์ด ์žˆ์Šต๋‹ˆ๋‹ค. ์‚ฌ์šฉ์ž์˜ ๋Œ€ํ™” ๊ธฐ๋ก์ด๋‚˜ ํ–‰๋™ ํŒจํ„ด์ด ๋ถ€์ ์ ˆํ•˜๊ฒŒ ์ฒ˜๋ฆฌ๋˜๋ฉด, ํ”„๋ผ์ด๋ฒ„์‹œ ์นจํ•ด ์‚ฌ๋ก€๊ฐ€ ๋ฐœ์ƒํ•  ์ˆ˜ ์žˆ์Œ์„ ๊ฒฝ๊ณ ํ•ฉ๋‹ˆ๋‹ค.

Strategies for Enhancing Privacy

๊ฐœ์ธ์ •๋ณด ๋ณดํ˜ธ๋ฅผ ์œ„ํ•ด ๋ฐ์ดํ„ฐ ์ต๋ช…ํ™”, ์•ˆ์ „ํ•œ ์ €์žฅ ๋ฐ ์ ‘๊ทผ ์ œ์–ด, ๊ทธ๋ฆฌ๊ณ  ์ฐจ๋“ฑ ๊ฐœ์ธ์ •๋ณด ๋ณดํ˜ธ(Differential Privacy) ๊ธฐ๋ฒ• ๋“ฑ์ด ์ œ์•ˆ๋ฉ๋‹ˆ๋‹ค. ๋˜ํ•œ, ๋ชจ๋ธ์ด ๋ฏผ๊ฐ ์ •๋ณด๋ฅผ ํ•™์Šตํ•˜์ง€ ์•Š๋„๋ก ํ•˜๋Š” ์‚ฌ์ „ ํ•„ํ„ฐ๋ง ๋ฐ ํ›„์ฒ˜๋ฆฌ ๋‹จ๊ณ„๋„ ์ค‘์š”ํ•˜๋ฉฐ, ์‹ค์‹œ๊ฐ„ ๋ชจ๋‹ˆํ„ฐ๋ง ์ฒด๊ณ„๋ฅผ ๋„์ž…ํ•˜๋Š” ๋ฐฉ์•ˆ๋„ ๊ณ ๋ ค๋ฉ๋‹ˆ๋‹ค.

7.5 Technical Challenges in Real-world Deployment

Lack of Social Intelligence and Theory of Mind

์‹ค์ œ ์‚ฌํšŒ์  ์ƒํ˜ธ์ž‘์šฉ์—์„œ LLM์€ ์ธ๊ฐ„๊ณผ ๊ฐ™์€ ์‚ฌํšŒ์  ์ง€๋Šฅ์ด๋‚˜ ํƒ€์ธ์˜ ์‹ฌ๋ฆฌ๋ฅผ ์˜ˆ์ธกํ•˜๋Š” ๋Šฅ๋ ฅ์ด ๋ถ€์กฑํ•ฉ๋‹ˆ๋‹ค. ์ด๋กœ ์ธํ•ด, ๋ณต์žกํ•œ ์‚ฌํšŒ์  ๋งฅ๋ฝ์ด๋‚˜ ๋ฏธ๋ฌ˜ํ•œ ๊ฐ์ • ํ‘œํ˜„, ์ƒํ˜ธ์ž‘์šฉ์˜ ๋‰˜์•™์Šค๋ฅผ ์ •ํ™•ํ•˜๊ฒŒ ํŒŒ์•…ํ•˜๊ธฐ ์–ด๋ ค์›Œ์ง‘๋‹ˆ๋‹ค.

Long-context Challenges

๊ธด ๋Œ€ํ™”๋‚˜ ๋ณต์žกํ•œ ์‹œ๋‚˜๋ฆฌ์˜ค์—์„œ ์ปจํ…์ŠคํŠธ๋ฅผ ํšจ๊ณผ์ ์œผ๋กœ ์œ ์ง€ํ•˜๋Š” ๋ฐ ํ•œ๊ณ„๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. LLM์˜ ๊ณ ์ •๋œ ์ปจํ…์ŠคํŠธ ์œˆ๋„์šฐ๋กœ ์ธํ•ด, ์žฅ์‹œ๊ฐ„ ์ƒํ˜ธ์ž‘์šฉ ์‹œ ์ค‘์š”ํ•œ ์ •๋ณด๊ฐ€ ๋ˆ„๋ฝ๋˜๊ฑฐ๋‚˜ ์ผ๊ด€์„ฑ์ด ๋–จ์–ด์ง€๋Š” ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

Knowledge Gaps

LLM์€ ์ตœ์‹  ์ •๋ณด๋‚˜ ํŠน์ • ๋„๋ฉ”์ธ์— ๋Œ€ํ•œ ์ „๋ฌธ ์ง€์‹์—์„œ ๊ฐ„๊ทน์ด ์กด์žฌํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ, ์ด๋กœ ์ธํ•ด ์—ญํ•  ๋†€์ด ๊ณผ์ •์—์„œ ๋ถ€์ •ํ™•ํ•˜๊ฑฐ๋‚˜ ๋ถˆ์™„์ „ํ•œ ์ •๋ณด๋ฅผ ์ œ๊ณตํ•  ์œ„ํ—˜์ด ์žˆ์Šต๋‹ˆ๋‹ค. ํŠนํžˆ, ์‹œ์‹œ๊ฐ๊ฐ ๋ณ€ํ™”ํ•˜๋Š” ์‹ค์‹œ๊ฐ„ ์ •๋ณด๋‚˜ ์ตœ์‹  ํŠธ๋ Œ๋“œ๋ฅผ ๋ฐ˜์˜ํ•˜๋Š” ๋ฐ ์–ด๋ ค์›€์ด ์žˆ์Œ์ด ๊ฐ•์กฐ๋ฉ๋‹ˆ๋‹ค.

7.6 Anthropomorphism

Social Isolation

์ธ๊ฐ„๊ณผ ๋งค์šฐ ์œ ์‚ฌํ•œ ์—์ด์ „ํŠธ๊ฐ€ ์‹ค์ œ ์ธ๊ฐ„๊ณผ์˜ ์ƒํ˜ธ์ž‘์šฉ์„ ๋Œ€์ฒดํ•˜๊ฒŒ ๋˜๋ฉด, ์‚ฌํšŒ์  ๊ณ ๋ฆฝ ํ˜„์ƒ์ด ์ด‰๋ฐœ๋  ์šฐ๋ ค๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. ํŠนํžˆ, ๊ฐœ์ธํ™”๋œ RPLA๊ฐ€ ์ธ๊ฐ„ ๊ด€๊ณ„์˜ ๋Œ€์ฒด์žฌ๋กœ ์ธ์‹๋˜๋ฉด, ์ธ๊ฐ„ ์ƒํ˜ธ์ž‘์šฉ์˜ ์งˆ์ด ์ €ํ•˜๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

Manipulation of Public Opinion

RPLA๊ฐ€ ์ธ๊ฐ„์ฒ˜๋Ÿผ ๋ณด์ด๊ณ  ํ–‰๋™ํ•จ์œผ๋กœ์จ, ๊ณต๊ณต ์—ฌ๋ก ์ด๋‚˜ ์‚ฌํšŒ์  ์ด์Šˆ์— ์˜ํ–ฅ์„ ๋ฏธ์น  ์ˆ˜ ์žˆ๋Š” ์ž ์žฌ์  ์œ„ํ—˜์ด ์žˆ์Šต๋‹ˆ๋‹ค. ํŠนํžˆ, ์ •์น˜์  ๋˜๋Š” ์‚ฌํšŒ์  ๋งฅ๋ฝ์—์„œ ์ธ์œ„์ ์œผ๋กœ ์„ค๊ณ„๋œ ํŽ˜๋ฅด์†Œ๋‚˜๋ฅผ ์ด์šฉํ•ด ์ž˜๋ชป๋œ ์ •๋ณด ํ™•์‚ฐ์ด๋‚˜ ์—ฌ๋ก  ์กฐ์ž‘์ด ์‹œ๋„๋  ์ˆ˜ ์žˆ์Œ์„ ๊ฒฝ๊ณ„ํ•ฉ๋‹ˆ๋‹ค.

8. Closing Remarks

์—ฐ๊ตฌ์ง„์€ RPLA๊ฐ€ ๋‹ค์–‘ํ•œ ํŽ˜๋ฅด์†Œ๋‚˜ ๊ตฌํ˜„๊ณผ ๊ฐœ์ธํ™” ๊ธฐ์ˆ ์„ ํ†ตํ•ด ์‚ฌ์šฉ์ž ๋งž์ถคํ˜• ์ƒํ˜ธ์ž‘์šฉ์„ ์ œ๊ณตํ•  ์ˆ˜ ์žˆ๋Š” ์ ์„ ๊ฐ•์กฐํ•˜๋ฉด์„œ๋„, ์•ˆ์ „์„ฑ, ํŽธํ–ฅ, ํ• ๋ฃจ์‹œ๋„ค์ด์…˜ ๋“ฑ ํ•ด๊ฒฐํ•ด์•ผ ํ•  ๋ฌธ์ œ๋“ค์ด ์—ฌ์ „ํžˆ ๋‚จ์•„ ์žˆ์Œ์„ ์ง€์ ํ•ฉ๋‹ˆ๋‹ค. ๋˜ํ•œ, ์ด ์žฅ์€ ํ›„์† ์—ฐ๊ตฌ๋ฅผ ์œ„ํ•œ ๋ฐฉํ–ฅ์„ฑ์„ ์ œ์‹œํ•˜๋ฉฐ, ์•ž์œผ๋กœ์˜ ๋ฐœ์ „ ๊ฐ€๋Šฅ์„ฑ๊ณผ ์‘์šฉ ๋ถ„์•ผ์— ๋Œ€ํ•ด ๋…ผ์˜ํ•ฉ๋‹ˆ๋‹ค.

Future Directions on RPLA Systems

RPLA ์‹œ์Šคํ…œ์ด ๋ฏธ๋ž˜์— ๋‚˜์•„๊ฐ€์•ผ ํ•  ์—ฐ๊ตฌ ๋ฐฉํ–ฅ๊ณผ ๋„์ „ ๊ณผ์ œ๋“ค์„ ์ œ์‹œํ•ฉ๋‹ˆ๋‹ค. ์—ฐ๊ตฌ์ง„์€ ํ˜„์žฌ์˜ ํ•œ๊ณ„๋“ค์„ ๊ทน๋ณตํ•˜๊ณ , ๋ณด๋‹ค ์ •๊ตํ•˜๊ณ  ์œ ์—ฐํ•œ ์—ญํ•  ๋†€์ด ์—์ด์ „ํŠธ๋ฅผ ๊ฐœ๋ฐœํ•˜๊ธฐ ์œ„ํ•ด ๋‹ค์–‘ํ•œ ์ธก๋ฉด์—์„œ ๊ฐœ์„ ์ด ํ•„์š”ํ•˜๋‹ค๊ณ  ์„ค๋ช…ํ•ฉ๋‹ˆ๋‹ค. ํŠนํžˆ, ์˜์‚ฌ๊ฒฐ์ • ์ง€์›, ๊ฐœ์ธ ๋งž์ถคํ˜• ์„œ๋น„์Šค, ๊ทธ๋ฆฌ๊ณ  ์ž์œจ์ ์ธ ์‚ฌํšŒ ์‹œ๋ฎฌ๋ ˆ์ด์…˜๊ณผ ๊ฐ™์€ ์˜์—ญ์—์„œ ์ƒˆ๋กœ์šด ์—ฐ๊ตฌ ๊ธฐํšŒ๊ฐ€ ์—ด๋ฆด ๊ฒƒ์ž„์„ ๊ฐ•์กฐํ•˜๋ฉฐ, ๊ฐ ๋ฐฉํ–ฅ์— ๋Œ€ํ•œ ๊ตฌ์ฒด์ ์ธ ์—ฐ๊ตฌ ์•„์ด๋””์–ด๋ฅผ ์ œ์•ˆํ•ฉ๋‹ˆ๋‹ค.

Causal Data Analysis for Decision-making:

RPLA๊ฐ€ ์‚ฌ์šฉ์ž์™€์˜ ์ƒํ˜ธ์ž‘์šฉ์—์„œ ๋ฐœ์ƒํ•˜๋Š” ๋ฐ์ดํ„ฐ์˜ ์ธ๊ณผ ๊ด€๊ณ„๋ฅผ ๋ถ„์„ํ•˜๋Š” ๊ฒƒ์˜ ์ค‘์š”์„ฑ์„ ๋‹ค๋ฃน๋‹ˆ๋‹ค. ๋‹จ์ˆœํ•œ ์ƒ๊ด€๊ด€๊ณ„์— ๋จธ๋ฌด๋ฅด์ง€ ์•Š๊ณ , ์–ด๋–ค ์š”์ธ์ด ๊ฒฐ๊ณผ์— ์ง์ ‘์ ์ธ ์˜ํ–ฅ์„ ๋ฏธ์น˜๋Š”์ง€ ํŒŒ์•…ํ•จ์œผ๋กœ์จ, ์—์ด์ „ํŠธ๊ฐ€ ๋ณด๋‹ค ์ •ํ™•ํ•˜๊ณ  ์‹ ๋ขฐ์„ฑ ์žˆ๋Š” ๊ฒฐ์ •์„ ๋‚ด๋ฆด ์ˆ˜ ์žˆ๋„๋ก ๋•๋Š” ๋ฐฉ๋ฒ•์„ ์ œ์‹œํ•ฉ๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ์ธ๊ณผ ๋ถ„์„์€ ๋ชจ๋ธ์ด ๋ถˆํ•„์š”ํ•œ ๋…ธ์ด์ฆˆ๋ฅผ ์ œ๊ฑฐํ•˜๊ณ , ํ•ต์‹ฌ ๋ณ€์ˆ˜์— ์ง‘์ค‘ํ•˜์—ฌ ์˜์‚ฌ๊ฒฐ์ • ๊ณผ์ •์„ ๊ฐœ์„ ํ•˜๋Š” ๋ฐ ์ค‘์š”ํ•œ ์—ญํ• ์„ ํ•  ๊ฒƒ์œผ๋กœ ๊ธฐ๋Œ€๋ฉ๋‹ˆ๋‹ค.

Improved Decision-making:

RPLA์˜ ์˜์‚ฌ๊ฒฐ์ • ๋Šฅ๋ ฅ์„ ํ•œ์ธต ๋” ๊ฐ•ํ™”ํ•˜๊ธฐ ์œ„ํ•œ ์ „๋žต์„ ์„ค๋ช…ํ•ฉ๋‹ˆ๋‹ค. ์—ฌ๊ธฐ์„œ๋Š” ๊ณ ๊ธ‰ ์ถ”๋ก  ์•Œ๊ณ ๋ฆฌ์ฆ˜๊ณผ ์‹ค์‹œ๊ฐ„ ํ”ผ๋“œ๋ฐฑ ๋ฉ”์ปค๋‹ˆ์ฆ˜์„ ๋„์ž…ํ•˜์—ฌ, ๋ณต์žกํ•œ ์ƒํ™ฉ์—์„œ๋„ ๋…ผ๋ฆฌ์ ์ด๊ณ  ์ผ๊ด€๋œ ๊ฒฐ์ •์„ ๋‚ด๋ฆด ์ˆ˜ ์žˆ๋„๋ก ํ•˜๋Š” ๋ฐฉ์•ˆ์„ ๋…ผ์˜ํ•ฉ๋‹ˆ๋‹ค. ์ฆ‰, ๋‹จ์ˆœํžˆ ์ธ๊ฐ„์˜ ํ–‰๋™์„ ๋ชจ๋ฐฉํ•˜๋Š” ๊ฒƒ์„ ๋„˜์–ด, ๋‹ค์–‘ํ•œ ๋ณ€์ˆ˜์™€ ๋ถˆํ™•์‹ค์„ฑ์„ ๊ณ ๋ คํ•œ ์ฒด๊ณ„์ ์ธ ์˜์‚ฌ๊ฒฐ์ • ํ”„๋กœ์„ธ์Šค๋ฅผ ๊ตฌ์ถ•ํ•˜๋Š” ๊ฒƒ์ด ๋ชฉํ‘œ์ž…๋‹ˆ๋‹ค.

RPLA as Personal Assistants for Personal Decision-making:

๊ฐœ์ธํ™”๋œ RPLA๊ฐ€ ์‚ฌ์šฉ์ž์˜ ์ผ์ƒ์ ์ธ ์˜์‚ฌ๊ฒฐ์ •์„ ์ง€์›ํ•˜๋Š” ๊ฐœ์ธ ๋น„์„œ๋กœ ๋ฐœ์ „ํ•  ๊ฐ€๋Šฅ์„ฑ์„ ์ œ์‹œํ•ฉ๋‹ˆ๋‹ค. ์—์ด์ „ํŠธ๊ฐ€ ์‚ฌ์šฉ์ž์˜ ํ”„๋กœํ•„, ๋Œ€ํ™” ๊ธฐ๋ก, ๊ทธ๋ฆฌ๊ณ  ์„ ํ˜ธ ๋ฐ์ดํ„ฐ๋ฅผ ์‹ค์‹œ๊ฐ„์œผ๋กœ ๋ฐ˜์˜ํ•จ์œผ๋กœ์จ, ๊ฐœ์ธ๋ณ„๋กœ ๋งž์ถคํ˜• ์กฐ์–ธ๊ณผ ๊ฒฐ์ •์„ ์ œ๊ณตํ•  ์ˆ˜ ์žˆ๋Š” ๋ฐฉํ–ฅ์„ ๋ชจ์ƒ‰ํ•ฉ๋‹ˆ๋‹ค. ์ด๋กœ ์ธํ•ด, ์‚ฌ์šฉ์ž๋Š” ๋ณด๋‹ค ํšจ์œจ์ ์ด๊ณ , ๊ฐœ์ธํ™”๋œ ๋ฐฉ์‹์œผ๋กœ ์ผ์ƒ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ, RPLA๋Š” ๋‹จ์ˆœํ•œ ์—ญํ•  ๋†€์ด๋ฅผ ๋„˜์–ด์„œ ์‹ค์งˆ์ ์ธ ๊ฐœ์ธ ๋น„์„œ๋กœ์„œ์˜ ์—ญํ• ์„ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ๊ฒŒ ๋ฉ๋‹ˆ๋‹ค.

Social Simulation through Autonomous Role-Playing:

RPLA๊ฐ€ ์ž์œจ์ ์œผ๋กœ ๋‹ค์ˆ˜์˜ ์—์ด์ „ํŠธ์™€ ์ƒํ˜ธ์ž‘์šฉํ•˜๋ฉฐ, ์‹ค์ œ ์‚ฌํšŒ์  ์ƒํ™ฉ์„ ๋ชจ์‚ฌํ•˜๋Š” ์‚ฌํšŒ ์‹œ๋ฎฌ๋ ˆ์ด์…˜ ์‹œ์Šคํ…œ์œผ๋กœ ๋ฐœ์ „ํ•  ์ˆ˜ ์žˆ๋Š” ๊ฐ€๋Šฅ์„ฑ์„ ๋…ผ์˜ํ•ฉ๋‹ˆ๋‹ค. ์ด๋ฅผ ํ†ตํ•ด, ๋ณต์žกํ•œ ์ง‘๋‹จ ์˜์‚ฌ๊ฒฐ์ •, ์‚ฌํšŒ์  ๊ทœ๋ฒ” ๋ฐ ์—ญํ•™์„ ์žฌํ˜„ํ•˜๋Š” ๋™์‹œ์—, ์ธ๊ฐ„ ์‚ฌํšŒ์˜ ๋‹ค์–‘ํ•œ ํ–‰๋™ ํŒจํ„ด์— ๋Œ€ํ•œ ํ†ต์ฐฐ์„ ์ œ๊ณตํ•  ์ˆ˜ ์žˆ๋Š” ์—ฐ๊ตฌ ๋ถ„์•ผ๋กœ ํ™•์žฅ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ์ž์œจ์  ์‚ฌํšŒ ์‹œ๋ฎฌ๋ ˆ์ด์…˜์€ RPLA๊ฐ€ ๋‹จ์ˆœํžˆ ๊ฐœ๋ณ„ ์‚ฌ์šฉ์ž์™€์˜ ์ƒํ˜ธ์ž‘์šฉ์„ ๋„˜์–ด, ์ง‘๋‹จ ๋‚ด ํ˜‘์—…์ด๋‚˜ ๊ฒฝ์Ÿ ์ƒํ™ฉ์—์„œ๋„ ์œ ์šฉํ•˜๊ฒŒ ํ™œ์šฉ๋  ์ˆ˜ ์žˆ์Œ์„ ์‹œ์‚ฌํ•ฉ๋‹ˆ๋‹ค.


Great! Youโ€™ve successfully signed up.

Welcome back! You've successfully signed in.

You've successfully subscribed to zoomg.

Success! Check your email for magic link to sign-in.

Success! Your billing info has been updated.

Your billing was not updated.