📢 검색 기능 추가 예정

Aug 15, 2023

23.08.15 (Tue)

zoomg

Database administrators (DBAs) play a crucial role in managing, maintainingand optimizing a database system to ensure data availability, performance, andreliability. However, it is hard and tedious for DBAs to manage a large numberof database instances (e.g., millions of instances on the cloud da…

arXiv.orgXuanhe Zhou

AgentBench: Evaluating LLMs as Agents

Large Language Models (LLMs) are becoming increasingly smart and autonomous,targeting real-world pragmatic missions beyond traditional NLP tasks. As aresult, there has been an urgent need to evaluate LLMs as agents on challengingtasks in interactive environments. We present AgentBench, a multi-di…

arXiv.orgXiao Liu

Studying Large Language Model Generalization with Influence Functions

When trying to gain better visibility into a machine learning model in orderto understand and mitigate the associated risks, a potentially valuable sourceof evidence is: which training examples most contribute to a given behavior?Influence functions aim to answer a counterfactual: how would the m…

arXiv.orgRoger Grosse

Pre-Trained Large Language Models for Industrial Control

For industrial control, developing high-performance controllers with fewsamples and low technical debt is appealing. Foundation models, possessing richprior knowledge obtained from pre-training with Internet-scale corpus, have thepotential to be a good controller with proper prompts. In this pape…

arXiv.orgLei Song

Trustworthy LLMs: a Survey and Guideline for Evaluating Large Language Models’ Alignment

Ensuring alignment, which refers to making models behave in accordance withhuman intentions [1,2], has become a critical task before deploying largelanguage models (LLMs) in real-world applications. For instance, OpenAI devotedsix months to iteratively aligning GPT-4 before its release [3]. Howev…

arXiv.orgYang Liu

LLM Agent Evaluation

Read next