Papers

12094 papers

ICLR2024

CRITIC: Large Language Models Can Self-Correct with Tool-Interactive Critiquing

Summary pending...

NeurIPS2023

HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face

Summary pending...

The Rise and Potential of Large Language Model Based Agents: A Survey

Summary pending...

NeurIPS2023

Large Language Models Still Can't Plan (A Benchmark for LLMs on Planning and Reasoning about Change)

Summary pending...

NeurIPS2023

AdaPlanner: Adaptive Planning from Feedback with Language Models

Summary pending...

NeurIPS2023

Self-Refine: Iterative Refinement with Self-Feedback

Summary pending...

NeurIPS2023

Leveraging Pre-trained Large Language Models to Construct and Utilize World Models for Model-based Task Planning

Summary pending...

Augmented Language Models: a Survey

Summary pending...

NeurIPS2023

Toolformer: Language Models Can Teach Themselves to Use Tools

Summary pending...

NeurIPS2023

Leveraging Pre-trained Large Language Models to Construct and Utilize World Models for Model-based Task Planning

Summary pending...

EMNLP2023

ByteSized32: A Corpus and Challenge Task for Generating Task-Specific World Models Expressed as Text Games

Summary pending...

Making Large Language Models into World Models with Precondition and Effect Knowledge

Summary pending...

TaskBench: Benchmarking Large Language Models for Task Automation

Summary pending...

MetaTool Benchmark: Deciding Whether to Use Tools and Which to Use

Summary pending...

ICLR2023

ReAct: Synergizing Reasoning and Acting in Language Models

Summary pending...

EMNLP2023

API-Bank: A Benchmark for Tool-Augmented LLMs

Summary pending...

Learning From Mistakes Makes LLM Better Reasoner

Summary pending...

Learning From Correctness Without Prompting Makes LLM Efficient Reasoner

Summary pending...

NeurIPS2023

PlanBench: An Extensible Benchmark for Evaluating Large Language Models on Planning and Reasoning about Change

Summary pending...

NeurIPS2023

On the Planning Abilities of Large Language Models - A Critical Investigation

Summary pending...