Papers

12094 papers

ICLR2025

Value-Incentivized Preference Optimization: A Unified Approach to Online and Offline RLHF

Summary pending...

preference optimizationthe principle of optimism/pessimismRLHF theory

Paper

ICLR2025

AgentOccam: A Simple Yet Strong Baseline for LLM-Based Web Agents

Summary pending...

LLMAgentLLM-based Agent

Paper

ICLR2025

MaestroMotif: Skill Design from Artificial Intelligence Feedback

Summary pending...

Hierarchical RLReinforcement LearningLLMs

Paper

ICLR2025

Probe Pruning: Accelerating LLMs through Dynamic Pruning via Model-Probing

Summary pending...

Large Lanuage Model PruningProbe Pruning

Paper

ICLR2025

ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery

Summary pending...

BenchmarkEvaluationLarge Language Model

Paper

ICLR2025

Humanizing the Machine: Proxy Attacks to Mislead LLM Detectors

Summary pending...

machine-generted text detection; evade detection; fine-tuning

Paper

ICLR2025

Learning-Guided Rolling Horizon Optimization for Long-Horizon Flexible Job-Shop Scheduling

Summary pending...

Learning-Guided OptimizationRolling Horizon OptimizationFlexible Job Shop Scheduling

Paper

ICLR2025

ImProver: Agent-Based Automated Proof Optimization

Summary pending...

Automated Proof OptimizationNeural Theorem ProvingFormal Mathematics

Paper

ICLR2025

Token-Supervised Value Models for Enhancing Mathematical Problem-Solving Capabilities of Large Language Models

Summary pending...

Large Language ModelsMathematical Problem-SolvingVerifiers

Paper

ICLR2025

Reward Learning from Multiple Feedback Types

Summary pending...

Reinforcement LearningRLHFMachine Learning

Paper

ICLR2025

Directional Gradient Projection for Robust Fine-Tuning of Foundation Models

Summary pending...

Fine-tuningtransfer learningfoundation models

Paper

ICLR2025

When narrower is better: the narrow width limit of Bayesian parallel branching neural networks

Summary pending...

Bayesian NetworksGaussian ProcessKernel Renormalization

Paper

ICLR2025

Brain Bandit: A Biologically Grounded Neural Network for Efficient Control of Exploration

Summary pending...

explore-exploitstochastic Hopfield networkThompson sampling

Paper

ICLR2025

Discovering Influential Neuron Path in Vision Transformers

Summary pending...

ExplainabilityVision TransformerNeuron

Paper

ICLR2025

Learning-Augmented Frequent Directions

Summary pending...

learning-augmented algorithmsalgorithms with predictionsdata streams

Paper

ICLR2025

Can Watermarks be Used to Detect LLM IP Infringement For Free?

Summary pending...

large language modelswatermarkmodel copyright

Paper

ICLR2025

More Experts Than Galaxies: Conditionally-Overlapping Experts with Biologically-Inspired Fixed Routing

Summary pending...

Deep learningMixture of ExpertsModularity

Paper

ICLR2025

In-context Time Series Predictor

Summary pending...

Time Series ForecastingIn-context LearningTransformer

Paper

ICLR2025

Uncertainty Herding: One Active Learning Method for All Label Budgets

Summary pending...

Active learning

Paper

ICLR2025

Efficient Biological Data Acquisition through Inference Set Design

Summary pending...

Active LearningData AcquisitionML for Drug Discovery

Paper