Papers

12094 papers

ICLR2025

ComaDICE: Offline Cooperative Multi-Agent Reinforcement Learning with Stationary Distribution Shift Regularization

Summary pending...

Offline Reinforcement LearningMulti-Agent Reinforcement LearningStationary Distribution Correction Estimation
ICLR2025

Beyond Random Masking: When Dropout meets Graph Convolutional Networks

Summary pending...

Graph neural networksDropout
ICLR2025

Self-supervised contrastive learning performs non-linear system identification

Summary pending...

system identificationdynamics learningidentifiability
ICLR2025

DarkBench: Benchmarking Dark Patterns in Large Language Models

Summary pending...

Dark PatternsAI DeceptionLarge Language Models
ICLR2025

Sparse autoencoders reveal selective remapping of visual concepts during adaptation

Summary pending...

interpretabilityvision-language modelssparse autoencoder
ICLR2025

Reliable and Diverse Evaluation of LLM Medical Knowledge Mastery

Summary pending...

LLM EvaluationMedical EvaluationLarge Language Model
ICLR2025

How new data permeates LLM knowledge and how to dilute it

Summary pending...

fine-tuninghallucinationsknowledge injection
ICLR2025

SWE-Search: Enhancing Software Agents with Monte Carlo Tree Search and Iterative Refinement

Summary pending...

agentsLLMSWE-agents
ICLR2025

Language Models are Advanced Anonymizers

Summary pending...

privacyanonymizationlarge language models
ICLR2025

ADAM: An Embodied Causal Agent in Open-World Environments

Summary pending...

embodied agentcausalitylarge language model
ICLR2025

Clique Number Estimation via Differentiable Functions of Adjacency Matrix Permutations

Summary pending...

Graph neural networkdistant supervision
ICLR2025

Expected Return Symmetries

Summary pending...

multi-agent reinforcement learningzero-shot coordination
ICLR2025

Herald: A Natural Language Annotated Lean 4 Dataset

Summary pending...

Lean 4AutoformalizingLLM
ICLR2025

Beware of Calibration Data for Pruning Large Language Models

Summary pending...

calibration datapost-training pruninglarge language models
UAI2024

DistriBlock: Identifying adversarial audio samples by leveraging characteristics of the output distribution

Summary pending...

Audio adversarial examplesASRMachine Learning
UAI2024

Consistency Regularization for Domain Generalization with Logit Attribution Matching

Summary pending...

domain generalizationconsistency regularizationcausality
UAI2024

Decentralized Online Learning in General-Sum Stackelberg Games

Summary pending...

Stackelberg games; Bandits; Online learning;
UAI2024

Graph Feedback Bandits with Similar Arms

Summary pending...

online learningbandit
UAI2024

Early-Exit Neural Networks with Nested Prediction Sets

Summary pending...

early-exit neural networksuncertainty quantificationanytime-valid confidence sequence
UAI2024

Low-rank Matrix Bandits with Heavy-tailed Rewards

Summary pending...

contextual bandit