NeurIPS Spotlight Session
ERBench: Entity-Relationship-based Automatically Verifiable Hallucination Benchmark for Large-Scale Language Models
Geo Oh, Soyoung Kim, Seo Jun-seok, Jindong Wang, Luochen Xu, Xin Xi, Stephen Eui-jung Huang
To thoroughly analyze LLM, the authors propose ERBench, which automatically converts any relational database into a benchmark based on an entity-relationship model.
Study of plasticity loss in on-policy deep reinforcement learning
Arthur Giuliani, Jordan Ash
The authors conducted extensive experiments on policy-based deep reinforcement learning and plasticity loss in various mitigation methods.
Advances in spiking neural networks for sequential modeling with central pattern generators
Changze Lv, Dongqi Han, Yansen Wang, Xiaoqing Zheng, Xuanjing Huang, Dongsheng Li
CPG-PE is a novel positional encoding (PE) technique for spiking neural networks inspired by the human brain’s central pattern generator.
Interaction with Assouad, Fano, Le Cam: A unified lower bound framework and characterization for bandit learnability.
Fan Chen, Dylan J. Foster, Yanjun Han, Jiang Qian, Alexander Rachlin, Yunbei Xu
The authors develop a unified framework for lower bound methods in statistical estimation and interactive decision making. They also propose a unified view of these different methodologies.
BPQP: A differentiable convex optimization framework for efficient end-to-end learning
Xiao Yang, Xu Yang, Weiqing Liu, Lewen Wang, Jiang Bian
To increase efficiency, the authors took advantage of the structural properties of the Karush-Kuhn-Tucker (KKT) matrix to reformulate the backward pass as a simplified and decoupled quadratic programming problem.
Generalizing configurations across distribution shifts with sparse tree operations
Paul Smolenski, Gao Jianfeng, Roland Fernandez
In this study, we investigate an integrated neural semiotic system in which transformations within a network can be simultaneously interpreted as both symbolic and neural computations. It extends a unified neurosymbolic architecture.
Datasets and Lessons Learned from the 2024 SaTML LLM Capture the Flag Competition
Edoardo DeBenedetti, Javier Land, Daniel Pareca, Phineas Silaghi, Dragos Albastroiu, Niv Cohen, Yuval Lemberg, Reshmi Ghosh, Ahmed Salem, Rui Wen, Jovan Ni Kelvin, Santiago Zanella Beguelin, Robin Schmidt, Victor Klemm, Takahiro Miki, Chenghao Li, Stefan Kraft, Mario Fritz, Florian Trammer, Sahar Abdelnabi, Lee Schoenherr
This report summarizes insights from the Capture the Flag competition at IEEE SaTML 2024, highlighting the challenges in defending large language model systems from malicious message attacks.
Popularization of world modeling: Atari, visual details matter (Opens in new tab)
Eloy Alonso, Adam Gerry, Vincent Micheli, Ansi Kanervist, Amos Stokey, Tim Pearce (Opens in new tab)François Fleuret
In this work, we present DIAMOND (Diffusion as a Model for Environmental Dreams), an open-source reinforcement learning agent trained on a diffuse world model.
DISCOVERYWORLD: A virtual environment for developing and evaluating automated scientific discovery agents
Peter Alexander Jansen, Marc-Alexandre Côté, Tushar Court, Erin Bransome, Bhavana Dalvi, Bodhisattva Prasad Majumdar, Øyvind Tafjord, Peter Clarke
DISCOVERYWORLD is an open-source virtual environment for developing and benchmarking an agent’s ability to perform a complete scientific discovery cycle, including 120 different tasks across a variety of topics.
Efficient adversarial training in LLM with continuous attacks
Sophie Chaunou, Alessandro Soldoni, Stephane Gunnemann, Gautier Zidel, Leo Schwinn
In this work, we introduce an efficient approach to adversarial attacks by computing them in the continuous embedding space of LLM.
Generalized linear bandit with limited adaptability
Ayush Sawarni, Nirjal Das, Siddharth Burman, Gaurav Sinha
In this paper, we study the generalized linear context bandit problem under limited adaptability and introduce two algorithms, B-GLinCB and RS-GLinCB, that address two common settings.
Human-aware visual and verbal navigation: Bridging simulation to reality with dynamic human interaction
Ming-Han Li, Heng Li, Zhi-Chi Chen, Yifei Dong, Yushuang Zhou, Junyang Ho, Qi Dai, Teruko Mitamura, Alexander G. Hauptmann
In this study, we introduce human-aware visual and language navigation (HA-VLN), which extends traditional VLN by incorporating dynamic human activities and relaxing key assumptions.
Identify equivalent training dynamics
(Opens in new tab)William T. Redman, Juan M. Bello-Rivas (Opens in new tab)M. Fonobelova, Ryan Mohr, I. Kebrekidis, Igor Mejić.
The authors exploit advances in Koopman operator theory to develop a framework for identifying conjugate and nonconjugate training dynamics.
Procgen’s implicit curriculum becomes explicit
(Opens in new tab)Wang Kaixin (Opens in new tab)Wang Xinchao
This study investigates the learning process itself under multilevel training in Procgen. This shows a gradual transition from easy to hard contexts and suggests an implicit curriculum in multilevel training.
Is behavioral replication all that is needed? Understanding horizons in imitative learning
Dylan J. Foster, Adam Block, Dipendra Misra
The authors show that horizon-independent sample complexity can be achieved in offline imitation learning if the range of cumulative payoffs and appropriate notions of the complexity of supervised learning of policy classes are controlled.
MInference: Accelerating prefilling of long-context LLMs with dynamic sparse attention
Huiqiang Jiang, Yucheng Li, Chengruidong Zhang, Qianhui Wu, Xufang Luo, Surin Ahn, Zhenhua Han, Amir H. Abdi, Dongsheng Li, Chin-Yew Lin, Yuqing Yang, Lili Qiu
MInference is a sparse computation technique designed to speed up prefilling of long sequence processing and identifies three unique patterns of long-context attention matrices that can be used for efficient sparse computation on GPUs. .
The power of reset in online reinforcement learning
Zakaria Muhammedi, Dylan J. Foster, Alexander Rachlin
In this study, we explore the potential of simulators through reinforcement learning with local simulator access, an RL protocol that allows an agent to reset to a previously observed state and track its dynamics during training.
VideoGUI: Benchmarking GUI automation from instructional videos
Kevin Qinghong Lin, Linjie Li, Difei Gao, Qinchen Wu, Mingyi Yan, Zhengyuan Yang, Lijuan Wang, Mike Zheng Shou
In this study, we introduce VideoGUI, a new multimodal benchmark designed to evaluate GUI assistants on visually-centric GUI tasks through a hierarchical process, allowing us to identify specific levels at which they may fail.
Voila-A: Align the visual language model to the user’s gaze attention
Kun Yan, Lei Ji, Zeyu Wang, Yuntao Wang, Nan Duan, Shuai Ma
The authors introduce a new approach for gaze coordination that introduces gaze information that may be collected by AR or VR devices and increase the interpretability and effectiveness of these models in real-world applications. I will.
Explore 100+ accepted papers from Microsoft
Microsoft at ML4H 2024
Co-located with NeurIPS is the AHLI Machine Learning for Health (ML4H) Symposium, an event that unites machine learning researchers, clinicians, and health data experts to advance AI applications in healthcare. Microsoft’s contribution of four papers to this symposium underscores our commitment to improving medical imaging and clinical workflows through AI, with a focus on accuracy, efficiency, and interpretability.