Scientific Inference
Scientific inference, the process of drawing conclusions from data, is a core challenge across numerous scientific fields, with current research focusing on improving efficiency and accuracy. This involves developing novel algorithms and architectures, such as those based on Bayesian networks, diffusion transformers, and autoregressive models, to optimize inference processes in various contexts, including large language models and image processing. These advancements are crucial for accelerating scientific discovery and enabling real-world applications in areas like personalized medicine, legal tech, and industrial automation, where efficient and reliable inference is paramount. The emphasis is on addressing computational bottlenecks and improving the reliability of inferences, particularly in scenarios with limited data or complex models.
Papers
KV Prediction for Improved Time to First Token
Maxwell Horton, Qingqing Cao, Chenfan Sun, Yanzi Jin, Sachin Mehta, Mohammad Rastegari, Moin Nabi
Cost-aware Simulation-based Inference
Ayush Bharti, Daolang Huang, Samuel Kaski, François-Xavier Briol
Hybrid Summary Statistics
T. Lucas Makinen, Ce Sui, Benjamin D. Wandelt, Natalia Porqueres, Alan Heavens
Distributed Inference on Mobile Edge and Cloud: An Early Exit based Clustering Approach
Divya Jyoti Bajpai, Manjesh Kumar Hanawal
Inference Scaling for Long-Context Retrieval Augmented Generation
Zhenrui Yue, Honglei Zhuang, Aijun Bai, Kai Hui, Rolf Jagerman, Hansi Zeng, Zhen Qin, Dong Wang, Xuanhui Wang, Michael Bendersky
HarmoniCa: Harmonizing Training and Inference for Better Feature Cache in Diffusion Transformer Acceleration
Yushi Huang, Zining Wang, Ruihao Gong, Jing Liu, Xinjie Zhang, Jinyang Guo, Xianglong Liu, Jun Zhang
A Little Goes a Long Way: Efficient Long Context Training and Inference with Partial Contexts
Suyu Ge, Xihui Lin, Yunan Zhang, Jiawei Han, Hao Peng
Revisiting Hierarchical Text Classification: Inference and Metrics
Roman Plaud, Matthieu Labeau, Antoine Saillenfest, Thomas Bonald
Amortized Bayesian Workflow (Extended Abstract)
Marvin Schmitt, Chengkun Li, Aki Vehtari, Luigi Acerbi, Paul-Christian Bürkner, Stefan T. Radev
Introducing ELLIPS: An Ethics-Centered Approach to Research on LLM-Based Inference of Psychiatric Conditions
Roberta Rocca, Giada Pistilli, Kritika Maheshwari, Riccardo Fusaroli
Half-VAE: An Encoder-Free VAE to Bypass Explicit Inverse Mapping
Yuan-Hao Wei, Yan-Jie Sun, Chen Zhang