Full Model
"Full Model" research encompasses the development and improvement of large-scale machine learning models across diverse applications, aiming to enhance performance, efficiency, and robustness. Current research focuses on addressing model vulnerabilities (e.g., adversarial attacks, hallucinations), improving efficiency for resource-constrained devices, and developing specialized models for specific domains (e.g., finance, astronomy, medical imaging). This work is significant for advancing AI capabilities in various fields and for mitigating potential risks associated with deploying complex models in real-world settings.
Papers
Multilingual Text Style Transfer: Datasets & Models for Indian Languages
Sourabrata Mukherjee, Atul Kr. Ojha, Akanksha Bansal, Deepak Alok, John P. McCrae, Ondřej Dušek
Skeleton-OOD: An End-to-End Skeleton-Based Model for Robust Out-of-Distribution Human Action Detection
Jing Xu, Anqi Zhu, Jingyu Lin, Qiuhong Ke, Cunjian Chen
Multi-Modal Generative Embedding Model
Feipeng Ma, Hongwei Xue, Guangting Wang, Yizhou Zhou, Fengyun Rao, Shilin Yan, Yueyi Zhang, Siying Wu, Mike Zheng Shou, Xiaoyan Sun
Trust the Model Where It Trusts Itself -- Model-Based Actor-Critic with Uncertainty-Aware Rollout Adaption
Bernd Frauenknecht, Artur Eisele, Devdutt Subhasish, Friedrich Solowjow, Sebastian Trimpe
NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models
Chankyu Lee, Rajarshi Roy, Mengyao Xu, Jonathan Raiman, Mohammad Shoeybi, Bryan Catanzaro, Wei Ping
Towards One Model for Classical Dimensionality Reduction: A Probabilistic Perspective on UMAP and t-SNE
Aditya Ravuri, Neil D. Lawrence