Text Modality
Text modality research explores how textual information can be effectively integrated with other data modalities (e.g., images, audio, video) to improve the performance and capabilities of AI models. Current research focuses on developing multimodal models using transformer architectures and diffusion models, often incorporating techniques like prompt tuning and meta-learning to enhance controllability and generalization. This work is significant because it enables more sophisticated AI systems capable of understanding and generating complex information across various data types, with applications ranging from improved medical diagnosis to more realistic virtual environments.
Papers
Meta learning with language models: Challenges and opportunities in the classification of imbalanced text
Apostol Vassilev, Honglan Jin, Munawar Hasan
DetectGPT-SC: Improving Detection of Text Generated by Large Language Models through Self-Consistency with Masked Predictions
Rongsheng Wang, Qi Li, Sihong Xie
Toward Joint Language Modeling for Speech Units and Text
Ju-Chieh Chou, Chung-Ming Chien, Wei-Ning Hsu, Karen Livescu, Arun Babu, Alexis Conneau, Alexei Baevski, Michael Auli
Fast Word Error Rate Estimation Using Self-Supervised Representations For Speech And Text
Chanho Park, Chengsong Lu, Mingjie Chen, Thomas Hain
Mitigating stereotypical biases in text to image generative systems
Piero Esposito, Parmida Atighehchian, Anastasis Germanidis, Deepti Ghadiyaram
Text Embeddings Reveal (Almost) As Much As Text
John X. Morris, Volodymyr Kuleshov, Vitaly Shmatikov, Alexander M. Rush
Humans and language models diverge when predicting repeating text
Aditya R. Vaidya, Javier Turek, Alexander G. Huth