K TOKEN
"K Token" research broadly explores the representation and utilization of information units ("tokens") in large language and multimodal models, aiming to improve efficiency, accuracy, and context understanding. Current research focuses on novel tokenization methods for diverse data types (text, images, video, audio), developing model architectures (like transformers) that effectively process these tokens, and evaluating their performance on various tasks including question answering, generation, and semantic understanding. This work is significant for advancing the capabilities of large models, enabling more efficient and accurate processing of complex information, and impacting applications ranging from natural language processing to computer vision.
Papers
Efficient Generative Modeling with Residual Vector Quantization-Based Tokens
Jaehyeon Kim, Taehong Moon, Keon Lee, Jaewoong Cho
B-VLLM: A Vision Large Language Model with Balanced Spatio-Temporal Tokens
Zhuqiang Lu, Zhenfei Yin, Mengwei He, Zhihui Wang, Zicheng Liu, Zhiyong Wang, Kun Hu
Byte Latent Transformer: Patches Scale Better Than Tokens
Artidoro Pagnoni, Ram Pasunuru, Pedro Rodriguez, John Nguyen, Benjamin Muller, Margaret Li, Chunting Zhou, Lili Yu, Jason Weston, Luke Zettlemoyer, Gargi Ghosh, Mike Lewis, Ari Holtzman, Srinivasan Iyer
Enhancing Character-Level Understanding in LLMs through Token Internal Structure Learning
Zhu Xu, Zhiqiang Zhao, Zihan Zhang, Yuchi Liu, Quanwei Shen, Fei Liu, Yu Kuang, Jian He, Conglin Liu
Signs as Tokens: An Autoregressive Multilingual Sign Language Generator
Ronglai Zuo, Rolandos Alexandros Potamias, Evangelos Ververas, Jiankang Deng, Stefanos Zafeiriou