Positional Encoding
Positional encoding methods aim to incorporate information about the order and relative positions of elements within data sequences into neural network architectures, particularly transformers, which are inherently order-agnostic. Current research focuses on developing more effective positional encodings for various data types, including sequences, graphs, and even higher-dimensional structures like cell complexes, often tailoring encoding schemes to specific tasks (e.g., arithmetic, visual grounding, or time series forecasting) and model architectures (e.g., graph transformers, diffusion models). These advancements are crucial for improving the performance and generalization capabilities of deep learning models across numerous applications, ranging from natural language processing and computer vision to scientific simulations and process monitoring.
Papers
Layer-Specific Scaling of Positional Encodings for Superior Long-Context Modeling
Zhenghua Wang, Yiran Ding, Changze Lv, Zhibo Xu, Tianlong Li, Tianyuan Shi, Xiaoqing Zheng, Xuanjing HuangFudan University●Westlake University●Shanghai Key Laboratory of Intelligent Information ProcessingLEDiT: Your Length-Extrapolatable Diffusion Transformer without Positional Encoding
Shen Zhang, Yaning Tan, Siyuan Liang, Linze Li, Ge Wu, Yuhao Chen, Shuheng Li, Zhenyu Zhao, Caihua Chen, Jiajun Liang, Yao TangJIIOV Technology●Nanjing University●Nankai University