Positional Encoding
Positional encoding methods aim to incorporate information about the order and relative positions of elements within data sequences into neural network architectures, particularly transformers, which are inherently order-agnostic. Current research focuses on developing more effective positional encodings for various data types, including sequences, graphs, and even higher-dimensional structures like cell complexes, often tailoring encoding schemes to specific tasks (e.g., arithmetic, visual grounding, or time series forecasting) and model architectures (e.g., graph transformers, diffusion models). These advancements are crucial for improving the performance and generalization capabilities of deep learning models across numerous applications, ranging from natural language processing and computer vision to scientific simulations and process monitoring.
Papers
DAPE: Data-Adaptive Positional Encoding for Length Extrapolation
Chuanyang Zheng, Yihang Gao, Han Shi, Minbin Huang, Jingyao Li, Jing Xiong, Xiaozhe Ren, Michael Ng, Xin Jiang, Zhenguo Li, Yu Li
Attending to Topological Spaces: The Cellular Transformer
Rubén Ballester, Pablo Hernández-García, Mathilde Papillon, Claudio Battiloro, Nina Miolane, Tolga Birdal, Carles Casacuberta, Sergio Escalera, Mustafa Hajij