Relative Positional Encoding
Relative positional encoding (RPE) aims to improve the performance and generalization capabilities of transformer-based models by explicitly incorporating information about the relative positions of tokens within a sequence, addressing limitations of absolute positional encodings. Current research focuses on developing novel RPE methods, including those based on orthogonal polynomials, hyperbolic functions, and multiple kernel learning, to enhance length extrapolation, improve efficiency, and reduce positional bias in various applications such as natural language processing, computer vision, and time series analysis. These advancements are significant because they enable more robust and efficient processing of longer sequences and improve model performance across diverse tasks, impacting fields ranging from machine translation to medical image analysis.
Papers
Randomized Positional Encodings Boost Length Generalization of Transformers
Anian Ruoss, Grégoire Delétang, Tim Genewein, Jordi Grau-Moya, Róbert Csordás, Mehdi Bennani, Shane Legg, Joel Veness
Improving Position Encoding of Transformers for Multivariate Time Series Classification
Navid Mohammadi Foumani, Chang Wei Tan, Geoffrey I. Webb, Mahsa Salehi