Bounded Recall

Bounded recall, in machine learning and related fields, refers to algorithms and systems constrained to utilize only a limited amount of past information when making decisions or predictions. Current research focuses on developing efficient algorithms that achieve low regret despite this constraint, exploring techniques like structured pruning, two-phase recall-and-select frameworks, and multi-candidate cross-encoding to improve speed and accuracy while maintaining acceptable performance. These advancements are significant for resource-constrained applications like robotics and edge computing, as well as for accelerating model selection and improving the efficiency of large language models in knowledge-intensive tasks.

Papers