First Stage Retrieval
First-stage retrieval in information retrieval systems aims to efficiently select a subset of relevant documents from a large corpus, forming the foundation for subsequent ranking stages. Current research focuses on improving retrieval accuracy and efficiency using various approaches, including dense retrieval models (often based on transformer architectures), novel indexing techniques like composite codes, and innovative frameworks that combine generative and retrieval models (e.g., "generate-then-read" pipelines). These advancements are crucial for enhancing the performance and scalability of information retrieval systems across diverse applications, such as question answering and machine translation, by providing a more accurate and efficient initial selection of relevant information.