Document Similarity
Document similarity focuses on quantifying the resemblance between textual documents, crucial for tasks like plagiarism detection, information retrieval, and recommendation systems. Current research emphasizes efficient algorithms, moving beyond quadratic-time complexities associated with transformer-based approaches by exploring sparse graph representations and specialized embeddings tailored to specific document aspects. These advancements aim to improve accuracy and scalability, particularly for large corpora and morphologically rich languages, impacting fields ranging from biomedical literature analysis to financial auditing.
Papers
October 5, 2024
February 6, 2024
January 19, 2024
November 22, 2023
November 21, 2023
August 11, 2023
April 3, 2023
March 28, 2022
December 28, 2021
December 23, 2021