Train Test

Train-test discrepancies, where differences between training and testing data negatively impact model performance, are a significant challenge across various machine learning domains. Current research focuses on mitigating these discrepancies through techniques like developing scale-invariant models, refining data preprocessing to reduce class overlap, and improving the generalization capabilities of models, particularly in long-sequence processing (e.g., LLMs) and weakly supervised learning. Addressing these issues is crucial for improving the reliability and robustness of machine learning models in real-world applications, ranging from information retrieval and image recognition to natural language processing.

Papers

November 20, 2024

No Free Delivery Service: Epistemic limits of passive data collection in complex social systems
Maximilian Nickel
Artificial Intelligence Artificial Intelligence Research Inference Efficiency Passive Sensing Train Test

October 10, 2024

Language model developers should report train-test overlap
Andy K Zhang, Kevin Klyman, Yifan Mai, Yoav Levine, Yian Zhang, Rishi Bommasani, Percy Liang
Large Language Model Language Model Train Test

October 2, 2024

Scale-Invariant Learning-to-Rank
Alessio Petrozziello, Christian Sommeregger, Ye-Sheen Lim
Learning to Rank Normalization Technique Train Test Feature Scaling

April 1, 2024

On Train-Test Class Overlap and Detection for Image Retrieval
Chull Hwan Song, Jooyoung Yoon, Taebaek Hwang, Shunghyun Choi, Yeong Hyeon Gu, Yannis Avrithis
Data Detection Object Detector Image Retrieval Image Level Representation Landmark Dataset Train Test

February 29, 2024

Resonance RoPE: Improving Context Length Generalization of Large Language Models
Suyuchen Wang, Ivan Kobyzev, Peng Lu, Mehdi Rezagholizadeh, Bang Liu
Downstream NLP Task Length Generalization Language Modeling Task Rotary Position Train Test

October 14, 2023

Does CLIP's Generalization Performance Mainly Stem from High Train-Test Similarity?
Prasanna Mayilvahanan, Thaddäus Wiedemer, Evgenia Rusak, Matthias Bethge, Wieland Brendel
Supervised ImageNet Generalization Performance Single CLIP Distribution Generalization Generalizable Representation STEM Education Train Test

April 17, 2023

Improving Weakly Supervised Temporal Action Localization by Bridging Train-Test Gap in Pseudo Labels
Jingqiu Zhou, Linjiang Huang, Liang Wang, Si Liu, Hongsheng Li
Pseudo Label Temporal Action Localization Supervised Temporal Action Localization Action Boundary Train Test

Train Test

Papers

No Free Delivery Service: Epistemic limits of passive data collection in complex social systems

Language model developers should report train-test overlap

Scale-Invariant Learning-to-Rank

On Train-Test Class Overlap and Detection for Image Retrieval

Resonance RoPE: Improving Context Length Generalization of Large Language Models

Does CLIP's Generalization Performance Mainly Stem from High Train-Test Similarity?

Improving Weakly Supervised Temporal Action Localization by Bridging Train-Test Gap in Pseudo Labels