Large Scale Benchmark

Large-scale benchmarks are datasets designed to rigorously evaluate the performance of machine learning models across diverse and challenging tasks, pushing the boundaries of model capabilities. Current research focuses on developing benchmarks for various domains, including fluid dynamics, log parsing, image manipulation detection, and various aspects of video and image analysis, often employing deep learning architectures like transformers and convolutional neural networks. These benchmarks are crucial for advancing the field by providing standardized evaluation metrics and facilitating the development of more robust and generalizable models with significant implications for diverse applications ranging from medical imaging to autonomous systems.

Papers

May 4, 2023

May 3, 2023

Glitch in the Matrix: A Large Scale Benchmark for Content Driven Audio-Visual Forgery Detection and Localization
Zhixi Cai, Shreya Ghosh, Abhinav Dhall, Tom Gedeon, Kalin Stefanov, Munawar Hayat
Localization Focus Terminator Economy Large Scale Benchmark Unseen Deepfakes Forgery Localization Glitch Classification Video Forgery

March 7, 2023

OpenOccupancy: A Large Scale Benchmark for Surrounding Semantic Occupancy Perception
Xiaofeng Wang, Zheng Zhu, Wenbo Xu, Yunpeng Zhang, Yi Wei, Xu Chi, Yun Ye, Dalong Du, Jiwen Lu, Xingang Wang
Occupancy Prediction Large Scale Benchmark Occupancy Information Semantic Occupancy Occupancy Map

February 4, 2023

Benchmarking sparse system identification with low-dimensional chaos
Alan A. Kaptanoglu, Lanyue Zhang, Zachary G. Nicolaou, Urban Fasel, Steven L. Brunton
System Identification Chaotic System Large Scale Benchmark Sparse Identification

October 31, 2022

CausalBench: A Large-scale Benchmark for Network Inference from Single-cell Perturbation Data
Mathieu Chevalley, Yusuf Roohani, Arash Mehrjou, Jure Leskovec, Patrick Schwab
Causal Inference Large Scale Benchmark Network Inference CausalBench Challenge

October 18, 2022

RibSeg v2: A Large-scale Benchmark for Rib Labeling and Anatomical Centerline Extraction
Liang Jin, Shixuan Gu, Donglai Wei, Jason Ken Adhinarta, Kaiming Kuang, Yongjie Jessica Zhang, Hanspeter Pfister, Bingbing Ni, Jiancheng Yang, Ming Li
Large Scale Benchmark Centerline Detection Rib Segmentation

July 21, 2022

Omni3D: A Large Benchmark and Model for 3D Object Detection in the Wild
Garrick Brazil, Abhinav Kumar, Julian Straub, Nikhila Ravi, Justin Johnson, Georgia Gkioxari
Full Model 3D Object Detection Wild Challenge 3D Detection Large Scale Benchmark 3D Object Recognition

June 12, 2022

APT-36K: A Large-scale Benchmark for Animal Pose Estimation and Tracking
Yuxiang Yang, Junjie Yang, Yufei Xu, Jing Zhang, Long Lan, Dacheng Tao
Web Tracking Large Scale Benchmark 3D Animal Animal Pose Estimation Animal Datasets Animal Tracking

May 5, 2022

Gait Recognition in the Wild: A Large-scale Benchmark and NAS-based Baseline
Xianda Guo, Zheng Zhu, Tian Yang, Beibei Lin, Junjie Huang, Jiankang Deng, Guan Huang, Jie Zhou, Jiwen Lu
Wild Challenge Gait Recognition Baseline Result Large Scale Benchmark

March 24, 2022

BigDetection: A Large-scale Benchmark for Improved Object Detector Pre-training
Likun Cai, Zhi Zhang, Yi Zhu, Li Zhang, Mu Li, Xiangyang Xue
Object Detection Large Scale Benchmark Pre Trained Object Detection Datasets

March 1, 2022

Towards deep learning-powered IVF: A large public benchmark for morphokinetic parameter prediction
Tristan Gomez, Magalie Feyeux, Nicolas Normand, Laurent David, Perrine Paul-Gilloteaux, Thomas Fréour, Harold Mouchère
Deep Learning Large Scale Benchmark Microelectronic Morphogenesis Human Embryo

December 31, 2021

Improving Baselines in the Wild
Kazuki Irie, Imanol Schlag, Róbert Csordás, Jürgen Schmidhuber
Data Set Wild Challenge Natural Image Baseline Result Large Scale Benchmark

December 9, 2021

Extending the WILDS Benchmark for Unsupervised Adaptation
Shiori Sagawa, Pang Wei Koh, Tony Lee, Irena Gao, Sang Michael Xie, Kendrick Shen, Ananya Kumar, Weihua Hu, Michihiro Yasunaga, Henrik Marklund, Sara Beery, Etienne David, Ian Stavness, Wei Guo, Jure Leskovec, Kate Saenko, Tatsunori Hashimoto, Sergey Levine, Chelsea Finn, Percy Liang
Unlabeled Data Self Supervised Method Large Scale Benchmark Unsupervised Adaptation Distribution Shift Benchmark

November 19, 2021

A Large Scale Benchmark for Individual Treatment Effect Prediction and Uplift Modeling
Eustache Diemert, Artem Betlei, Christophe Renaudin, Massih-Reza Amini, Théophane Gregoir, Thibaud Rahier
Causal Effect Large Scale Benchmark Uplift Modeling Individual Treatment Effect Treatment Assignment Causal Inference Task Treatment Effect Estimation

Large Scale Benchmark

Papers

UrbanBIS: a Large-scale Benchmark for Fine-grained Urban Building Instance Segmentation

ANetQA: A Large-scale Benchmark for Fine-grained Compositional Reasoning over Untrimmed Videos

Glitch in the Matrix: A Large Scale Benchmark for Content Driven Audio-Visual Forgery Detection and Localization

OpenOccupancy: A Large Scale Benchmark for Surrounding Semantic Occupancy Perception

Benchmarking sparse system identification with low-dimensional chaos

CausalBench: A Large-scale Benchmark for Network Inference from Single-cell Perturbation Data

RibSeg v2: A Large-scale Benchmark for Rib Labeling and Anatomical Centerline Extraction

Omni3D: A Large Benchmark and Model for 3D Object Detection in the Wild

APT-36K: A Large-scale Benchmark for Animal Pose Estimation and Tracking

Gait Recognition in the Wild: A Large-scale Benchmark and NAS-based Baseline

BigDetection: A Large-scale Benchmark for Improved Object Detector Pre-training

Towards deep learning-powered IVF: A large public benchmark for morphokinetic parameter prediction

Improving Baselines in the Wild

Extending the WILDS Benchmark for Unsupervised Adaptation

A Large Scale Benchmark for Individual Treatment Effect Prediction and Uplift Modeling