Hyper Parameter

Hyperparameters are the settings of a machine learning model that are not learned from data but are set beforehand, significantly impacting model performance and resource consumption. Current research focuses on optimizing hyperparameter selection across various model architectures, including deep neural networks, large language models, and Gaussian processes, often employing techniques like Bayesian optimization, evolutionary algorithms, and novel mathematical frameworks to improve efficiency and generalization. Effective hyperparameter tuning is crucial for achieving optimal model performance, reducing computational costs (including energy consumption), and enhancing the reliability and reproducibility of machine learning results across diverse applications.

75papers

Papers

April 25, 2025

Tree Boosting Methods for Balanced andImbalanced Classification and their Robustness Over Time in Risk Assessment
Gissel Velarde, Michael Weichert, Anuj Deshmunkh, Sanjay Deshmane, Anindya Sudhir, Khushboo Sharma, Vaibhav Joshi
Risk Assessment Simple Classifier NCD Method Native Robustness XGBoost Model Hyper Parameter Machine Learning Class Imbalance Balanced Classification

March 14, 2025

Bayes and Biased Estimators Without Hyper-parameter Estimation: Comparable Performance to the Empirical-Bayes-Based Regularized Estimator
Yue Ju, Bo Wahlberg, Håkan Hjalmarsson
KTH Royal Institute of Technology●Competence Centre for Advanced BioProduction by Continuous Processing
Bayes Rule Estimation Bias Ridge Regression Hyper Parameter Performance Comparison

February 26, 2025

Online Prototypes and Class-Wise Hypergradients for Online Continual Learning with Pre-Trained Models
Nicolas Michel, Maorong Wang, Jiangpeng He, Toshihiko Yamasaki
The University of Tokyo●CNRS●Massachusetts Institute of Technology
Online Continual Learning Continual LEArning Visual Prototype Hyper Parameter Pre Trained Model LeArning Abstract

February 18, 2025

Tuning Algorithmic and Architectural Hyperparameters in Graph-Based Semi-Supervised Learning with Provable Guarantees
Ally Yalei Du, Eric Huang, Dravyansh Sharma
Carnegie Mellon University●Toyota Technological Institute at Chicago
LeArning Abstract Semi Supervised Learning New Hyperparameter Hyper Parameter Provable Guarantee Automatic Tuning Graph Convolution

February 13, 2025

Bayesian Optimization for Simultaneous Selection of Machine Learning Algorithms and Hyperparameters on Shared Latent Space
Kazuki Ishikawa, Ryota Ozaki, Yohei Kanzaki, Ichiro Takeuchi, Masayuki Karasuyama
Hyper Parameter Related Hyperparameters Machine Learning Bayesian Optimization Sequential Selection Multi Task Model

February 7, 2025

Gemstones: A Model Suite for Multi-Faceted Scaling Laws
Sean McLeish, John Kirchenbauer, David Yu Miller, Siddharth Singh, Abhinav Bhatele, Micah Goldblum, Ashwinee Panda, Tom Goldstein
Scaling Law Hyper Parameter Multiplicative Size Scaling

November 19, 2024

Tailoring the Hyperparameters of a Wide-Kernel Convolutional Neural Network to Fit Different Bearing Fault Vibration Datasets
Dan Hudson, Jurgen van den Hoogen, Martin Atzmueller
Fault Detection Hyper Parameter Convolutional Kernel Related Hyperparameters Network Programming

October 29, 2024

Hierarchical mixtures of Unigram models for short text clustering: The role of Beta-Liouville priors
Massimo Bilancia, Samuele Magro
Multinomial Logistic Hyper Parameter Integral Role Short Text Clustering Gaussian Prior Beta Distribution Multinomial Mixture Prior Distribution Hierarchical Mixture

October 28, 2024

ATLAS: Adapting Trajectory Lengths and Step-Size for Hamiltonian Monte Carlo
Chirag Modi
Atlas Reliability Map Hamiltonian Monte Carlo Complex Geometry Step Size Trajectory Length Adaptive Sampling Hessian Matrix Hyper Parameter

October 11, 2024

Scaling Gaussian Processes for Learning Curve Prediction via Latent Kronecker Structure
Jihao Andreas Lin, Sebastian Ament, Maximilian Balandat, Eytan Bakshy
Gaussian Process Learning Curve Learning Rate Intermediate Latent Kronecker Product Efficient Inference Hyper Parameter

September 17, 2024

Improve Machine Learning carbon footprint using Parquet dataset format and Mixed Precision training for regression algorithms
Andrew Antonopoulos
Computational Power Power Consumption Novel Regression Hyper Parameter Dataset Documentation Mixed Precision Training

September 12, 2024

Improve Machine Learning carbon footprint using Nvidia GPU and Mixed Precision training for classification algorithms
Andrew Antonopoulos
Power Consumption Hyper Parameter Deep Neural Network Mixed Precision Training Classification Algorithm Single GPU

September 7, 2024

Optimization Hyper-parameter Laws for Large Language Models
Xingyu Xie, Kuangyu Ding, Shuicheng Yan, Kim-Chuan Toh, Tianwen Wei
Hyper Parameter Optimization Hyper Parameter Learning Rate Schedule

August 17, 2024

On the KL-Divergence-based Robust Satisficing Model
Haojie Yan, Minglong Zhou, Jiayi Guo
Hyper Parameter Robust Satisficing Empirical Risk Minimization Loss Function

August 6, 2024

Non-Determinism of "Deterministic" LLM Settings
Berk Atil, Sarp Aykent, Alexa Chittams, Lisheng Fu, Rebecca J. Passonneau, Evan Radcliffe, Guru Rajan Rajagopal, Adam Sloan, Tomasz Tudrej+4
LLM Accuracy Deterministic Algorithm Comprehensive Analysis Hyper Parameter LLM Evaluation Surprise Bound

July 31, 2024

Hyper-parameter tuning for text guided image editing
Shiwen Zhang
Image Editing State of the Art Text Modality Hyper Parameter

July 30, 2024

Be aware of overfitting by hyperparameter optimization!
Igor V. Tetko, Ruud van Deursen, Guillaume Godin
Hyper Parameter Solubility Prediction Hyperparameter Optimization Machine Learning

July 29, 2024

Introducing a new hyper-parameter for RAG: Context Window Utilization
Kush Juvekar, Anupam Purwar
Retrieval Augmented Generation Context Information Chunk Wise Generative Model Context Window Hyper Parameter

July 26, 2024

Binary Bleed: Fast Distributed and Parallel Method for Automatic Model Selection
Ryan Barron, Maksim E. Eren, Manish Bhattarai, Ismael Boureima, Cynthia Matuszek, Boian S. Alexandrov
Machine Learning Algorithm Negative Matrix Factorization Model Selection Hyper Parameter

July 5, 2024

Improved algorithms for learning quantum Hamiltonians, via flat polynomials
Shyam Narayanan
Polynomial Approximation Sample Complexity Quantum Physic Hyper Parameter Improved Algorithm

Hyper Parameter

Papers

Tree Boosting Methods for Balanced andImbalanced Classification and their Robustness Over Time in Risk Assessment

Bayes and Biased Estimators Without Hyper-parameter Estimation: Comparable Performance to the Empirical-Bayes-Based Regularized Estimator

Online Prototypes and Class-Wise Hypergradients for Online Continual Learning with Pre-Trained Models

Tuning Algorithmic and Architectural Hyperparameters in Graph-Based Semi-Supervised Learning with Provable Guarantees

Bayesian Optimization for Simultaneous Selection of Machine Learning Algorithms and Hyperparameters on Shared Latent Space

Gemstones: A Model Suite for Multi-Faceted Scaling Laws

Tailoring the Hyperparameters of a Wide-Kernel Convolutional Neural Network to Fit Different Bearing Fault Vibration Datasets

Hierarchical mixtures of Unigram models for short text clustering: The role of Beta-Liouville priors

ATLAS: Adapting Trajectory Lengths and Step-Size for Hamiltonian Monte Carlo

Scaling Gaussian Processes for Learning Curve Prediction via Latent Kronecker Structure

Improve Machine Learning carbon footprint using Parquet dataset format and Mixed Precision training for regression algorithms

Improve Machine Learning carbon footprint using Nvidia GPU and Mixed Precision training for classification algorithms

Optimization Hyper-parameter Laws for Large Language Models

On the KL-Divergence-based Robust Satisficing Model

Non-Determinism of "Deterministic" LLM Settings

Hyper-parameter tuning for text guided image editing

Be aware of overfitting by hyperparameter optimization!

Introducing a new hyper-parameter for RAG: Context Window Utilization

Binary Bleed: Fast Distributed and Parallel Method for Automatic Model Selection

Improved algorithms for learning quantum Hamiltonians, via flat polynomials