the latest in aiBeta

Normalization Layer

Normalization layers are crucial components in deep neural networks, primarily aiming to stabilize training and improve generalization by addressing issues like internal covariate shift and exploding gradients. Current research focuses on optimizing normalization layer design for specific architectures (e.g., transformers, graph neural networks, residual networks) and addressing challenges in continual learning and federated learning settings, including mitigating recency bias and handling non-IID data. These advancements are significant because improved normalization techniques lead to more robust and efficient training of deep learning models across diverse applications, from image classification and object detection to financial prediction and medical image analysis.

40papers

Papers

May 19, 2025

Parallel Layer Normalization for Universal Approximation
One Hidden Layer Layer Normalization Universal Approximation Normalization Layer Approximation Power

May 16, 2025

Where You Place the Norm Matters: From Prejudiced to Neutral Initializations
Batch Normalization Agnostic Approach Rapid Norm Growth Normalization Dictionary Small Initialization Layer Normalization Normalization Layer

May 1, 2025

On the Importance of Gaussianizing Representations
4 Dimensional Normalization Layer Data Normalisation Normalization Dictionary Importance Aware

April 15, 2025

Bridging Distribution Gaps in Time Series Foundation Model Pretraining with Prototype-Guided Normalization
Distributional Assumption Foundation Model Normalization Layer Large Scale Pretraining Inter Sample

March 25, 2025

Tiling artifacts and trade-offs of feature normalization in the segmentation of large biological images
Feature Normalization Segmentation Based Approach Normalization Layer Microscopy Image Ring Artifact

March 13, 2025

Transformers without Normalization
Normalization Layer Deep Network Transformer Megatron Decepticons Modern Neural Network Normalization Dictionary

March 6, 2025

PSDNorm: Test-Time Temporal Normalization for Deep Learning in Sleep Staging
Deep Learning Temporal Expression Test Time Domain Adaptation Tuning LayerNorm Normalization Layer Sleep Datasets Abnormal Signal

February 26, 2025

The Sharpness Disparity Principle in Transformers for Accelerating Language Model Pre-Training
Large Language Model Normalization Layer Memory effIcieNt Language Model Pre Training Transformer Megatron Decepticons

November 8, 2024

Continuous-Time Analysis of Adaptive Optimization and Normalization
Adaptive Optimization Continuous Time Modern Deep Learning Normalization Dictionary Normalization Layer

November 1, 2024

Normalization Layer Per-Example Gradients are Sufficient to Predict Gradient Noise Scale in Transformers
Gradient Noise Transformer Megatron Decepticons Normalization Layer Tuning LayerNorm Gradient Norm Gradient Based Parameter Rank Adaptive Tensor Optimization Label Regularization

October 24, 2024

Scale Propagation Network for Generalizable Depth Completion
Sparse Depth Normalization Layer Dense Depth Map Depth Completion Depth Network Graph Propagation

October 13, 2024

SimBa: Simplicity Bias for Scaling Up Parameters in Deep Reinforcement Learning
LeArning Abstract Deep RL Deep Reinforcement Many Parameter Simplicity Bias Normalization Layer

August 16, 2024

Gradient Reduction Convolutional Neural Network Policy for Financial Deep Reinforcement Learning
Financial Reinforcement Learning FINancial Task Normalization Layer Variance Reduced Policy Gradient Financial Dataset

July 18, 2024

Training-Free Model Merging for Multi-target Domain Adaptation
Multi Target Domain Adaptation Path Consistency Bentonite Buffer Scene Understanding Model Merging Normalization Layer

July 1, 2024

Normalization and effective learning rates in reinforcement learning
Normalization Layer Reinforcement Learning Deep Reinforcement Learning Normalization Dictionary Learning Rate Plasticity Loss

June 25, 2024

Transformer Normalisation Layers and the Independence of Semantic Subspaces
Normalization Layer Semantic Subspace Latent Representation Local Independence Sufficient Representation Sparse Attention

June 5, 2024

Residual Connections and Normalization Can Provably Prevent Oversmoothing in GNNs
Normalization Layer Residual Connection Effective Prevention Batch Normalization Graph Signal Graph Neural Network Normalization Dictionary

May 19, 2024

SLAB: Efficient Transformers with Simplified Linear Attention and Progressive Re-parameterized Batch Normalization
Normalization Layer Rendezvous Problem Batchnorm Minus Implementation Transformer Architecture Efficient Transformer Tuning LayerNorm

April 20, 2024

GRANOLA: Adaptive Normalization for Graph Neural Networks
Graph Based Feature Normalization Graph Neural Network Adaptive Normalization Normalization Layer

April 15, 2024

Lowering PyTorch's Memory Consumption for Selective Differentiation
Pytorch Model Evolutionary Mechanism Closed Form Differentiable Expression Normalization Layer Memory Management Back Propagation Automatic Differentiation