Effective Depth Up Scaling

Effective depth up-scaling focuses on improving the performance of deep learning models by strategically increasing their depth, addressing challenges like vanishing gradients and rank collapse in transformer architectures. Current research explores methods like depthwise scaling and continued pre-training, often within the context of large language models or implicit surface learning for 3D reconstruction, aiming for efficient scaling without complex architectural modifications. These advancements are significant for enhancing the capabilities of various deep learning applications, from natural language processing to computer vision and 3D scene understanding, by enabling the creation of more powerful and efficient models.

Papers

October 15, 2024

Transformer Layer Injection: A Novel Approach for Efficient Upscaling of Large Language Models
James Vo
Large Language Model Transformer Based Model Novel Approach Transformer Layer Model Scale Effective Depth Up Scaling Arbitrary Upscaling

December 23, 2023

SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling
Dahyun Kim, Chanjun Park, Sanghoon Kim, Wonsung Lee, Wonho Song, Yunsu Kim, Hyeonwoo Kim, Yungi Kim, Hyeonju Lee, Jihoo Kim, Changbae Ahn, Seonghoon Yang, Sukyung Lee, Hyunbyung Park, Gyoungjin Gim, Mikyoung Cha, Hwalsuk Lee, Sunghun Kim
Large Language Model Natural Language Processing Synergy Oriented LeARning Effective Depth Up Scaling

October 2, 2023

Commutative Width and Depth Scaling in Deep Neural Networks
Soufiane Hayou
Deep Neural Network Effective Depth Up Scaling First Open Source Width Benchmark Depth Limit

March 14, 2023

Do More With What You Have: Transferring Depth-Scale from Labeled to Unlabeled Domains
Alexandra Dana, Nadav Carmel, Amit Shomer, Ofer Manela, Tomer Peleg
Depth Prediction Self Supervised Depth Ground Truth Depth Textual Label Effective Depth Up Scaling

June 7, 2022

Signal Propagation in Transformers: Theoretical Perspectives and the Role of Rank Collapse
Lorenzo Noci, Sotiris Anagnostidis, Luca Biggio, Antonio Orvieto, Sidak Pal Singh, Aurelien Lucchi
Transformer Megatron Decepticons Integral Role Natural Gradient Gradient Norm Self Attention Layer Propagation Environment Rank Collapse Theoretical Consideration Effective Depth Up Scaling

April 5, 2022

CHORE: Contact, Human and Object REconstruction from a single RGB image
Xianghui Xie, Bharat Lal Bhatnagar, Gerard Pons-Moll
Human Generated 3D Human Object Reconstruction Single RGB Image Neural Reconstruction Fast Contact Pixel Aligned Implicit Effective Depth Up Scaling Indivisible CHORE

Effective Depth Up Scaling

Papers

Transformer Layer Injection: A Novel Approach for Efficient Upscaling of Large Language Models

SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling

Commutative Width and Depth Scaling in Deep Neural Networks

Do More With What You Have: Transferring Depth-Scale from Labeled to Unlabeled Domains

Signal Propagation in Transformers: Theoretical Perspectives and the Role of Rank Collapse

CHORE: Contact, Human and Object REconstruction from a single RGB image