Deep Learning Acceleration via Virtual Minibatch Size Scaling

A research conducted by Takuro Kutsuna was published in SIAM Journal on Mathematics of Data Science.

In recent years, deep learning models used in applications such as image recognition have continued to grow in scale, leading to a significant increase in computational cost for training. In general, increasing the minibatch size—the number of data samples used in each training step—can accelerate training. However, in practice, hardware limitations such as GPU memory constraints often prevent sufficiently large minibatch sizes.

In this study, we propose a method that estimates the importance of each data sample during training and automatically adjusts its sampling frequency accordingly. By applying importance sampling with appropriate weighting, the method avoids introducing bias into the training process. Our theoretical analysis shows that, when importance is properly estimated, importance sampling effectively acts as a virtual expansion of the minibatch size. We further formalize this concept as the “effective minibatch size” and propose a low-overhead method to estimate it during training. Experimental results on multiple image datasets demonstrate that the proposed method achieves higher accuracy than conventional approaches while requiring comparable computational time.

This technology is expected to contribute to more efficient large-scale AI training and to improve the efficiency of future distributed learning systems.

Title: Exploring Variance Reduction in Importance Sampling for Efficient DNN Training

Authors: Kutsuna, T.

Journal Name: SIAM Journal on Mathematics of Data Science

Published: November 18, 2025

https://doi.org/10.1137/25M1726339

サブ-ナビゲーション

Back to list