Gian Paolo Leonardi: Geometric Post-Training Quantization of Deep Neural Networks

Seminars - Analysis and Applied Mathematics Seminar

Speakers

Gian Paolo Leonardi, University of Trento

22 May 2025

12:30pm - 1:45pm

Room 3-E4-SR03

Abstract: Quantization is a key technique for reducing the memory footprint and computational cost of deep learning models. However, traditional quantization methods are not supported by general theoretical results. Moreover, they typically overlook the role of the geometry induced on the parameter space by the structure of the model and by the training dynamics.
Our main theoretical finding consists of the identification of an appropriate metric to be used when projecting weights onto a quantization grid after training. More precisely, we consider suitably scaled, over-parametrized deep neural networks with L layers, whose parameters are first initialized as i.i.d. normal variables with zero mean and unit variance, and are subsequently trained with gradient descent. Then, we rigorously prove that the natural quantization metric is the one defined by the Gauss-Newton matrix, whenever the final point of the training dynamics satisfies suitable sparsity assumptions. Specifically, we quantify the naturality of this metric in probabilistic terms, i.e., with high probability over the initialization as the dimension of the parameter space becomes large.
Based on this theoretical result, we propose a novel, post-training quantization algorithm called GeoPTQ, which is shown to outperform classical quantization schemes in some preliminary experiments.
This research is in collaboration with Massimiliano Datres (LMU Munich) and Andrea Agazzi (Univ. Bern).

For further information please contact elisur.magrini@unibocconi.it