Data-Free Parameter Pruning and Quantization

For many applications, when transfer learning is used to retrain an image classification network for a new task, or when a new network is trained from scratch, the optimal network architecture is not known. This can result in overparameterization and redundant connections for the network. Pruning aims to identify these redundant, unnecessary connections that can be removed without affecting final network accuracy. This demonstration shows how to implement unstructured pruning in MATLAB®. Magnitude pruning is an intuitive first step to pruning, where the absolute value of each parameter is used as a metric of its relative importance to the network. A classification network is iteratively pruned using magnitude scores to achieve a target network sparsity. After pruning, the best solution is selected based on the accuracy of the pruned network as a function of sparsity. Unstructured pruning alone does not lead to any specific memory or inference speedup when not used with sparse matrix optimizers for the fi

1 view