首页>
外国专利>
METHODS AND ALGORITHMS OF REDUCING COMPUTATION FOR DEEP NEURAL NETWORKS VIA PRUNING
METHODS AND ALGORITHMS OF REDUCING COMPUTATION FOR DEEP NEURAL NETWORKS VIA PRUNING
展开▼
机译:通过修剪的深层神经网络减少计算的方法和算法
展开▼
页面导航
摘要
著录项
相似文献
摘要
A method is disclosed to reduce computational load of a deep neural network. A number of multiply-accumulate (MAC) operations is determined for each layer of the deep neural network. A pruning error allowance per weight is determined based on a computational load of each layer. For each layer of the deep neural network: a threshold estimator is initialized, and weights of each layer are pruned based on a standard deviation of all weights within the layer. A pruning error per weight is determined for the layer, and if the pruning error per weight exceeds a predetermined threshold, the threshold estimator is updated for the layer the weights of the layer are repruned using the updated threshold estimator and the pruning error per weight is re-determined until the pruning error per weight is less than the threshold. The deep neural network is then retrained.
展开▼