论文标题
利润:一种针对4位Mobilenet模型的新型培训方法
PROFIT: A Novel Training Method for sub-4-bit MobileNet Models
论文作者
论文摘要
由于对移动设备中更好的能源效率的需求不断增加,因此需要4位和较低的精度移动模型。在这项工作中,我们报告说,由重量量化引起的激活不稳定性(AIWQ)是移动网络下4位量化的关键障碍。为了减轻AIWQ问题,我们提出了一种新型的培训方法,称为渐进式冻结迭代培训(利润),该方法试图冻结其权重的层,其权重受到不稳定性问题的影响,而不稳定性问题比其他层强。我们还提出了一种可区分和统一的量化方法(DUQ)和负填充思想,以支持H-Swish等不对称激活函数。我们通过在ImageNet上量化Mobilenet-V1,V2和V3来评估所提出的方法,并报告4位量化提供可比的(超过1.48%TOP-1精度)的精度至全精度基线。在对Mobilenet-V3的3位量化的消融研究中,我们提出的方法以大幅度的幅度优于最先进的方法,占TOP-1准确性的12.86%。
4-bit and lower precision mobile models are required due to the ever-increasing demand for better energy efficiency in mobile devices. In this work, we report that the activation instability induced by weight quantization (AIWQ) is the key obstacle to sub-4-bit quantization of mobile networks. To alleviate the AIWQ problem, we propose a novel training method called PROgressive-Freezing Iterative Training (PROFIT), which attempts to freeze layers whose weights are affected by the instability problem stronger than the other layers. We also propose a differentiable and unified quantization method (DuQ) and a negative padding idea to support asymmetric activation functions such as h-swish. We evaluate the proposed methods by quantizing MobileNet-v1, v2, and v3 on ImageNet and report that 4-bit quantization offers comparable (within 1.48 % top-1 accuracy) accuracy to full precision baseline. In the ablation study of the 3-bit quantization of MobileNet-v3, our proposed method outperforms the state-of-the-art method by a large margin, 12.86 % of top-1 accuracy.
