论文标题
ARMA网:扩大接受范围的浓度预测
ARMA Nets: Expanding Receptive Field for Dense Prediction
论文作者
论文摘要
全局信息对于密集的预测问题至关重要,其目标是计算图像中每个像素的离散或连续标签。最初是用于图像分类的神经网络中传统的卷积层在这些问题中受到限制,因为滤波器的大小限制了他们的接受场。在这项工作中,我们建议用自回归运动平均层(ARMA)层代替任何传统的卷积层,这是一个具有可调节的自动回归系数控制的可调节接收场的新型模块。与传统的卷积层相比,我们的ARMA层可实现输出神经元的显式互连,并通过调整互连的自回归系数来学习其接受场。 ARMA层可调节到不同类型的任务:对于全局信息至关重要的任务,它能够学习相对较大的自动回归系数,从而允许输出神经元的接收场涵盖整个输入;对于仅需要本地信息的任务,它可以学习小或接近零的自回归系数,并自动减少到传统的卷积层。我们从理论和经验上都表明,具有ARMA层(称为ARMA网络)的网络的有效接收场随着较大的自动回归系数扩展。我们还可以通过重新参数化机制来解决ARMA层中学习和预测的不稳定性问题。此外,我们证明了ARMA网络在包括视频预测和语义细分在内的具有挑战性的密集预测任务上大大改善了其基准。
Global information is essential for dense prediction problems, whose goal is to compute a discrete or continuous label for each pixel in the images. Traditional convolutional layers in neural networks, initially designed for image classification, are restrictive in these problems since the filter size limits their receptive fields. In this work, we propose to replace any traditional convolutional layer with an autoregressive moving-average (ARMA) layer, a novel module with an adjustable receptive field controlled by the learnable autoregressive coefficients. Compared with traditional convolutional layers, our ARMA layer enables explicit interconnections of the output neurons and learns its receptive field by adapting the autoregressive coefficients of the interconnections. ARMA layer is adjustable to different types of tasks: for tasks where global information is crucial, it is capable of learning relatively large autoregressive coefficients to allow for an output neuron's receptive field covering the entire input; for tasks where only local information is required, it can learn small or near zero autoregressive coefficients and automatically reduces to a traditional convolutional layer. We show both theoretically and empirically that the effective receptive field of networks with ARMA layers (named as ARMA networks) expands with larger autoregressive coefficients. We also provably solve the instability problem of learning and prediction in the ARMA layer through a re-parameterization mechanism. Additionally, we demonstrate that ARMA networks substantially improve their baselines on challenging dense prediction tasks including video prediction and semantic segmentation.
