Deep Learning (10) - Convolutional Neural Network

Edit

终于到CNN了,卷积神经网络。顾名思义,加入了卷积层的神经网络就是一个CNN。

卷积的定义

卷积的数学定义在wiki page上可以找到。大致如下:

而机器学习中使用的卷积略有差别,如下

:实际上机器学习里的convolution是cross-correlation

卷积参数

卷积核

上面做卷积的3x3的矩阵就是卷积核,又可以称作filter,滤波器,过滤器等等。指的都是同一个东西。
在图像处理中,卷积通常被用来做边缘检测。例如上图的3x3 filter可以检测竖边缘。也可以变成下面这样来检测横边缘:

  • : channel number
  • : 下一个layer输入的channel number,本层等于filter number
  • f是卷积核的大小,s是步长,后面会讲到

Padding

Padding要解决的问题就是,如果没有padding,则输出矩阵会越来越小,因为当时,总小于n。如果有了padding,则这个数字修正成。此时可以修订padding的大小来调整输出矩阵的大小。

Stride

Stride指步长,上面图中的例子步长采用的是1,步长也可以是任意其他值。当步长为s,padding为p时,输出矩阵的尺寸为:

卷积神经网络

一个完整的卷积神经网络通常包括3个部分:

  • Convolution (CONV)
  • Pooling (POOL)
  • Fully connected (FC)

其中的Fully connected就是经典的全连接神经网络。

卷积网络

下图就是一层卷积网络的大致形态

  • 输入是6x6x3的矩阵
  • 经过两路卷积核和non-linear activation得到4x4x2的输出
  • 其中b是bias,activation采用ReLU

这样一层网络的参数有:

  • : filter size
  • : padding size
  • : stride
  • : number of filters in layer
  • Input:
  • Output:
  • Each filter has shape:
  • After activations: , with mini-batch:
  • Weights:
  • Bias:

如果一个64x64的数据输入,采用10个3x3的卷积核,需要多少个模型参数?
答案是:(3x3+1)*10 = 280个,与输入图像的尺寸无关。即用小尺寸样本数据训练出来的卷积模型,同样可以适用于大尺寸的图像。

Pooling

翻译做池化层。一般有两种池化策略:

  • Max pooling
  • Average pooling

前者使用的更多一些。
具体做法用两张图表示:
Max pooling

Average pooling

这里f=2表示,基于2x2的矩阵做池化,s=2表示每次偏移2获得下一个池化矩阵。f, s都是超参数。所以池化层没有learnable parameter,只有hyper parameter。

为什么要有池化层?网上有很多讨论,摘一段:

本质上,是在精简feature map数据量的同时,最大化保留空间信息和特征信息,的处理技巧;目的是,通过feature map进行压缩浓缩,给到后面hidden layer的input就小了,计算效率能提高;CNN的invariance的能力,本质是由convolution创造的;

我的理解,有几个原因(可能不一定对,请斧正):

  • 卷积滑动累加时,区域有重叠,所以数据是有冗余的,需要精简
  • 池化可以减少位移带来的影响,如max pooling只取一小块区域的最大值,这样虽然有小小位移,输出数据对此并不敏感
  • 可以降维,减少后续数据计算量,也可以减少过拟合的风险,但是增加了欠拟合的风险。

完整的卷积神经网络

一个完整的卷积神经网络大概长这样:

CONV-POOL-CONV-POOL-FC-FC-Softmax

  • 每个卷积网络后跟一个池化层,共两个卷积网络两个池化层
  • 卷积层后,输出串行化到一个列向量里,作为后续神经网络的输入
  • FC3和FC4是两个全连接的标准神经网络
  • 最后是Softmax的输出层

这样一个网络所有的参数如下表:

%23%20Deep%20Learning%20%2810%29%20-%20%20Convolutional%20Neural%20Network%0A@%28myblog%29%5Bdeep%20learning%2C%20machine%20learning%5D%0A%0A%u7EC8%u4E8E%u5230CNN%u4E86%uFF0C%u5377%u79EF%u795E%u7ECF%u7F51%u7EDC%u3002%u987E%u540D%u601D%u4E49%uFF0C%u52A0%u5165%u4E86%u5377%u79EF%u5C42%u7684%u795E%u7ECF%u7F51%u7EDC%u5C31%u662F%u4E00%u4E2ACNN%u3002%0A%0A%23%23%20%u5377%u79EF%u7684%u5B9A%u4E49%0A%u5377%u79EF%u7684%u6570%u5B66%u5B9A%u4E49%u5728%5Bwiki%20page%5D%28https%3A//zh.wikipedia.org/wiki/%25E5%258D%25B7%25E7%25A7%25AF%29%u4E0A%u53EF%u4EE5%u627E%u5230%u3002%u5927%u81F4%u5982%u4E0B%uFF1A%0A%24%24h%28x%29%20%3D%20%28f%20%5Cast%20g%29%28x%29%20%3D%20%5Cint_%7B-%5Cinfty%7D%5E%5Cinfty%20f%28%5Ctau%29g%28x-%5Ctau%29dx%24%24%0A%u800C%u673A%u5668%u5B66%u4E60%u4E2D%u4F7F%u7528%u7684%u5377%u79EF%u7565%u6709%u5DEE%u522B%uFF0C%u5982%u4E0B%0A%21%5BAlt%20text%7C500x0%5D%28./1537223810801.png%29%0A%u4E24%u8005%u7684%u5DEE%u522B%u5728%u4E8E%uFF0C%u673A%u5668%u5B66%u4E60%u4E2D%u7684%u5377%u79EF%u6CA1%u6709%u5BF9%u51FD%u6570g%28x%29%u8FDB%u884C%u7FFB%u8F6C%u3002%u800C%u662F%u76F4%u63A5%u4E0E%u6E90%u6570%u636E%u8FDB%u884C%u79FB%u52A8%u76F8%u4E58%u5E76%u7D2F%u79EF%u3002%u524D%u8005%u7FFB%u8F6C%u7684%u610F%u4E49%u5728%u4E8E%uFF0C%u53EF%u4EE5%u8FD0%u7528%u7ED3%u5408%u5F8B%u5373%24%28A%20%5Cast%20B%29%5Cast%20C%20%3D%20A%20%5Cast%20%28B%20%5Cast%20C%29%24%u3002%0A%3E**%u6CE8**%uFF1A%u5B9E%u9645%u4E0A%u673A%u5668%u5B66%u4E60%u91CC%u7684convolution%u662Fcross-correlation%0A%0A%23%23%20%u5377%u79EF%u53C2%u6570%0A%23%23%23%20%u5377%u79EF%u6838%0A%u4E0A%u9762%u505A%u5377%u79EF%u76843x3%u7684%u77E9%u9635%u5C31%u662F%u5377%u79EF%u6838%uFF0C%u53C8%u53EF%u4EE5%u79F0%u4F5Cfilter%uFF0C%u6EE4%u6CE2%u5668%uFF0C%u8FC7%u6EE4%u5668%u7B49%u7B49%u3002%u6307%u7684%u90FD%u662F%u540C%u4E00%u4E2A%u4E1C%u897F%u3002%0A%u5728%u56FE%u50CF%u5904%u7406%u4E2D%uFF0C%u5377%u79EF%u901A%u5E38%u88AB%u7528%u6765%u505A%u8FB9%u7F18%u68C0%u6D4B%u3002%u4F8B%u5982%u4E0A%u56FE%u76843x3%20filter%u53EF%u4EE5%u68C0%u6D4B%u7AD6%u8FB9%u7F18%u3002%u4E5F%u53EF%u4EE5%u53D8%u6210%u4E0B%u9762%u8FD9%u6837%u6765%u68C0%u6D4B%u6A2A%u8FB9%u7F18%uFF1A%0A%21%5BAlt%20text%7C150x0%5D%28./1537224799937.png%29%0A%u90A3%u5230%u5E95%u8FD9%u4E2A%u5377%u79EF%u6838%u8BE5%u5982%u4F55%u53D6%u503C%uFF1F%u5728%u673A%u5668%u5B66%u4E60%u4E2D%uFF0C%u8FD9%u4E2A%u5377%u79EF%u6838%u7684%u53D6%u503C%u5B9E%u9645%u662F%u4E00%u4E2Alearnable%20parameter%u3002%u4E5F%u5C31%u662F%u8BF4%u4ED6%u548C%u795E%u7ECF%u7F51%u7EDC%u4E2D%u7684W%uFF0Cb%u662F%u4E00%u6837%u7684%uFF0C%u4E5F%u662F%u901A%u8FC7%u5B66%u4E60%u83B7%u5F97%u7684%u3002%u4ED6%u662F%u6A21%u578B%u53C2%u6570%u3002%u6211%u4EEC%u9700%u8981%u5B9A%u7684%u8D85%u53C2%u6570%u662F%uFF0C%u91C7%u7528%u591A%u5C11%u4E2A%u5377%u79EF%u6838%u3002%0A%u4F8B%u5982%u8FD9%u4E2A%u4F8B%u5B50%uFF1A%0A%21%5BAlt%20text%7C600x0%5D%28./1537225051999.png%29%0A%u8F93%u5165%u662F%u4E00%u4E2A6x6x3%u77E9%u9635%uFF0C%u4EE3%u8868%u4E00%u4E2A%u56FE%u50CF%u7684RGB%u4E09%u4E2A%u901A%u9053%uFF0C%u5377%u57FA%u5C42%u91C7%u7528%u4E24%u4E2A3x3x3%u7684%u5377%u79EF%u6838%uFF0C%u83B7%u5F974x4x2%u7684%u5377%u79EF%u8F93%u51FA%0A%24%24n_H%20%5Ctimes%20n_W%20%5Ctimes%20n_C%20%5Crightarrow%20%28%5Cfrac%20%7Bn_H-f%7D%7Bs%7D+1%29%20%5Ctimes%20%28%5Cfrac%7Bn_W-f%7D%7Bs%7D%20+%201%29%20%5Ctimes%20n_C%27%24%24%0A-%20%24n_C%24%3A%20%20channel%20number%0A-%20%24n_C%27%24%3A%20%u4E0B%u4E00%u4E2Alayer%u8F93%u5165%u7684channel%20number%uFF0C%u672C%u5C42%u7B49%u4E8Efilter%20number%0A-%20f%u662F%u5377%u79EF%u6838%u7684%u5927%u5C0F%uFF0Cs%u662F%u6B65%u957F%uFF0C%u540E%u9762%u4F1A%u8BB2%u5230%0A%0A%23%23%23%20Padding%0APadding%u8981%u89E3%u51B3%u7684%u95EE%u9898%u5C31%u662F%uFF0C%u5982%u679C%u6CA1%u6709padding%uFF0C%u5219%u8F93%u51FA%u77E9%u9635%u4F1A%u8D8A%u6765%u8D8A%u5C0F%uFF0C%u56E0%u4E3A%u5F53%24f%20%5Cgt%201%24%u65F6%uFF0C%24%5Ccfrac%20%7Bn-f%7D%7Bs%7D%20+%201%24%u603B%u5C0F%u4E8En%u3002%u5982%u679C%u6709%u4E86padding%uFF0C%u5219%u8FD9%u4E2A%u6570%u5B57%u4FEE%u6B63%u6210%24%5Ccfrac%20%7Bn-f+2p%7D%7Bs%7D%20+%201%24%u3002%u6B64%u65F6%u53EF%u4EE5%u4FEE%u8BA2padding%u7684%u5927%u5C0F%u6765%u8C03%u6574%u8F93%u51FA%u77E9%u9635%u7684%u5927%u5C0F%u3002%0A%0A%23%23%23%20Stride%0AStride%u6307%u6B65%u957F%uFF0C%u4E0A%u9762%u56FE%u4E2D%u7684%u4F8B%u5B50%u6B65%u957F%u91C7%u7528%u7684%u662F1%uFF0C%u6B65%u957F%u4E5F%u53EF%u4EE5%u662F%u4EFB%u610F%u5176%u4ED6%u503C%u3002%u5F53%u6B65%u957F%u4E3As%uFF0Cpadding%u4E3Ap%u65F6%uFF0C%u8F93%u51FA%u77E9%u9635%u7684%u5C3A%u5BF8%u4E3A%uFF1A%0A%24%24%5Clfloor%20%5Ccfrac%20%7Bn-f+2p%7D%7Bs%7D%20+%201%20%5Crfloor%20%5Ctimes%20%5Clfloor%20%5Ccfrac%20%7Bn-f+2p%7D%7Bs%7D%20+%201%20%5Crfloor%20%5Ctimes%20n_C%24%24%0A%0A%23%23%20%u5377%u79EF%u795E%u7ECF%u7F51%u7EDC%0A%u4E00%u4E2A%u5B8C%u6574%u7684%u5377%u79EF%u795E%u7ECF%u7F51%u7EDC%u901A%u5E38%u5305%u62EC3%u4E2A%u90E8%u5206%uFF1A%0A-%20Convolution%20%28CONV%29%0A-%20Pooling%20%28POOL%29%0A-%20Fully%20connected%20%28FC%29%0A%0A%u5176%u4E2D%u7684Fully%20connected%u5C31%u662F%u7ECF%u5178%u7684%u5168%u8FDE%u63A5%u795E%u7ECF%u7F51%u7EDC%u3002%0A%0A%23%23%23%20%u5377%u79EF%u7F51%u7EDC%0A%u4E0B%u56FE%u5C31%u662F%u4E00%u5C42%u5377%u79EF%u7F51%u7EDC%u7684%u5927%u81F4%u5F62%u6001%0A%21%5BAlt%20text%7C700x0%5D%28./1537238687912.png%29%0A-%20%u8F93%u5165%u662F6x6x3%u7684%u77E9%u9635%0A-%20%u7ECF%u8FC7%u4E24%u8DEF%u5377%u79EF%u6838%u548Cnon-linear%20activation%u5F97%u52304x4x2%u7684%u8F93%u51FA%0A-%20%u5176%u4E2Db%u662Fbias%uFF0Cactivation%u91C7%u7528ReLU%0A%0A%u8FD9%u6837%u4E00%u5C42%u7F51%u7EDC%u7684%u53C2%u6570%u6709%uFF1A%0A-%20%24f%5E%7B%5Bl%5D%7D%24%3A%20filter%20size%0A-%20%24p%5E%7B%5Bl%5D%7D%24%3A%20padding%20size%0A-%20%24s%5E%7B%5Bl%5D%7D%24%3A%20stride%0A-%20%24n_C%5E%7B%5Bl%5D%7D%24%3A%20number%20of%20filters%20in%20layer%20%24l%24%0A-%20Input%3A%20%24n_H%5E%7B%5Bl-1%5D%7D%20%5Ctimes%20n_W%5E%7B%5Bl-1%5D%7D%20%5Ctimes%20n_C%5E%7B%5Bl-1%5D%7D%24%0A-%20Output%3A%20%24n_H%5E%7B%5Bl%5D%7D%20%5Ctimes%20n_W%5E%7B%5Bl%5D%7D%20%5Ctimes%20n_C%5E%7B%5Bl%5D%7D%24%0A-%20Each%20filter%20has%20shape%3A%20%24f%5E%7B%5Bl%5D%7D%20%5Ctimes%20f%5E%7B%5Bl%5D%7D%20%5Ctimes%20n_C%5E%7B%5Bl-1%5D%7D%24%0A-%20After%20activations%3A%20%24a%5E%7B%5Bl%5D%7D%20%3D%20n_H%5E%7B%5Bl%5D%7D%20%5Ctimes%20n_W%5E%7B%5Bl%5D%7D%20%5Ctimes%20n_C%5E%7B%5Bl%5D%7D%24%2C%20with%20mini-batch%3A%20%24A%5E%7B%5Bl%5D%7D%20%3D%20m%20%5Ctimes%20n_H%5E%7B%5Bl%5D%7D%20%5Ctimes%20n_W%5E%7B%5Bl%5D%7D%20%5Ctimes%20n_C%5E%7B%5Bl%5D%7D%24%0A-%20Weights%3A%20%24f%5E%7B%5Bl%5D%7D%20%5Ctimes%20f%5E%7B%5Bl%5D%7D%20%5Ctimes%20n_C%5E%7B%5Bl-1%5D%7D%20%5Ctimes%20n_C%5E%7B%5Bl%5D%7D%24%0A-%20Bias%3A%20%24n_C%5E%7B%5Bl%5D%7D%24%0A%0A%3E%20%u5982%u679C%u4E00%u4E2A64x64%u7684%u6570%u636E%u8F93%u5165%uFF0C%u91C7%u752810%u4E2A3x3%u7684%u5377%u79EF%u6838%uFF0C%u9700%u8981%u591A%u5C11%u4E2A%u6A21%u578B%u53C2%u6570%uFF1F%0A%3E%20%u7B54%u6848%u662F%uFF1A%283x3+1%29*10%20%3D%20280%u4E2A%uFF0C%u4E0E%u8F93%u5165%u56FE%u50CF%u7684%u5C3A%u5BF8%u65E0%u5173%u3002%u5373%u7528%u5C0F%u5C3A%u5BF8%u6837%u672C%u6570%u636E%u8BAD%u7EC3%u51FA%u6765%u7684%u5377%u79EF%u6A21%u578B%uFF0C%u540C%u6837%u53EF%u4EE5%u9002%u7528%u4E8E%u5927%u5C3A%u5BF8%u7684%u56FE%u50CF%u3002%0A%0A%23%23%23%20Pooling%0A%u7FFB%u8BD1%u505A%u6C60%u5316%u5C42%u3002%u4E00%u822C%u6709%u4E24%u79CD%u6C60%u5316%u7B56%u7565%uFF1A%0A-%20Max%20pooling%0A-%20Average%20pooling%0A%0A%u524D%u8005%u4F7F%u7528%u7684%u66F4%u591A%u4E00%u4E9B%u3002%0A%u5177%u4F53%u505A%u6CD5%u7528%u4E24%u5F20%u56FE%u8868%u793A%uFF1A%0A**Max%20pooling**%0A%21%5BAlt%20text%7C500x0%5D%28./1537240125946.png%29%0A%0A**Average%20pooling**%0A%21%5BAlt%20text%7C500x0%5D%28./1537240211256.png%29%0A%0A%u8FD9%u91CCf%3D2%u8868%u793A%uFF0C%u57FA%u4E8E2x2%u7684%u77E9%u9635%u505A%u6C60%u5316%uFF0Cs%3D2%u8868%u793A%u6BCF%u6B21%u504F%u79FB2%u83B7%u5F97%u4E0B%u4E00%u4E2A%u6C60%u5316%u77E9%u9635%u3002f%2C%20s%u90FD%u662F%u8D85%u53C2%u6570%u3002%u6240%u4EE5%u6C60%u5316%u5C42%u6CA1%u6709learnable%20parameter%uFF0C%u53EA%u6709hyper%20parameter%u3002%0A%0A%0A%u4E3A%u4EC0%u4E48%u8981%u6709%u6C60%u5316%u5C42%uFF1F%u7F51%u4E0A%u6709%u5F88%u591A%u8BA8%u8BBA%uFF0C%u6458%u4E00%u6BB5%uFF1A%0A%3E%u672C%u8D28%u4E0A%uFF0C%u662F%u5728%u7CBE%u7B80feature%20map%u6570%u636E%u91CF%u7684%u540C%u65F6%uFF0C%u6700%u5927%u5316%u4FDD%u7559%u7A7A%u95F4%u4FE1%u606F%u548C%u7279%u5F81%u4FE1%u606F%uFF0C%u7684%u5904%u7406%u6280%u5DE7%uFF1B%u76EE%u7684%u662F%uFF0C%u901A%u8FC7feature%20map%u8FDB%u884C%u538B%u7F29%u6D53%u7F29%uFF0C%u7ED9%u5230%u540E%u9762hidden%20layer%u7684input%u5C31%u5C0F%u4E86%uFF0C%u8BA1%u7B97%u6548%u7387%u80FD%u63D0%u9AD8%uFF1BCNN%u7684invariance%u7684%u80FD%u529B%uFF0C%u672C%u8D28%u662F%u7531convolution%u521B%u9020%u7684%uFF1B%0A%0A%u6211%u7684%u7406%u89E3%uFF0C%u6709%u51E0%u4E2A%u539F%u56E0%28%u53EF%u80FD%u4E0D%u4E00%u5B9A%u5BF9%uFF0C%u8BF7%u65A7%u6B63%29%uFF1A%0A-%20%u5377%u79EF%u6ED1%u52A8%u7D2F%u52A0%u65F6%uFF0C%u533A%u57DF%u6709%u91CD%u53E0%uFF0C%u6240%u4EE5%u6570%u636E%u662F%u6709%u5197%u4F59%u7684%uFF0C%u9700%u8981%u7CBE%u7B80%0A-%20%u6C60%u5316%u53EF%u4EE5%u51CF%u5C11%u4F4D%u79FB%u5E26%u6765%u7684%u5F71%u54CD%uFF0C%u5982max%20pooling%u53EA%u53D6%u4E00%u5C0F%u5757%u533A%u57DF%u7684%u6700%u5927%u503C%uFF0C%u8FD9%u6837%u867D%u7136%u6709%u5C0F%u5C0F%u4F4D%u79FB%uFF0C%u8F93%u51FA%u6570%u636E%u5BF9%u6B64%u5E76%u4E0D%u654F%u611F%0A-%20%u53EF%u4EE5%u964D%u7EF4%uFF0C%u51CF%u5C11%u540E%u7EED%u6570%u636E%u8BA1%u7B97%u91CF%uFF0C%u4E5F%u53EF%u4EE5%u51CF%u5C11%u8FC7%u62DF%u5408%u7684%u98CE%u9669%uFF0C%u4F46%u662F%u589E%u52A0%u4E86%u6B20%u62DF%u5408%u7684%u98CE%u9669%u3002%0A%0A%0A%23%23%23%20%u5B8C%u6574%u7684%u5377%u79EF%u795E%u7ECF%u7F51%u7EDC%0A%u4E00%u4E2A%u5B8C%u6574%u7684%u5377%u79EF%u795E%u7ECF%u7F51%u7EDC%u5927%u6982%u957F%u8FD9%u6837%uFF1A%0A%21%5BAlt%20text%7C800x0%5D%28./1537247393297.png%29%0A%0ACONV-POOL-CONV-POOL-FC-FC-Softmax%0A%0A-%20%u6BCF%u4E2A%u5377%u79EF%u7F51%u7EDC%u540E%u8DDF%u4E00%u4E2A%u6C60%u5316%u5C42%uFF0C%u5171%u4E24%u4E2A%u5377%u79EF%u7F51%u7EDC%u4E24%u4E2A%u6C60%u5316%u5C42%0A-%20%u5377%u79EF%u5C42%u540E%uFF0C%u8F93%u51FA%u4E32%u884C%u5316%u5230%u4E00%u4E2A%u5217%u5411%u91CF%u91CC%uFF0C%u4F5C%u4E3A%u540E%u7EED%u795E%u7ECF%u7F51%u7EDC%u7684%u8F93%u5165%0A-%20FC3%u548CFC4%u662F%u4E24%u4E2A%u5168%u8FDE%u63A5%u7684%u6807%u51C6%u795E%u7ECF%u7F51%u7EDC%0A-%20%u6700%u540E%u662FSoftmax%u7684%u8F93%u51FA%u5C42%0A%0A%u8FD9%u6837%u4E00%u4E2A%u7F51%u7EDC%u6240%u6709%u7684%u53C2%u6570%u5982%u4E0B%u8868%uFF1A%0A%21%5BAlt%20text%7C650x0%5D%28./1537247671438.png%29%0A%0A%0A%0A%0A%0A%0A%0A