CNN总结

应用场景：

图像（二维矩阵数据）特征提取

概括

卷积核滑呀滑

核心参数

kernel size (卷积核大小=3x3)：无论图像多大，parameters只取决于kernel size和fitter数量
paddle(填充大小)：1.每次卷积都会减小图片大小，卷不了几次图片就会变得非常小；2.边缘的像素被使用的次数少
stride (滑动步长=1)
max/avg pooling：进一步缩小图片大小，特征放大

特色/亮点

参数共享：在图像的一部分有用的特征检测器（例如垂直边缘检测器）可能在图像的另一部分有用。与全连接相比大大减小了模型参数。（平移不变性）
稀疏连接：在每一层中，每个输出值仅取决于少量输入。例如卷积核3x3则任何输出只与对应的9个输入数字有关。需要存储的参数更少，不仅减少了模型的存储需求，而且提高了它的统计效率。这也意味着为了得到输出我们只需要更少的计算量。

一个例子

class Classifier(nn.Module):
    def __init__(self):
        super(Classifier, self).__init__()
        # The arguments for commonly used modules:
        # torch.nn.Conv2d(in_channels, out_channels, kernel_size, stride, padding)
        # torch.nn.MaxPool2d(kernel_size, stride, padding)

        # input image size: [3, 128, 128]
        self.cnn_layers = nn.Sequential(
            nn.Conv2d(3, 64, 3, 1, 1),
            nn.BatchNorm2d(64),
            nn.ReLU(),
            nn.MaxPool2d(2, 2, 0),

            nn.Conv2d(64, 128, 3, 1, 1),
            nn.BatchNorm2d(128),
            nn.ReLU(),
            nn.MaxPool2d(2, 2, 0),

            nn.Conv2d(128, 256, 3, 1, 1),
            nn.BatchNorm2d(256),
            nn.ReLU(),
            nn.MaxPool2d(4, 4, 0),
        )
        self.fc_layers = nn.Sequential(
            nn.Linear(256 * 8 * 8, 256),
            nn.ReLU(),
            nn.Linear(256, 256),
            nn.ReLU(),
            nn.Linear(256, 11)
        )

    def forward(self, x):
        # input (x): [batch_size, 3, 128, 128]
        # output: [batch_size, 11]

        # Extract features by convolutional layers.
        x = self.cnn_layers(x)

        # The extracted feature map must be flatten before going to fully-connected layers.
        x = x.flatten(1)

        # The features are transformed by fully-connected layers to obtain the final logits.
        x = self.fc_layers(x)
        return x


# from torchinfo import summary
# device = "cuda" if torch.cuda.is_available() else "cpu"
# model = Classifier().to(device)
# summary(model, input_size=(batch_size, 3, 128, 128))

1x1卷积核的作用？

增加非线性映射次数，增加网络深度，提高网络的非线性能力
升维/降维，(减少卷积核参数)

经典案例(Backbone)

Resnet
- 解决深度神经网络的“退化”问题，“退化”指的是，给网络叠加更多的层后，性能却快速下降的情况
- 调整求解方法，比如更好的初始化、更好的梯度下降算法等；调整模型结构，让模型更易于优化（改变模型结构实际上是改变了error surface的形态）
- 跳跃连接 short cut
- 残差块 block
- error surface
Darknet53
FPN
- 低层的特征语义信息比较少，但是目标位置准确；高层的特征语义信息比较丰富，但是目标位置比较粗略。

视频推荐

吴恩达深度学习CNN

lhcstation

LHC {{moment(1677160750000).fromNow()}}

CNN总结

应用场景：

概括

核心参数

特色/亮点

一个例子

1x1卷积核的作用？

经典案例(Backbone)

视频推荐