pytorch2ONNX时,AdaptiveAvgPool2d的相关问题

1、torchvision.models.vgg11_bn

from torchsummary import summary
import torch
from torchvision import models


device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = models.vgg11_bn(num_classes=2).to(device)

# 打印模型结构
backbone1 = summary(model, (3, 128, 128))
backbone2 = summary(model, (3, 224, 224))
  • 当图片输入尺寸为:(3, 224, 224),模型的输出结构如下:
----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
================================================================
            Conv2d-1         [-1, 64, 224, 224]           1,792
       BatchNorm2d-2         [-1, 64, 224, 224]             128
              ReLU-3         [-1, 64, 224, 224]               0
         MaxPool2d-4         [-1, 64, 112, 112]               0
            Conv2d-5        [-1, 128, 112, 112]          73,856
       BatchNorm2d-6        [-1, 128, 112, 112]             256
              ReLU-7        [-1, 128, 112, 112]               0
         MaxPool2d-8          [-1, 128, 56, 56]               0
            Conv2d-9          [-1, 256, 56, 56]         295,168
      BatchNorm2d-10          [-1, 256, 56, 56]             512
             ReLU-11          [-1, 256, 56, 56]               0
           Conv2d-12          [-1, 256, 56, 56]         590,080
      BatchNorm2d-13          [-1, 256, 56, 56]             512
             ReLU-14          [-1, 256, 56, 56]               0
        MaxPool2d-15          [-1, 256, 28, 28]               0
           Conv2d-16          [-1, 512, 28, 28]       1,180,160
      BatchNorm2d-17          [-1, 512, 28, 28]           1,024
             ReLU-18          [-1, 512, 28, 28]               0
           Conv2d-19          [-1, 512, 28, 28]       2,359,808
      BatchNorm2d-20          [-1, 512, 28, 28]           1,024
             ReLU-21          [-1, 512, 28, 28]               0
        MaxPool2d-22          [-1, 512, 14, 14]               0
           Conv2d-23          [-1, 512, 14, 14]       2,359,808
      BatchNorm2d-24          [-1, 512, 14, 14]           1,024
             ReLU-25          [-1, 512, 14, 14]               0
           Conv2d-26          [-1, 512, 14, 14]       2,359,808
      BatchNorm2d-27          [-1, 512, 14, 14]           1,024
             ReLU-28          [-1, 512, 14, 14]               0
        MaxPool2d-29            [-1, 512, 7, 7]               0
AdaptiveAvgPool2d-30            [-1, 512, 7, 7]               0
           Linear-31                 [-1, 4096]     102,764,544
             ReLU-32                 [-1, 4096]               0
          Dropout-33                 [-1, 4096]               0
           Linear-34                 [-1, 4096]      16,781,312
             ReLU-35                 [-1, 4096]               0
          Dropout-36                 [-1, 4096]               0
           Linear-37                    [-1, 2]           8,194
================================================================
Total params: 128,780,034
Trainable params: 128,780,034
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.57
Forward/backward pass size (MB): 182.02
Params size (MB): 491.26
Estimated Total Size (MB): 673.85
----------------------------------------------------------------
  • 当图片输入尺寸为:(3, 128, 128),模型的输出结构如下:
----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
================================================================
            Conv2d-1         [-1, 64, 128, 128]           1,792
       BatchNorm2d-2         [-1, 64, 128, 128]             128
              ReLU-3         [-1, 64, 128, 128]               0
         MaxPool2d-4           [-1, 64, 64, 64]               0
            Conv2d-5          [-1, 128, 64, 64]          73,856
       BatchNorm2d-6          [-1, 128, 64, 64]             256
              ReLU-7          [-1, 128, 64, 64]               0
         MaxPool2d-8          [-1, 128, 32, 32]               0
            Conv2d-9          [-1, 256, 32, 32]         295,168
      BatchNorm2d-10          [-1, 256, 32, 32]             512
             ReLU-11          [-1, 256, 32, 32]               0
           Conv2d-12          [-1, 256, 32, 32]         590,080
      BatchNorm2d-13          [-1, 256, 32, 32]             512
             ReLU-14          [-1, 256, 32, 32]               0
        MaxPool2d-15          [-1, 256, 16, 16]               0
           Conv2d-16          [-1, 512, 16, 16]       1,180,160
      BatchNorm2d-17          [-1, 512, 16, 16]           1,024
             ReLU-18          [-1, 512, 16, 16]               0
           Conv2d-19          [-1, 512, 16, 16]       2,359,808
      BatchNorm2d-20          [-1, 512, 16, 16]           1,024
             ReLU-21          [-1, 512, 16, 16]               0
        MaxPool2d-22            [-1, 512, 8, 8]               0
           Conv2d-23            [-1, 512, 8, 8]       2,359,808
      BatchNorm2d-24            [-1, 512, 8, 8]           1,024
             ReLU-25            [-1, 512, 8, 8]               0
           Conv2d-26            [-1, 512, 8, 8]       2,359,808
      BatchNorm2d-27            [-1, 512, 8, 8]           1,024
             ReLU-28            [-1, 512, 8, 8]               0
        MaxPool2d-29            [-1, 512, 4, 4]               0
AdaptiveAvgPool2d-30            [-1, 512, 7, 7]               0
           Linear-31                 [-1, 4096]     102,764,544
             ReLU-32                 [-1, 4096]               0
          Dropout-33                 [-1, 4096]               0
           Linear-34                 [-1, 4096]      16,781,312
             ReLU-35                 [-1, 4096]               0
          Dropout- ==**当图片输入尺寸为:(3, 128, 128),模型的输出结构如下:**==
-36                 [-1, 4096]               0
           Linear-37                    [-1, 2]           8,194
================================================================
Total params: 128,780,034
Trainable params: 128,780,034
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.19
Forward/backward pass size (MB): 59.69
Params size (MB): 491.26
Estimated Total Size (MB): 551.14
----------------------------------------------------------------

2、对比发现

  • 两种尺寸图片都可以正常训练,224是torch官方使用的尺寸,训练imagenet训练,并且提供了训练权重。当模型属输入尺寸为128时,仍然可以使用预训练权重,但是可以看到MaxPool2d到AdaptiveAvgPool2d时,输出尺寸发生了变化,这是因为AdaptiveAvgPool2d可以动态的调整输入尺寸的大小和stride。可以更好的适应不同的输入出尺寸。
  • 在训练模型时,这些层在前向传播过程中可以产生输出,但是在反向传播过程中并不影响梯度的计算。因此,尽管在导出到ONNX格式时可能会遇到一些限制或错误,但模型仍然可以继续通过梯度下降算法进行训练。

3、结论

  • 在ONNX导出时,当AdaptiveAvgPool2d的输入尺寸和输出尺寸不对应时,就会提示错误:
raise errors.SymbolicValueError(
torch.onnx.errors.SymbolicValueError: Unsupported: ONNX export of operator adaptive_avg_pool2d, output size that are not factor of input size. Please feel free to request support or submit a pull request on PyTorch GitHub: https://github.com/pytorch/pytorch/issues  [Caused by the value '100 defined in (%100 : Long(2, strides=[1], device=cpu) = onnx::Constant[value= 7  7 [ CPULongType{2} ]]()
)' (type 'Tensor') in the TorchScript graph. The containing node has kind 'onnx::Constant'.] 
  • 所以需要修改AdaptiveAvgPool2d,以正确的导出模型
  • 在导出onnx的过程中,很多动态层都不支持,需要改为固定输出
  • 在 PyTorch 的模型中,通常在使用 torchsummarysummary 函数时,如果没有指定 batch size,它会默认使用一个 batch size 为 2 来生成模型的 summary。这是因为在实际训练和推理过程中,通常会使用 mini-batch 处理数据,而选择 batch size 为 2 是一种常见的默认设置。

因此,当你调用 summary(model, (3, 128, 128)) 时,torchsummary 库会假定 batch size 为 2,然后将输入尺寸 (3, 128, 128) 传递给模型,以便计算模型的结构和参数数量。

如果你希望使用不同的 batch size,可以在调用 summary 函数时显式指定,例如:

backbone1 = summary(model, input_size=(3, 128, 128), batch_size=4)

通过提供 batch_size 参数,你可以自定义用于生成 summary 的 batch size。这样可以更好地了解模型在不同 batch size 下的行为和参数量。

4、模型修改

from torchsummary import summary
import torch
from torchvision import models
from torch import nn


device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = models.vgg11_bn(num_classes=2)
model.avgpool = nn.AdaptiveAvgPool2d((4, 4))
model.classifier = nn.Sequential(
            nn.Linear(512 * 4 * 4, 4096),
            nn.ReLU(True),
            nn.Dropout(p=0.5),
            nn.Linear(4096, 4096),
            nn.ReLU(True),
            nn.Dropout(p=0.5),
            nn.Linear(4096, 2),
        )
model.to(device)   # 模型修改之后,再搞到GPU上,不然报错

# 打印模型结构
backbone1 = summary(model, (3, 128, 128))
backbone2 = summary(model, (3, 224, 224))
  • 当图片输入尺寸为:(3, 224, 224),模型的输出结构如下:
----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
================================================================
            Conv2d-1         [-1, 64, 224, 224]           1,792
       BatchNorm2d-2         [-1, 64, 224, 224]             128
              ReLU-3         [-1, 64, 224, 224]               0
         MaxPool2d-4         [-1, 64, 112, 112]               0
            Conv2d-5        [-1, 128, 112, 112]          73,856
       BatchNorm2d-6        [-1, 128, 112, 112]             256
              ReLU-7        [-1, 128, 112, 112]               0
         MaxPool2d-8          [-1, 128, 56, 56]               0
            Conv2d-9          [-1, 256, 56, 56]         295,168
      BatchNorm2d-10          [-1, 256, 56, 56]             512
             ReLU-11          [-1, 256, 56, 56]               0
           Conv2d-12          [-1, 256, 56, 56]         590,080
      BatchNorm2d-13          [-1, 256, 56, 56]             512
             ReLU-14          [-1, 256, 56, 56]               0
        MaxPool2d-15          [-1, 256, 28, 28]               0
           Conv2d-16          [-1, 512, 28, 28]       1,180,160
      BatchNorm2d-17          [-1, 512, 28, 28]           1,024
             ReLU-18          [-1, 512, 28, 28]               0
           Conv2d-19          [-1, 512, 28, 28]       2,359,808
      BatchNorm2d-20          [-1, 512, 28, 28]           1,024
             ReLU-21          [-1, 512, 28, 28]               0
        MaxPool2d-22          [-1, 512, 14, 14]               0
           Conv2d-23          [-1, 512, 14, 14]       2,359,808
      BatchNorm2d-24          [-1, 512, 14, 14]           1,024
             ReLU-25          [-1, 512, 14, 14]               0
           Conv2d-26          [-1, 512, 14, 14]       2,359,808
      BatchNorm2d-27          [-1, 512, 14, 14]           1,024
             ReLU-28          [-1, 512, 14, 14]               0
        MaxPool2d-29            [-1, 512, 7, 7]               0
AdaptiveAvgPool2d-30            [-1, 512, 4, 4]               0
           Linear-31                 [-1, 4096]      33,558,528
             ReLU-32                 [-1, 4096]               0
          Dropout-33                 [-1, 4096]               0
           Linear-34                 [-1, 4096]      16,781,312
             ReLU-35                 [-1, 4096]               0
          Dropout-36                 [-1, 4096]               0
           Linear-37                    [-1, 2]           8,194
================================================================
Total params: 59,574,018
Trainable params: 59,574,018
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.57
Forward/backward pass size (MB): 181.89
Params size (MB): 227.26
Estimated Total Size (MB): 409.73
----------------------------------------------------------------
  • 当图片输入尺寸为:(3, 128, 128),模型的输出结构如下:
----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
================================================================
            Conv2d-1         [-1, 64, 128, 128]           1,792
       BatchNorm2d-2         [-1, 64, 128, 128]             128
              ReLU-3         [-1, 64, 128, 128]               0
         MaxPool2d-4           [-1, 64, 64, 64]               0
            Conv2d-5          [-1, 128, 64, 64]          73,856
       BatchNorm2d-6          [-1, 128, 64, 64]             256
              ReLU-7          [-1, 128, 64, 64]               0
         MaxPool2d-8          [-1, 128, 32, 32]               0
            Conv2d-9          [-1, 256, 32, 32]         295,168
      BatchNorm2d-10          [-1, 256, 32, 32]             512
             ReLU-11          [-1, 256, 32, 32]               0
           Conv2d-12          [-1, 256, 32, 32]         590,080
      BatchNorm2d-13          [-1, 256, 32, 32]             512
             ReLU-14          [-1, 256, 32, 32]               0
        MaxPool2d-15          [-1, 256, 16, 16]               0
           Conv2d-16          [-1, 512, 16, 16]       1,180,160
      BatchNorm2d-17          [-1, 512, 16, 16]           1,024
             ReLU-18          [-1, 512, 16, 16]               0
           Conv2d-19          [-1, 512, 16, 16]       2,359,808
      BatchNorm2d-20          [-1, 512, 16, 16]           1,024
             ReLU-21          [-1, 512, 16, 16]               0
        MaxPool2d-22            [-1, 512, 8, 8]               0
           Conv2d-23            [-1, 512, 8, 8]       2,359,808
      BatchNorm2d-24            [-1, 512, 8, 8]           1,024
             ReLU-25            [-1, 512, 8, 8]               0
           Conv2d-26            [-1, 512, 8, 8]       2,359,808
      BatchNorm2d-27            [-1, 512, 8, 8]           1,024
             ReLU-28            [-1, 512, 8, 8]               0
        MaxPool2d-29            [-1, 512, 4, 4]               0
AdaptiveAvgPool2d-30            [-1, 512, 4, 4]               0
           Linear-31                 [-1, 4096]      33,558,528
             ReLU-32                 [-1, 4096]               0
          Dropout-33                 [-1, 4096]               0
           Linear-34                 [-1, 4096]      16,781,312
             ReLU-35                 [-1, 4096]               0
          Dropout-36                 [-1, 4096]               0
           Linear-37                    [-1, 2]           8,194
================================================================
Total params: 59,574,018
Trainable params: 59,574,018
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.19
Forward/backward pass size (MB): 59.56
Params size (MB): 227.26
Estimated Total Size (MB): 287.01
----------------------------------------------------------------

相关推荐

  1. pytorch2ONNXAdaptiveAvgPool2d相关问题

    2024-05-11 10:48:06       22 阅读
  2. pytorch之导出ONNX相关问题

    2024-05-11 10:48:06       60 阅读
  3. Pytorch当中nn.AvgPool3d()和nn.AdaptiveAvgPool3d()区别

    2024-05-11 10:48:06       60 阅读
  4. 复现NAS with RLpytorch相关问题

    2024-05-11 10:48:06       63 阅读
  5. pytorchwhile for 循环 导出onnx问题

    2024-05-11 10:48:06       43 阅读

最近更新

  1. docker php8.1+nginx base 镜像 dockerfile 配置

    2024-05-11 10:48:06       94 阅读
  2. Could not load dynamic library ‘cudart64_100.dll‘

    2024-05-11 10:48:06       100 阅读
  3. 在Django里面运行非项目文件

    2024-05-11 10:48:06       82 阅读
  4. Python语言-面向对象

    2024-05-11 10:48:06       91 阅读

热门阅读

  1. 【python】Flask开发感悟

    2024-05-11 10:48:06       24 阅读
  2. 【软考】scrum的步骤

    2024-05-11 10:48:06       29 阅读
  3. 【C++】每日一题 103 二叉树的锯齿形层序遍历

    2024-05-11 10:48:06       34 阅读
  4. K8S 删除pod的正确步骤

    2024-05-11 10:48:06       42 阅读
  5. 500行代码实现贪吃蛇(2)

    2024-05-11 10:48:06       23 阅读
  6. 右键使用VSCode打开文件/文件夹目录

    2024-05-11 10:48:06       33 阅读
  7. [openwrt-21.02]MT7981+MT7976 WiFi debug指令

    2024-05-11 10:48:06       53 阅读
  8. 图像处理、计算机视觉和深度学习,区别与联系

    2024-05-11 10:48:06       30 阅读
  9. No row with the given identifier exists 解决方法

    2024-05-11 10:48:06       35 阅读