
1 Title

        DiffusionDet: Diffusion Model for Object Detection(Shoufa Chen,Peize Sun,Yibing Song,Ping Luo)【ICCV 2023】

2 Conclusion        

        This study proposes DiffusionDet, a new framework that formulates object detection as a denoising diffusion process from noisy boxes to object boxes. During the training stage, object boxes diffuse from ground-truth boxes to random distribution, and the model learns to reverse this noising process. In inference, the model refines a set of randomly generated boxes to the output results in a progressive way. 

3 Good Sentences

        1、This noise-to-box approach requires neither heuristic object priors nor learnable queries, further simplifying the object candidates and pushing the development of the detection pipeline forward.(The advantage of diffusion model when used at object detection)
        2、However, despite significant interest in this idea, there are no previous solutions that successfully adapt generative diffusion models for object detection, the progress of which remarkably lags behind that of segmentation. We argue that this may be because segmentation tasks are processed in an image-to-image style, which is more conceptually similar to the image generation tasks, while object detection is a set prediction problem which requires assigning object candidates to ground truth objects.(Why this study chose object detection to research)
        3、As shown in Figure 3a, the performance of DiffusionDet increases steadily with the number of boxes used for evaluation. (The characteristics of DiffusionDet)

本文提出了 DiffusionDet,这是一个新的框架,它将对象检测表述为从噪声框到对象框的去噪扩散过程。在训练阶段,目标框从真实框扩散到随机分布,模型学会逆转这个噪声过程。在推理中,该模型以渐进的方式将一组随机生成的框细化为输出结果

a:标准DDPM扩散过程 b:去噪过程 c:目标检测去噪过程示意

DiffusionDet的框架如图所示,由于扩散模型迭代生成数据样本,因此需要在推理阶段多次运行模型。然而,在每个迭代步骤中直接将模型应用于原始图像在计算上是难以处理的。因此,本文将整个模型分成两部分,图像编码器和检测解码器,其中前者只运行一次以从原始输入图像 x 中提取深度特征表示,后者以此深度特征为条件,而不是原始图像,以逐步细化来自嘈杂框 zt 的框预测(这个想法跟latent diffusion 差不多,不过latent diffusion使用vae来提取特征)。



  1. 经典目标检测算法

    2024-03-25 09:18:01       26 阅读
  2. 常见经典目标检测

    2024-03-25 09:18:01       29 阅读
  3. 目标检测迁移学习

    2024-03-25 09:18:01       27 阅读


  1. docker php8.1+nginx base 镜像 dockerfile 配置

    2024-03-25 09:18:01       94 阅读
  2. Could not load dynamic library ‘cudart64_100.dll‘

    2024-03-25 09:18:01       101 阅读
  3. 在Django里面运行非项目文件

    2024-03-25 09:18:01       82 阅读
  4. Python语言-面向对象

    2024-03-25 09:18:01       91 阅读


  1. 开源GPGPU

    2024-03-25 09:18:01       35 阅读
  2. 前端并发控制

    2024-03-25 09:18:01       38 阅读
  3. 前端开发中手机端相关知识点

    2024-03-25 09:18:01       31 阅读
  4. Windows + RTX4090驱动,CUDA安装

    2024-03-25 09:18:01       50 阅读
  5. 深入理解DBC文件:汽车行业的数据通信蓝图

    2024-03-25 09:18:01       37 阅读
  6. Simulink学习教程分享

    2024-03-25 09:18:01       48 阅读
  7. 滴滴基于 Clickhouse 构建新一代日志存储系统

    2024-03-25 09:18:01       41 阅读
  8. 精读《如何做好 CodeReview》

    2024-03-25 09:18:01       40 阅读
  9. 复习Day2_

    2024-03-25 09:18:01       43 阅读
  10. TCP重传机制详解——03DSACK

    2024-03-25 09:18:01       37 阅读
  11. 【boost_search搜索引擎】2.正排索引和倒排索引

    2024-03-25 09:18:01       37 阅读
  12. P1873 [COCI 2011/2012 #5] EKO / 砍树

    2024-03-25 09:18:01       38 阅读
  13. AI大模型的训练与优化

    2024-03-25 09:18:01       38 阅读