深度学习入门(7) - Video Understanding

2024-04-28 21:12:03
开发
37

Videos

Due to the memory constraints, we have to do some down-sampling

Raw video: long high fps

Training: short clips with low fps

testing: run model on different clips, averaging predictions

Single-frame CNN

train normal 2D CNN to classify frames independently!

easy but a very strong baseline!

Late fusion

take the time axis into account

(flatten / average pooling) concatenate the results of CNNs and feed to MLP to get a classification score

Problem: Hard to compare low-level motion between frames

Early Fusion

compare frames with very first conv layer after that normal 2D CNN

then pass a 2D CNN to get class score

Problem: only one layer of the temporal processing may be not enough

3D CNN (slow fusion network)

use 3D conv and 3D pooling operations

C3D: The VGG of 3D CNNs

Measuring Motion: Optical Flow

Optical Flow gives a displacement field F between image t and image t+1.

Two-Stream Networks

combine Spatial stream Convnet and Temporal stream Convnet to a large vector and get a class score

Train the two Convnets seperately and take averages in Test time

Modeling long-term temporal structure

we can extract features with CNN (2D / 3D) and use lstm in the time axis

trick: sometimes we don’t backprop to CNN to save memory (use them as a feature extractor)

Recurrent Convolutional Network

请添加图片描述

Spatio-Temporal Self-Attention (Nonlocal Block)

Inflating 2D Networks to 3D (I3D)

take a 2D CNN architecture and replace each 2D conv/pool with a 3D version

we can inflate the trained weighted as well, duplicate them in time

pre-train model on images -> inflate them into videos -> fine-tuning

We can use the same visualizing tricks

SlowFast Networks

请添加图片描述

Other aspects: Temporal Action Localization, Spatio-Temporal Detection

原文地址:https://blog.csdn.net/andyc_03/article/details/138243992 本文来自互联网用户投稿，该文观点仅代表作者本人，不代表本站立场。本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如若转载，请注明出处：https://www.suanlizi.com/kf/1784571276996775936.html 如若内容造成侵权/违法违规/事实不符，请联系《酸梨子》网邮箱：1419361763@qq.com进行投诉反馈，一经查实，立即删除！

阅读全部

相关推荐

深度学习入门(7) - Video Understanding

2024-04-28 21:12:03 38 阅读
深度学习快速入门--7天做项目

2024-04-28 21:12:03 57 阅读
深度学习从入门到不想放弃-7

2024-04-28 21:12:03 53 阅读
深度学习_微调_7

2024-04-28 21:12:03 42 阅读
深度学习如何入门？

2024-04-28 21:12:03 53 阅读
深度学习如何入门？

2024-04-28 21:12:03 53 阅读
深度学习如何入门

2024-04-28 21:12:03 53 阅读
深度学习如何入门？

2024-04-28 21:12:03 40 阅读
深度学习如何入门？

2024-04-28 21:12:03 43 阅读
深度学习如何入门？

2024-04-28 21:12:03 41 阅读

最近更新

题解 - 序列

2024-04-28 21:12:03 122 阅读
CST热仿真案例——电动车直流快充Cable热仿真

2024-04-28 21:12:03 109 阅读
docker php8.1+nginx base 镜像 dockerfile 配置

2024-04-28 21:12:03 98 阅读
Could not load dynamic library ‘cudart64_100.dll‘

2024-04-28 21:12:03 106 阅读
NoSQL之Redis非关系型数据库

2024-04-28 21:12:03 108 阅读
2024.7.22 作业

2024-04-28 21:12:03 106 阅读
GDB调试正在运行的程序

2024-04-28 21:12:03 87 阅读
昇思25天学习打卡营第18天| DCGAN生成漫画头像

2024-04-28 21:12:03 90 阅读
在Django里面运行非项目文件

2024-04-28 21:12:03 87 阅读
SSD基本架构与工作原理

2024-04-28 21:12:03 94 阅读
在誉天学习完HCIE就业吗？

2024-04-28 21:12:03 98 阅读
【合同专题】合同终止协议书、项目合作协议、交底纪要、管理台账

2024-04-28 21:12:03 90 阅读
驾驭云原生日志洪流：高效分析与管理的策略集

2024-04-28 21:12:03 92 阅读
go 协程池的实现

2024-04-28 21:12:03 93 阅读
Shell脚本循环语句与函数

2024-04-28 21:12:03 96 阅读
连锁店收银系统源码（收银称重pos+聚合支付+ERP进销存+营销+会员管理）

2024-04-28 21:12:03 98 阅读
TIA博途V19无法勾选来自远程对象的PUT/GET访问的解决办法

2024-04-28 21:12:03 90 阅读
四大引用——强软弱虚

2024-04-28 21:12:03 92 阅读
Python语言-面向对象

2024-04-28 21:12:03 96 阅读
如何分清楚常见的 Git 分支管理策略Git Flow、GitHub Flow 和 GitLab Flow

2024-04-28 21:12:03 91 阅读
网站安全-CDN篇

2024-04-28 21:12:03 93 阅读

热门阅读

为什么数据库会用圆柱体来表示?

2024-04-28 21:12:03 33 阅读
西湖大学赵世钰老师【强化学习的数学原理】学习笔记-1、0节

2024-04-28 21:12:03 36 阅读
测试的分类(3)

2024-04-28 21:12:03 42 阅读
密码学系列6-随机预言机模型和标准模型

2024-04-28 21:12:03 25 阅读
为什么我的Mac运行速度变慢 mac运行速度慢怎么办如何使用CleanMyMac X修复它

2024-04-28 21:12:03 35 阅读
linux 安装rzsz

2024-04-28 21:12:03 27 阅读
机器学习和深度学习-- 李宏毅（笔记与个人理解）Day22

2024-04-28 21:12:03 35 阅读
设计模式-组合模式

2024-04-28 21:12:03 30 阅读
分享一些常用的内外网文件传输工具

2024-04-28 21:12:03 30 阅读
dockerfile 搭建lamp 实验模拟

2024-04-28 21:12:03 28 阅读
基于opencv的单目相机标定

2024-04-28 21:12:03 30 阅读
李沐70_bert微调——自学笔记

2024-04-28 21:12:03 27 阅读
网络编程!

2024-04-28 21:12:03 37 阅读
【Protobuf】protobuf详细介绍

2024-04-28 21:12:03 33 阅读
git bash上传本地文件报错debug

2024-04-28 21:12:03 33 阅读
静态路由深研究

2024-04-28 21:12:03 31 阅读
Scala Extention

2024-04-28 21:12:03 22 阅读
同仁堂医养拟赴港上市，养老产业的盈利难题有了答案？

2024-04-28 21:12:03 34 阅读
使用 GORM 自定义类型：解决问题与技巧分享

2024-04-28 21:12:03 29 阅读
maya blendshape

2024-04-28 21:12:03 33 阅读
系统设计 --- E2E Test System

2024-04-28 21:12:03 27 阅读
短视频素材哪个好？8个短视频素材官网推荐

2024-04-28 21:12:03 29 阅读
书生·浦语大模型-第六节课笔记/作业

2024-04-28 21:12:03 28 阅读
Mockito Mybatis-plus 单元测试

2024-04-28 21:12:03 35 阅读
Qt——置灰窗口

2024-04-28 21:12:03 33 阅读
官网设计UI设计需要考虑哪些？

2024-04-28 21:12:03 29 阅读
RTCRTC

2024-04-28 21:12:03 30 阅读
synchronized锁升级

2024-04-28 21:12:03 37 阅读
Unity中的C#事件与回调：一个简单的实例分析

2024-04-28 21:12:03 36 阅读
【Linux】进程间通信

2024-04-28 21:12:03 30 阅读