The future of you, waiting for you in the future.

Geoffrey

Long, long the pathway to Cold Hill;
Drear, drear the waterside so chill.

返回顶部

视频研究入门经典

视频研究入门经典

  • Labor-Free Video Concept Learningby Jointly Exploiting Web Videos and Images

​ intro: CVPR 2016
​ intro: Lead–Exceed Neural Network (LENN), LSTM
​ paper: https://www.microsoft.com/en-us/research/wp-content/uploads/2016/06/CVPR16_webly_final.pdf

  • Video Fill in the Blank with Merging LSTMs

​ intro: for Large Scale Movie Description and Understanding Challenge (LSMDC) 2016, "Movie fill-in-the-blank" Challenge, UCF_CRCV
​ intro: Video-Fill-in-the-Blank (ViFitB)
​ arxiv: https://arxiv.org/abs/1610.04062

  • Video Pixel Networks

​ intro: Google DeepMind
​ arxiv: https://arxiv.org/abs/1610.00527

  • Robust Video Synchronization using Unsupervised Deep Learning

​ arxiv: https://arxiv.org/abs/1610.05985

  • Video Propagation Networks

​ intro: CVPR 2017. Max Planck Institute for Intelligent Systems & Bernstein Center for Computational Neuroscience
​ project page: https://varunjampani.github.io/vpn/
​ arxiv: https://arxiv.org/abs/1612.05478
​ github(Caffe): https://github.com/varunjampani/video_prop_networks

  • Video Frame Synthesis using Deep Voxel Flow

​ project page: https://liuziwei7.github.io/projects/VoxelFlow.html
​ arxiv: https://arxiv.org/abs/1702.02463

  • Optimizing Deep CNN-Based Queries over Video Streams at Scale

​ intro: Stanford InfoLab
​ keywords: NoScope. difference detectors, specialized models
​ arxiv: https://arxiv.org/abs/1703.02529
​ github: https://github.com/stanford-futuredata/noscope
​ github: https://github.com/stanford-futuredata/tensorflow-noscope

  • NoScope: 1000x Faster Deep Learning Queries over Video

http://dawn.cs.stanford.edu/2017/06/22/noscope/

  • Unsupervised Visual-Linguistic Reference Resolution in Instructional Videos

​ intro: CVPR 2017. Stanford University & University of Southern California
​ arxiv: https://arxiv.org/abs/1703.02521

  • ProcNets: Learning to Segment Procedures in Untrimmed and Unconstrained Videos

https://arxiv.org/abs/1703.09788

  • Unsupervised Learning Layers for Video Analysis

​ intro: Baidu Research
​ intro: "The experiments demonstrated the potential applications of UL layers and online learning algorithm to head orientation estimation and moving object localization"
​ arxiv: https://arxiv.org/abs/1705.08918

  • Look, Listen and Learn

​ intro: DeepMind
​ intro: "Audio-Visual Correspondence" learning
​ arxiv: https://arxiv.org/abs/1705.08168

  • Video Imagination from a Single Image with Transformation Generation

​ intro: Peking University
​ arxiv: https://arxiv.org/abs/1706.04124
​ github: https://github.com/gitpub327/VideoImagination

  • Learning to Learn from Noisy Web Videos

​ intro: CVPR 2017. Stanford University & CMU & Simon Fraser University
​ arxiv: https://arxiv.org/abs/1706.02884

  • Convolutional Long Short-Term Memory Networks for Recognizing First Person Interactions

​ intro: Accepted on the second International Workshop on Egocentric Perception, Interaction and Computing(EPIC) at International Conference on Computer Vision(ICCV-17)
​ arxiv: https://arxiv.org/abs/1709.06495

  • Learning Binary Residual Representations for Domain-specific Video Streaming

​ intro: AAAI 2018
​ project page: http://research.nvidia.com/publication/2018-02_Learning-Binary-Residual
​ arxiv: https://arxiv.org/abs/1712.05087

  • Video Representation Learning Using Discriminative Pooling

​ intro: CVPR 2018
​ arxiv: https://arxiv.org/abs/1803.10628

  • Rethinking the Faster R-CNN Architecture for Temporal Action Localization

​ intro: CVPR 2018
​ arxiv: https://arxiv.org/abs/1804.07667

  • Deep Keyframe Detection in Human Action Videos

​ intro: two-stream ConvNet
​ arxiv: https://arxiv.org/abs/1804.10021

  • FFNet: Video Fast-Forwarding via Reinforcement Learning

​ intro: CVPR 2018
​ arxiv: https://arxiv.org/abs/1805.02792

  • Fast forwarding Egocentric Videos by Listening and Watching

https://arxiv.org/abs/1806.04620

  • Scanner: Efficient Video Analysis at Scale

​ intro: CMU
​ arxiv: https://arxiv.org/abs/1805.07339

  • Massively Parallel Video Networks

​ intro: DeepMind & University of Oxford
​ arxiv: https://arxiv.org/abs/1806.03863

  • Object Level Visual Reasoning in Videos

​ intro: LIRIS & Facebook AI Research
​ arxiv: https://arxiv.org/abs/1806.06157

  • Video Time: Properties, Encoders and Evaluation

​ intro: BMVC 2018
​ arxiv: https://arxiv.org/abs/1807.06980

视频分类

  • Large-scale Video Classification with Convolutional Neural Networks

​ intro: CVPR 2014
​ project page: http://cs.stanford.edu/people/karpathy/deepvideo/
​ paper: www.cv-foundation.org/openaccess/content_cvpr_2014/papers/Karpathy_Large-scale_Video_Classification_2014_CVPR_paper.pdf

  • Exploiting Image-trained CNN Architectures for Unconstrained Video Classification

​ intro: Video-level event detection. extracting deep features for each frame, averaging frame-level deep features
​ arxiv: http://arxiv.org/abs/1503.04144

  • Beyond Short Snippets: Deep Networks for Video Classification

​ intro: CNN + LSTM
​ arxiv: http://arxiv.org/abs/1503.08909
​ demo: http://pan.baidu.com/s/1eQ9zLZk

  • Modeling Spatial-Temporal Clues in a Hybrid Deep Learning Framework for Video Classification

​ intro: ACM Multimedia, 2015
​ arxiv: http://arxiv.org/abs/1504.01561

Video Content Recognition with Deep Learning

​ author: Zuxuan Wu, Fudan University
​ slides: http://vision.ouc.edu.cn/valse/slides/20160420/Zuxuan Wu - Video Content Recognition with Deep Learning-Zuxuan Wu.pdf

  • Video Content Recognition with Deep Learning

​ author: Yu-Gang Jiang, Lab for Big Video Data Analytics (BigVid), Fudan University
​ slides: http://www.yugangjiang.info/slides/DeepVideoTalk-2015.pdf

  • Efficient Large Scale Video Classification

​ intro: Google
​ arxiv: http://arxiv.org/abs/1505.06250

  • Fusing Multi-Stream Deep Networks for Video Classification

​ arxiv: http://arxiv.org/abs/1509.06086

  • Learning End-to-end Video Classification with Rank-Pooling

​ paper: http://jmlr.org/proceedings/papers/v48/fernando16.html
​ paper: http://jmlr.csail.mit.edu/proceedings/papers/v48/fernando16.pdf
​ summary(by Hugo Larochelle): http://www.shortscience.org/paper?bibtexKey=conf/icml/FernandoG16#hlarochelle

  • Deep Learning for Video Classification and Captioning

​ arxiv: http://arxiv.org/abs/1609.06782

  • Fast Video Classification via Adaptive Cascading of Deep Models

​ arxiv: https://arxiv.org/abs/1611.06453

  • Deep Feature Flow for Video Recognition

​ intro: CVPR 2017
​ intro: It provides a simple, fast, accurate, and end-to-end framework for video recognition (e.g., object detection and semantic segmentation in videos)
​ arxiv: https://arxiv.org/abs/1611.07715
​ github(official, MXNet): https://github.com/msracver/Deep-Feature-Flow
​ youtube: https://www.youtube.com/watch?v=J0rMHE6ehGw

  • Large-Scale YouTube-8M Video Understanding with Deep Neural Networks

https://arxiv.org/abs/1706.04488

  • Deep Learning Methods for Efficient Large Scale Video Labeling

​ intro: Solution to the Kaggle's competition Google Cloud & YouTube-8M Video Understanding Challenge
​ arxiv: https://arxiv.org/abs/1706.04572
​ github: https://github.com/mpekalski/Y8M

  • Learnable pooling with Context Gating for video classification

​ intro: CVPR17 Youtube 8M workshop. Kaggle 1st place
​ arxiv: https://arxiv.org/abs/1706.06905
​ github: https://github.com/antoine77340/LOUPE

  • Aggregating Frame-level Features for Large-Scale Video Classification

​ intro: Youtube-8M Challenge, 4th place
​ arxiv: https://arxiv.org/abs/1707.00803

  • Tensor-Train Recurrent Neural Networks for Video Classification

https://arxiv.org/abs/1707.01786

  • Hierarchical Deep Recurrent Architecture for Video Understanding

​ intro: Classification Challenge Track paper in CVPR 2017 Workshop on YouTube-8M Large-Scale Video Understanding
​ arxiv: https://arxiv.org/abs/1707.03296

  • Large-scale Video Classification guided by Batch Normalized LSTM Translator

​ intro: CVPR2017 Workshop on Youtube-8M Large-scale Video Understanding
​ arxiv: https://arxiv.org/abs/1707.04045

  • UTS submission to Google YouTube-8M Challenge 2017

​ intro: CVPR'17 Workshop on YouTube-8M
​ arxiv: https://arxiv.org/abs/1707.04143
​ github: https://github.com/ffmpbgrnn/yt8m

  • A spatiotemporal model with visual attention for video classification

https://arxiv.org/abs/1707.02069

  • Cultivating DNN Diversity for Large Scale Video Labelling

​ intro: CVPR 2017 Youtube-8M Workshop
​ arxiv: https://arxiv.org/abs/1707.04272

  • Attention Transfer from Web Images for Video Recognition

​ intro: ACM Multimedia, 2017
​ arxiv: https://arxiv.org/abs/1708.00973

  • Non-local Neural Networks

​ intro: CVPR 2018. CMU & Facebook AI Research
​ arxiv: https://arxiv.org/abs/1711.07971
​ github(Caffe2): https://github.com/facebookresearch/video-nonlocal-net

  • Temporal 3D ConvNets: New Architecture and Transfer Learning for Video Classification

https://arxiv.org/abs/1711.08200

  • Appearance-and-Relation Networks for Video Classification

​ arxiv: https://arxiv.org/abs/1711.09125
​ github: https://github.com/wanglimin/ARTNet

  • Rethinking Spatiotemporal Feature Learning: Speed-Accuracy Trade-offs in Video Classification

​ intro: ECCV 2018. Google Research & University of California San Diego
​ arxiv: https://arxiv.org/abs/1712.04851

  • Long Activity Video Understanding using Functional Object-Oriented Network

https://arxiv.org/abs/1807.00983

Deep Architectures and Ensembles for Semantic Video Classification

https://arxiv.org/abs/1807.01026

  • Deep Discriminative Model for Video Classification

​ intro: ECCV 2018
​ arxiv: https://arxiv.org/abs/1807.08259

  • Deep Video Color Propagation

​ intro: BMVC 2018
​ arxuv: https://arxiv.org/abs/1808.03232

  • Non-local NetVLAD Encoding for Video Classification

​ intro: ECCV 2018 workshop on YouTube-8M Large-Scale Video Understanding
​ intro: Tencent AI Lab & Fudan University
​ arxiv: https://arxiv.org/abs/1810.00207

Learnable Pooling Methods for Video Classification

​ intro: Youtube 8M ECCV18 Workshop
​ arxiv: https://arxiv.org/abs/1810.00530
​ github: https://github.com/pomonam/LearnablePoolingMethods

  • NeXtVLAD: An Efficient Neural Network to Aggregate Frame-level Features for Large-scale Video Classification

​ intro: ECCV 2018 workshop
​ arxiv: https://arxiv.org/abs/1811.05014
​ github: https://github.com/linrongc/youtube-8m

视频行为识别 / 行为检测

  • 3d convolutional neural networks for human action recognition

​ paper: http://www.cs.odu.edu/~sji/papers/pdf/Ji_ICML10.pdf

  • Sequential Deep Learning for Human Action Recognition

​ paper: http://liris.cnrs.fr/Documents/Liris-5228.pdf

  • Two-stream convolutional networks for action recognition in videos

​ arxiv: http://arxiv.org/abs/1406.2199

  • Finding action tubes

​ intro: "built action models from shape and motion cues. They start from the image proposals and select the motion salient subset of them and extract saptio-temporal features to represent the video using the CNNs."
​ arxiv: http://arxiv.org/abs/1411.6031

  • Hierarchical Recurrent Neural Network for Skeleton Based Action Recognition

​ paper: http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Du_Hierarchical_Recurrent_Neural_2015_CVPR_paper.pdf

  • Action Recognition with Trajectory-Pooled Deep-Convolutional Descriptors

​ intro: CVPR 2015. TDD
​ paper: www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Wang_Action_Recognition_With_2015_CVPR_paper.pdf
​ ext: http://www.cv-foundation.org/openaccess/content_cvpr_2015/app/2B_105_ext.pdf
​ poster: https://wanglimin.github.io/papers/WangQT_CVPR15_Poster.pdf
​ github: https://github.com/wanglimin/TDD

  • Action Recognition by Hierarchical Mid-level Action Elements

​ paper: http://cvgl.stanford.edu/papers/tian2015.pdf

Contextual Action Recognition with R*CNN

​ arxiv: http://arxiv.org/abs/1505.01197
​ github: https://github.com/gkioxari/RstarCNN

  • Towards Good Practices for Very Deep Two-Stream ConvNets

​ arxiv: http://arxiv.org/abs/1507.02159
​ github: https://github.com/yjxiong/caffe

  • Action Recognition using Visual Attention

​ intro: LSTM / RNN
​ arxiv: http://arxiv.org/abs/1511.04119
​ project page: http://shikharsharma.com/projects/action-recognition-attention/
​ github(Python/Theano): https://github.com/kracwarlock/action-recognition-visual-attention

  • End-to-end Learning of Action Detection from Frame Glimpses in Videos

​ intro: CVPR 2016
​ project page: http://ai.stanford.edu/~syyeung/frameglimpses.html
​ arxiv: http://arxiv.org/abs/1511.06984
​ paper: http://vision.stanford.edu/pdf/yeung2016cvpr.pdf

  • Multi-velocity neural networks for gesture recognition in videos

​ arxiv: http://arxiv.org/abs/1603.06829

Active Learning for Online Recognition of Human Activities from Streaming Videos

​ arxiv: http://arxiv.org/abs/1604.02855

  • Convolutional Two-Stream Network Fusion for Video Action Recognition

​ arxiv: http://arxiv.org/abs/1604.06573
​ github: https://github.com/feichtenhofer/twostreamfusion

  • Deep, Convolutional, and Recurrent Models for Human Activity Recognition using Wearables

​ arxiv: http://arxiv.org/abs/1604.08880

  • Unsupervised Semantic Action Discovery from Video Collections

​ arxiv: http://arxiv.org/abs/1605.03324

  • Anticipating Visual Representations from Unlabeled Video

​ paper: http://web.mit.edu/vondrick/prediction.pdf

VideoLSTM Convolves, Attends and Flows for Action Recognition

​ arxiv: http://arxiv.org/abs/1607.01794

  • Hierarchical Attention Network for Action Recognition in Videos (HAN)

​ arxiv: http://arxiv.org/abs/1607.06416

  • Spatio-Temporal LSTM with Trust Gates for 3D Human Action Recognition

​ arxiv: http://arxiv.org/abs/1607.07043

  • Connectionist Temporal Modeling for Weakly Supervised Action Labeling

​ arxiv: http://arxiv.org/abs/1607.08584

  • CUHK & ETHZ & SIAT Submission to ActivityNet Challenge 2016

​ intro: won the 1st place in the untrimmed video classification task of ActivityNet Challenge 2016. TSN
​ arxiv: http://arxiv.org/abs/1608.00797
​ github: https://github.com/yjxiong/anet2016-cuhk

  • Actionness Estimation Using Hybrid FCNs

​ intro: CVPR 2016. H-FCN
​ project page: http://wanglimin.github.io/actionness_hfcn/index.html
​ paper: http://wanglimin.github.io/papers/WangQTV_CVPR16.pdf
​ github: https://github.com/wanglimin/actionness-estimation/

  • Real-time Action Recognition with Enhanced Motion Vector CNNs

​ intro: CVPR 2016
​ project page: http://zbwglory.github.io/MV-CNN/index.html
​ paper: http://wanglimin.github.io/papers/ZhangWWQW_CVPR16.pdf
​ github: https://github.com/zbwglory/MV-release

  • Temporal Segment Networks: Towards Good Practices for Deep Action Recognition

​ intro: ECCV 2016. HMDB51: 69.4%, UCF101: 94.2%
​ arxiv: http://arxiv.org/abs/1608.00859
​ paper: http://wanglimin.github.io/papers/WangXWQLTV_ECCV16.pdf
​ github: https://github.com/yjxiong/temporal-segment-networks

  • Temporal Segment Networks for Action Recognition in Videos

​ intro: An extension of submission http://arxiv.org/abs/1608.00859
​ arxiv: https://arxiv.org/abs/1705.02953

  • Hierarchical Attention Network for Action Recognition in Videos

​ arxiv: http://arxiv.org/abs/1607.06416

  • DeepCAMP: Deep Convolutional Action & Attribute Mid-Level Patterns

​ intro: CVPR 2016
​ arxiv: http://arxiv.org/abs/1608.03217

  • Depth2Action: Exploring Embedded Depth for Large-Scale Action Recognition

​ arxiv: http://arxiv.org/abs/1608.04339

  • Dynamic Image Networks for Action Recognition

​ intro: CVPR 2016
​ arxiv: http://users.cecs.anu.edu.au/~sgould/papers/cvpr16-dynamic_images.pdf
​ github: https://github.com/hbilen/dynamic-image-nets

  • Human Action Recognition without Human

​ arxiv: http://arxiv.org/abs/1608.07876

  • Temporal Convolutional Networks: A Unified Approach to Action Segmentation

​ arxiv: http://arxiv.org/abs/1608.08242
​ ECCV 2016 workshop: http://bravenewmotion.github.io/

  • Temporal Activity Detection in Untrimmed Videos with Recurrent Neural Networks

​ intro: Bachelor Thesis Report at ETSETB TelecomBCN
​ project page: https://imatge-upc.github.io/activitynet-2016-cvprw/
​ arxiv: http://arxiv.org/abs/1608.08128
​ github: https://github.com/imatge-upc/activitynet-2016-cvprw

  • Sequential Deep Trajectory Descriptor for Action Recognition with Three-stream CNN

​ arxiv: http://arxiv.org/abs/1609.03056

  • Semi-Coupled Two-Stream Fusion ConvNets for Action Recognition at Extremely Low Resolutions

​ arxiv: https://arxiv.org/abs/1610.03898

  • Spatiotemporal Residual Networks for Video Action Recognition

​ intro: NIPS 2016
​ arxiv: https://arxiv.org/abs/1611.02155

  • Action Recognition Based on Joint Trajectory Maps Using Convolutional Neural Networks

​ arxiv: https://arxiv.org/abs/1611.02447

  • Deep Recurrent Neural Network for Mobile Human Activity Recognition with High Throughput

​ arxiv: https://arxiv.org/abs/1611.03607

  • Joint Network based Attention for Action Recognition

​ arxiv: https://arxiv.org/abs/1611.05215

  • Temporal Convolutional Networks for Action Segmentation and Detection

​ arxiv: https://arxiv.org/abs/1611.05267

  • AdaScan: Adaptive Scan Pooling in Deep Convolutional Neural Networks for Human Action Recognition in Videos

​ arxiv: https://arxiv.org/abs/1611.08240

  • ActionFlowNet: Learning Motion Representation for Action Recognition

​ arxiv: https://arxiv.org/abs/1612.03052

  • Higher-order Pooling of CNN Features via Kernel Linearization for Action Recognition

​ intro: Australian Center for Robotic Vision & Data61/CSIRO
​ arxiv: https://arxiv.org/abs/1701.05432

  • Tube Convolutional Neural Network (T-CNN) for Action Detection in Videos

https://arxiv.org/abs/1703.10664

  • Temporal Action Detection with Structured Segment Networks

​ project page: http://yjxiong.me/others/ssn/
​ arxiv: https://arxiv.org/abs/1704.06228
​ github: https://github.com/yjxiong/action-detection

  • Recurrent Residual Learning for Action Recognition

https://arxiv.org/abs/1706.08807

  • Hierarchical Multi-scale Attention Networks for Action Recognition

https://arxiv.org/abs/1708.07590

  • Two-stream Flow-guided Convolutional Attention Networks for Action Recognition

​ intro: International Conference of Computer Vision Workshop (ICCVW), 2017
​ arxiv: https://arxiv.org/abs/1708.09268

  • Action Classification and Highlighting in Videos

https://arxiv.org/abs/1708.09522

  • Real-Time Action Detection in Video Surveillance using Sub-Action Descriptor with Multi-CNN

https://arxiv.org/abs/1710.03383

  • End-to-end Video-level Representation Learning for Action Recognition

​ keywords: Deep networks with Temporal Pyramid Pooling (DTPP)
​ arxiv: https://arxiv.org/abs/1711.04161

  • Fully-Coupled Two-Stream Spatiotemporal Networks for Extremely Low Resolution Action Recognition

​ intro: WACV 2018
​ arxiv: https://arxiv.org/abs/1801.03983

  • DiscrimNet: Semi-Supervised Action Recognition from Videos using Generative Adversarial Networks

https://arxiv.org/abs/1801.07230

  • A Fusion of Appearance based CNNs and Temporal evolution of Skeleton with LSTM for Daily Living Action Recognition

https://arxiv.org/abs/1802.00421

  • Real-Time End-to-End Action Detection with Two-Stream Networks

https://arxiv.org/abs/1802.08362

  • A Closer Look at Spatiotemporal Convolutions for Action Recognition

​ intro: CVPR 2018. Facebook Research
​ intro: R(2+1)D and Mixed-Convolutions for Action Recognition.
​ project page: https://dutran.github.io/R2Plus1D/
​ arxiv: https://arxiv.org/abs/1711.11248
​ github: https://github.com/facebookresearch/R2Plus1D

  • VideoCapsuleNet: A Simplified Network for Action Detection

https://arxiv.org/abs/1805.08162

  • Where and When to Look? Spatio-temporal Attention for Action Recognition in Videos

https://arxiv.org/abs/1810.04511
Projects

  • A Torch Library for Action Recognition and Detection Using CNNs and LSTMs

​ intro: CS231n student project report
​ paper: http://cs231n.stanford.edu/reports2016/221_Report.pdf
​ github: https://github.com/garythung/torch-lrcn

  • 2016 ActivityNet action recognition challenge. CNN + LSTM approach. Multi-threaded loading.

​ github: https://github.com/jrbtaylor/ActivityNet

  • LSTM for Human Activity Recognition

​ github: https://github.com/guillaume-chevalier/LSTM-Human-Activity-Recognition/
​ github(MXNet): https://github.com/Ldpe2G/DeepLearningForFun/tree/master/Mxnet-Scala/HumanActivityRecognition

  • Scanner: Efficient Video Analysis at Scale

​ intro: Locate and recognize faces in a video, Detect shots in a film, Search videos by image
​ github: https://github.com/scanner-research/scanner

  • Charades Starter Code for Activity Classification and Localization

​ intro: Activity Recognition Algorithms for the Charades Dataset
​ github: https://github.com/gsig/charades-algorithms

  • NonLocalNetwork and Sequeeze-Excitation Network

​ intro: MXNet implementation of Non-Local and Squeeze-Excitation network
​ github: https://github.com/WillSuen/NonLocalandSEnet

事件识别

  • TagBook: A Semantic Video Representation without Supervision for Event Detection

​ arxiv: http://arxiv.org/abs/1510.02899

  • AENet: Learning Deep Audio Features for Video Analysis

​ arxiv: https://arxiv.org/abs/1701.00599
​ github: https://github.com/znaoya/aenet

事件检测

  • DevNet: A Deep Event Network for Multimedia Event Detection and Evidence Recounting

​ paper: http://120.52.72.47/winsty.net/c3pr90ntcsf0/papers/devnet.pdf
​ paper: http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Gan_DevNet_A_Deep_2015_CVPR_paper.pdf

  • Detecting events and key actors in multi-person videos

​ intro: CVPR 2016
​ arxiv: http://arxiv.org/abs/1511.02917
​ paper: www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Ramanathan_Detecting_Events_and_CVPR_2016_paper.pdf
​ paper: http://vision.stanford.edu/pdf/johnson2016cvpr.pdf
​ blog: http://www.leiphone.com/news/201606/l1TKIRFLO3DUFNNu.html

  • Deep Convolutional Neural Networks and Data Augmentation for Acoustic Event Detection

​ intro: INTERSPEECH 2016
​ arxiv: https://arxiv.org/abs/1604.07160

  • Efficient Action Detection in Untrimmed Videos via Multi-Task Learning

​ arxiv: https://arxiv.org/abs/1612.07403

  • Joint Event Detection and Description in Continuous Video Streams

​ intro: Joint Event Detection and Description Network (JEDDi-Net)
​ arxiv: https://arxiv.org/abs/1802.10250

转自: https://blog.csdn.net/WJ_MeiMei/article/details/84344836

posted @ 2019-11-25 17:03  Geoffreygau  阅读(949)  评论(0编辑  收藏  举报