视频研究入门经典
视频研究入门经典
- 
Labor-Free Video Concept Learningby Jointly Exploiting Web Videos and Images
 
    intro: CVPR 2016
    intro: Lead–Exceed Neural Network (LENN), LSTM
    paper: https://www.microsoft.com/en-us/research/wp-content/uploads/2016/06/CVPR16_webly_final.pdf
- 
Video Fill in the Blank with Merging LSTMs
 
    intro: for Large Scale Movie Description and Understanding Challenge (LSMDC) 2016, "Movie fill-in-the-blank" Challenge, UCF_CRCV
    intro: Video-Fill-in-the-Blank (ViFitB)
    arxiv: https://arxiv.org/abs/1610.04062
- 
Video Pixel Networks
 
    intro: Google DeepMind
    arxiv: https://arxiv.org/abs/1610.00527
- 
Robust Video Synchronization using Unsupervised Deep Learning
 
 arxiv: https://arxiv.org/abs/1610.05985
- 
Video Propagation Networks
 
    intro: CVPR 2017. Max Planck Institute for Intelligent Systems & Bernstein Center for Computational Neuroscience
    project page: https://varunjampani.github.io/vpn/
    arxiv: https://arxiv.org/abs/1612.05478
    github(Caffe): https://github.com/varunjampani/video_prop_networks
- 
Video Frame Synthesis using Deep Voxel Flow
 
    project page: https://liuziwei7.github.io/projects/VoxelFlow.html
    arxiv: https://arxiv.org/abs/1702.02463
- 
Optimizing Deep CNN-Based Queries over Video Streams at Scale
 
    intro: Stanford InfoLab
    keywords: NoScope. difference detectors, specialized models
    arxiv: https://arxiv.org/abs/1703.02529
    github: https://github.com/stanford-futuredata/noscope
    github: https://github.com/stanford-futuredata/tensorflow-noscope
- 
NoScope: 1000x Faster Deep Learning Queries over Video
 
http://dawn.cs.stanford.edu/2017/06/22/noscope/
- 
Unsupervised Visual-Linguistic Reference Resolution in Instructional Videos
 
    intro: CVPR 2017. Stanford University & University of Southern California
    arxiv: https://arxiv.org/abs/1703.02521
- 
ProcNets: Learning to Segment Procedures in Untrimmed and Unconstrained Videos
 
https://arxiv.org/abs/1703.09788
- 
Unsupervised Learning Layers for Video Analysis
 
    intro: Baidu Research
    intro: "The experiments demonstrated the potential applications of UL layers and online learning algorithm to head orientation estimation and moving object localization"
    arxiv: https://arxiv.org/abs/1705.08918
- 
Look, Listen and Learn
 
    intro: DeepMind
    intro: "Audio-Visual Correspondence" learning
    arxiv: https://arxiv.org/abs/1705.08168
- 
Video Imagination from a Single Image with Transformation Generation
 
    intro: Peking University
    arxiv: https://arxiv.org/abs/1706.04124
    github: https://github.com/gitpub327/VideoImagination
- 
Learning to Learn from Noisy Web Videos
 
    intro: CVPR 2017. Stanford University & CMU & Simon Fraser University
    arxiv: https://arxiv.org/abs/1706.02884
- 
Convolutional Long Short-Term Memory Networks for Recognizing First Person Interactions
 
    intro: Accepted on the second International Workshop on Egocentric Perception, Interaction and Computing(EPIC) at International Conference on Computer Vision(ICCV-17)
    arxiv: https://arxiv.org/abs/1709.06495
- 
Learning Binary Residual Representations for Domain-specific Video Streaming
 
    intro: AAAI 2018
    project page: http://research.nvidia.com/publication/2018-02_Learning-Binary-Residual
    arxiv: https://arxiv.org/abs/1712.05087
- 
Video Representation Learning Using Discriminative Pooling
 
    intro: CVPR 2018
    arxiv: https://arxiv.org/abs/1803.10628
- 
Rethinking the Faster R-CNN Architecture for Temporal Action Localization
 
    intro: CVPR 2018
    arxiv: https://arxiv.org/abs/1804.07667
- 
Deep Keyframe Detection in Human Action Videos
 
    intro: two-stream ConvNet
    arxiv: https://arxiv.org/abs/1804.10021
- 
FFNet: Video Fast-Forwarding via Reinforcement Learning
 
    intro: CVPR 2018
    arxiv: https://arxiv.org/abs/1805.02792
- 
Fast forwarding Egocentric Videos by Listening and Watching
 
https://arxiv.org/abs/1806.04620
- 
Scanner: Efficient Video Analysis at Scale
 
    intro: CMU
    arxiv: https://arxiv.org/abs/1805.07339
- 
Massively Parallel Video Networks
 
    intro: DeepMind & University of Oxford
    arxiv: https://arxiv.org/abs/1806.03863
- 
Object Level Visual Reasoning in Videos
 
    intro: LIRIS & Facebook AI Research
    arxiv: https://arxiv.org/abs/1806.06157
- 
Video Time: Properties, Encoders and Evaluation
 
    intro: BMVC 2018
    arxiv: https://arxiv.org/abs/1807.06980
视频分类
- 
Large-scale Video Classification with Convolutional Neural Networks
 
    intro: CVPR 2014
    project page: http://cs.stanford.edu/people/karpathy/deepvideo/
    paper: www.cv-foundation.org/openaccess/content_cvpr_2014/papers/Karpathy_Large-scale_Video_Classification_2014_CVPR_paper.pdf
- 
Exploiting Image-trained CNN Architectures for Unconstrained Video Classification
 
    intro: Video-level event detection. extracting deep features for each frame, averaging frame-level deep features
    arxiv: http://arxiv.org/abs/1503.04144
- 
Beyond Short Snippets: Deep Networks for Video Classification
 
    intro: CNN + LSTM
    arxiv: http://arxiv.org/abs/1503.08909
    demo: http://pan.baidu.com/s/1eQ9zLZk
- 
Modeling Spatial-Temporal Clues in a Hybrid Deep Learning Framework for Video Classification
 
    intro: ACM Multimedia, 2015
    arxiv: http://arxiv.org/abs/1504.01561
Video Content Recognition with Deep Learning
    author: Zuxuan Wu, Fudan University
    slides: http://vision.ouc.edu.cn/valse/slides/20160420/Zuxuan Wu - Video Content Recognition with Deep Learning-Zuxuan Wu.pdf
- 
Video Content Recognition with Deep Learning
 
    author: Yu-Gang Jiang, Lab for Big Video Data Analytics (BigVid), Fudan University
    slides: http://www.yugangjiang.info/slides/DeepVideoTalk-2015.pdf
- 
Efficient Large Scale Video Classification
 
    intro: Google
    arxiv: http://arxiv.org/abs/1505.06250
- 
Fusing Multi-Stream Deep Networks for Video Classification
 
 arxiv: http://arxiv.org/abs/1509.06086
- 
Learning End-to-end Video Classification with Rank-Pooling
 
    paper: http://jmlr.org/proceedings/papers/v48/fernando16.html
    paper: http://jmlr.csail.mit.edu/proceedings/papers/v48/fernando16.pdf
    summary(by Hugo Larochelle): http://www.shortscience.org/paper?bibtexKey=conf/icml/FernandoG16#hlarochelle
- 
Deep Learning for Video Classification and Captioning
 
 arxiv: http://arxiv.org/abs/1609.06782
- 
Fast Video Classification via Adaptive Cascading of Deep Models
 
 arxiv: https://arxiv.org/abs/1611.06453
- 
Deep Feature Flow for Video Recognition
 
    intro: CVPR 2017
    intro: It provides a simple, fast, accurate, and end-to-end framework for video recognition (e.g., object detection and semantic segmentation in videos)
    arxiv: https://arxiv.org/abs/1611.07715
    github(official, MXNet): https://github.com/msracver/Deep-Feature-Flow
    youtube: https://www.youtube.com/watch?v=J0rMHE6ehGw
- 
Large-Scale YouTube-8M Video Understanding with Deep Neural Networks
 
https://arxiv.org/abs/1706.04488
- 
Deep Learning Methods for Efficient Large Scale Video Labeling
 
    intro: Solution to the Kaggle's competition Google Cloud & YouTube-8M Video Understanding Challenge
    arxiv: https://arxiv.org/abs/1706.04572
    github: https://github.com/mpekalski/Y8M
- 
Learnable pooling with Context Gating for video classification
 
    intro: CVPR17 Youtube 8M workshop. Kaggle 1st place
    arxiv: https://arxiv.org/abs/1706.06905
    github: https://github.com/antoine77340/LOUPE
- 
Aggregating Frame-level Features for Large-Scale Video Classification
 
    intro: Youtube-8M Challenge, 4th place
    arxiv: https://arxiv.org/abs/1707.00803
- 
Tensor-Train Recurrent Neural Networks for Video Classification
 
https://arxiv.org/abs/1707.01786
- 
Hierarchical Deep Recurrent Architecture for Video Understanding
 
    intro: Classification Challenge Track paper in CVPR 2017 Workshop on YouTube-8M Large-Scale Video Understanding
    arxiv: https://arxiv.org/abs/1707.03296
- 
Large-scale Video Classification guided by Batch Normalized LSTM Translator
 
    intro: CVPR2017 Workshop on Youtube-8M Large-scale Video Understanding
    arxiv: https://arxiv.org/abs/1707.04045
- 
UTS submission to Google YouTube-8M Challenge 2017
 
    intro: CVPR'17 Workshop on YouTube-8M
    arxiv: https://arxiv.org/abs/1707.04143
    github: https://github.com/ffmpbgrnn/yt8m
- 
A spatiotemporal model with visual attention for video classification
 
https://arxiv.org/abs/1707.02069
- 
Cultivating DNN Diversity for Large Scale Video Labelling
 
    intro: CVPR 2017 Youtube-8M Workshop
    arxiv: https://arxiv.org/abs/1707.04272
- 
Attention Transfer from Web Images for Video Recognition
 
    intro: ACM Multimedia, 2017
    arxiv: https://arxiv.org/abs/1708.00973
- 
Non-local Neural Networks
 
    intro: CVPR 2018. CMU & Facebook AI Research
    arxiv: https://arxiv.org/abs/1711.07971
    github(Caffe2): https://github.com/facebookresearch/video-nonlocal-net
- 
Temporal 3D ConvNets: New Architecture and Transfer Learning for Video Classification
 
https://arxiv.org/abs/1711.08200
- 
Appearance-and-Relation Networks for Video Classification
 
    arxiv: https://arxiv.org/abs/1711.09125
    github: https://github.com/wanglimin/ARTNet
- 
Rethinking Spatiotemporal Feature Learning: Speed-Accuracy Trade-offs in Video Classification
 
    intro: ECCV 2018. Google Research & University of California San Diego
    arxiv: https://arxiv.org/abs/1712.04851
- 
Long Activity Video Understanding using Functional Object-Oriented Network
 
https://arxiv.org/abs/1807.00983
Deep Architectures and Ensembles for Semantic Video Classification
https://arxiv.org/abs/1807.01026
- 
Deep Discriminative Model for Video Classification
 
    intro: ECCV 2018
    arxiv: https://arxiv.org/abs/1807.08259
- 
Deep Video Color Propagation
 
    intro: BMVC 2018
    arxuv: https://arxiv.org/abs/1808.03232
- 
Non-local NetVLAD Encoding for Video Classification
 
    intro: ECCV 2018 workshop on YouTube-8M Large-Scale Video Understanding
    intro: Tencent AI Lab & Fudan University
    arxiv: https://arxiv.org/abs/1810.00207
Learnable Pooling Methods for Video Classification
    intro: Youtube 8M ECCV18 Workshop
    arxiv: https://arxiv.org/abs/1810.00530
    github: https://github.com/pomonam/LearnablePoolingMethods
- 
NeXtVLAD: An Efficient Neural Network to Aggregate Frame-level Features for Large-scale Video Classification
 
    intro: ECCV 2018 workshop
    arxiv: https://arxiv.org/abs/1811.05014
    github: https://github.com/linrongc/youtube-8m
视频行为识别 / 行为检测
- 
3d convolutional neural networks for human action recognition
 
 paper: http://www.cs.odu.edu/~sji/papers/pdf/Ji_ICML10.pdf
- 
Sequential Deep Learning for Human Action Recognition
 
 paper: http://liris.cnrs.fr/Documents/Liris-5228.pdf
- 
Two-stream convolutional networks for action recognition in videos
 
 arxiv: http://arxiv.org/abs/1406.2199
- 
Finding action tubes
 
    intro: "built action models from shape and motion cues. They start from the image proposals and select the motion salient subset of them and extract saptio-temporal features to represent the video using the CNNs."
    arxiv: http://arxiv.org/abs/1411.6031
- 
Hierarchical Recurrent Neural Network for Skeleton Based Action Recognition
 
- 
Action Recognition with Trajectory-Pooled Deep-Convolutional Descriptors
 
    intro: CVPR 2015. TDD
    paper: www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Wang_Action_Recognition_With_2015_CVPR_paper.pdf
    ext: http://www.cv-foundation.org/openaccess/content_cvpr_2015/app/2B_105_ext.pdf
    poster: https://wanglimin.github.io/papers/WangQT_CVPR15_Poster.pdf
    github: https://github.com/wanglimin/TDD
- 
Action Recognition by Hierarchical Mid-level Action Elements
 
 paper: http://cvgl.stanford.edu/papers/tian2015.pdf
Contextual Action Recognition with R*CNN
    arxiv: http://arxiv.org/abs/1505.01197
    github: https://github.com/gkioxari/RstarCNN
- 
Towards Good Practices for Very Deep Two-Stream ConvNets
 
    arxiv: http://arxiv.org/abs/1507.02159
    github: https://github.com/yjxiong/caffe
- 
Action Recognition using Visual Attention
 
    intro: LSTM / RNN
    arxiv: http://arxiv.org/abs/1511.04119
    project page: http://shikharsharma.com/projects/action-recognition-attention/
    github(Python/Theano): https://github.com/kracwarlock/action-recognition-visual-attention
- 
End-to-end Learning of Action Detection from Frame Glimpses in Videos
 
    intro: CVPR 2016
    project page: http://ai.stanford.edu/~syyeung/frameglimpses.html
    arxiv: http://arxiv.org/abs/1511.06984
    paper: http://vision.stanford.edu/pdf/yeung2016cvpr.pdf
- 
Multi-velocity neural networks for gesture recognition in videos
 
 arxiv: http://arxiv.org/abs/1603.06829
Active Learning for Online Recognition of Human Activities from Streaming Videos
 arxiv: http://arxiv.org/abs/1604.02855
- 
Convolutional Two-Stream Network Fusion for Video Action Recognition
 
    arxiv: http://arxiv.org/abs/1604.06573
    github: https://github.com/feichtenhofer/twostreamfusion
- 
Deep, Convolutional, and Recurrent Models for Human Activity Recognition using Wearables
 
 arxiv: http://arxiv.org/abs/1604.08880
- 
Unsupervised Semantic Action Discovery from Video Collections
 
 arxiv: http://arxiv.org/abs/1605.03324
- 
Anticipating Visual Representations from Unlabeled Video
 
 paper: http://web.mit.edu/vondrick/prediction.pdf
VideoLSTM Convolves, Attends and Flows for Action Recognition
 arxiv: http://arxiv.org/abs/1607.01794
- 
Hierarchical Attention Network for Action Recognition in Videos (HAN)
 
 arxiv: http://arxiv.org/abs/1607.06416
- 
Spatio-Temporal LSTM with Trust Gates for 3D Human Action Recognition
 
 arxiv: http://arxiv.org/abs/1607.07043
- 
Connectionist Temporal Modeling for Weakly Supervised Action Labeling
 
 arxiv: http://arxiv.org/abs/1607.08584
- 
CUHK & ETHZ & SIAT Submission to ActivityNet Challenge 2016
 
    intro: won the 1st place in the untrimmed video classification task of ActivityNet Challenge 2016. TSN
    arxiv: http://arxiv.org/abs/1608.00797
    github: https://github.com/yjxiong/anet2016-cuhk
- 
Actionness Estimation Using Hybrid FCNs
 
    intro: CVPR 2016. H-FCN
    project page: http://wanglimin.github.io/actionness_hfcn/index.html
    paper: http://wanglimin.github.io/papers/WangQTV_CVPR16.pdf
    github: https://github.com/wanglimin/actionness-estimation/
- 
Real-time Action Recognition with Enhanced Motion Vector CNNs
 
    intro: CVPR 2016
    project page: http://zbwglory.github.io/MV-CNN/index.html
    paper: http://wanglimin.github.io/papers/ZhangWWQW_CVPR16.pdf
    github: https://github.com/zbwglory/MV-release
- 
Temporal Segment Networks: Towards Good Practices for Deep Action Recognition
 
    intro: ECCV 2016. HMDB51: 69.4%, UCF101: 94.2%
    arxiv: http://arxiv.org/abs/1608.00859
    paper: http://wanglimin.github.io/papers/WangXWQLTV_ECCV16.pdf
    github: https://github.com/yjxiong/temporal-segment-networks
- 
Temporal Segment Networks for Action Recognition in Videos
 
    intro: An extension of submission http://arxiv.org/abs/1608.00859
    arxiv: https://arxiv.org/abs/1705.02953
- 
Hierarchical Attention Network for Action Recognition in Videos
 
 arxiv: http://arxiv.org/abs/1607.06416
- 
DeepCAMP: Deep Convolutional Action & Attribute Mid-Level Patterns
 
    intro: CVPR 2016
    arxiv: http://arxiv.org/abs/1608.03217
- 
Depth2Action: Exploring Embedded Depth for Large-Scale Action Recognition
 
 arxiv: http://arxiv.org/abs/1608.04339
- 
Dynamic Image Networks for Action Recognition
 
    intro: CVPR 2016
    arxiv: http://users.cecs.anu.edu.au/~sgould/papers/cvpr16-dynamic_images.pdf
    github: https://github.com/hbilen/dynamic-image-nets
- 
Human Action Recognition without Human
 
 arxiv: http://arxiv.org/abs/1608.07876
- 
Temporal Convolutional Networks: A Unified Approach to Action Segmentation
 
    arxiv: http://arxiv.org/abs/1608.08242
    ECCV 2016 workshop: http://bravenewmotion.github.io/
- 
Temporal Activity Detection in Untrimmed Videos with Recurrent Neural Networks
 
    intro: Bachelor Thesis Report at ETSETB TelecomBCN
    project page: https://imatge-upc.github.io/activitynet-2016-cvprw/
    arxiv: http://arxiv.org/abs/1608.08128
    github: https://github.com/imatge-upc/activitynet-2016-cvprw
- 
Sequential Deep Trajectory Descriptor for Action Recognition with Three-stream CNN
 
 arxiv: http://arxiv.org/abs/1609.03056
- 
Semi-Coupled Two-Stream Fusion ConvNets for Action Recognition at Extremely Low Resolutions
 
 arxiv: https://arxiv.org/abs/1610.03898
- 
Spatiotemporal Residual Networks for Video Action Recognition
 
    intro: NIPS 2016
    arxiv: https://arxiv.org/abs/1611.02155
- 
Action Recognition Based on Joint Trajectory Maps Using Convolutional Neural Networks
 
 arxiv: https://arxiv.org/abs/1611.02447
- 
Deep Recurrent Neural Network for Mobile Human Activity Recognition with High Throughput
 
 arxiv: https://arxiv.org/abs/1611.03607
- 
Joint Network based Attention for Action Recognition
 
 arxiv: https://arxiv.org/abs/1611.05215
- 
Temporal Convolutional Networks for Action Segmentation and Detection
 
 arxiv: https://arxiv.org/abs/1611.05267
- 
AdaScan: Adaptive Scan Pooling in Deep Convolutional Neural Networks for Human Action Recognition in Videos
 
 arxiv: https://arxiv.org/abs/1611.08240
- 
ActionFlowNet: Learning Motion Representation for Action Recognition
 
 arxiv: https://arxiv.org/abs/1612.03052
- 
Higher-order Pooling of CNN Features via Kernel Linearization for Action Recognition
 
    intro: Australian Center for Robotic Vision & Data61/CSIRO
    arxiv: https://arxiv.org/abs/1701.05432
- 
Tube Convolutional Neural Network (T-CNN) for Action Detection in Videos
 
https://arxiv.org/abs/1703.10664
- 
Temporal Action Detection with Structured Segment Networks
 
    project page: http://yjxiong.me/others/ssn/
    arxiv: https://arxiv.org/abs/1704.06228
    github: https://github.com/yjxiong/action-detection
- 
Recurrent Residual Learning for Action Recognition
 
https://arxiv.org/abs/1706.08807
- 
Hierarchical Multi-scale Attention Networks for Action Recognition
 
https://arxiv.org/abs/1708.07590
- 
Two-stream Flow-guided Convolutional Attention Networks for Action Recognition
 
    intro: International Conference of Computer Vision Workshop (ICCVW), 2017
    arxiv: https://arxiv.org/abs/1708.09268
- 
Action Classification and Highlighting in Videos
 
https://arxiv.org/abs/1708.09522
- 
Real-Time Action Detection in Video Surveillance using Sub-Action Descriptor with Multi-CNN
 
https://arxiv.org/abs/1710.03383
- 
End-to-end Video-level Representation Learning for Action Recognition
 
    keywords: Deep networks with Temporal Pyramid Pooling (DTPP)
    arxiv: https://arxiv.org/abs/1711.04161
- 
Fully-Coupled Two-Stream Spatiotemporal Networks for Extremely Low Resolution Action Recognition
 
    intro: WACV 2018
    arxiv: https://arxiv.org/abs/1801.03983
- 
DiscrimNet: Semi-Supervised Action Recognition from Videos using Generative Adversarial Networks
 
https://arxiv.org/abs/1801.07230
- 
A Fusion of Appearance based CNNs and Temporal evolution of Skeleton with LSTM for Daily Living Action Recognition
 
https://arxiv.org/abs/1802.00421
- 
Real-Time End-to-End Action Detection with Two-Stream Networks
 
https://arxiv.org/abs/1802.08362
- 
A Closer Look at Spatiotemporal Convolutions for Action Recognition
 
    intro: CVPR 2018. Facebook Research
    intro: R(2+1)D and Mixed-Convolutions for Action Recognition.
    project page: https://dutran.github.io/R2Plus1D/
    arxiv: https://arxiv.org/abs/1711.11248
    github: https://github.com/facebookresearch/R2Plus1D
- 
VideoCapsuleNet: A Simplified Network for Action Detection
 
https://arxiv.org/abs/1805.08162
- 
Where and When to Look? Spatio-temporal Attention for Action Recognition in Videos
 
https://arxiv.org/abs/1810.04511
Projects
- 
A Torch Library for Action Recognition and Detection Using CNNs and LSTMs
 
    intro: CS231n student project report
    paper: http://cs231n.stanford.edu/reports2016/221_Report.pdf
    github: https://github.com/garythung/torch-lrcn
- 
2016 ActivityNet action recognition challenge. CNN + LSTM approach. Multi-threaded loading.
 
 github: https://github.com/jrbtaylor/ActivityNet
- 
LSTM for Human Activity Recognition
 
    github: https://github.com/guillaume-chevalier/LSTM-Human-Activity-Recognition/
    github(MXNet): https://github.com/Ldpe2G/DeepLearningForFun/tree/master/Mxnet-Scala/HumanActivityRecognition
- 
Scanner: Efficient Video Analysis at Scale
 
    intro: Locate and recognize faces in a video, Detect shots in a film, Search videos by image
    github: https://github.com/scanner-research/scanner
- 
Charades Starter Code for Activity Classification and Localization
 
    intro: Activity Recognition Algorithms for the Charades Dataset
    github: https://github.com/gsig/charades-algorithms
- 
NonLocalNetwork and Sequeeze-Excitation Network
 
    intro: MXNet implementation of Non-Local and Squeeze-Excitation network
    github: https://github.com/WillSuen/NonLocalandSEnet
事件识别
- 
TagBook: A Semantic Video Representation without Supervision for Event Detection
 
 arxiv: http://arxiv.org/abs/1510.02899
- 
AENet: Learning Deep Audio Features for Video Analysis
 
    arxiv: https://arxiv.org/abs/1701.00599
    github: https://github.com/znaoya/aenet
事件检测
- 
DevNet: A Deep Event Network for Multimedia Event Detection and Evidence Recounting
 
    paper: http://120.52.72.47/winsty.net/c3pr90ntcsf0/papers/devnet.pdf
    paper: http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Gan_DevNet_A_Deep_2015_CVPR_paper.pdf
- 
Detecting events and key actors in multi-person videos
 
    intro: CVPR 2016
    arxiv: http://arxiv.org/abs/1511.02917
    paper: www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Ramanathan_Detecting_Events_and_CVPR_2016_paper.pdf
    paper: http://vision.stanford.edu/pdf/johnson2016cvpr.pdf
    blog: http://www.leiphone.com/news/201606/l1TKIRFLO3DUFNNu.html
- 
Deep Convolutional Neural Networks and Data Augmentation for Acoustic Event Detection
 
    intro: INTERSPEECH 2016
    arxiv: https://arxiv.org/abs/1604.07160
- 
Efficient Action Detection in Untrimmed Videos via Multi-Task Learning
 
 arxiv: https://arxiv.org/abs/1612.07403
- 
Joint Event Detection and Description in Continuous Video Streams
 
    intro: Joint Event Detection and Description Network (JEDDi-Net)
    arxiv: https://arxiv.org/abs/1802.10250
转自: https://blog.csdn.net/WJ_MeiMei/article/details/84344836

                
            
        
浙公网安备 33010602011771号