Query-aware sparse coding for web multi-video summarization.pdf
《Query-aware sparse coding for web multi-video summarization.pdf》由会员分享,可在线阅读,更多相关《Query-aware sparse coding for web multi-video summarization.pdf(15页珍藏版)》请在得力文库 - 分享文档赚钱的网站上搜索。
1、Information Sciences 478 (2019) 152166 Contents lists available at ScienceDirect Information Sciences journal homepage: Query-aware sparse coding for web multi-video summarization Zhong Ji a , Yaru Ma a , Yanwei Pang a , , Xuelong Li b a School of Electrical and Information Engineering, Tianjin Uni
2、versity, Tianjin 30 0 072, China b Xian Institute of Optics and Precision Mechanics, Chinese Academy of Sciences, Xian 710119, China a r t i c l e i n f o Article history: Received 28 March 2018 Revised 20 September 2018 Accepted 23 September 2018 Available online 8 November 2018 Keywords: Video sum
3、marization Sparse coding Query-aware Multi-video a b s t r a c t Given the explosive growth of online videos, it is becoming increasingly important to re- lieve the tedious work of browsing and managing the video content of interest. Video sum- marization aims at providing such a technique by transf
4、orming one or multiple videos into a compact one. However, conventional multi-video summarization methods often fail to produce satisfying results as they ignore the users search intents. To this end, this pa- per proposes a novel query-aware approach by formulating the multi-video summariza- tion i
5、n a sparse coding framework, where the web images searched by a query are taken as the important preference information to reveal the query intent. To provide a user- friendly summarization, this paper also develops an event-keyframe presentation structure to present keyframes in groups of specific
6、events related to the query by using an unsu- pervised multi-graph fusion method. Moreover, we release a new public dataset named MVS1K, which contains about 10 0 0 videos from 10 queries and their video tags, manual annotations, and associated web images. Extensive experiments on the MVS1K and TVSu
7、m datasets demonstrate that our approaches produce competitively objective and subjective results. 2018 Published by Elsevier Inc. 1. Introduction The rapid growth of video data has steadily occupied the vast majority of network traffic. For example, YouTube, as one of the primary online video shari
8、ng website, serves over 300 h video upload per minute in April 2018. This massive amount of video has increased the demand for efficient ways to browse and manage desired video content 17,24,29,30,37 . However, given an event query, search engines usually return thousands or even more videos, which
9、are quite noisy, redundant, and even irrelevant. This makes it difficult for users to grasp the focus of the whole event, forcing them to spend a lot of time and effort to explore the main content of the returned videos. Multi-Video Summarization (MVS) is one of the effective ways to tackle this pro
10、blem. It extracts the essential information of multi-video frames as keyframes to produce a condensed and informative version. That is to say, its goal is to generate a single summary to describe a large number of retrieved videos. In this way, it empowers the users to quickly browse and comprehend
11、a large amount of video content. One key challenge of MVS is to accurately access the users search intents, that is, to generate query-aware summa- rization. Consequently, a surge of effort s have been carried out along this thread. These effort s can be divided into three Corresponding author. E-ma
12、il addresses: (Z. Ji), (Y. Ma), (Y. Pang), xuelong_ (X. Li). https:/doi.org/10.1016/j.ins.2018.09.050 0020-0255/ 2018 Published by Elsevier Inc. Z. Ji, Y. Ma and Y. Pang et al. / Information Sciences 478 (2019) 152166 153 Fig. 1. The MVS pipeline of the proposed QUASC and MGF approaches. categori
13、es: searching-based approach 1,13,45 , learning-based approach 16,29,30,37 , and fusion-based approach 14,20,34 . Specifically, the searching-based one prefers to select those video frames with high similarities to the searched web images as the keyframes in summarization 1,13,45 . The idea behind i
14、t is that the searched web images returned by the search engines generally reflect the search intents for a specific query, thus the generated MVS is query-aware. However, this type of approach tends to produce several redundant keyframes in a summarization since there are always some frames having
15、high similarity in multiple videos. The learning-based one selects the keyframes by building a learning model 16,29,30,37 . For example, Besiris et al. 2 apply a multiple instance learning model to localize the tags into video shots and select the query-aware keyframes in accordance with the tags. I
16、t achieves satisfactory performance on query-video dataset. However, there is an obstacle to scale such N-way discrete classifiers beyond a limited number of discrete query categories 20 . Recently, there are considerable interests on fusing the ideas of the above two types of approaches to overcome
17、 their re- spective drawbacks. Some pioneering fusion-based approaches formulate the MVS problem in a graph model 14 , concept learning model 34 , and multi-task learning model 20 , respectively. On the other hand, sparse coding technique is effective and widely used in Single Video Summarization (S
18、VS) 6,21 . It formulates the keyframe selection problem as a coefficient selection one, which guarantees the general properties of SVS, such as conciseness and representativeness. However, it is inappropriate for MVS to directly utilize sparse coding since there are plenty of irrelevant or less rele
19、vant content to the query in multiple videos. Otherwise, the summarization will contain several noisy or unimportant keyframes, which weakens the conciseness and representativeness. A natural idea is taking advantage of the searched web images to emphasize the important content in the sparse coding
20、framework. However, it is still an unsolved challenging problem. To deal with this challenge, we present a QUery-Aware Sparse Coding (QUASC) method that generates the query- dependent MVS by fusing the ideas of sparse coding technique and searching-based MVS approach. Moreover, to present the summar
21、ization in a friendly manner, we also develop a novel Event-Keyframe Presentation (EKP) structure with a novel Multi-Graph Fusion (MGF) approach to present keyframes in groups of specific events related to the query. The MVS frame- work of the proposed QUASC and MGF is illustrated in Fig. 1 . It is
22、worthwhile to highlight several aspects of the proposed methods: (1) A novel QUery-Aware Sparse Coding (QUASC) method for multi-video summarization is proposed. It formulates the multi-video summarization in a sparse coding framework, where the web images searched by the query are taken as the impor
23、tant preference information to reveal the query intent. 154 Z. Ji, Y. Ma and Y. Pang et al. / Information Sciences 478 (2019) 152166 (2) A user-friendly summarization representation structure is developed, which presents the keyframes in groups of specific events related to the query. (3) A new publ
24、ic dataset named MVS1K is released. 1 It contains about 1, 0 0 0 videos from 10 queries and their video tags, manual annotations, and associated web images. To the best of our knowledge, it is the largest public multi-video sum- marization dataset. Both our data and code will be made available. The
- 配套讲稿:
如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
- 特殊限制:
部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。
- 关 键 词:
- Query awaresparsecodingforwebmulti videosummarization
链接地址:https://www.deliwenku.com/p-789511.html
限制150内