期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Variability modelling for audio events detection in movies

Cédric Penet Claire-Hélène Demarty Guillaume Gravier Patrick Gros 《Multimedia Tools and Applications》2015,74(4):1143-1173

相似文献

2.

Multimodal information fusion for android malware detection using lazy learning

Qaisar Zahid Hussain Li Ruixuan 《Multimedia Tools and Applications》2022,81(9):12077-12091

Multimedia Tools and Applications - Android has a large number of users that are accumulating with each passing day. Security of the Android ecosystem is a major concern for these users with the... 相似文献

3.

Multimodal recognition of visual concepts using histograms of textual concepts and selective weighted late fusion scheme

Ningning Liu Emmanuel Dellandréa Liming Chen Chao Zhu Yu Zhang Charles-Edmond Bichot Stéphane Bres Bruno Tellez 《Computer Vision and Image Understanding》2013,117(5):493-512

相似文献

4.

Multimodal biometrics recognition based on local fusion visual features and variational Bayesian extreme learning machine

《Expert systems with applications》2016

相似文献

5.

A novel approach for audio steganography by processing of amplitudes and signs of secret audio separately

Bharti Shambhu Shankar Gupta Manish Agarwal Suneeta 《Multimedia Tools and Applications》2019,78(16):23179-23201

Multimedia Tools and Applications - Now a days, cases of theft of important data both by employees of the organization and outside hackers are increasing day-by-day. So, new methods for information... 相似文献

6.

Comparison of mid-level feature coding approaches and pooling strategies in visual concept detection

Piotr Koniusz Fei Yan Krystian Mikolajczyk 《Computer Vision and Image Understanding》2013,117(5):479-492

相似文献

7.

Experiencing OptiqueVQS: a multi-paradigm and ontology-based visual query system for end users

Ahmet Soylu Martin Giese Ernesto Jimenez-Ruiz Guillermo Vega-Gorgojo Ian Horrocks 《Universal Access in the Information Society》2016,15(1):129-152

Data access in an enterprise setting is a determining factor for value creation processes, such as sense-making, decision-making, and intelligence analysis. Particularly, in an enterprise setting, intuitive data access tools that directly engage domain experts with data could substantially increase competitiveness and profitability. In this respect, the use of ontologies as a natural communication medium between end users and computers has emerged as a prominent approach. To this end, this article introduces a novel ontology-based visual query system, named OptiqueVQS, for end users. OptiqueVQS is built on a powerful and scalable data access platform and has a user-centric design supported by a widget-based flexible and extensible architecture allowing multiple coordinated representation and interaction paradigms to be employed. The results of a usability experiment performed with non-expert users suggest that OptiqueVQS provides a decent level of expressivity and high usability and hence is quite promising. 相似文献

8.

Multimodal detection of highlights for multimedia content

Serhan Dagtas Mohamed Abdel-Mottaleb 《Multimedia Systems》2004,9(6):586-593

相似文献

9.

Multimodal features fusion for gait,gender and shoes recognition

Francisco M. Castro Manuel J. Marín-Jiménez Nicolás Guil 《Machine Vision and Applications》2016,27(8):1213-1228

相似文献

10.

Pedestrian object detection with fusion of visual attention mechanism and semantic computation

Xiao Feng Liu Baotong Li Runa 《Multimedia Tools and Applications》2020,79(21-22):14593-14607

In response to the problem that the primary visual features are difficult to effectively address pedestrian detection in complex scenes, we present a method to improve pedestrian detection using a visual attention mechanism with semantic computation. After determining a saliency map with a visual attention mechanism, we can calculate saliency maps for human skin and the human head-shoulders. Using a Laplacian pyramid, the static visual attention model is established to obtain a total saliency map and then complete pedestrian detection. Experimental results demonstrate that the proposed method achieves state-of-the-art performance on the INRIA dataset with 92.78% pedestrian detection accuracy at a very competitive time cost.

相似文献

11.

Virtual lifeline: Multimodal sensor data fusion for robust navigation in unknown environments

Gerald Pirkl Daniele Munaretto Carl Fischer Chunlei An Paul Lukowicz Martin Klepal Andreas Timm-Giel Joerg Widmer Dirk Pesch Hans Gellersen 《Pervasive and Mobile Computing》2012,8(3):388-401

We present a novel, multimodal indoor navigation technique that combines pedestrian dead reckoning (PDR) with relative position information from wireless sensor nodes. It is motivated by emergency response scenarios where no fixed or pre-deployed global positioning infrastructure is available and where typical motion patterns defeat standard PDR systems. We use RF and ultrasound beacons to periodically re-align the PDR system and reduce the impact of incremental error accumulation. Unlike previous work on multimodal positioning, we allow the beacons to be dynamically deployed (dropped by the user) at previously unknown locations. A key contribution of this paper is to show that despite the fact that the beacon locations are not known (in terms of absolute coordinates), they significantly improve the performance of the system. This effect is especially relevant when a user re-traces (parts of) the path he or she had previously travelled or lingers and moves around in an irregular pattern at single locations for extended periods of time. Both situations are common and relevant for emergency response scenarios. We describe the system architecture, the fusion algorithms and provide an in depth evaluation in a large scale, realistic experiment. 相似文献

12.

A new method for violence detection in surveillance scenes

Tao Zhang Zhijie Yang Wenjing Jia Baoqing Yang Jie Yang Xiangjian He 《Multimedia Tools and Applications》2016,75(12):7327-7349

相似文献

13.

Structured audio and effects processing in the MPEG-4 multimedia standard

Eric D. Scheirer 《Multimedia Systems》1999,7(1):11-22

相似文献

14.

Recognition of isolated words using Zernike and MFCC features for audio visual speech recognition

Prashant Borde Amarsinh Varpe Ramesh Manza Pravin Yannawar 《International Journal of Speech Technology》2015,18(2):167-175

相似文献

15.

Multimodal feature extraction and fusion for semantic mining of soccer video: a survey

Payam Oskouie Sara Alipour Amir-Masoud Eftekhari-Moghadam 《Artificial Intelligence Review》2014,42(2):173-210

This paper presents a classified review of soccer video analysis works. The existing approaches in the aspects of highlight event detection, video summarization and retrieval based on video stream, ball and player tracking for provision of match statistics, technical and tactical analysis and application of different sources in soccer video analysis have been surveyed. In addition, some major existing commercial softwares developed for video analysis are introduced and compared. With regard to the existing challenge for automatic and realtime provision of video analysis, different computer vision approaches are discussed and compared. Audio, video and text feature extraction methods have been investigated and the future trends for improvement of the reviewed systems have been introduced in terms of response time optimization, increase of precision and eliminating the need of human intervention for video analysis. 相似文献

16.

Aspects of image fusion for automated visual inspection

M. Heizmann 《Pattern Recognition and Image Analysis》2008,18(2):222-230

In automated visual inspection, the evaluation of only one image is often not sufficient to obtain the desired information on a scene. Shortcomings originate at the image acquisition and are mainly due to the projection of the world by the optical system and the sampling and quantization in spatial and temporal dimensions that are necessary to obtain a digitized image. In consequence, image acquisition is always a noninjective mapping. However, by acquiring image series where at least one acquisition parameter has been varied systematically, the useful information about the scene can often be transformed such that it is conserved in some way in the image series. In this context, image fusion is the technique of reconstructing useful information about a scene by means of sophisticated processing of an image series. In this way, information can be gained that is not directly conceivable in a single image. The text was submitted by the author in English. Michael Heizmann (born 1971) received the M.S. degree in mechanical engineering in 1998 and the Ph.D. degree in automated visual inspection in 2004, both from the University of Karlsruhe, Germany. Since 2004, he is a Postdoctoral Research Assistant at the Fraunhofer Institute for Information and Data Processing IITB, Karlsruhe, where he heads the research group for Variable Image Acquisition and Processing. His research interests include image fusion, automated visual inspection, pattern recognition, and their applications in industrial quality inspection. He is author or coauthor of about 30 national and international publications, head of the technical committee on Image Processing in Measurement and Automation Engineering of the VDI (Association of German Engineers), and organizer of several workshops and conferences on image processing and fusion. 相似文献

17.

Automatic mood detection and tracking of music audio signals 总被引：2，自引：0，他引：2

Lie Lu Liu D. Hong-Jiang Zhang 《IEEE transactions on audio, speech, and language processing》2006,14(1):5-18

Music mood describes the inherent emotional expression of a music clip. It is helpful in music understanding, music retrieval, and some other music-related applications. In this paper, a hierarchical framework is presented to automate the task of mood detection from acoustic music data, by following some music psychological theories in western cultures. The hierarchical framework has the advantage of emphasizing the most suitable features in different detection tasks. Three feature sets, including intensity, timbre, and rhythm are extracted to represent the characteristics of a music clip. The intensity feature set is represented by the energy in each subband, the timbre feature set is composed of the spectral shape features and spectral contrast features, and the rhythm feature set indicates three aspects that are closely related with an individual's mood response, including rhythm strength, rhythm regularity, and tempo. Furthermore, since mood is usually changeable in an entire piece of classical music, the approach to mood detection is extended to mood tracking for a music piece, by dividing the music into several independent segments, each of which contains a homogeneous emotional expression. Preliminary evaluations indicate that the proposed algorithms produce satisfactory results. On our testing database composed of 800 representative music clips, the average accuracy of mood detection achieves up to 86.3%. We can also on average recall 84.1% of the mood boundaries from nine testing music pieces. 相似文献

18.

Multimodal fusion of biomedical data at different temporal and dimensional scales

Marco Viceconti Gordon ClapworthyDebora Testi Fulvia TaddeiNigel McFarlane 《Computer methods and programs in biomedicine》2011,102(3):227-237

The introduction of integrative approaches to biomedical research (integrative biology, physiome, Virtual Physiological Human, etc.) poses original problems to computer aided medicine: the need to operate with large amounts of data that are strongly heterogeneous in structure, format and even in the knowledge domain that generated them; the need to integrate all of these data into a coherent whole; the further complication imposed by the fact that more and more frequently these data are captured at very different dimensional and/or temporal scales. The present study describes a first attempt at providing an interactive visualisation environment for homogeneous biomedical data defined over radically different spatial or temporal scales. In particular, we describe new strategies for the management of the dimensional information of highly heterogeneous data types; the management of temporal multiscaling; for 3D unstructured spatial multiscale visualisation and the related interaction paradigms and user interface. Preliminary results with a prototype implementation based on the OpenMAF application framework (http://www.openmaf.org) indicate that it is possible to develop effective environments for interactive visualisation of multiscale biomedical data. 相似文献

19.

Multimodal biometric scheme for human authentication technique based on voice and face recognition fusion

Abozaid Anter Haggag Ayman Kasban Hany Eltokhy Mostafa 《Multimedia Tools and Applications》2019,78(12):16345-16361

Multimedia Tools and Applications - In this paper, an effective multimodal biometric identification approach for human authentication tool based on face and voice recognition fusion is proposed.... 相似文献

20.

《Multimedia Tools and Applications》2022,81(25):35731-35731

Multimedia Tools and Applications - 相似文献