期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Apertium: a free/open-source platform for rule-based machine translation 总被引：1，自引：1，他引：0

Mikel L. Forcada Mireia Ginestí-Rosell Jacob Nordfalk Jim O’Regan Sergio Ortiz-Rojas Juan Antonio Pérez-Ortiz Felipe Sánchez-Martínez Gema Ramírez-Sánchez Francis M. Tyers 《Machine Translation》2011,25(2):127-144

Apertium is a free/open-source platform for rule-based machine translation. It is being widely used to build machine translation systems for a variety of language pairs, especially in those cases (mainly with related-language pairs) where shallow transfer suffices to produce good quality translations, although it has also proven useful in assimilation scenarios with more distant pairs involved. This article summarises the Apertium platform: the translation engine, the encoding of linguistic data, and the tools developed around the platform. The present limitations of the platform and the challenges posed for the coming years are also discussed. Finally, evaluation results for some of the most active language pairs are presented. An appendix describes Apertium as a free/open-source project. 相似文献

2.

Polysemy Deciphering Network for Robust Human–Object Interaction Detection

Zhong Xubin Ding Changxing Qu Xian Tao Dacheng 《International Journal of Computer Vision》2021,129(6):1910-1929

Human–Object Interaction (HOI) detection is important to human-centric scene understanding tasks. Existing works tend to assume that the same verb has similar visual characteristics in different HOI categories, an approach that ignores the diverse semantic meanings of the verb. To address this issue, in this paper, we propose a novel Polysemy Deciphering Network (PD-Net) that decodes the visual polysemy of verbs for HOI detection in three distinct ways. First, we refine features for HOI detection to be polysemy-aware through the use of two novel modules: namely, Language Prior-guided Channel Attention (LPCA) and Language Prior-based Feature Augmentation (LPFA). LPCA highlights important elements in human and object appearance features for each HOI category to be identified; moreover, LPFA augments human pose and spatial features for HOI detection using language priors, enabling the verb classifiers to receive language hints that reduce intra-class variation for the same verb. Second, we introduce a novel Polysemy-Aware Modal Fusion module, which guides PD-Net to make decisions based on feature types deemed more important according to the language priors. Third, we propose to relieve the verb polysemy problem through sharing verb classifiers for semantically similar HOI categories. Furthermore, to expedite research on the verb polysemy problem, we build a new benchmark dataset named HOI-VerbPolysemy (HOI-VP), which includes common verbs (predicates) that have diverse semantic meanings in the real world. Finally, through deciphering the visual polysemy of verbs, our approach is demonstrated to outperform state-of-the-art methods by significant margins on the HICO-DET, V-COCO, and HOI-VP databases. Code and data in this paper are available at https://github.com/MuchHair/PD-Net.

相似文献

3.

Matxin, an open-source rule-based machine translation system for Basque 总被引：1，自引：1，他引：0

Aingeru Mayor I?aki Alegria Arantza D??az de Ilarraza Gorka Labaka Mikel Lersundi Kepa Sarasola 《Machine Translation》2011,25(1):53-82

We present the first publicly available machine translation (MT) system for Basque. The fact that Basque is both a morphologically rich and less-resourced language makes the use of statistical approaches difficult, and raises the need to develop a rule-based architecture which can be combined in the future with statistical techniques. The MT architecture proposed reuses several open-source tools and is based on a unique XML format to facilitate the flow between the different modules, which eases the interaction among different developers of tools and resources. The result is the rule-based Matxin MT system, an open-source toolkit, whose first implementation translates from Spanish to Basque. We have performed innovative work on the following tasks: construction of a dependency analyser for Spanish, use of rich linguistic information to translate prepositions and syntactic functions (such as subject and object markers), construction of an efficient module for verbal chunk transfer, and design and implementation of modules for ordering words and phrases, independently of the source language. 相似文献

4.

The Train Benchmark: cross-technology performance evaluation of continuous model queries

Gábor Szárnyas Benedek Izsó István Ráth Dániel Varró 《Software and Systems Modeling》2018,17(4):1365-1393

In model-driven development of safety-critical systems (like automotive, avionics or railways), well-formedness of models is repeatedly validated in order to detect design flaws as early as possible. In many industrial tools, validation rules are still often implemented by a large amount of imperative model traversal code which makes those rule implementations complicated and hard to maintain. Additionally, as models are rapidly increasing in size and complexity, efficient execution of validation rules is challenging for the currently available tools. Checking well-formedness constraints can be captured by declarative queries over graph models, while model update operations can be specified as model transformations. This paper presents a benchmark for systematically assessing the scalability of validating and revalidating well-formedness constraints over large graph models. The benchmark defines well-formedness validation scenarios in the railway domain: a metamodel, an instance model generator and a set of well-formedness constraints captured by queries, fault injection and repair operations (imitating the work of systems engineers by model transformations). The benchmark focuses on the performance of query evaluation, i.e. its execution time and memory consumption, with a particular emphasis on reevaluation. We demonstrate that the benchmark can be adopted to various technologies and query engines, including modeling tools; relational, graph and semantic databases. The Train Benchmark is available as an open-source project with continuous builds from https://github.com/FTSRG/trainbenchmark. 相似文献

5.

Improving learning and generalization capabilities of the C-Mantec constructive neural network algorithm

Gómez Iván Mesa Héctor Ortega-Zamorano Francisco Jerez-Aragonés José M. Franco Leonardo 《Neural computing & applications》2020,32(13):8955-8963

C-Mantec neural network constructive algorithm Ortega (C-Mantec neural network algorithm implementation on MATLAB. https://github.com/IvanGGomez/CmantecPaco, 2015) creates very compact architectures with generalization capabilities similar to feed-forward networks trained by the well-known back-propagation algorithm. Nevertheless, constructive algorithms suffer much from the problem of overfitting, and thus, in this work the learning procedure is first analyzed for networks created by this algorithm with the aim of trying to understand the training dynamics that will permit optimization possibilities. Secondly, several optimization strategies are analyzed for the position of class separating hyperplanes, and the results analyzed on a set of public domain benchmark data sets. The results indicate that with these modifications a small increase in prediction accuracy of C-Mantec can be obtained but in general this was not better when compared to a standard support vector machine, except in some cases when a mixed strategy is used.

相似文献

6.

Understanding contents of filled-in Bangla form images

Bhattacharya Rajdeep Malakar Samir Ghosh Soulib Bhowmik Showmik Sarkar Ram 《Multimedia Tools and Applications》2021,80(3):3529-3570

相似文献

7.

Engineering fast multilevel support vector machines

Sadrfaridpour Ehsan Razzaghi Talayeh Safro Ilya 《Machine Learning》2019,108(11):1879-1917

The computational complexity of solving nonlinear support vector machine (SVM) is prohibitive on large-scale data. In particular, this issue becomes very sensitive when the data represents additional difficulties such as highly imbalanced class sizes. Typically, nonlinear kernels produce significantly higher classification quality to linear kernels but introduce extra kernel and model parameters which requires computationally expensive fitting. This increases the quality but also reduces the performance dramatically. We introduce a generalized fast multilevel framework for regular and weighted SVM and discuss several versions of its algorithmic components that lead to a good trade-off between quality and time. Our framework is implemented using PETSc which allows an easy integration with scientific computing tasks. The experimental results demonstrate significant speed up compared to the state-of-the-art nonlinear SVM libraries. Reproducibility: our source code, documentation and parameters are available at https://github.com/esadr/mlsvm.

相似文献

8.

SiamET: a Siamese based visual tracking network with enhanced templates

Zhou Yuxin Zhang Yi 《Applied Intelligence》2022,52(9):9782-9794

Discriminative correlation filter (DCF) played a dominant role in visual tracking tasks in early years. However, with the recent development of deep learning, the Siamese based networks begin to prevail. Unlike DCF, most Siamese network based tracking methods take the first frame as the reference, while ignoring the information from the subsequent frames. As a result, these methods may fail under unforeseeable situations (e.g. target scale/size changes, variant illuminations, occlusions etc.). Meanwhile, other deep learning based tracking methods learn discriminative filters online, where the training samples are extracted from a few fixed frames with predictable labels. However, these methods have the same limitations as Siamese-based trackers. The training samples are prone to have cumulative errors, which ultimately lead to tracking loss. In this situation, we propose SiamET, a Siamese-based network using Resnet-50 as its backbone with enhanced template module. Different from existing methods, our templates are acquired based on all historical frames. Extensive experiments have been carried out on popular datasets to verify the effectiveness of our method. It turns out that our tracker achieves superior performances than the state-of-the-art methods on 4 challenging benchmarks, including OTB100, VOT2018, VOT2019 and LaSOT. Specifically, we achieve an EAO score of 0.480 on VOT2018 with 31 FPS. Code is available at https://github.com/yu-1238/SiamET

相似文献

9.

Fusion that matters: convolutional fusion networks for visual recognition

Yu Liu Yanming Guo Theodoros Georgiou Michael S. Lew 《Multimedia Tools and Applications》2018,77(22):29407-29434

In recent years, deep learning has been successfully applied to diverse multimedia research areas, with the aim of learning powerful and informative representations for a variety of visual recognition tasks. In this work, we propose convolutional fusion networks (CFN) to integrate multi-level deep features and fuse a richer visual representation. Despite recent advances in deep fusion networks, they still have limitations due to expensive parameters and weak fusion modules. Instead, CFN uses 1 × 1 convolutional layers and global average pooling to generate side branches with few parameters, and employs a locally-connected fusion module, which can learn adaptive weights for different side branches and form a better fused feature. Specifically, we introduce three key components of the proposed CFN, and discuss its differences from other deep models. Moreover, we propose fully convolutional fusion networks (FCFN) that are an extension of CFN for pixel-level classification applied to several tasks, such as semantic segmentation and edge detection. Our experiments demonstrate that CFN (and FCFN) can achieve promising performance by consistent improvements for both image-level and pixel-level classification tasks, compared to a plain CNN. We release our codes on https://github.com/yuLiu24/CFN. Also, we make a live demo (goliath.liacs.nl) using a CFN model trained on the ImageNet dataset. 相似文献

10.

A multimodal fusion method for sarcasm detection based on late fusion

Ding Ning Tian Sheng-wei Yu Long 《Multimedia Tools and Applications》2022,81(6):8597-8616

Information on social media is multi-modal, most of which contains the meaning of sarcasm. In recent years, many people have studied the problem of sarcasm detection. Many traditional methods have been proposed in this field, but the study of deep learning methods to detect sarcasm is still insufficient. It is necessary to comprehensively consider the information of the text,the changes of the tone of the audio signal,the facial expressions and the body posture in the image to detect sarcasm. This paper proposes a multi-level late-fusion learning framework with residual connections, a more reasonable experimental data-set split and two model variants based on different experimental settings. Extensive experiments on the MUStARD show that our methods are better than other fusion models. In our speaker-independent experimental split, the multi-modality has a 4.85% improvement over the single-modality, and the Error rate reduction has an improvement of 11.8%. The latest code will be updated to this URL later: https://github.com/DingNing123/m_fusion

相似文献

11.

Chinese medical question answer selection via hybrid models based on CNN and GRU

Zhang Yuteng Lu Wenpeng Ou Weihua Zhang Guoqiang Zhang Xu Cheng Jinyong Zhang Weiyu 《Multimedia Tools and Applications》2020,79(21-22):14751-14776

Question answer selection in the Chinese medical field is very challenging since it requires effective text representations to capture the complex semantic relationships between Chinese questions and answers. Recent approaches on deep learning, e.g., CNN and RNN, have shown their potential in improving the selection quality. However, these existing methods can only capture a part or one-side of semantic relationships while ignoring the other rich and sophisticated ones, leading to limited performance improvement. In this paper, a series of neural network models are proposed to address Chinese medical question answer selection issue. In order to model the complex relationships between questions and answers, we develop both single and hybrid models with CNN and GRU to combine the merits of different neural network architectures. This is different from existing works that can onpy capture partial relationships by utilizing a single network structure. Extensive experimental results on cMedQA dataset demonstrate that the proposed hybrid models, especially BiGRU-CNN, significantly outperform the state-of-the-art methods. The source codes of our models are available in the GitHub (https://github.com/zhangyuteng/MedicalQA-CNN-BiGRU).

相似文献

12.

XKin: an open source framework for hand pose and gesture recognition using kinect

Fabrizio Pedersoli Sergio Benini Nicola Adami Riccardo Leonardi 《The Visual computer》2014,30(10):1107-1122

相似文献

13.

Analyzing the potential of active learning for document image classification

Saifullah Saifullah Agne Stefan Dengel Andreas Ahmed Sheraz 《International Journal on Document Analysis and Recognition》2023,26(3):187-209

Deep learning has been extensively researched in the field of document analysis and has shown excellent performance across a wide range of document-related tasks. As a result, a great deal of emphasis is now being placed on its practical deployment and integration into modern industrial document processing pipelines. It is well known, however, that deep learning models are data-hungry and often require huge volumes of annotated data in order to achieve competitive performances. And since data annotation is a costly and labor-intensive process, it remains one of the major hurdles to their practical deployment. This study investigates the possibility of using active learning to reduce the costs of data annotation in the context of document image classification, which is one of the core components of modern document processing pipelines. The results of this study demonstrate that by utilizing active learning (AL), deep document classification models can achieve competitive performances to the models trained on fully annotated datasets and, in some cases, even surpass them by annotating only 15–40% of the total training dataset. Furthermore, this study demonstrates that modern AL strategies significantly outperform random querying, and in many cases achieve comparable performance to the models trained on fully annotated datasets even in the presence of practical deployment issues such as data imbalance, and annotation noise, and thus, offer tremendous benefits in real-world deployment of deep document classification models. The code to reproduce our experiments is publicly available at https://github.com/saifullah3396/doc_al.

相似文献

14.

Increasing adaptability of a speech into sign language translation system

Verónica López-Ludeña Rubén San-Segundo Carlos González Morcillo Juan Carlos López José M. Pardo Muñoz 《Expert systems with applications》2013,40(4):1312-1322

This paper describes a new version of a speech into sign language translation system with new tools and characteristics for increasing its adaptability to a new task or a new semantic domain. This system is made up of a speech recognizer (for decoding the spoken utterance into a word sequence), a natural language translator (for converting a word sequence into a sequence of signs belonging to the sign language), and a 3D avatar animation module (for playing back the signs). In order to increase the system adaptability, this paper presents new improvements in all the three main modules for generating automatically the task dependent information from a parallel corpus: automatic generation of Spanish variants when generating the vocabulary and language model for the speech recogniser, an acoustic adaptation module for the speech recogniser, data-oriented language and translation models for the machine translator and a list of signs to design. The avatar animation module includes a new editor for rapidly design of the required signs. These developments have been necessary to reduce the effort when adapting a Spanish into Spanish sign language (LSE: Lengua de Signos Española) translation system to a new domain. The whole translation presents a SER (Sign Error Rate) lower than 10% and a BLEU higher than 90% while the effort for adapting the system to a new domain has been reduced more than 50%. 相似文献

15.

Long-term Visual Tracking: Review and Experimental Comparison

Liu Chang Chen Xiao-Fan Bo Chun-Juan Wang Dong 《国际自动化与计算杂志》2022,19(6):512-530

相似文献

16.

Temporal pattern attention for multivariate time series forecasting

Shih Shun-Yao Sun Fan-Keng Lee Hung-yi 《Machine Learning》2019,108(8-9):1421-1441

Forecasting of multivariate time series data, for instance the prediction of electricity consumption, solar power production, and polyphonic piano pieces, has numerous valuable applications. However, complex and non-linear interdependencies between time steps and series complicate this task. To obtain accurate prediction, it is crucial to model long-term dependency in time series data, which can be achieved by recurrent neural networks (RNNs) with an attention mechanism. The typical attention mechanism reviews the information at each previous time step and selects relevant information to help generate the outputs; however, it fails to capture temporal patterns across multiple time steps. In this paper, we propose using a set of filters to extract time-invariant temporal patterns, similar to transforming time series data into its “frequency domain”. Then we propose a novel attention mechanism to select relevant time series, and use its frequency domain information for multivariate forecasting. We apply the proposed model on several real-world tasks and achieve state-of-the-art performance in almost all of cases. Our source code is available at https://github.com/gantheory/TPA-LSTM.

相似文献

17.

Improved pedestrian detection using motion segmentation and silhouette orientation

Suman Kumar Choudhury Pankaj Kumar Sa Ram Prasad Padhy Saurav Sharma Sambit Bakshi 《Multimedia Tools and Applications》2018,77(11):13075-13114

相似文献

18.

Learning Degradation-Invariant Representation for Robust Real-World Person Re-Identification

Huang Yukun Fu Xueyang Li Liang Zha Zheng-Jun 《International Journal of Computer Vision》2022,130(11):2770-2796

Person re-identification (Re-ID) in real-world scenarios suffers from various degradations, e.g., low resolution, weak lighting, and bad weather. These degradations hinders identity feature learning and significantly degrades Re-ID performance. To address these issues, in this paper, we propose a degradation invariance learning framework for robust person Re-ID. Concretely, we first design a content-degradation feature disentanglement strategy to capture and isolate task-irrelevant features contained in the degraded image. Then, to avoid the catastrophic forgetting problem, we introduce a memory replay algorithm to further consolidate invariance knowledge learned from the previous pre-training to improve subsequent identity feature learning. In this way, our framework is able to continuously maintain degradation-invariant priors from one or more datasets to improve the robustness of identity features, achieving state-of-the-art Re-ID performance on several challenging real-world benchmarks with a unified model. Furthermore, the proposed framework can be extended to low-level image processing, e.g., low-light image enhancement, demonstrating the potential of our method as a general framework for the various vision tasks. Code and trained models will be available at: https://github.com/hyk1996/Degradation-Invariant-Re-ID-pytorch.

相似文献

19.

Three-dimensional Entity Resolution with JedAI

《Information Systems》2020

Entity Resolution (ER) is the task of detecting different entity profiles that describe the same real-world objects. To facilitate its execution, we have developed JedAI, an open-source system that puts together a series of state-of-the-art ER techniques that have been proposed and examined independently, targeting parts of the ER end-to-end pipeline. This is a unique approach, as no other ER tool brings together so many established techniques. Instead, most ER tools merely convey a few techniques, those primarily developed by their creators. In addition to democratizing ER techniques, JedAI goes beyond the other ER tools by offering a series of unique characteristics: (i) It allows for building and benchmarking millions of ER pipelines. (ii) It is the only ER system that applies seamlessly to any combination of structured and/or semi-structured data. (iii) It constitutes the only ER system that runs seamlessly both on stand-alone computers and clusters of computers — through the parallel implementation of all algorithms in Apache Spark. (iv) It supports two different end-to-end workflows for carrying out batch ER (i.e., budget-agnostic), a schema-agnostic one based on blocks, and a schema-based one relying on similarity joins. (v) It adapts both end-to-end workflows to budget-aware (i.e., progressive) ER. We present in detail all features of JedAI, stressing the core characteristics that enhance its usability, and boost its versatility and effectiveness. We also compare it to the state-of-the-art in the field, qualitatively and quantitatively, demonstrating its state-of-the-art performance over a variety of large-scale datasets from different domains.The central repository of the JedAI’s code base is here: https://github.com/scify/JedAIToolkit .A video demonstrating the JedAI’s Web application is available here: https://www.youtube.com/watch?v=OJY1DUrUAe8. 相似文献

20.

Bottom-up broadcast neural network for music genre classification

Liu Caifeng Feng Lin Liu Guochao Wang Huibing Liu Shenglan 《Multimedia Tools and Applications》2021,80(5):7313-7331

Music genre classification based on visual representation has been successfully explored over the last years. Recently, there has been increasing interest in attempting convolutional neural networks (CNNs) to achieve the task. However, most of the existing methods employ the mature CNN structures proposed in image recognition without any modification, which results in the learning features that are not adequate for music genre classification. Faced with the challenge of this issue, we fully exploit the low-level information from spectrograms of audio and develop a novel CNN architecture in this paper. The proposed CNN architecture takes the multi-scale time-frequency information into considerations, which transfers more suitable semantic features for the decision-making layer to discriminate the genre of the unknown music clip. The experiments are evaluated on the benchmark datasets including GTZAN, Ballroom, and Extended Ballroom. The experimental results show that the proposed method can achieve 93.9%, 96.7%, 97.2% classification accuracies respectively, which to the best of our knowledge, are the best results on these public datasets so far. It is notable that the trained model by our proposed network possesses tiny size, only 0.18M, which can be applied in mobile phones or other devices with limited computational resources. Codes and model will be available at https://github.com/CaifengLiu/music-genre-classification.

相似文献