首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Increasing the awareness of how incomplete data affects learning and classification accuracy has led to increasing numbers of missing data techniques. This article investigates the robustness and accuracy of seven popular techniques for tolerating incomplete training and test data for different patterns of missing data—different proportions and mechanisms of missing data on resulting tree-based models. The seven missing data techniques were compared by artificially simulating different proportions, patterns, and mechanisms of missing data using 21 complete datasets (i.e., with no missing values) obtained from the University of California, Irvine repository of machine-learning databases (Blake and Merz, 1998 Blake , C. L. and C. J. Merz . 1998 . UCI Repository of machine learning databases . University of California, Department of Information and Computer Science , Irvine , CA . (http:/www.ics.uci.edu/mlearn/MLRepository.html) . [Google Scholar]). A four-way repeated measures design was employed to analyze the data. The simulation results suggest important differences. All methods have their strengths and weaknesses. However, listwise deletion is substantially inferior to the other six techniques, while multiple imputation, that utilizes the expectation maximization algorithm, represents a superior approach to handling incomplete data. Decision tree single imputation and surrogate variables splitting are more severely impacted by missing values distributed among all attributes compared to when they are only on a single attribute. Otherwise, the imputation—versus model-based imputation procedures gave—reasonably good results although some discrepancies remained. Different techniques for addressing missing values when using decision trees can give substantially diverse results, and must be carefully considered to protect against biases and spurious findings. Multiple imputation should always be used, especially if the data contain many missing values. If few values are missing, any of the missing data techniques might be considered. The choice of technique should be guided by the proportion, pattern, and mechanisms of missing data, especially the latter two. However, the use of older techniques like listwise deletion and mean or mode single imputation is no longer justifiable given the accessibility and ease of use of more advanced techniques, such as multiple imputation and supervised learning imputation.  相似文献   

2.
This essay begins with discussion of four relatively recent works which are representative of major themes and preoccupations in Artificial Life Art: ‘Propagaciones’ by Leo Nuñez; ‘Sniff’ by Karolina Sobecka and Jim George; ‘Universal Whistling Machine’ by Marc Boehlen; and ‘Performative Ecologies’ by Ruari Glynn. This essay is an attempt to contextualise these works by providing an overview of the history and forms of Artificial Life Art as it has developed over two decades, along with some background in the ideas of the Artificial Life movement of the late 1980s and 1990s.1 A more extensive study of the theoretical history of Artificial Life can be found in my paper ‘Artificial Life Art—A Primer’, in the Proceedings of DAC09 and also at http://www.ace.uci.edu/Penny. Excerpts from that essay are included here.   相似文献   

3.
This paper presents some tentative experiments in using a special case of rewriting rules in Mizar (Mizar homepage: http://www.mizar.org/): rewriting a term as its subterm. A similar technique, but based on another Mizar mechanism called functor identification (Korni?owicz 2009) was used by Caminati, in his paper on basic first-order model theory in Mizar (Caminati, J Form Reason 3(1):49–77, 2010, Form Math 19(3):157–169, 2011). However for this purpose he was obligated to introduce some artificial functors. The mechanism presented in the present paper looks promising and fits the Mizar paradigm.  相似文献   

4.
In this paper we describe the libMesh (http://libmesh.sourceforge.net) framework for parallel adaptive finite element applications. libMesh is an open-source software library that has been developed to facilitate serial and parallel simulation of multiscale, multiphysics applications using adaptive mesh refinement and coarsening strategies. The main software development is being carried out in the CFDLab (http://cfdlab.ae.utexas.edu) at the University of Texas, but as with other open-source software projects; contributions are being made elsewhere in the US and abroad. The main goals of this article are: (1) to provide a basic reference source that describes libMesh and the underlying philosophy and software design approach; (2) to give sufficient detail and references on the adaptive mesh refinement and coarsening (AMR/C) scheme for applications analysts and developers; and (3) to describe the parallel implementation and data structures with supporting discussion of domain decomposition, message passing, and details related to dynamic repartitioning for parallel AMR/C. Other aspects related to C++ programming paradigms, reusability for diverse applications, adaptive modeling, physics-independent error indicators, and similar concepts are briefly discussed. Finally, results from some applications using the library are presented and areas of future research are discussed.  相似文献   

5.
ABSTRACT

As open source software has gained in popularity throughout the last decades, free operating systems (OSs) such as Linux (Torvalds) and BSD derivatives (i.e., FreeBSD, 2012; NetBSD, 2012 NetBSD Foundation. (2012). The NetBSD project. Available from http://netbsd.org (http://netbsd.org)  [Google Scholar]; OpenBSD, 2012 OpenBSD. (2012). OpenBSD. OpenBSD. Available from http://netbsd.org (http://netbsd.org)  [Google Scholar]) have become more common, not only on datacenters but also on desktop and laptop computers. It is not rare to find computer labs or company offices composed of personal computers that boot more than one operating system. By being able to choose among available OSs, a company's or organization's information technology manager has the freedom to select the right OS for the company's needs, and the decision can be based on technical or financial criteria. This freedom of choice, however, comes with a cost. The administrative complexity of heterogeneous networks is much higher compared to single OS networks, and if the network is large enough so that protocols such as LDAP (Zeilenga, 2006 Zeilenga, K. 2006. Lightweight directory access protocol (LDAP): Technical specification road map et. alTech. rep., RFC 4510, June [Google Scholar]) or Kerberos (Kohl & Neuman, 1993 Kohl, J. and Neuman, C. 1993. The Kerberos network authentication service (v5) Tech. rep., RFC 1510, September[Crossref] [Google Scholar]) need to be adopted, then the administration burden may become unbearable. Even though some tools exist that make user management of heterogeneous networks more feasible (Tournier, 2006 Tournier, J. (2006). smbldap-tools – summary [Gna!]. In Welcome to Gna! http://gna.org/projects/smbldap-tools (http://gna.org/projects/smbldap-tools)  [Google Scholar]; Chu & Symas Corp., 2005 Chu, H. and Symas Corp. (2005) http://www.openldap.org/devel/cvsweb.cgi/~checkout~/contrib/slapd-modules/smbk5pwd/README (http://www.openldap.org/devel/cvsweb.cgi/~checkout~/contrib/slapd-modules/smbk5pwd/README)  [Google Scholar]), it is not uncommon to use more than one back end for storing user credentials due to OS incompatibilities. In such configurations, the hardest problem to address is credential and account expiration synchronization among the different back ends. This paper demonstrates a platform that tries to mitigate the problem of synchronization by adding an additional, modular, easy to expand layer which is responsible for synchronizing any number of underlying back ends in a secure fashion.  相似文献   

6.
The Brainstorm feature introduced in Adobe After Effects CS3 (2007) allows users to automate parts of the process of generating design variations for the purposes of comparison and selection. The paper begins with a brief discussion of current discursive formations around software and software-based practice among digital design practitioners and educators. Next, the paper draws upon critical concepts drawn from multimodal discourse analysis, media theory and sociology to analyse Brainstorm in terms of the interplay of software structure and design agency. The key concepts used are modality, articulation and interpretation (Kress and van Leeuwen 1996 Kress, G. and van Leeuwen, T. 1996. Reading images: the grammar of visual design, London: Routledge.  [Google Scholar], 2001), the database as cultural form and the logic of selection (Manovich 2001 Manovich, L. 2001. The language of new media, Cambridge, MA: MIT Press.  [Google Scholar]), habitus and practical logic (Bourdieu 1977 Bourdieu, P. 1977. Outline of a theory of practice, New York: Cambridge University Press. [Crossref] [Google Scholar]) and the radius of creativity (Toynbee 2000 Toynbee, J. 2000. Making popular music: musicians, creativity and institutions, London: Arnold.  [Google Scholar]). Throughout, the paper addresses specific structural features of the software, thus developing an overview of the affordances and constraints of Brainstorm as a creative tool.  相似文献   

7.
We propose the task of free-form and open-ended Visual Question Answering (VQA). Given an image and a natural language question about the image, the task is to provide an accurate natural language answer. Mirroring real-world scenarios, such as helping the visually impaired, both the questions and answers are open-ended. Visual questions selectively target different areas of an image, including background details and underlying context. As a result, a system that succeeds at VQA typically needs a more detailed understanding of the image and complex reasoning than a system producing generic image captions. Moreover, VQA is amenable to automatic evaluation, since many open-ended answers contain only a few words or a closed set of answers that can be provided in a multiple-choice format. We provide a dataset containing \(\sim \)0.25 M images, \(\sim \)0.76 M questions, and \(\sim \)10 M answers (www.visualqa.org), and discuss the information it provides. Numerous baselines and methods for VQA are provided and compared with human performance. Our VQA demo is available on CloudCV (http://cloudcv.org/vqa).  相似文献   

8.
Game appropriation is currently not well conceptualized. What literature does exists (Griffiths & Light, 2008 Griffiths, M. and Light, B. 2008. Social networking and digital gaming media convergence: Classification and its consequences for appropriation. Information Systems Frontiers, 10: 447459. [Crossref], [Web of Science ®] [Google Scholar]; Lowood, 2005 Lowood, H. 2005. Real-time performance: Machinima and game studies. The International Media and Art Association Journal, 1(3): 1016.  [Google Scholar]; Postigo, 2008 Postigo, H. 2008. Video game appropriation through modifications. Attitudes concerning intellectual property among modders and fans. Convergence, 14: 5974.  [Google Scholar]; Stalker, 2005 Stalker, P. J. (2005). Gaming in art. Unpublished master's thesis, University of the Witwatersrand, Johannesburg, South Africa. http://www.selectparks.net/dl/PippaStalker_GamingInArt.pdf (http://www.selectparks.net/dl/PippaStalker_GamingInArt.pdf)  [Google Scholar]) uses the term primarily to denote gamers' practices beyond the designers' original intentions, for instance, game content modifications. This article frames game appropriation in a different manner; unlike existing appropriation models, game appropriation is conceptualized as a motivational process underpinned by three primary factors: game design characteristics, social interaction, and the psychological characteristics of the gamer. The main contribution of this article is the development of the first model of game appropriation, the game appropriation model (GAM). GAM explains the process of digital games' incorporation into gamers' daily practices as well as the nature of their gameplay. Game appropriation recognizes the online–offline continuity; it contributes to understating gameplay as a long-term, dynamic activity, directly interrelated with a gamers' everyday life rather than a set of defined moments of participation.  相似文献   

9.
The notions of a grammar form and its interpretations were introduced to describe families of structurally related grammars. Basically, a grammar formG is a (context-free) grammar serving as a master grammar and the interpretation operator defines a family of grammars, each structurally related toG. In this paper, a new operator yielding a family of grammars, is introduced as a variant of . There are two major results. The first is that and commute. The second is that for each grammar formG, the collection of all families of grammars ,G′ in , is finite. Expressed otherwise, the second result is that for each grammar formG there is only a bounded number of grammar forms in (G) no two of which are strongly equivalent.  相似文献   

10.
Twitter (http://twitter.com) is one of the most popular social networking platforms. Twitter users can easily broadcast disaster-specific information, which, if effectively mined, can assist in relief operations. However, the brevity and informal nature of tweets pose a challenge to Information Retrieval (IR) researchers. In this paper, we successfully use word embedding techniques to improve ranking for ad-hoc queries on microblog data. Our experiments with the ‘Social Media for Emergency Relief and Preparedness’ (SMERP) dataset provided at an ECIR 2017 workshop show that these techniques outperform conventional term-matching based IR models. In addition, we show that, for the SMERP task, our word embedding based method is more effective if the embeddings are generated from the disaster specific SMERP data, than when they are trained on the large social media collection provided for the TREC (http://trec.nist.gov/) 2011 Microblog track dataset.  相似文献   

11.
In the era of bigdata, with a massive set of digital information of unprecedented volumes being collected and/or produced in several application domains, it becomes more and more difficult to manage and query large data repositories. In the framework of the PetaSky project (http://com.isima.fr/Petasky), we focus on the problem of managing scientific data in the field of cosmology. The data we consider are those of the LSST project (http://www.lsst.org/). The overall size of the database that will be produced is expected to exceed 60 PB (Lsst data challenge handbook, 2012). In order to evaluate the performances of existing SQL On MapReduce data management systems, we conducted extensive experiments by using data and queries from the area of cosmology. The goal of this work is to report on the ability of such systems to support large scale declarative queries. We mainly investigated the impact of data partitioning, indexing and compression on query execution performances.  相似文献   

12.
Abstract

This paper describes the importance of the XTS-AES encryption mode of operation and concludes with a new proof for the security of ciphertext stealing as used by XTS-AES. The XTS-AES mode is designed for encrypting data stored on hard disks where there is not additional space for an integrity field. Given this lack of space for an integrity field, XTS-AES builds on the security of AES by protecting the storage device from many dictionary and copy/paste attacks. The operation of the XTS mode of AES is defined in the IEEE 1619-2007 standard [3 IEEE Std 1619–2007 . April 18, 2008 . The XTS-AES Tweakable Block Cipher. Institute of Electrical and Electronics Engineers, Inc.  [Google Scholar]], and has been adopted by the U.S. National Institute of Standards and Technology (NIST) as an approved mode of operation under FIPS 140-2 [2 Dworkin , M. December 2009 . NIST SP 800-38E, “Recommendation for Block Cipher Modes of Operation: The XTS-AES Mode for Confidentiality on Storage Devices”.  [Google Scholar]]. XTS-AES builds on the XEX (Xor-Encrypt-Xor) mode originally proposed by Rogaway [8 Rogaway , P. 2004 . Efficient Instantiations of Tweakable Block ciphers and Refinements to Modes OCB and PMAC. Advances in Cryptology–Asiacrypt 2004, Lecture Notes in Computer Science, vol. 3329, Springer-Verlag, pp. 16–31. Available at http://www.cs.ucdavis.edu/rogaway/papers/offsets.pdf (Accessed 6 January 2012) . [Google Scholar]].  相似文献   

13.
This article is a slightly abridged edited version of a final report detailing the background and implementation of a project that introduced electronic book (e-book) collections to Essex Public Libraries during 2004. The research considered e-book collections available for borrowing on a PDA (HP iPAQ) and collections downloadable on to the borrower's PDA or PC (OverDrive, ebrary). The project, sponsored by The Laser Foundation,1 The Laser (London and South-Eastern Library Region) Foundation was founded in 2002 following the transfer of an operational business to a grant-making body. The Foundation has made a number of grants to public and academic libraries since 2003, and further detail of its activities can be found at: http://www.bl.uk/concord/laser-about.html (accessed 21 February 2005). consisted of a partnership consisting between Loughborough University,2 The Department of Information Science. http://www.lboro.ac.uk/dis/ (accessed 21 February 2005). Essex Public Libraries3 Essex County Libraries operate 73 public libraries and mobile library services in south-eastern England. Their website can be found at: http://194.129.26.30/vip8/ecc/ECCWebsite/display/channels/libraries_channel_134343_Enjoying/index.jsp (accessed 21 February 2005). and Co-East.4 The Co-East Partnership manages acquisition and provision of electronic resources for public libraries in eastern England, including management of the popular ‘Ask a Librarian’ service. Their website can be found at: http://www.co-east.net/ (accessed 22 October 2004). In addition to a discussion of the findings of the research, guidelines are offered to other public library authorities considering the adoption of e-book collections and mobile technology. Two articles based on this research have been published elsewhere considering the evaluation of the iPAQ trials (Dearnley et al., ) and the provision and uptake of OverDrive and ebrary (Dearnley et al., ) collections.  相似文献   

14.
Abstract

GOST-R 34.11-94 is a Russian standard cryptographic hash function that was introduced in 1994 by the Russian Federal Agency for the purposes of information processing, information security, and digital signature. Mendel et al. (2008 Mendel, F., N. Pramstaller, C. Rechberger, M. Kontak, and J. Szmidt. 2008. Cryptanalysis of the GOST hash function, Advances in Cryptology – CRYPTO 2008, vol. 5157, 162–178. [Google Scholar]) and Courtois and Mourouzis (2011 Courtois, N., and T. Mourouzis. 2011. Black-box collision attacks on the compression function of the GOST hash function. SECRYPT. Proceedings of the International Conference on Security and Cryptography, 325332, IEEE. [Google Scholar]) found attacks on the compression function of the GOST-R structure that were basically weaknesses of the GOST-R block cipher (GOST 28147–89, 1989 GOST 28147-89. 1989. Systems of the information treatment, cryptographic security, algorithms of the cryptographic transformation (in Russian). [Google Scholar]). Hence in 2012, it was updated to GOST-R 34.11-2012, which replaced the older one for all its applications from January 2013. GOST-R 34.11-2012 is based on a modified Merkle-Damgård construction. Here we present a modified version of GOST-R 34.11-2012 (Modified GOST-R (MGR) hash). The design of the MGR hash is based on wide-pipe construction, which is also a modified Merkle-Damgård construction. MGR is much more secure as well as three times faster than GOST-R 34.11-2012. Advanced Encryption Standard (AES)-like block ciphers have been used in designing the compression function of MGR because AES is one of the most efficient and secure block ciphers and has been evaluated for more than 14?years. A detailed statistical analysis with a few other attacks on MGR is incorporated into this paper.  相似文献   

15.
《国际计算机数学杂志》2012,89(18):2562-2575
In this article, we extend a Milstein finite difference scheme introduced in 8 Giles, M. B. and Reisinger, C. 2012. Stochastic finite differences and multilevel Monte Carlo for a class of SPDEs in finance. SIAM Financ. Math., 3(1): 572592. (doi:10.1137/110841916)[Crossref] [Google Scholar] for a certain linear stochastic partial differential equation (SPDE) to semi-implicit and fully implicit time-stepping as introduced by Szpruch 32 Szpruch, L. 2010. Numerical approximations of nonlinear stochastic systems PhD Thesis, University of Strathclyde [Google Scholar] for stochastic differential equations (SDEs). We combine standard finite difference Fourier analysis for partial differential equations with the linear stability analysis in 3 Buckwar, E. and Sickenberger, T. 2011. A comparative linear mean-square stability analysis of Maruyama- and Milstein-type methods. Math. Comput. Simulation, 81: 11101127. (doi:10.1016/j.matcom.2010.09.015)[Crossref], [Web of Science ®] [Google Scholar] for SDEs to analyse the stability and accuracy. The results show that Crank–Nicolson time-stepping for the principal part of the drift with a partially implicit but negatively weighted double Itô integral gives unconditional stability over all parameter values and converges with the expected order in the mean-square sense. This opens up the possibility of local mesh refinement in the spatial domain, and we show experimentally that this can be beneficial in the presence of reduced regularity at boundaries.  相似文献   

16.
Abstract

After Section 404 of the Sarbanes-Oxley Act (SOX 404) was released, developing a computer auditing system became more important for management and auditors. In this study, the researchers aim to: (1) explore the crucial control items of the purchasing and expenditure cycle in meeting the conditions of SOX 404; (2) develop a computer auditing system based on the recognized control items and requirements of SOX 404; and (3) validate the applicability of the system by using an ISO/IEC 9126 model in meeting organizational needs (ISO, 2001 International Organization for Standardization (ISO) (2001). ISO/IEC 9126-1:2001. http://www.iso.org/iso/en/CatalogueDetailPage.CatalogueDetail?CSNUMBER=22749 (Accessed: 15 January 2006).  [Google Scholar]). The Gowin's Vee research strategy developed by Novak &; Gowin (1984) Novak, J. D. and Gowin, D. B. 1984. Learning How to Learn, Cambridge: Cambridge University. [Crossref] [Google Scholar] was used in the study. In theory, researchers have identified eight operational procedures and 34 critical control items for the purchasing and expenditure cycle. The prototype computer auditing system of this study was then developed. On the experimental side, the researchers conducted two case studies based on the ISO/IEC 9126 software assessment criteria, the result of which showed that the system can provide company internal auditing personnel and their external auditors with a simple, continuous, timely, and analytical tool, which may promptly and effectively help in detecting problem control issues. We believe this study can contribute to the development of a sufficient and manageable computer auditing system, and provide prospective researchers and businesses with future directions in this subject area.  相似文献   

17.
CIPHER EQUIPMENT     
Louis Kruh 《Cryptologia》2013,37(2):143-149
Abstract

In this article, a simplified version of the International Data Encryption Algorithm (IDEA) is described. This simplified version, like simplified versions of DES [8-12 Schaefer , E. 1996 . “A Simplified Data Encryption Standard Algorithm,” Cryptologia , 20 ( 1 ): 7784 . Schneier , B. 1996 . Applied Cryptography, , 2nd ed . Wiley , New York , NY . Schneier , B. 1999. Crypto Guru Bruce Schneier Answers. http://slashdot.org/interviews/99/10/29/0832246.shtml last accessed February 23, 2007. Shannon , C. “Communications Theory of Secrecy Systems,” Oct. 1949 . Bell Systems Technical Journal , 28 ( 4 ): 656715 . Trappe , W. and L. Washington . 2006 . Introduction to Cryptography with Coding Theory, , 2nd ed. Prentice Hall , Upper Saddle River , NJ . ] and AES [6 Musa , M. , E. Shaefer , and S. Wedig . 2003 . “A Simplified AES Algorithm and its Linear and Differential Cryptanalysis,” Cryptologia , 17 ( 2 ): 148177 . [Google Scholar] 7 Phan , R. 2002 . “Mini Advanced Encryption Standard (Mini-AES): A Testbed for Cryptanalysis Students,” Cryptologia , 26 ( 4 ): 283306 .[Taylor &; Francis Online] [Google Scholar]] that have appeared in print, is intended to help students understand the algorithm by providing a version that permits examples to be worked by hand. IDEA is useful teaching tool to help students bridge the gap between DES and AES.  相似文献   

18.
ABSTRACT

Computer system security relies on different aspects of a computer system such as security policies, security mechanisms, threat analysis, and countermeasures. This paper provides an ontological approach to capturing and utilizing the fundamental attributes of those key components to determine the effects of vulnerabilities on a system's security. Our ontology for vulnerability management (OVM) has been populated with all vulnerabilities in NVD (see http://nvd.nist.gov/scap.cfm) with additional inference rules and knowledge discovery mechanisms so that it may provide a promising pathway to make security automation program (NIST Version 1.0, 2007 NIST. 2007. Information Security Automation Program, Automating Vulnerability Management, Security Measurement, and Compliance, Version 1.0 Beta revised May 22 [Google Scholar]) more effective and reliable.  相似文献   

19.
As the volume of data generated each day continues to increase, more and more interest is put into Big Data algorithms and the insight they provide.? Since these analyses require a substantial amount of resources, including physical machines, power, and time, reliable execution of the algorithms becomes critical. This paper analyzes the error resilience of a select group of popular Big Data algorithms and shows how they can effectively be made more fault-tolerant. Using KULFI (http://github.com/quadpixels/kulfi, 2013) and the LLVM (Proceedings of the 2004 international symposium on code generation and optimization (CGO 2004), San Jose, CA, USA, 2004) compiler for compilation allows injection of artificial soft faults throughout these algorithms, giving a thorough analysis of how faults in different locations can affect the outcome of the program. This information is then used to help guide incorporating fault tolerance mechanisms into the program, making them as impervious as possible to soft faults.  相似文献   

20.
Data classification tasks often concern objects described by tens or even hundreds of features. Classification of such high-dimensional data is a difficult computational problem. Feature selection techniques help reduce the number of computations and improve classification accuracy.

In Michalak and Kwasnicka (2006a Michalak , K. , and H. Kwasnicka . 2006a . Correlation-based feature selection strategy in classification problems . Applied Mathematics and Computer Science 16 ( 4 ): 503511 . [Google Scholar], b Michalak , K. , and H. Kwasnicka . 2006b. Correlation-based feature selection strategy in neural classification. In ISDA'06: Proceedings of the sixth international conference on intelligent systems design and applications (ISDA'06) , 741746. Washington , DC , USA : IEEE Computer Society. [Google Scholar]) we proposed a feature selection strategy that selects features in an individual or pairwise manner based on the assessed level of dependence between features. In the case of numerical features, this level of dependence can be expressed numerically using linear correlation coefficients. In this paper, the feature selection problem is addressed in the case of a mixture of nominal and numerical features. The feature similarity measure used in this case is based on the probabilistic dependence between features. This similarity function is used in an iterative feature selection procedure, which we proposed for selecting features prior to classification. Experiments prove that using the probabilistic dependence similarity function along with the presented feature selection procedure can improve computation speed while preserving classification accuracy in the case of mixed nominal and numerical features.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号