首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
    
In certain bilingual and multi‐lingual societies, translated legal documents are as important as the original legal documents because they have the same legal status as the originals. However, there is little reported work on the retrieval and management of bilingual legal documents. We describe the design and development of a bilingual document retrieval and management prototype, called ELDoS, which is used by court interpreters and judges from the Hong Kong Judiciary. Since the speed of retrieval is a major concern for user acceptance, and therefore for widespread deployment of the system, the architecture of the prototype is designed to balance the workload of the client and server. Extensible Markup Language (XML) is used to mark up the bilingual legal documents for a variety of document retrieval and management tasks. XML enables the use of XML Stylesheet Language Transformation (XSLT) to align bilingual data in the client, instead of the server, and improve alignment speed linearly with respect to the size of the document, using a high‐end PC, when the server has no concurrent access. The design of the interface was continually improved after extensive consultation with court interpreters and after the user acceptance tests. In our evaluation, the facilities for highlighting translated terms have a macro‐averaged precision of 90+% and a macro‐average recall of 80+%, which were considered acceptable by our users. We believe that the experience in the design and development of this prototype is applicable to other language pairs as well as to other domains. Copyright © 2002 John Wiley & Sons, Ltd.  相似文献   

2.
Applying EuroWordNet to Cross-Language Text Retrieval   总被引:1,自引:0,他引:1  
We discuss ways in which EuroWordNet (EWN) can be used in multilingual information retrieval activities, focusing on two approaches to Cross-Language Text Retrieval that use the EWN database as a large-scale multilingual semantic resource. The first approach indexes documents and queries in terms of the EuroWordNet Inter-Lingual-Index, thus turning term weighting and query/document matching into language-independent tasks. The second describes how the information in the EWN database could be integrated with a corpus-based technique, thus allowing retrieval of domain-specific terms that may not be present in our multilingual database. Our objective is to show the potential of EuroWordNet as a promising alternative to existing approaches to Cross-Language Text Retrieval.  相似文献   

3.
Legal text retrieval traditionally relies upon external knowledge sources such as thesauri and classification schemes, and an accurate indexing of the documents is often manually done. As a result not all legal documents can be effectively retrieved. However a number of current artificial intelligence techniques are promising for legal text retrieval. They sustain the acquisition of knowledge and the knowledge-rich processing of the content of document texts and information need, and of their matching. Currently, techniques for learning information needs, learning concept attributes of texts, information extraction, text classification and clustering, and text summarization need to be studied in legal text retrieval because of their potential for improving retrieval and decreasing the cost of manual indexing. The resulting query and text representations are semantically much richer than a set of key terms. Their use allows for more refined retrieval models in which some reasoning can be applied. This paper gives an overview of the state of the art of these innovativetechniques and their potential for legal text retrieval.  相似文献   

4.
Considerable attention has been given to the accessibility of legal documents, such as legislation and case law, both in legal information retrieval (query formulation, search algorithms), in legal information dissemination practice (numerous examples of on-line access to formal sources of law), and in legal knowledge-based systems (by translating the contents of those documents to ready-to-use rule and case-based systems). However, within AI & law, it has hardly ever been tried to make the contents of sources of law, and the relations among them, more accessible to those without a legal education. This article presents a theory about translating sources of law into information accessible to persons without a legal education. It illustrates the theory by providing two elaborated examples of such translation ventures. In the first example, formal sources of law in the domain of exchanging police information are translated into rules of thumb useful for policemen. In the second example, the goal of providing non-legal professionals with insight into legislative procedures is translated into a framework for making available sources of law through an integrated legislative calendar. Although the theory itself does not support automating the several stages described, in this article some hints are given as to what such automation would have to look like.
Laurens MommersEmail:
  相似文献   

5.
Legal knowledge based systems (KBSs) are, by definition, grounded on law. Very often the relevant law is subject to routine amendment and repeal, such changes occurring at irregular and unpredictable intervals. These systems are thus particularly affected by significant problems of adaptation as a result, a fact which has limited their practical take-up. If they are to be of more practical use the maintenance issues associated with these systems must be taken seriously. In this paper we discuss the issues associated with the maintenance of legal KBSs and describe a suite of maintenance tools designed to address these issues.  相似文献   

6.
Isomorphism and legal knowledge based systems   总被引:1,自引:1,他引:0  
This paper discusses some engineering considerations that should be taken into account when building a knowledge based system, and recommends isomorphism, the well defined correspondence of the knowledge base to the source texts, as a basic principle of system construction in the legal domain. Isomorphism, as it has been used in the field of legal knowledge based systems, is characterised and the benefits which stem from its use are described. Some objections to and limitations of the approach are discussed. The paper concludes with a case study giving a detailed example of the use of the isomorphic approach in a particular application.  相似文献   

7.
To help design an environment in which professionals without legal training can make effective use of public sector legal information on planning and the environment – for Add-Wijzer, a European e-government project – we evaluated their perceptions of usefulness and usability. In concurrent think-aloud usability tests, lawyers and non-lawyers carried out information retrieval tasks on a range of online legal databases. We found that non-lawyers reported twice as many difficulties as those with legal training (p = 0.001), that the number of difficulties and the choice of database affected successful completion, and that the non-lawyers had surprisingly few problems understanding legal terminology. Instead, they had more problems understanding the syntactical structure of legal documents and collections. The results support the constraint attunement hypothesis (CAH) of the effects of expertise on information retrieval, with implications for the design of systems to support the effective understanding and use of information.  相似文献   

8.
基于多相关本体的模糊信息检索模型   总被引:1,自引:0,他引:1  
俞扬信 《计算机工程》2010,36(20):68-70
根据概念及概念之间的语义,提出一种多相关本体的模糊信息检索模型,用本体的关系表示模糊关系。描述本体信息检索模型的处理过程及检索机制,讨论应用不同类型本体的检索效果和影响,并采用TREC的评价方法评估该模型。结果证明该模型具有较好的整体性能比,能改善用户需要的检索结果。  相似文献   

9.
随着市场经济发展,网络成为了人们传播信息的主要途径。但是网络是把双刃剑,网络信息确实可以给人们带来极大的便利,但同时也会带来毒瘤。因此,必须要构建网络信息法制化,让网络信息成为人们信赖的传播渠道。本文就是从网络信息法制化建设的问题入手,有针对性在给出相关对策。  相似文献   

10.
This paper presents a four layer model for working with legal knowledge in expert systems. It distinguishes five sources of knowledge. Four contain basic legal knowledge found in published and unpublished sources. The fifth consists of legal metaknowledge. In the model the four basic legal knowledge sources are placed at the lowest level. The metaknowledge is placed at levels above the other four knowledge sources. The assumption is that the knowledge is represented only once. The use of metaknowledge at various levels should make it possible to use the appropriate knowledge for the problem presented to the system. The knowledge has to be represented as closely to the original format as possible for this purpose. Suitable representation formalisms for the various types of knowledge in the five knowledge sources are discussed. It is not possible to indicate a best representation formalism for each knowledge source.  相似文献   

11.
Legal Information Retrieval (IR) research has stressed the fact that legal knowledge systems should be sufficiently capable to interpret and handle the semantics of a database. Modeling (expert-) knowledge by using ontologies enhances the ability to extract and exploit information from documents. This contribution presents theories, ideas and notions regarding the development of dynamic electronic commentaries based on a comprehensive legal ontology. This Article is based on a paper presented at the LOAIT Workshop in conjunction with the ICAIL Conference 2005, Bologna.  相似文献   

12.
In this article we describe two core ontologies of law that specify knowledge that is common to all domains of law. The first one, FOLaw describes and explains dependencies between types of knowledge in legal reasoning; the second one, LRI-Core ontology, captures the main concepts in legal information processing. Although FOLaw has shown to be of high practical value in various applied European ICT projects, its reuse is rather limited as it is rather concerned with the structure of legal reasoning than with legal knowledge itself: as many other “legal core ontologies”, FOLaw is therefore rather an epistemological framework than an ontology. Therefore, we also developed LRI-Core. As we argue here that legal knowledge is based to a large extend on common-sense knowledge, LRI-Core is particularly inspired by research on abstract common-sense concepts. The main categories of LRI-Core are: physical, mental and abstract concepts. Roles cover in particular social worlds. Another special category are occurrences; terms that denote events and situations. We illustrate the use of LRI-Core with an ontology for Dutch criminal law, developed in the e-Court European project.  相似文献   

13.
Written laws, records and legal materials form the very foundation of a democratic society. Lawmakers, legal scholars and everyday citizens alike need, and are entitled, to access the current and historic materials that comprise, explain, define, critique and contextualize their laws and legal institutions. The preservation of legal information in all formats is imperative. Thus far, the twenty-first century has witnessed unprecedented mass-scale acceptance and adoption of digital culture, which has resulted in an explosion in digital information. However, digitally born materials, especially those that are published directly and independently to the Web, are presently at an extremely high risk of permanent loss. Our legal heritage is no exception to this phenomenon, and efforts must be put forth to ensure that our current body of digital legal information is not lost. The authors explored the role of the United States law library community in the preservation of digital legal information. Through an online survey of state and academic law library directors, it was determined that those represented in the sample recognize that digitally born legal materials are at high risk for loss, yet their own digital preservation projects have primarily focused upon the preservation of digitized print materials, rather than digitally born materials. Digital preservation activities among surveyed libraries have been largely limited by a lack of funding, staffing and expertise; however, these barriers could be overcome by collaboration with other institutions, as well as participation in a large-scale regional or national digital preservation movement, which would allow for resource-sharing among participants. One such collaborative digital preservation program, the Chesapeake Project, is profiled in the article and explored as a collaborative effort that may be expanded upon or replicated by other institutions and libraries tackling the challenges of digital preservation.  相似文献   

14.
提出了一个基于语义、面向自然语言处理的多文种信息处理平台的模型SMIPP.该模型主要由应用程序/用户接口层、文字输入层和文字输出层、信息处理服务层、语料库层、多文种代码体系SemaCode层和语言Ontology层组成,该平台把各种语言文字统一用具有自描述能力的SemaCode表示,并通过语言Ontology来表示词汇的语义以及在各个文种间的联系,再通过服务形式提供各种基于语料库的文字信息处理功能,是一个全新的多文种信息处理模型.  相似文献   

15.
重新审视跨语言信息检索   总被引:7,自引:1,他引:6  
阻碍互联网资源在世界范围内广泛共享的一个主要障碍是多语言问题,而跨语言信息检索是解决这个问题的有效方法之一。本文从定义跨语言信息检索系统开始,给出了一个标准的跨语言信息检索系统框架和评价方法,对主流研究方法进行了重新审视,进一步明确指出了跨语言信息检索中必须解决的核心问题,最后通过分析研究现状给出了未来可能的重点研究方向。  相似文献   

16.
    
Abstract: Vast amounts of medical information reside within text documents, so that the automatic retrieval of such information would certainly be beneficial for clinical activities. The need for overcoming the bottleneck provoked by the manual construction of ontologies has generated several studies and research on obtaining semi-automatic methods to build ontologies. Most techniques for learning domain ontologies from free text have important limitations. Thus, they can extract concepts so that only taxonomies are generally produced although there are other types of semantic relations relevant in knowledge modelling. This paper presents a language-independent approach for extracting knowledge from medical natural language documents. The knowledge is represented by means of ontologies that can have multiple semantic relationships among concepts.  相似文献   

17.
基于本体的法律信息语义检索   总被引:3,自引:0,他引:3  
网络中海量的法律信息及其多义性为准确、高效的查询检索提出了难题,进而也桎梏着司法判案、决策的方法。为了较好地解决司法信息检索中存在的问题,通过对国内外领域本体方法、语义Web技术的研究,借助本体的概念构建了面向案例的法律信息语义检索原型,为法律领域的知识管理和信息检索提供了可借鉴的参考。  相似文献   

18.
    
Digitalization has changed the way of information processing, and new techniques of legal data processing are evolving. Text mining helps to analyze and search different court cases available in the form of digital text documents to extract case reasoning and related data. This sort of case processing helps professionals and researchers to refer the previous case with more accuracy in reduced time. The rapid development of judicial ontologies seems to deliver interesting problem solving to legal knowledge formalization. Mining context information through ontologies from corpora is a challenging and interesting field. This research paper presents a three tier contextual text mining framework through ontologies for judicial corpora. This framework comprises on the judicial corpus, text mining processing resources and ontologies for mining contextual text from corpora to make text and data mining more reliable and fast. A top-down ontology construction approach has been adopted in this paper. The judicial corpus has been selected with a sufficient dataset to process and evaluate the results. The experimental results and evaluations show significant improvements in comparison with the available techniques.  相似文献   

19.
Kwong  Linus W.  Ng  Yiu-Kai 《World Wide Web》2003,6(3):281-303
To retrieve Web documents of interest, most of the Web users rely on Web search engines. All existing search engines provide query facility for users to search for the desired documents using search-engine keywords. However, when a search engine retrieves a long list of Web documents, the user might need to browse through each retrieved document in order to determine which document is of interest. We observe that there are two kinds of problems involved in the retrieval of Web documents: (1) an inappropriate selection of keywords specified by the user; and (2) poor precision in the retrieved Web documents. In solving these problems, we propose an automatic binary-categorization method that is applicable for recognizing multiple-record Web documents of interest, which appear often in advertisement Web pages. Our categorization method uses application ontologies and is based on two information retrieval models, the Vector Space Model (VSM) and the Clustering Model (CM). We analyze and cull Web documents to just those applicable to a particular application ontology. The culling analysis (i) uses CM to find a virtual centroid for the records in a Web document, (ii) computes a vector in a multi-dimensional space for this centroid, and (iii) compares the vector with the predefined ontology vector of the same multi-dimensional space using VSM, which we consider the magnitudes of the vectors, as well as the angle between them. Our experimental results show that we have achieved an average of 90% recall and 97% precision in recognizing Web documents belonged to the same category (i.e., domain of interest). Thus our categorization discards very few documents it should have kept and keeps very few it should have discarded.  相似文献   

20.
This paper describes our work on developing a language-independent technique for discovery of implicit knowledge from multilingual information sources. Text mining has been gaining popularity in the knowledge discovery field, particularity with the increasing availability of digital documents in various languages from all around the world. However, currently most text mining tools mainly focus only on processing monolingual documents (particularly English documents): little attention has been paid to apply the techniques to handle the documents in Asian languages, and further extend the mining algorithms to support the aspects of multilingual information sources. In this work, we attempt to develop a language-neutral method to tackle the linguistics difficulties in the text mining process. Using a variation of automatic clustering techniques, which apply a neural net approach, namely the Self-Organizing Maps (SOM), we have conducted several experiments to uncover associated documents based on a Chinese corpus, Chinese-English bilingual parallel corpora, and a hybrid Chinese-English corpus. The experiments show some interesting results and a couple of potential paths for future work in the field of multilingual information discovery. Besides, this work is expected to act as a starting point for exploring the impacts on linguistics issues with the machine-learning approach to mining sensible linguistics elements from multilingual text collections.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号