首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
A large number of internet users share their knowledge and opinions in online social networks like forums, weblogs, etc. This fact has attracted many researchers from different fields to study online social networks. The Persian language is one of the dominant languages in the Middle East which is the official language of Iran, Afghanistan and Tajikistan; so, a large number of Persians are active in online social networks. Despite this fact, very few studies exist about Persian social networks. In this paper we will study the characteristics of Persian bloggers based on a new collection, named irBlogs. The collection contains nearly 5 million posts and the network of more than 560,000 Persian bloggers which assures the reliability of the results of this study. Some of the analyzed characteristics are: the similarities and the differences between formal Persian and the language style that is used by Persian bloggers, the interests of the bloggers and the impact of other web resources on Persian blogosphere. Our analysis show that IT, sports, society, culture and politics are the main interests of Persian bloggers. Also, analysis of the links that are shared by Persian bloggers shows that news agencies, knowledge-bases and other social networks have a great impact on Persian bloggers and they are interested to share multimedia content.  相似文献   

2.
3.
4.
5.
This paper describes a project to write a simulator for the native mode text editor of one computer, which could be run on a different computer. This was done to give a user of both computers a common editing language. The method of design and construction is presented together with brief details of the syntax and semantics of the editing language.  相似文献   

6.
A framework for fast text analysis, which is developed as a part of the Texterra project, is described. Texterra provides a scalable solution for the fast text processing on the basis of novel methods that exploit knowledge extracted from the Web and text documents. For the developed tools, details of the project, use cases, and evaluation results are presented.  相似文献   

7.
Conclusion While this system does not supply all of the features we might wish, it does facilitate greatly many of the tasks that humanists and language analysts often perform. Its general flexibility should enable the user to design and implement procedures that fit his analytic model more precisely and sensitively than is generally possible. The results, hopefully, will be a greater number of computer-assisted works that are of definite substantive importance within the user's discipline.  相似文献   

8.
CHEF is an interactive text editor for use with both printing and display terminals. Its prime field of application is computer source program editing but it has some word processing capabilities that make it useful for documentation and general text editing work. There is a comprehensive set of whole-line operations, including block moves and the insertion of text from external files, together with substring replacement and line segmentation based on a flexible pattern-matching algorithm. There is a set of one-line buffers for temporary storage of lines or command strings and complex command sequences can be built up by macro substitution. Considerable effort has been made to design a command syntax that is flexible and consistent and, at the same time, minimizes effort during the editing process. CHEF copies the user's file into an internal work-space so that the original is not disturbed until the user is satisfied with the results of the editing session. A virtual memory technique is used to provide a work-space that can be almost any desired size, with random access to any part and efficient editing operations. CHEF is written in BCPL and has already been implemented on four different machines.  相似文献   

9.
J. Wellington. (2000). Teaching and learning secondary science: Contemporary issues and practical approaches. London: Routledge. ISBN 0–4152–1403–3  相似文献   

10.
Millions of handwritten bank cheques are processed manually every day in banks and other financial institutions all over the world. Substitution of manual cheque processing with automatic cheque reader system saves time and the cost of processing. In the recent years, systems such as A2iA have been made in order to automate processing of Latin cheques. Normally, these systems are based on the standard structures of cheques such as Check 21 in the USA or Check 006 in Canada. There are major problems in traditional (currently used) Persian bank cheques, which yield low accuracy and computational cost in their automatic processing. In this paper, in order to solve these problems, a novel structure for Persian handwritten bank cheques is presented. Importance and supremacy of this new structure for Persian handwritten bank cheques is shown by conducting several experiments on our created database of cheques based on the new structure. The created database includes 500 handwritten bank cheques based on the presented structure. Experimental results verify the usefulness and importance of the new structure in automatic processing of Persian handwritten bank cheques which provides a standard guideline for automatic processing of Persian handwritten bank cheques comparable to Check 21 or Check 006.  相似文献   

11.
While operating system command languages have improved in recent years, the advances have not yet been widely applied to other command interpreters. This paper describes an editor that has been given two features popular in operating system command languages — i/o redirection and programmable command files. The result is suited both to editing and to some repetitive reformatting tasks often solved by one-shot, ad hoc programs. Examples display the utility of the extensions, and implications for still other command interpreters are discussed.  相似文献   

12.
13.
A simple yet flexible method of editing text is described which is applicable to all forms of character based command processing applications. The technique greatly increases the friendliness of a text driven interface but does not interfere with most existing command conventions; it can also be generalised to form the basis of a powerful and easy to use text editor. This paper describes the details and basic philosophy of the editing interface and describes its successful use in two applications (command processor and calculator) which are not normally associated with text editing requirements.  相似文献   

14.
The natural distribution of textual data used in text classification is often imbalanced. Categories with fewer examples are under-represented and their classifiers often perform far below satisfactory. We tackle this problem using a simple probability based term weighting scheme to better distinguish documents in minor categories. This new scheme directly utilizes two critical information ratios, i.e. relevance indicators. Such relevance indicators are nicely supported by probability estimates which embody the category membership. Our experimental study using both Support Vector Machines and Naïve Bayes classifiers and extensive comparison with other classic weighting schemes over two benchmarking data sets, including Reuters-21578, shows significant improvement for minor categories, while the performance for major categories are not jeopardized. Our approach has suggested a simple and effective solution to boost the performance of text classification over skewed data sets.  相似文献   

15.
It is well known that the classification effectiveness of the text categorization system is not simply a matter of learning algorithms. Text representation factors are also at work. This paper will consider the ways in which the effectiveness of text classifiers is linked to the five text representation factors: “stop words removal”, “word stemming”, “indexing”, “weighting”, and “normalization”. Statistical analyses of experimental results show that performing “normalization” can always promote effectiveness of text classifiers significantly. The effects of the other factors are not as great as expected. Contradictory to common sense, a simple binary indexing method can sometimes be helpful for text categorization.  相似文献   

16.
The concept of variation is essential in geometric design. It is surprising that patterns very different may be variations of the same model. We define two families of pentagonal patterns with three kind of variations, and give some suggestions how to analyse these patterns and create in this style. We then search for self-similarity systems in a strict sense. Although from a systematic search, the two solutions proposed here can also generate some traditional 2-level patterns. In searching for subdivisions of the tiles into rhombuses, we found two solutions. Both can be compatible with the Binary Tiling (not with the Penrose Tiling). Then, using the concept of X-Tiles defined in a previous paper (Castera et al., http://castera.net/entrelacs/public/articles/Flying_Patterns.pdf, 2011), we find new relationship between the two families of pentagonal patterns. In the last chapter we show and comment some examples taken from traditional architecture in Iran, and infer a self-similar system for pattern with interlaces from a 2-level tiling in Isfahan. This paper reflect the point of view of a pattern designer.  相似文献   

17.
18.
James Sneeringer 《Software》1978,8(5):543-557
This paper describes the design of the user interface of a text editor named Occam with the goal of communicating by example the process of user-interface design. An attempt is made to induce principles; where that attempt falls, the raw details are presented. First Occam Itself Is described, and then aspects of its user Interface are used to exemplify (1) power versus ease of learning, (2) the use of prototypes and user feedback, (3) the importance of planning and (4) error detection and handling.  相似文献   

19.
Automatic text summarization is an essential tool in this era of information overloading. In this paper we present an automatic extractive Arabic text summarization system where the user can cap the size of the final summary. It is a direct system where no machine learning is involved. We use a two pass algorithm where in pass one, we produce a primary summary using Rhetorical Structure Theory (RST); this is followed by the second pass where we assign a score to each of the sentences in the primary summary. These scores will help us in generating the final summary. For the final output, sentences are selected with an objective of maximizing the overall score of the summary whose size should not exceed the user selected limit. We used Rouge to evaluate our system generated summaries of various lengths against those done by a (human) news editorial professional. Experiments on sample texts show our system to outperform some of the existing Arabic summarization systems including those that require machine learning.  相似文献   

20.
In this paper, we present our attempts to design and implement a large-coverage computational grammar for the Persian language based on the Generalized Phrase Structured Grammar (GPSG) model. This grammatical model was developed for continuous speech recognition (CSR) applications, but is suitable for other applications that need the syntactic analysis of Persian. In this work, we investigate various syntactic structures relevant to the modern Persian language, and then describe these structures according to a phrase structure model. Noun (N), Verb (V), Adjective (ADJ), Adverb (ADV), and Preposition (P) are considered basic syntactic categories, and X-bar theory is used to define Noun phrases, Verb phrases, Adjective phrases, Adverbial phrases, and Prepositional phrases. However, we have to extend Noun phrase levels in X-bar theory to four levels due to certain complexities in the structure of Noun phrases in the Persian language. A set of 120 grammatical rules for describing different phrase structures of Persian is extracted, and a few instances of the rules are presented in this paper. These rules cover the major syntactic structures of the modern Persian language. For evaluation, the obtained grammatical model is utilized in a bottom-up chart parser for parsing 100 Persian sentences. Our grammatical model can take 89 sentences into account. Incorporating this grammar in a Persian CSR system leads to a 31% reduction in word error rate.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号