首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
Based on a case study of the popular plagiarism detection service Turnitin, particularly its “Legal Document,” this article contends that plagiarism detection services should be viewed as digital archives. Services like Turnitin not only seek to regulate what constitute original texts and appropriate writing practices but also to advance conceptions of the work that archives should do in storing and circulating texts in digital spaces. This article concludes that the services we sometimes use to ensure the integrity of students’ texts can themselves be of questionable integrity—largely through the design of their archives. As increasing numbers of texts take digital form, the problems and promise of digital archives will demand thoughtful responses that do not rush to replace questionable writing and research practices with equally troubling pedagogical and archival ones. These thoughtful responses start with exploring the use of plagiarism detection service archival technology in unadvertised ways.  相似文献   

2.
In this essay, I analyze Turnitin.com, <http://www.turnitin.com>, as a form of anti-plagiarism therapy, demonstrating some of the ways the service maps identity and manages transgression in accordance with traditional values pertaining to authorship and intellectual property. I propose a broad-based approach to Turnitin.com that addresses the many historical, institutional, economic, cultural, and pedagogical factors informing current debates about plagiarism and plagiarism detection. In particular, I argue, first, that Turnitin.com reifies identity categories via plagiarism discourse disguised as educational content. Secondly, Turnitin.com socializes student writers toward traditional notions of textual normality and docility. And third, Turnitin.com represents a new phase in the bureaucratization of composition instruction consistent with past administrative practices and reflective of emerging corporate management alliances in higher education.  相似文献   

3.
Plagiarism occurs when the content is copied without permission or citation. One of the contributing factors is that many text documents on the internet are easily copied and accessed. This paper introduces a plagiarism detection technique based on the Semantic Role Labeling (SRL). The technique analyses and compares text based on the semantic allocation for each term inside the sentence. SRL is superior in generating arguments for each sentence semantically. Weighting for each argument generated by SRL to study its behaviour is also introduced in this paper. It was found that not all arguments affect the plagiarism detection process. In addition, experimental results on PAN-PC-09 data sets showed that our method significantly outperforms the modern methods for plagiarism detection in terms of Recall, Precision and F-measure.  相似文献   

4.
针对以维吾尔语书写的文档间的相似性计算及剽窃检测问题,提出了一种基于内容的维吾尔语剽窃检测(U-PD)方法。首先,通过预处理阶段对维吾尔语文本进行分词、删除停止词、提取词干和同义词替换,其中提取词干是基于N-gram 统计模型实现。然后,通过BKDRhash算法计算每个文本块的hash值并构建整个文档的hash指纹信息。最后,根据hash指纹信息,基于RKR-GST匹配算法在文档级、段落级和句子级将文档与文档库进行匹配,获得文档相似度,以此实现剽窃检测。通过在维吾尔语文档中的实验评估表明,提出的方法能够准确检测出剽窃文档,具有可行性和有效性。  相似文献   

5.
Readers of documents on CRT displays report difficulties in remembering whereabouts in a lengthy text they previously read something. Four experiments explore whether subdividing such texts, at appropriate thematic boundaries, into five successive coloured sections can aid readers' retrieval of information. Experiment 1, using texts presented on coloured paper, showed that this use of colour helped readers relocate information. Experiment 2 presented the same texts on a CRT, but variation in the colour of the characters on the screen did not help readers relocate information. Experiment 3 replicated the findings of experiment 2, with texts differing in both content and structure from those used previously. Experiment 4, again using coloured text on a CRT display, showed that giving readers a visible guide to the ordering of the coloured sections was not sufficient to restore the advantage that coloured pages had for texts presented on paper. The implications of these findings for variation in the background and foreground colouring of multi-window displays are discussed, but the main conclusion concerns the caution needed when transferring information design solutions across media.  相似文献   

6.
This study investigates writers enacting rhetorical invention within a non-academic digital environment. The data described and analyzed came from dating site participants who completed surveys about their composing processes and who provided profiles they had previously written for a dating site. In particular, the investigation considers the inventional choices writers make to represent themselves through discourse in this particular environment. Qualitative textual analysis led to the identification of these digital writers’ invention strategies along four dimensions: assessing self, assessing task, planning/composing text, and assessing interaction. The findings complicate our understanding of the relationship between rhetorical invention, audience, and impression management and suggest that rather than engaging in invention within the discovery versus creation binary, digital writers actually employ a spectrum of approaches to invention. Overall, this study suggests that audience is deeply connected to invention throughout the profile composing process: the participants expressed concerns about ethos, pathos, and the process of impression management within their writing, and these concerns all connected to audience in various ways.  相似文献   

7.
This article focuses on the conceptual issues faced by scholarlyeditors and textual studies specialists. Theoretical debatein this general field is still active as digital texts presentspecial problems and magnify others. Older theory and methodologyare hampered by unacknowledged, sometimes inappropriate culturalvalues and other limitations, and are not always useful in connectionwith digital texts. Nevertheless, the distinction between theabstract work and its concrete expression is influential bothwithin and outside the field. In this approach, the conceptof authenticity relates to the degree of change a work undergoesor the accuracy of the ‘instructions’ for its reconstitution.Whether the digital text is best thought of as immaterial ormaterial is not as crucial as might first appear. The way adigital text is made visible is important, though potentiallyparadoxical. In order to be workable, the concept of authenticationby instructions needs further technical assistance, like thatprovided by the Just-in-Time Markup System. But, despite itslimitations, traditional textual scholarship still has muchto offer textual studies in digital environments.  相似文献   

8.
Abstract

We report two studies investigating readers' ability to allocate limited time adaptively across online texts of varying difficulty. In both studies participants were asked to learn about the human heart and were free to allocate time to 4 separate online texts about the heart but did not have enough time to read them all thoroughly. Of particular interest was whether readers attempted to select the best text for them (by sampling the texts before reading) or to monitor texts while reading them and continue reading any text judged good enough (a satisficing strategy). We argue that both strategies can be considered adaptive, depending on properties of readers, texts, and tasks. Experiment 1 tested readers with a range of background knowledge and allowed them either 7 or 15 min study time. It showed that participants were adaptive in how they allocated their time in that more knowledgeable readers spent more time reading more difficult texts. Satisficing was a much more common strategy than sampling. Experiment 2 showed that providing outline overviews of each text dramatically increased the number of participants using a sampling strategy so that it became the modal strategy. However, this change in strategy had no effect on learning. Outline overviews presumably changed readers' perception of the ease with which relevant dimensions of text quality can be judged.  相似文献   

9.
In this article, I consider the changing nature of publications in relation to technology and tenure, presenting a taxonomy of scholarly publications: online scholarship, scholarship about new media, and new media scholarship. I offer a focused definition of new media texts as ones that juxtapose semiotic modes in new and aesthetically pleasing ways and, in doing so, break away from print traditions so that written text is not the primary rhetorical means. By applying this definition to scholarly online publications, readers can be better prepared to recognize and interpret the meaning-making potential of aesthetic modes used in new media scholarly texts. I conclude by offering an analysis of a scholarly new media text, “Digital Multiliteracies.”  相似文献   

10.
利用聚类和粗糙集进行文本分类研究   总被引:4,自引:0,他引:4  
文本信息是人们所接触到的最主要的信息表示方式,对文本信息高效管理是文本分类研究的重要内容之一。该文在空间向量模型的基础上将文本聚类和粗糙集理论的属性约简相结合,提出了一种新的文本分类方法,实验表明该方法可提高文本分类效率。  相似文献   

11.
People express their opinions about things like products, celebrities and services using social media channels. The analysis of these textual contents for sentiments is a gold mine for marketing experts as well as for research in humanities, thus automatic sentiment analysis is a popular area of applied artificial intelligence. The chief objective of this paper is to investigate automatic sentiment analysis on social media contents over various text sources and languages. The comparative findings of the investigation may give useful insights to artificial intelligence researchers who develop sentiment analyzers for a new textual source. To achieve this, we describe supervised machine learning based systems which perform sentiment analysis and we comparatively evaluate them on seven publicly available English and Hungarian databases, which contain text documents taken from Twitter and product review sites. We discuss the differences among these text genres and languages in terms of document- and target-level sentiment analysis.  相似文献   

12.
Netta Iivari 《AI & Society》2009,23(4):511-528
This paper outlines a critical, textual approach for the analysis of the relationship between different actors in information technology (IT) production, and further concretizes the approach in the analysis of the role of users in the open source software (OSS) development literature. Central concepts of the approach are outlined. The role of users is conceptualized as reader involvement aiming to contribute to the configuration of the reader (to how users and the parameters for their work practices are defined in OSS texts). Afterwards, OSS literature addressing reader involvement is critically reviewed. In OSS context, the OSS writers as readers configure the reader and other readers are assumed to be capable of and interested in commenting the texts. A lack of OSS research on non-technical reader involvement is identified. Furthermore, not only are the OSS readers configured, but so are OSS writers. In OSS context while writers may be empowered, this clearly does not apply to the non-technical OSS readers. Implication for research and practice are discussed.
Netta IivariEmail:
  相似文献   

13.
Digital plagiarism is a problem for educators all over the world. There are many software tools on the market for uncovering digital plagiarism. Most of them can work only with text submissions. In this paper, we present a new architecture for a plagiarism detection tool that can work with many different kinds of digital submissions, from plain or formatted texts to audio podcasts. The open architecture is based on converting the digital submission into text form for processing by a plagiarism detection algorithm. To process non-text submissions, the system is extended with the appropriate converter. Such an open architecture makes the anti-plagiarism toolbox universal and easily adaptable for processing virtually any kind of digital submissions. This paper describes a software prototype based on the proposed architecture and presents the results of its implementation on a large archive of student papers.  相似文献   

14.
姜欣  姜怡  方淼 《计算机应用》2010,30(7):1938-1940
运用算法可更加科学地量化出翔实的显性互文线索,这对于追溯文本间的关联,理解和翻译文本都有着重要意义。以茶典籍文本为例,使用并比较了4种互文度量方法,即戴斯系数、匹配系数、全置信度和余弦,并给出用于文本辅助翻译的索引方法。文本互文度与互文度矩阵揭示了文本间的影响与关联。实验结果与性能分析表明余弦度量结果最好,基于互文性的文本翻译索引可为更加精确地理解和翻译相关文本提供有价值的参考。  相似文献   

15.
代码剽窃是程序语言课程中经常出现的一种作弊行为,严重破坏正常的教学秩序。检测剽窃的程序代码、验证学生程序作业的原创性在程序语言教学中就尤为重要。结合程序代码相似度检测技术中的属性计数技术和结构度量技术,提出一种适用于Python程序的相似度检测方法,该方法能够有效地计算出学生Python程序作业之间的相似度。  相似文献   

16.
针对物联网(IoT)服务描述文本篇幅较短、特征稀疏,直接采用传统的主题模型对IoT服务建模得到的聚类效果不佳,从而导致无法发现最佳服务的问题,提出了一种基于BTM的IoT服务发现方法。该方法首先利用BTM挖掘现有IoT服务的隐含主题,并通过全局主题分布和主题-词分布计算推理得到服务文档-主题概率分布;其次利用K-means算法对服务进行聚类,并返回服务请求的最佳匹配结果。实验结果分析表明,该方法能够有效提高IoT服务的聚类效果,从而得到匹配的最佳服务。与现有的HDP(Hierarchical Dirichlet Process)、基于K-means的隐狄利克雷分配(LDA-K)等方法相比,该方法进行最佳服务发现的准确度(Precision)和归一化折损累积增益(NDCG)均有一定幅度的提高。  相似文献   

17.
Both traditional and computerized scholars face problems when they attempt empirical research on women writers and women readers using currently available computational tools. This essay discusses some factors that have inhibited empirical research; it develops its examples from work in progress on 18th century English poetry and on reader responses. A number of large linguistic and text databases are almost useless for research on women writers because works by women are either not included or represented by easily accessible, rather than editorially clean, texts. Traditional and contemporary reader response studies are also insufficiently empirical for reasons of sexual bias or flaws in research design.Rosanne G. Potter is a Professor at Iowa State University, a teacher of drama and Women Studies, and editor ofLiterary Computing and Literary Criticism: Theoretical and Practical Essays on Theme and Rhetoric (University of Pennsylvania Press, 1989). She has published essays inCHum, Style, Modern Drama, and in a number of collections on humanities computing. She is currently building a large database containing the texts of, and reader responses to, ten modern plays.  相似文献   

18.
The automatic generation of summaries using cases (GARUCAS) environment was designed as an intelligent system to help one learn to summarize narrative texts by means of examples within a case‐based reasoning (CBR) approach. Each example, modeled as a case, contains a conceptual representation of the initial textual state, the different steps of the summarization method, and the representation of the final textual state obtained. The CBR approach allows the environment to summarize new texts in order to produce new text summarization examples with respect to some predefined educational objectives. Within GARUCAS, this approach is used at two levels: an event level (EL) in order to identify essential elements of a story, and the clause level (CL) to make the summary more readable. The purpose of this article is to describe the GARUCAS environment and the model used to build story summarization examples and summarize new texts. This model is based on important psycholinguistic work concerning event and narrative structures and text revision rules. An experiment was conducted with 12 short stories. The GARUCAS environment can classify the stories according to their structure analogy and reuse the summarization method of the most similar text. Such an approach can be reused for any kind of texts or summary types. © 2003 Wiley Periodicals, Inc.  相似文献   

19.
A dramatic work may be seen either as an event or as a text; the TEI guidelines make it possible to encode a dramatic work in either way, but do not attempt to solve the difficult problem of doing both at once. The basic element of a dramatic work, when seen as a text, is the speech; the guidelines also provide elements for encoding other familiar parts of dramatic texts (such as stage directions and cast lists), as well as for encoding analytic information on various aspects of texts and performances that is not normally included in printed dramatic texts. There are often other formal structures in dramatic works that intersect with the structure of speeches — metrical structures, for example; we discuss approaches for encoding these structures.John Lavagnino is a graduate student in English and American Literature at Brandeis University. His fields of interest include Renaissance drama, modern literature, textual scholarship, and electronic textuality. He is Electronics Editor ofThe Collected Works of Thomas Middleton (forthcoming from Oxford University Press).Elli Mylonas is a Lead Project Analyst for the Scholarly Technology Group at Brown University. Formerly she was the Managing Editor of the Perseus Project. Her areas of interest are Roman poetry, textual markup and SGML, and hypertext.The work described in this paper is the outcome of the discussions of the Performance Working Group, whose members are Elli Mylonas (chair), Rosanne G. Potter, John Lavagnino, and Lou Burnard. The authors wish to thank the other two members for their contributions.  相似文献   

20.
王立杰  李萌  蔡斯博  李戈  谢冰  杨芙清 《软件学报》2012,23(6):1335-1349
随着Web服务技术的不断成熟和发展,互联网上出现了大量的公共Web服务.在使用Web服务开发软件系统的过程中,其文本描述信息(例如简介和使用说明等)可以帮助服务消费者直观有效地识别和理解Web服务并加以利用.已有的研究工作大多关注于从Web服务的WSDL文件中获取此类信息进行Web服务的发现或检索,调研发现,互联网上大部分Web服务的WSDL文件中普遍缺少甚至没有此类信息.为此,提出一种基于网络信息搜索的从WSDL文件之外的信息源为Web服务扩充文本描述信息的方法.从互联网上收集包含目标Web服务特征标识的相关网页,基于从网页中抽取出的信息片段,利用信息检索技术计算信息片段与目标Web服务的相关度,并选取相关度较高的文本片段为Web服务扩充文本描述信息.基于互联网上的真实数据进行的实验,其结果表明,可为约51%的互联网上的Web服务获取到相关网页,并为这些Web服务中约88%扩充文本描述信息.收集到的Web服务及其文本描述信息数据均已公开发布.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号