首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 390 毫秒
1.
分析了软件维护当中的问题,提出了一种针对Web系统的逆向工程方法.该方法以Web站点为输入分析页面的源码,从页面链接和交互进行逆向,构建出站点的部署和任务模型,从而直观地表示出对维护人员有帮助的系统信息,同时介绍了笔者自行开发的逆向工程辅助工具.通过该方法,维护人员能够直接得到系统信息, 而不必去分析源代码,克服了缺乏设计文档的困难.通过一个Web站点的逆向实例,演示了这种方法.  相似文献   

2.
本文设计实现了一种Web信息检索系统,面向有特定需求的特殊用户群,采用基于web站点处理的情报采集策略。先对各站点页面随机采样,提取出包含敏感信息页面的web站点,再采集敏感站点中的相关页面生成本地敏感资源库,并对库中的文本页面用改进的TFIDF算法分析处理,以满足用户的查询。该系统能够提高Web页面信息的检索精确度和检测更新率,并可据某一专题方向对Web站点进行简单的自动分类。  相似文献   

3.
戴东波  印鉴 《计算机科学》2006,33(4):126-129
现有的静态Web站点结构不能满足人们准确地找到所需信息和享用个性化服务的要求。本文不但通过Web日志文件的挖掘,找出用户的频繁访问路径来改进Web站点结构,而且分析当前访问页面与后续候选推荐页面的内容相关性,形成经过内容裁剪的个性化页面来压缩Web页面内容。这样,用户可快速定位到频繁访问的后续页面位置,且页面内容大多是用户感兴趣的主题信息。在此基础上,提出了一个自适应站点模型AdaptiveSite,经过推荐质量分析,该模型具有较好的优化性能。  相似文献   

4.
基于Web页面链接和标签的聚类方法   总被引:1,自引:0,他引:1  
针对目前Web聚类效率和准确率不高的问题,提出一种基于Web页面链接结构和标签信息的聚类方法CWPBLT(clustering web pages based on their links and tags),它是通过分析Web页面中的链接结构和重要标签信息来比较页面之间的相似度,从而对Web站点中的Web页面进行聚类,聚类过程同时兼顾了Web页面结构和页面标签提供的内容信息.实验结果表明,该方法有效地提高了聚类的时间效率和准确性,是对以往仅基于页面主题内容或页面结构聚类方法的改进.  相似文献   

5.
针对目前Web聚类准确率不高的问题,提出一种基于Web页面链接结构和页面中图片主色调特征的聚类算法。通过分析Web页面中的链接结构和Web页面中所显示图片的主色调来比较页面之间的相似度,对Web站点中的Web页面进行聚类。聚类过程兼顾Web页面结构和页面的主要色彩特征。系统实验结果表明,该算法能有效提高聚类的准确性。  相似文献   

6.
结合当前Web站点的数据特点,以信息项在页面中的出现位置为信息抽取的路径,利用PAT树技术,提出了一个多Agent协作的自动信息抽取模型.该模型能够自动分析样本页面数据特征,归纳学习整个站点的数据模式,生成抽取规则,指导以后的抽取动作.实验结果表明,该模型对Web页面的结构化信息抽取具有较高的效率.  相似文献   

7.
基于元搜索引擎的个性化Web信息采集   总被引:4,自引:0,他引:4  
为了减少传统Web采集系统网络资源的耗费,并增强其个性化支持,结合用户兴趣向量模型,将元搜索引擎技术应用到Web信息采集领域中,设计一个基于元搜索引擎的个性化Web信息采集系统.该系统通过调用成员搜索引擎发现与用户兴趣相关的目标Web站点,通过爬虫程序采集目标站点上的Web页面内容.在发现兴趣站点方面更具有针对性,能有效减少爬虫的数量.重点研究了系统的体系结构、个性化Web采集的工作流程,最后给出了该系统的应用场合.  相似文献   

8.
以采用HTML为文件格式,JavaScript作为客户端脚本,JSP作为服务器端执行代码的Web应用系统为研究对象,在现有Web应用结构抽取方法所存在的缺陷分析基础下,通过静态分析Web应用系统的源代码,获得整个Web应用的目录结构和文档类型,再进一步抽取页面内主要结构元素,将所得到的信息以XML语言形式进行存储。通过构建和遍历XML语法树,抽取主要组件及组件间的关联信息,最终形成Web应用的系统结构图,从而提高Web应用系统维护和演化工作的效率,有效帮助维护人员对整个Web应用系统的理解。  相似文献   

9.
Web信息抽取通常采用的是一种归纳学习方法,从指定的模版网页中归纳到抽取规则,这种方法虽然能够准确地抽取出信息,当网站的模版发生改变后,必须重新获得抽取规则,因而这种抽取器的维护成本比较高,可适应性差。本文针对这一难题,提出一种基于DOM树的可适应性多信息块Web信息抽取,该方法首先通过NekoHtml将网页解析成DOM树,然后确定包含关键词组的信息块,从而实现Web信息抽取。经过大量网站的实验证明该方法适用于不同站点的信息抽取,并且能对多信息块的Web页面进行信息抽取。  相似文献   

10.
面向特定领域的Web应用软件开发通常是通过代码级的复用开发多个相似的变体产品。随着这种变体产品数量和复杂性的不断增长,如何分析并掌握这些变体产品的整体共性和差异性状况成为一个关键问题。针对这一问题,提出基于Web软件的页面流,通过逆向分析技术实现对于可变性的逆向分析。所提出的逆向分析方法能够产生带有可变性描述的页面流程图,从而辅助开发人员理解变体Web软件产品在页面流程方面的共性和差异性。该方法已经被实现为一个包含逆向分析和图形化展示功能的支持工具,并通过一个案例研究初步验证了其有效性。  相似文献   

11.
陈长春  王昭顺 《计算机工程与设计》2005,26(5):1256-1258,1276
形式化技术为软件逆工程提供严格和完备的理论基础,但应用于实践的非常少。介绍了一种将形式化方法应用于逆工程的具体实现方法,应用最强后条件的形式化技术对命令语言进行逆工程的具体初步实践,分三阶段对源程序进行抽象以得到严格保证正确性和一致性的软件结构规格说明,并且给出了具体的实现方法。  相似文献   

12.
网页标题的正确抽取,在Web文本信息抽取领域有着重大意义。本文提出一种网页标题实时抽取方法。首先通过对目录型网页进行实时解析,接着采用基于超链接遍历的方法,并利用标题与发布时间的对应关系,最终获取对应目录型网页的URL及锚文本。若获得锚文本不是网页正文的标题,则获取主题型网页的HTML源码并构建网页DOM树。在此基础上,结合网页标题的视觉特点,深度优先遍历DOM树,正确提取网页正文标题。实验结果表明,本文提出的Web网页标题实时抽取方法,具有实现简单,准确率高等优点。   相似文献   

13.
There has been an ongoing trend toward collaborative software development using open and shared source code published in large software repositories on the Internet. While traditional source code analysis techniques perform well in single project contexts, new types of source code analysis techniques are ermerging, which focus on global source code analysis challenges. In this article, we discuss how the Semantic Web, can become an enabling technology to provide a standardized, formal, and semantic rich representations for modeling and analyzing large global source code corpora. Furthermore, inference services and other services provided by Semantic Web technologies can be used to support a variety of core source code analysis techniques, such as semantic code search, call graph construction, and clone detection. In this paper, we introduce SeCold, the first publicly available online linked data source code dataset for software engineering researchers and practitioners. Along with its dataset, SeCold also provides some Semantic Web enabled core services to support the analysis of Internet-scale source code repositories. We illustrated through several examples how this linked data combined with Semantic Web technologies can be harvested for different source code analysis tasks to support software trustworthiness. For the case studies, we combine both our linked-data set and Semantic Web enabled source code analysis services with knowledge extracted from StackOverflow, a crowdsourcing website. These case studies, we demonstrate that our approach is not only capable of crawling, processing, and scaling to traditional types of structured data (e.g., source code), but also supports emerging non-structured data sources, such as crowdsourced information (e.g., StackOverflow.com) to support a global source code analysis context.  相似文献   

14.
采用定理证明和逆向工程的方法,对Web应用中的数据库交互行为进行验证。使用Z规格说明描述需求模型,根据数据库交互的源代码和转换规则得到实现模型。从实现模型中获取Web应用的相关性质,通过Z/EVES定理证明器验证这些性质是否在需求模型的 Z规格说明中得到满足。在此基础上,设计该方法的验证框架,并开发相应的原型系统。通过图书馆数据库管理系统实例证明该方法的有 效性。  相似文献   

15.
Reverse engineering, also called reengineering, is used to modify systems that have functioned for many years, but which can no longer accomplish their intended tasks and, therefore, need to be updated. Reverse engineering can support the modification and extension of the knowledge in an already existing system. However, this can be an intricate task for a large, complex and poorly documented knowledge-based system. The rules in the knowledge base must be gathered, analyzed and understood, but also checked for verification and validation. We introduce an approach that uses reverse engineering for the knowledge in knowledge-based systems. The knowledge is encapsulated in rules, facts and conclusions, and in the relationships between them. Reverse engineering also collects functionality and source code. The outcome of reverse engineering is a model of the knowledge base, the functionality and the source code connected to the rules. These models are presented in diagrams using a graphic representation similar to Unified Modeling Language and employing ontology. Ontology is applied on top of rules, facts and relationships. From the diagrams, test cases are generated during the reverse engineering process and adopted to verify and validate the system.  相似文献   

16.
Web页面中的主题信息一般分布比较集中,可利用网页的这一特性进行网页主题信息的自动提取。网页源代码中的HTML标签不规范,使得正向匹配难以生成嵌套结构准确的DOM树,该文提出一种通过逆向匹配的方法,构建完整的网页源代码DOM树。通过对DOM树进行剪枝,删除无关节点,对保留下来的信息块的节点标签进行人工选择与唯一性判定,从而生成提取模板。该方法能够实现对电子商务网站源网页中的主题信息进行提取,是一种半自动、通用的方法,可用于信息检索系统中的信息采集。  相似文献   

17.
Querying source code is an essential aspect of a variety of software engineering tasks such as program understanding, reverse engineering, program structure analysis and program flow analysis. In this paper, we present and demonstrate the use of an algebraic source code query technique that blends expressive power with query compactness. The query framework of Source Code Algebra (SCA) permits users to express complex source code queries and views as algebraic expressions. Queries are expressed on an extensible, object-oriented database that stores program source code. The SCA algebraic approach offers multiple benefits such as an applicative query language, high expressive power, seamless handling of structural and flow information, clean formalism and potential for query optimization. We present a case study where SCA expressions are used to query a program in terms of program organization, resource flow, control flow, metrics and syntactic structure. Our experience with an SCA-based prototype query processor indicates that an algebraic approach to source code queries combines the benefits of expressive power and compact query formulation  相似文献   

18.
Software reverse engineering is the process of analyzing a software system to extract the design and implementation details. Reverse engineering provides the source code of an application, the insight view of the architecture and the third-party dependencies. From a security perspective, it is mostly used for finding vulnerabilities and attacking or cracking an application. The process is carried out either by obtaining the code in plaintext or reading it through the binaries or mnemonics. Nowadays, reverse engineering is widely used for mobile applications and is considered a security risk. The Open Web Application Security Project (OWASP), a leading security research forum, has included reverse engineering in its top 10 list of mobile application vulnerabilities. Mobile applications are used in many sectors, e.g., banking, education, health. In particular, the banking applications are critical in terms of security as they are used for financial transactions. A security breach of such applications can result in huge financial losses for the customers as well as the banks. There exist various tools for reverse engineering of mobile applications, however, they have deficiencies, e.g., complex configurations, lack of detailed analysis reports. In this research work, we perform an analysis of the available tools for reverse engineering of mobile applications. Our dataset consists of the mobile banking applications of the banks providing services in Pakistan. Our results indicate that none of the existing tools can carry out the complete reverse engineering process as a standalone tool. In addition, we observe significant differences in terms of the execution time and the number of files generated by each tool for the same file.  相似文献   

19.
基于OWL的软件工程数据建模   总被引:1,自引:0,他引:1  
网络本体语言(Web ontology language,OWL)是语义网技术的一个重要组成部分,适合于对复杂的数据进行语义描述和建模.在软件系统的开发过程中通常会产生大量结构复杂、语义丰富的数据,而建立一个灵活的语义模型是对各类软件工程数据进行统一管理的基础.从设计和实现海量软件工程数据管理平台的需求出发,提出了一种基于OWL的软件工程数据描述模型.该模型不仅能够对源代码、需求、测试、版本和缺陷数据进行描述,同时还能对这些数据之问的语义关联进行描述.通过案例分析对模型的有效性进行了讨论.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号