期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Establishing structural testing criteria for Java bytecode

A. M. R. Vincenzi M. E. Delamaro J. C. Maldonado W. E. Wong 《Software》2006,36(14):1513-1541

This paper describes intra‐method control‐flow and data‐flow testing criteria for the Java bytecode language. Six testing criteria are considered for the generation of testing requirements: four control‐flow and two data‐flow based. The main reason to work at a lower level is that, even when there is no source code, structural testing requirements can still be derived and used to assess the quality of a given test set. It can be used, for instance, to perform structural testing on third‐party Java components. In addition, the bytecode can be seen as an intermediate language, so the analysis performed at this level can be mapped back to the original high‐level language that generated the bytecode. To support the application of the testing criteria, we have implemented a tool named JaBUTi (Java Bytecode Understanding and Testing). JaBUTi is used to illustrate the application of the ideas developed in this paper. Copyright © 2006 John Wiley & Sons, Ltd. 相似文献

2.

On the implementation of bytecode compression for interpreted languages

Ekaterina Stefanov Anthony M. Sloane 《Software》2009,39(2):111-135

This paper describes a new method for code space optimization for interpreted languages called LZW‐CC . The method is based on a well‐known and widely used compression algorithm, LZW , which has been adapted to compress executable program code represented as bytecode. Frequently occurring sequences of bytecode instructions are replaced by shorter encodings for newly generated bytecode instructions. The interpreter for the compressed code is modified to recognize and execute those new instructions. When applied to systems where a copy of the interpreter is supplied with each user program, space is saved not only by compressing the program code but also by automatically removing the unused implementation code from the interpreter. The method's implementation within two compiler systems for the programming languages Haskell and Java is described and implementation issues of interest are presented, notably the recalculations of target jumps and the automated tailoring of the interpreter to program code. Applying LZW‐CC to nhc98 Haskell results in bytecode size reduction by up to 15.23% and executable size reduction by up to 11.9%. Java bytecode is reduced by up to 52%. The impact of compression on execution speed is also discussed; the typical speed penalty for Java programs is between 1.8 and 6.6%, while most compressed Haskell executables run faster than the original. Copyright © 2008 John Wiley & Sons, Ltd. 相似文献

3.

Polymorphic bytecode instrumentation

下载免费PDF全文

Walter Binder Philippe Moret Éric Tanter Danilo Ansaloni 《Software》2016,46(10):1351-1380

Bytecode instrumentation is a widely used technique to implement aspect weaving and dynamic analyses in virtual machines such as the Java virtual machine. Aspect weavers and other instrumentations are usually developed independently and combining them often requires significant engineering effort, if at all possible. In this article, we present polymorphic bytecode instrumentation(PBI), a simple but effective technique that allows dynamic dispatch amongst several, possibly independent instrumentations. PBI enables complete bytecode coverage, that is, any method with a bytecode representation can be instrumented. We illustrate further benefits of PBI with three case studies. First, we describe how PBI can be used to implement a comprehensive profiler of inter‐procedural and intra‐procedural control flow. Second, we provide an implementation of execution levels for AspectJ, which avoids infinite regression and unwanted interference between aspects. Third, we present a framework for adaptive dynamic analysis, where the analysis to be performed can be changed at runtime by the user. We assess the overhead introduced by PBI and provide thorough performance evaluations of PBI in all three case studies. We show that pure Java profilers like JP2 can, thanks to PBI, produce accurate execution profiles by covering all code, including the core Java libraries. We then demonstrate that PBI‐based execution levels are much faster than control flow pointcuts to avoid interference between aspects and that their efficient integration in a practical aspect language is possible. Finally, we report that PBI enables adaptive dynamic analysis tools that are more reactive to user inputs than existing tools that rely on dynamic aspect‐oriented programming with runtime weaving. These experiments position PBI as a widely applicable and practical approach for combining bytecode instrumentations. © 2015 The Authors. Software: Practice and Experience Published by John Wiley & Sons Ltd. 相似文献

4.

An empirical study of Java bytecode programs

Christian Collberg Ginger Myles Michael Stepp 《Software》2007,37(6):581-641

We present a study of the static structure of real Java bytecode programs. A total of 1132 Java jar‐files were collected from the Internet and analyzed. In addition to simple counts (number of methods per class, number of bytecode instructions per method, etc.), structural metrics such as the complexity of control‐flow and inheritance graphs were computed. We believe this study will be valuable in the design of future programming languages and virtual machine instruction sets, as well as in the efficient implementation of compilers and other language processors. Copyright © 2006 John Wiley & Sons, Ltd. 相似文献

5.

From bytecode to JavaScript: the Js_of_ocaml compiler

Jérôme Vouillon Vincent Balat 《Software》2014,44(8):951-972

We present the design and implementation of a compiler from OCaml bytecode to JavaScript. The compiler first translates the bytecode into a static single‐assignment intermediate representation on which optimizations are performed, before generating JavaScript. We believe that taking bytecode as an input instead of a high‐level language is a sensible choice. Virtual machines provide a very stable API. Such a compiler is thus easy to maintain. It is also convenient to use, and it can just be added to an existing installation of the development tools. Already‐compiled libraries can be used directly, with no need to reinstall anything. Finally, some virtual machines are the target of several languages. A bytecode to JavaScript compiler would make it possible to retarget all these languages to Web browsers at once. Copyright © 2013 John Wiley & Sons, Ltd. 相似文献

6.

Decompilation of Java bytecode to Prolog by partial evaluation

Miguel Gmez-Zamalloa Elvira Albert Germn Puebla 《Information and Software Technology》2009,51(10):1409-1427

Reasoning about Java bytecode (JBC) is complicated due to its unstructured control-flow, the use of three-address code combined with the use of an operand stack, etc. Therefore, many static analyzers and model checkers for JBC first convert the code into a higher-level representation. In contrast to traditional decompilation, such representation is often not Java source, but rather some intermediate language which is a good input for the subsequent phases of the tool. Interpretive decompilation consists in partially evaluating an interpreter for the compiled language (in this case JBC) written in a high-level language with respect to the code to be decompiled. There have been proofs-of-concept that interpretive decompilation is feasible, but there remain important open issues when it comes to decompile a real language such as JBC. This paper presents, to the best of our knowledge, the first modular scheme to enable interpretive decompilation of a realistic programming language to a high-level representation, namely of JBC to Prolog. We introduce two notions of optimality which together require that decompilation does not generate code more than once for each program point. We demonstrate the impact of our modular approach and optimality issues on a series of realistic benchmarks. Decompilation times and decompiled program sizes are linear with the size of the input bytecode program. This demonstrates empirically the scalability of modular decompilation of JBC by partial evaluation. 相似文献

7.

Metamorphic code generation from LLVM bytecode

Teja Tamboli Thomas H. Austin Mark Stamp 《Journal in Computer Virology》2014,10(3):177-187

Metamorphic software changes its internal structure across generations with its functionality remaining unchanged. Metamorphism has been employed by malware writers as a means of evading signature detection and other advanced detection strategies. However, code morphing also has potential security benefits, since it can serve to increase the “genetic diversity” of software. We have created a metamorphic code generator within the LLVM compiler framework. LLVM is a three-phase compiler that supports multiple source languages and target architectures. It uses a common intermediate representation (IR) bytecode in its optimizer. Consequently, any supported high-level programming language is transformed to this IR bytecode as part of the LLVM compilation process. Our metamorphic generator functions at the IR bytecode level, which provides many advantages over morphing at the assembly or source code level. The morphing techniques that we employ include dead code insertion and transposition, where the dead code is actually executed within the morphed code, making its detection and removal more challenging. We have verified the effectiveness of our code morphing using hidden Markov model analysis. 相似文献

8.

SeByte: Scalable clone and similarity search for bytecode

《Science of Computer Programming》2014

While source code clone detection is a well-established research area, finding similar code fragments in binary and other intermediate code representations has been not yet that widely studied. In this paper, we introduce SeByte, a bytecode clone detection and search model that applies semantic-enabled token matching. It is developed based on the idea of relaxation on the code fingerprints. This approach separates the input content based on the types of tokens into different dimensions, with each dimension representing the input content from a specific point of view. Following this approach, SeByte compares each dimension separately and independently which we refer to as multi-dimensional comparison in our research. As the similarity search function we use a well-known measure that supports our multi-dimensional comparison heuristic, the Jaccard similarity coefficient. Our preliminary study shows that SeByte can detect clones that are missed by existing approaches due to the differences in the input data and the search algorithm. We then further exploit the model to build a scalable bytecode clone search engine. This extension meets the requirements of a classical search engine including the ranking of result sets. Our evaluation with a large dataset of 500,000 compiled Java classes, which we extracted from the six most recent versions of the Eclipse IDE, showed that our SeByte search is not only scalable but also capable of providing a reliable ranking. 相似文献

9.

Relational bytecode correlations

《The Journal of Logic and Algebraic Programming》2010,79(7):483-514

We present a calculus for tracking equality relationships between values through pairs of bytecode programs. The calculus may serve as a certification mechanism for non-interference, a well-known program property in the field of language-based security, and code transformations. Contrary to previous type systems for non-interference, no restrictions are imposed on the control flow structure of programs. Objects, static and virtual methods are included, and heap-local reasoning is supported by frame rules. In combination with polyvariance, the latter enable the modular verification of programs over heap-allocated data structures, which we illustrate by verifying and comparing different implementations of list copying. The material is based on a complete formalisation in Isabelle/HOL. 相似文献

10.

Cost analysis of object-oriented bytecode programs

Elvira AlbertPuri Arenas Samir GenaimGerman Puebla Damiano Zanardini 《Theoretical computer science》2012,413(1):142-159

Cost analysis statically approximates the cost of programs in terms of their input data size. This paper presents, to the best of our knowledge, the first approach to the automatic cost analysis of object-oriented bytecode programs. In languages such as Java and C#, analyzing bytecode has a much wider application area than analyzing source code since the latter is often not available. Cost analysis in this context has to consider, among others, dynamic dispatch, jumps, the operand stack, and the heap. Our method takes a bytecode program and a cost model specifying the resource of interest, and generates cost relations which approximate the execution cost of the program with respect to such resource. We report on COSTA, an implementation for Java bytecode which can obtain upper bounds on cost for a large class of programs and complexity classes. Our basic techniques can be directly applied to infer cost relations for other object-oriented imperative languages, not necessarily in bytecode form. 相似文献

11.

A Type System for the Java Bytecode Language and Verifier

Stephen N. Freund John C. Mitchell 《Journal of Automated Reasoning》2003,30(3-4):271-321

The Java Virtual Machine executes bytecode programs that may have been sent from other, possibly untrusted, locations on the network. Since the transmitted code may be written by a malicious party or corrupted during network transmission, the Java Virtual Machine contains a bytecode verifier to check the code for type errors before it is run. As illustrated by reported attacks on Java run-time systems, the verifier is essential for system security. However, no formal specification of the bytecode verifier exists in the Java Virtual Machine Specification published by Sun. In this paper, we develop such a specification in the form of a type system for a subset of the bytecode language. The subset includes classes, interfaces, constructors, methods, exceptions, and bytecode subroutines. We also present a type checking algorithm and prototype bytecode verifier implementation, and we conclude by discussing other applications of this work. For example, we show how to extend our formal system to check other program properties, such as the correct use of object locks. This revised version was published online in August 2006 with corrections to the Cover Date. 相似文献

12.

基于字节码的以太坊智能合约分类方法

下载免费PDF全文

林丹林凯欣吴嘉婧郑子彬《网络与信息安全学报》2022,8(5):111-120

近年来,区块链技术已在金融、医疗和政务等领域得到了广泛应用和关注。然而,由于智能合约的不易篡改性和运行环境的特殊性,各类安全问题频繁出现。一方面是合约开发者在编写合约时出现的代码安全问题,另一方面是以太坊出现不少高风险智能合约,普通用户很容易被高风险合约提供的高回报所吸引,但对合约的风险却无从知晓。然而,关于智能合约安全的研究主要集中于代码安全方面,对合约功能识别的研究相对较少。假如能对智能合约功能进行准确分类,将有助于人们更好地理解智能合约的行为,同时保障智能合约生态安全,减少或挽回用户的损失。已有的智能合约分类方法通常依赖于对智能合约开源代码的分析,但以太坊发布的合约仅强制要求部署字节码,且只有极少数合约公布了其开源代码。因此,提出了一种基于字节码的以太坊智能合约分类方法。收集以太坊智能合约字节码和对应类别标签,然后提取操作码频率特征以及控制流图特征;通过实验对特征重要性进行分析,获取适合的图向量维度及最优的分类模型;在交易所、金融、赌博、游戏和高风险5个类别的智能合约多分类任务中进行实验验证,使用XGBoost分类器时的F1值达到0.9138。实验结果表明所提方法能较好地完成以太坊智能合约的分类任务,并且能够应用于现实中的智能合约类别预测。相似文献

13.

Subroutine Inlining and Bytecode Abstraction to Simplify Static and Dynamic Analysis

Cyrille Artho Armin Biere 《Electronic Notes in Theoretical Computer Science》2005,141(1):109

In Java bytecode, intra-method subroutines are employed to represent code in “finally” blocks. The use of such polymorphic subroutines within a method makes bytecode analysis very difficult. Fortunately, such subroutines can be eliminated through recompilation or inlining. Inlining is the obvious choice since it does not require changing compilers or access to the source code. It also allows transformation of legacy bytecode. However, the combination of nested, non-contiguous subroutines with overlapping exception handlers poses a difficult challenge. This paper presents an algorithm that successfully solves all these problems without producing superfluous instructions. Furthermore, inlining can be combined with bytecode simplification, using abstract bytecode. We show how this abstration is extended to the full set of instructions and how it simplifies static and dynamic analysis. 相似文献

14.

Provably correct control flow graphs from Java bytecode programs with exceptions

Afshin Amighi Pedro de Carvalho Gomes Dilian Gurov Marieke Huisman 《International Journal on Software Tools for Technology Transfer (STTT)》2016,18(6):653-684

We present an algorithm for extracting control flow graphs from Java bytecode that captures normal as well as exceptional control flow. We prove its correctness, in the sense that the behaviour of the extracted control flow graph is a sound over-approximation of the behaviour of the original program. This makes control flow graphs suitable for performing various static analyses, such as model checking of temporal safety properties. Analysing exceptional control flow for Java bytecode is difficult because of the stack-based nature of the language. We therefore develop the extraction in two stages. In the first, we abstract away from the complications arising from exceptional flows, and relativize the extraction on an oracle that is able to look into the stack and predict the exceptions that can be raised at each instruction. This idealized algorithm provides a specification for concrete extraction algorithms, which have to provide a suitable implementation for the oracle. We prove correctness of the idealized algorithm by means of behavioural simulation. In the second stage, we develop a concrete extraction algorithm that consists of two phases. In the first phase, the program is transformed into a BIR program, a stack-less intermediate representation of Java bytecode, from which the control flow graph is extracted in the second phase. We use this intermediate format because it provides the information needed to implement the oracle, and since it gives rise to more compact graphs. We show that the behaviour of the control flow graph extracted via the intermediate representation is a sound over-approximation of the behaviour of the graph extracted by the direct, idealized algorithm, and thus of the original program. The concrete extraction algorithm is implemented as the ConFlEx tool. A number of test cases are performed to evaluate the efficiency of the algorithm. 相似文献

15.

Tool-Assisted Specification and Verification of Typed Low-Level Languages

Gilles?Barthe Email author Pierre?Courtieu Guillaume?Dufay Sim?o?Melo de Sousa 《Journal of Automated Reasoning》2005,35(4):295-354

Bytecode verification is one of the key security functions of several architectures for mobile and embedded code, including Java, Java Card, and .NET. Over the past few years, its formal correctness has been studied extensively by academia and industry, using general-purpose theorem provers. The objective of our work is to facilitate such endeavors by providing a dedicated environment for establishing the correctness of bytecode verification within a proof assistant. The environment, called Jakarta, exploits a methodology that casts the correctness of bytecode verification relatively to a defensive virtual machine that performs checks at run-time and to an offensive one that does not; it can be summarized as stating that the two machines coincide on programs that pass bytecode verification. Such a methodology has been used successfully to prove the correctness of the Java Card bytecode verifier and may potentially be applied to many similar problems. One definite advantage of the methodology is that it is amenable to automation. Indeed, Jakarta automates the construction of an offensive virtual machine and a bytecode verifier from a defensive machine, and the proofs of correctness of the bytecode verifier. We illustrate the principles of Jakarta on a simple low-level language extended with subroutines and discuss its usefulness to proving the correctness of the Java Card platform. 相似文献

16.

Magic-sets for localised analysis of Java bytecode

Fausto Spoto Étienne Payet 《Higher-Order and Symbolic Computation》2010,23(1):29-86

Static analyses based on denotational semantics can naturally model functional behaviours of the code in a compositional and completely context and flow sensitive way. But they only model the functional i.e., input/output behaviour of a program P, not enough if one needs P’s internal behaviours i.e., from the input to some internal program points. This is, however, a frequent requirement for a useful static analysis. In this paper, we overcome this limitation, for the case of mono-threaded Java bytecode, with a technique used up to now for logic programs only. Namely, we define a program transformation that adds new magic blocks of code to the program P, whose functional behaviours are the internal behaviours of P. We prove the transformation correct w.r.t. an operational semantics and define an equivalent denotational semantics, devised for abstract interpretation, whose denotations for the magic blocks are hence the internal behaviours of P. We implement our transformation and instantiate it with abstract domains modelling sharing of two variables, non-cyclicity of variables, nullness of variables, class initialisation information and size of the values bound to program variables. We get a static analyser for full mono-threaded Java bytecode that is faster and scales better than another operational pair-sharing analyser. It has the same speed but is more precise than a constraint-based nullness analyser. It makes a polyhedral size analysis of Java bytecode scale up to 1300 methods in a couple of minutes and a zone-based size analysis scale to still larger applications. 相似文献

17.

TinyVM: an energy‐efficient execution infrastructure for sensor networks

Kirak Hong Jiin Park Sungho Kim Taekhoon Kim Hwangho Kim Bernd Burgstaller Bernhard Scholz 《Software》2012,42(10):1193-1209

Energy‐efficient implementation techniques for virtual machines (VMs) have received little attention yet: conventional wisdom claims that VMs have a diametrical effect on energy consumption, and VM‐based applications are therefore short‐lived. In this paper, we argue that bytecode interpretation is affordable if we synthesize VMs specifically for energy efficiency. We present TinyVM, an execution infrastructure that seamlessly integrates with C and nesC/TinyOS‐based programming environments. TinyVM achieves high code density through the use of compressed bytecode as the primary program representation. Compressed bytecode allows rapid application deployment with low communication overhead. TinyVM executes compressed bytecode in place, which eliminates the need for a decompression stage and thereby reduces memory consumption on sensor nodes. Our infrastructure automates the creation of energy‐efficient application‐specific VMs. Applications are partitioned in machine code, bytecode, and VM instruction set extensions. Partitioning is manually controlled and/or fully guided by a discrete optimization problem that produces a partitioning with lowest energy consumption for a given program size limit. We provide experimental results for sensor network benchmarks and for selected applications on various CPU architectures including Atmega128‐based motes and the ARM‐based Intel iMote2. TinyVM has been released under the GNU General Public License. Copyright © 2011 John Wiley & Sons, Ltd. 相似文献

18.

Bytecode verification on Java smart cards

Xavier Leroy 《Software》2002,32(4):319-340

This article presents a novel approach to the problem of bytecode verification for Java Card applets. By relying on prior off‐card bytecode transformations, we simplify the bytecode verifier and reduce its memory requirements to the point where it can be embedded on a smart card, thus increasing significantly the security of post‐issuance downloading of applets on Java Cards. This article describes the on‐card verification algorithm and the off‐card code transformations, and evaluates experimentally their impact on applet code size. Copyright © 2002 John Wiley & Sons, Ltd. 相似文献

19.

Timing Aware Information Flow Security for a JavaCard-like Bytecode

Daniel Hedin David Sands 《Electronic Notes in Theoretical Computer Science》2005,141(1):163

Common protection mechanisms fail to provide end-to-end security; programs with legitimate access to secret information are not prevented from leaking this to the world. Information-flow aware analyses track the flow of information through the program to prevent such leakages, but often ignore information flows through covert channels even though they pose a serious threat. A typical covert channel is to use the timing of certain events to carry information. We present a timing-aware information-flow type system for a low-level language similar to a non-trivial subset of a sequential Java bytecode. The type system is parameterized over the time model of the instructions of the language and over the algorithm enforcing low-observational equivalence, used in the prevention of implicit and timing flows. 相似文献

20.

Formalizing non-interference for a simple bytecode language in Coq

Florian Kammüller 《Formal Aspects of Computing》2008,20(3):259-275

In this paper, we describe the application of the interactive theorem prover Coq to the security analysis of bytecode as used in Java. We provide a generic specification and proof of non-interference for bytecode languages using the Coq module system. We illustrate the use of this formalization by applying it to a small subset of Java bytecode. The emphasis of the paper is on modularity of a language formalization and its analysis in a machine proof. C. B. Jones 相似文献