首页 | 本学科首页   官方微博 | 高级检索  
     


On the applicability of the longest-match rule in lexical analysis
Affiliation:1. Computer and Information Science Department, National Chiao-Tung University, HsinChu, Taiwan, ROC;2. Department of Computer Science and Information Management, Providence University, Taichung County, Taiwan, ROC;1. Servicio de Microbiología Clínica, Hospital Universitario 12 de Octubre, Madrid, Spain;2. Spanish Network for the Research in Infectious Diseases (REIPI RD12/0015), Instituto de Salud Carlos III, Madrid, Spain;1. Information Security Group Smart Card Centre, Royal Holloway, University of London, Egham, United Kingdom;2. Department of Engineering Science, National Cheng Kung University, Tainan 70101, Taiwan;3. Computer Science Department, University of Malaga, Ada Byron building, 29071 Malaga, Spain;4. XLIM (UMR CNRS 7252 / Université de Limoges), MathIS. Limoges, France;5. Department of Computer Science, St. Francis Xavier University, Antigonish, Canada;1. Cener (Nacional Renewable Energy Centre), Ciudad de la Innovación 7, Sarriguren 31621, Navarre, Spain;2. Dept. Statistics and Operations Research and Institute of Smart Cities, Public University of Navarre, Spain;3. Dept. Statistics and Operations Research, University of Valencia, Spain
Abstract:The lexical analyzer of a compiler usually adopts the longest-match rule to resolve ambiguities when deciding the next token in the input stream. However, that rule may not be applicable in all situations. Because the longest-match rule is widely used, a language designer or a compiler implementor frequently overlooks the subtle implications of the rule. The consequence is either a flawed language design or a deficient implementation. We propose a method that automatically checks the applicability of the longest-match rule and identifies precisely the situations in which that rule is not applicable. The method is useful to both language designers and compiler implementors. In particular, the method is indispensable to automatic generators of language translation systems since, without the method, the generated lexical analyzers can only blindly apply the longest-match rule and this results in erroneous behaviors. The crux of the method consists of two algorithms: one is to compute the regular set of the sequences of tokens produced by a nondeterministic Mealy automaton when the automaton processes elements of an input regular set. The other is to determine whether a regular set and a context-free language have nontrivial intersection with a set of equations.
Keywords:
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号