Abstract: | Full-text systems that access text randomly cannot normally determine the format operations in effect for a given target location. The problem can be solved by viewing the format marks as the non-terminals in a format grammar. A formatted text can then be parsed using the grammar to build a data structure that serves both as a parse tree and as a search tree. While processing a retrieved segment, a full-text system can follow the search tree from root to leaf, collecting the format marks encountered at each node to derive the sequence of commands active for that segment. The approach also supports the notion of a ‘well formatted’ document and provides a means for verifying the well-formedness of a given text. To illustrate the approach, a sample set of format marks and a sample grammar are given suitable for formatting and parsing the article as a sample text. |