May 29, 2018
Marco Campion
Indexed Grammars (IG) are a formalism which generate Indexed Languages (IL) and can be recognized by nested stack automata. Indexed grammars are a generalization of context-free grammars in that non-terminals are equipped with lists of index symbols and they are a proper subset of context-sensitive languages. Nested stack automata differ from pushdown automata: the memory stack can be nested, that is, each position in the stack can contain other stacks. The automaton always operates on the innermost stack only. Indexed languages resemble context-free languages in that many of the closure properties and decidability results for both classes of languages are similar. Part of this interest in larger classes of languages stems from the inadequacy of finite state automata and context-free grammars in specifying all of the code structures of all possible metamorphic change of a metamorphic code.
We give a least-fix-point characterization of the languages recognized by indexed grammars. We then study relations between the indexed languages and other formal languages contained and containing them. More precisely, we found that there is no best abstraction, in the sense of abstract interpretation, function from indexed languages to context-free languages and from context-sensitive languages to indexed languages. This means that there will not be a better abstract function that will allow us to map any indexed language to a context-free language with the smallest loss of information possible. This result does not exclude possibility of approximations: limitations to the structures of productions or memory of the indexed grammars are studied. We the propose some widening operators to overcome these limitations.