Résumé |
Weir (1987) showed that every tree-adjoining language can be expressed as a homomorphic image of the intersection of three sets: a regular set R, the Dyck language D_{2n} over the alphabet consisting of 2n pairs of brackets [_1, ]_1,..., [_{2n}, ]_{2n}, and g^{-1}(D_{2n}), where g is the bijection that maps [_{2i+1} to itself and cyclically permutes ]_{2i+1}, [_{2i+2}, ]_{2i+2}, for each i = 0, ..., n-1.
I generalize this theorem to simple context-free tree grammars: If L is the yield image of the language of a simple context-free tree grammar of rank m-1, then L can be expressed as a homomorphic image of the intersection of a regular set R, the Dyck language D_{mn} over the alphabet of mn pairs of brackets, and g^{-1}(D_{mn}), where g is the bijection that maps [_{mi+1} to itself and cyclically permutes ]_{mi+1}, [_{mi+2}, ]_{mi+2}, ..., [_{mi+m}, ]_{mi+m}, for i = 0, ..., n-1.
I obtain this result via a natural generalization of the Chomsky-Schützenberger theorem to the tree languages of simple context-free tree grammars. This intermediate result is stated in terms of "Dyck tree languages". In order to emphasize the analogy between the string case and the tree case, I use the notion of "multi-dimensional tree" introduced by Rogers (2003). An element of a Dyck tree language is regarded as a two-dimensional encoding of a three-dimensional tree, just like an element of an ordinary Dyck language encodes an ordinary, two-dimensional tree. A simple context-free tree language is defined to be the yield image of a local set of three-dimensional trees, where the "yield" is obtained from two-dimensional encoding by erasing the "brackets". I show that many necessary lemmas can be stated as general facts about m-dimensional trees. |