A Mathematical Theory of Grammatical Categories

Abstract:
Every mainstream linguistic theory categorizes expressions and
provides some system for naming the categories (with atomic symbols,
function expressions, or feature complexes of some kind), but the
conceptual and empirical basis for categorization is seldom
explored. In the 1950's the most prominent idea was that categories
are sets of substitution-equivalent strings, but the various rigorous
formulations of this idea are empirically inaccurate and fail to 
correspond to the way linguists actually use categories.  More sophisticated
approaches build on the idea that the categories partition (or cover)
the form-meaning pairs in a language in a way that stands in a certain
relation to the generative (or relational) structure of the language.

After critiquing some of the earlier ideas about categories, this
class explores proposals of the latter kind, with attention to the
possibility that an optimal categorization of expressions may be 
induced by the generative structure of grammar rather than being
simply stipulated. (We find some models for this kind of "optimal
categorization" in formal language theory: e.g. the Nerode equivalence
classes as the categories of canonical regular grammars, and the
elements of the Rabin&Scott, Schutzenberger "syntactic monoid".)

This builds on earlier foundational work by Keenan & Stabler as follows:
K&S treat a grammar G as a lexicon and a set F of generating functions; 
the language L(G) is the closure of the lexicon wrt F.  This definition 
partitions L(G) by equating two expressions iff some automorphism of L(G) 
a permutatation of L(G)that fixes each function in F) maps one expression 
to the other. These equivalence classes are unlike linguists' categories 
because they (typically) distinguish elements of differing derivational 
complexity.  But we expect that having a given category is an automorphism 
invariant property.  And some natural coarsenings of the automorphism 
partition come close to standard notions and suggest what an "optimal" 
categorization might be.  E.g.  consider the coarsest partition of L(G) 
that is a congruence in the sense that (using = for equivalence) for all 
f in F and all tuples s,t in L(G)*, if s is in dom(f) and t is the 
result of replacing the i'th coordinate si in s by ti where si=ti, then 
f(s)=f(t). This does not fit all reasonable proposals about categorizations,
as we show, but it comes close, and variations on the idea will be explored.

We also show how fundamental sub- and super-category relations are induced
by grammar and so do not need to be stipulated.