Statistical models for language: structure and computation Abstract: Statistical models are now widely used in NLP, speech recognition, and other forms of linguistic computation. Yet few graduate programs afford students the opportunity to fully learn the principles behind such statistical models. This course addresses this lack by presenting key statistical concepts -- aggregation, variance, degrees of freedom, parameter estimation, and significance testing -- in the context of analyzing and modeling linguistic data. These concepts are illustrated using log-linear models. Hidden Markov models and Probabilistic Context Free grammars are discussed in terms of their application to language analysis on the one hand, and in terms of statistical model structure on the other. Course materials for demonstration and exercises assignments will be prepared using R, a cross-platform, statistical computing environment available under the Gnu Public License (www.r-project.org). No prior knowledge of statistics is assumed in this course; general background in linguistics is assumed.