Thesis by: Shashank Srivastava
Advisor: Dr. Harish Karnick, IIT Kanpur
While computational modeling has yielded several plausible models for language emergence in a set of uniformly endowed agents, most of these treatments do not address emergence of syntax. They also ignore population turnover and do not incorporate dynamical and structural aspects of populations. Most earlier simulations for realistic populations have ignored the syntactic and compositional nature of human language; and have focused on the evolution of a coherent lexicon. While a coherent vocabulary is a necessity for any language, it is in fact syntax which allows humans to express seemingly in.nite meanings using a .nite set of phonetic elements.
In this thesis, we have extended a well known inductive learning model of language learning to large populations, heterogeneous interactions, and realistic social communities. The model induces grammatical rules on the basis of phonetic resemblances between lexical entities, and similarities in semantic meanings they correspond to. We have developed a framework where multiple agents can interact in an iterated learning setting, and each agent can receive its primary linguistic input from a set of speakers according to distributions specified by the existing social topology. We also try to extend the deterministic production model to a probabilistic one, and investigate possible biases which can expedite the emergence of compositional syntax.
In particular we study the effect of population size and the structure of social topology on linguistic coherence and language emergence for this model. Our investigation of the extended model on diff.erent social graphs leads to several insights, and indicate that social topology can have significant effects on the acquisition and evolution of language.