Discover millions of ebooks, audiobooks, and so much more with a free trial

Only $11.99/month after trial. Cancel anytime.

On the Logic and Learning of Language
On the Logic and Learning of Language
On the Logic and Learning of Language
Ebook345 pages3 hours

On the Logic and Learning of Language

Rating: 0 out of 5 stars

()

Read preview

About this ebook

This book presents the author's research on automatic learning procedures for categorial grammars of natural languages. The research program spans a number of intertwined disciplines, including syntax, semantics, learnability theory, logic, and computer science. The theoretical framework employed is an extension of categorial grammar that has come to be called multimodal or type-logical grammar. The first part of the book presents an expository summary of how grammatical sentences of any language can be deduced with a specially designed logical calculus that treats syntactic categories as its formulae. Some such Universal Type Logic is posited to underlie the human language faculty, and all linguistic variation is captured by the different systems of semantic and syntactic categories which are assigned in the lexicons of different languages. The remainder of the book is devoted to the explicit formal development of computer algorithms which can learn the lexicons of type logical grammars from learning samples of annotated sentences. The annotations consist of semantic terms expressed in the lambda calculus, and may also include an unlabeled tree-structuring over the sentence.

The major features of the research include the following:

We show how the assumption of a universal linguistic component---the logic of language---is not incompatible with the conviction that every language needs a different system of syntactic and semantic categories for its proper description.

The supposedly universal linguistic categories descending from antiquity (noun, verb, etc.) are summarily discarded.

Languages are here modeled as consisting primarily of sentence trees labeled with semantic structures; a new mathematical class of such term-labeled tree languages is developed which cross-cuts the well-known Chomsky hierarchy and provides a formal restrictive condition on the nature of human languages.

The human language acquisition mechanism is postulated to be biased, such that it assumes all input language samples are drawn from the above "syntactically homogeneous" class; in this way, the universal features of human languages arise not just from the innate logic of language, but also from the innate biases which govern language learning.

This project represents the first complete explicit attempt to model the aquisition of human language since Steve Pinker's groundbreaking 1984 publication, "Language Learnability and Language Development."


LanguageEnglish
Release dateOct 14, 2004
ISBN9781412222181
On the Logic and Learning of Language
Author

Sean A. Fulop

Sean Fulop received his B.Sc. in Physics from the University of Calgary in 1991 with a Linguistics minor, and abandoned Physics in favour of Linguistics. He went on to complete his Ph.D. in Linguistics at UCLA in 1999, gaining expertise in the two specialties of phonetics and mathematical linguistics. His doctoral dissertation bore the same title as the present book, and was an incomplete precursor to the research that is now being reported. In the intervening years he has held temporary lectureships and professorships, most recently appointed as Visiting Assistant Professor of Linguistics at the University of Chicago, with an affiliation to the Computer Science Department. Aside from the mathematics of language and speech, the author's greatest preoccupations are his family (including wife Jacquie and daughters Sandra and Brenna), sports cars, and progressive rock music. Preface This book is my Ph.D. dissertation all grown up. Though this volume and my 1999 UCLA Linguistics dissertation share the same title and core ideas, the earlier work was woefully inadequate in many ways in which the present book is not. This is not to say, of course, that the present book isn't woefully inadequate, but it is fair to say that many of the former inadequacies have been eliminated. Although certain publishers wouldn't believe it, this book is in part a foundational project in computational linguistics. The term "computational linguistics" is nowadays taken to refer to a kind of engineering discipline whose primary goal is to get computers to deal with information presented by means of ordinary language. This project exemplifies my view of computational linguistics, which is not as above. Consider for a moment what the various "computational sciences" amount to. Computational biology means using computer models to simulate biological systems and extract answers to questions of biology. Computational fluid dynamics (CFD) involves using computer models to simulate fluid dynamical systems and extract relevant answers—you get the idea. CFD is a favorite example because it is a relatively simple theory that has been plagued by a long history of computational obstacles. The theory of CFD pretty much amounts to systems of equations that were first derived in the nineteenth century, now called the Navier-Stokes equations. For decades it has been thought, correctly, that if you want an answer to a fluid dynamical question, simply solve the Navier-Stokes equations. This last step proved to be very sticky since these equations can only be solved numerically (save a few special cases), and decades of work have been required to figure out decent methods for doing it; we are in some cases still waiting for sufficient computational power to get the answers we really want. In my view, computational linguistics can be like that—a real computational science in which the primary activities are the construction of mathematical and computational models of human language, and then undertaking efforts to solve the "equations." I have found in my work, some of which is presented here, that the formulation of a good theory for modeling language is just the first important step in a long series of sticky problems, like how to compute the model in a reasonable time. The work herein is largely limited to the aforementioned first step, and it is thus a contribution to that unsung subfield known as "the mathematics of language." A mathematical model is presented for certain aspects of language and its acquisition that is fitted with a computational model for solving the linguistic equations, as it were. Unfortunately, an adequate computational methodology for extracting answers in practical cases has not yet been developed, so the results here are strictly theoretical. This is not a downfall at this stage; after all, the Navier-Stokes equations were once "strictly theoretical," too. I have undertaken to report my work and my research in this book without much regard for who the "intended audience" might be. I suppose it can be somewhat circularly defined as "those people who find it interesting," and then I will have pleased my intended audience. But I suggest that such people are likely to be graduate students and researchers in mathematical linguistics, language learnability, and formal aspects of computational linguistics. At times, the material presented here is quite formalized and tedious. The main reason for doing this is the desire to provide other researchers with exact details of the algorithms developed, and of the functions which the algorithms compute. I found, in my own reading when I was first learning the background for this project, that literature that was not tediously formalized read breezily but left me without a complete understanding. I hope that there are many readers who can appreciate this project, although I realize that it will not be very accessible to most linguists in spite of my efforts to present most ideas from first principles. I thus also hope that the presentation of so many things from first principles does not too much bore those readers who are already well-versed in the background areas. Perhaps they will find in these presentations some valuable expository material for the classroom. The production of this book has been carried out by the author, including typing, typesetting, indexing, and cover design. It has been typed and edited with GNU Emacs, and typeset using the LATEX system, with the Baskerville font family from Micropress, Inc. The final assembly was performed by the technical staff at Trafford. This is version 1.0 (beta) Chicago, Illinois, April 2004

Related to On the Logic and Learning of Language

Related ebooks

Science & Mathematics For You

View More

Related articles

Reviews for On the Logic and Learning of Language

Rating: 0 out of 5 stars
0 ratings

0 ratings0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    On the Logic and Learning of Language - Sean A. Fulop

    © Copyright 2004 Sean A. Fulop. All rights reserved.

    No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the written prior permission of the author.

    A cataloguing record for this book that includes the U.S. Library of Congress Classification number, the Library of Congress Call number and the Dewey Decimal cataloguing code is available from the National Library of Canada. The complete cataloguing record can be obtained from the National Library’s online database at: www.nlc-bnc.ca/amicus/index-e.html

    ISBN: 978-1-4120-2381-8 (softcover)

    ISBN: 978-1-4122-2218-1 (ebook)

    Image462.JPG

    This book was published on-demand in cooperation with Trafford Publishing. On-demand publishing is a unique process and service of making a book available for retail sale to the public taking advantage of on-demand manufacturing and Internet marketing. On-demand publishing includes promotions, retail sales, manufacturing, order fulfilment, accounting and collecting royalties on behalf of the author.

    Suite 6E, 2333 Government St., Victoria, B.C. V8T 4P4, CANADA

    10   9   8   7   6   5

    Contents

    Preface

    Acknowledgements

    Chapter 1 Introduction

    1.1 Grammar of logic, logic of grammar

    1.2 Learning the logic of language

    1.3 Type-logical structural generative linguistics

    Part I Logic of Language

    Chapter 2 Substructural deductive systems

    2.1 Formal propositional deduction

    2.2 Gentzen’s Natural Deduction

    2.3 Formal sequent languages and logics

    2.4 Gentzen’s sequent calculus

    Chapter 3 Categorial type logics

    3.1 The typed lambda calculus

    3.2 Categorial grammar

    3.3 Forms of Lambek’s calculus

    3.4 Type logics enriched

    3.5 Dutch verb clusters

    Part Il Learning Language in Logic

    Chapter 4 Type-logical semantics

    4.1 Curry-Howard correspondence

    4.2 Formulae-as-types for Lambek systems

    4.3 A generalized Curry-Howard morphism

    4.4 The linguistic syntax-semantics connection

    Chapter 5 Learning type-logical grammar (a first try)

    5.1 Categorial grammars and their languages

    5.2 CG discovery from syntactic structures

    5.3 Semantic term-labeled languages

    5.4 Automated theorem proving in type logics

    5.5 Discovery of type-logical lexicons

    5.6 Optimally unified lexicons

    5.7 Learning from unsubtyped tls’s

    Chapter 6 Learning a new class of languages

    6.1 American Structuralist revival

    6.2 Distributional analysis through type unification 6.2.1 Structural unification

    6.3 Minimum description length lexicons

    Part Ill Learnability and Linguistic Theory

    Chapter 7 Learnability

    7.1 Grammar frames and learnability

    7.2 Learnability of classical categorial lexicons

    7.3 Learnability of OUTL lexicons

    7.4 Learnability of structurally unified lexicons

    Chapter 8 The type-logical structure of linguistic theory

    8.1 On the notion syntactic constituent

    8.2 Syntactic structure versus proof structure

    8.3 On the notion syntactic category

    8.4 Generative capacity

    8.5 A comparison with Pinker’s learning scheme

    8.6 Type logic as a linguistic theory

    8.7 A Structuralist generative framework

    8.8 Future work: overcoming obstacles

    Appendix A Algebraic semantics for sequent calculi

    A.1 Abstract algebra

    A.2 Algebras of languages

    A.3 Algebras of deductive systems

    A.4 Subsuming special cases: an example

    A.5 Kripke semantics

    A.6 Algebraic semantics of type logics

    Bibliography

    Endnotes:

    Most of the formalizations in mathematics are based on some underlying idea-an intuitive notion which gives guidance and purpose. It is not easy to give a precise description of the nature of an idea; indeed a deeper idea may be almost impossible to communicate and so may be recognized only after it has been embodied in some for-malization.

    Saunders Mac Lane

    Mathematics: Form andFunction (1986)

    It is no matter of indifference whether the grammarian is content with a pre-scientific personal opinion on meaning-forms, or with notions empirically contaminated by historical [influence], e.g. by Latin grammar, or whether he keeps his eyes on a scientifically fixed, theoretically coherent system of pure meaning-forms. . .

    Edmund Husserl

    Logical Investigations vol. II, Investigation IV (1913)

    Lay it down and let me live the new language,

    Let me learn at every twist every turn.

    Jon Anderson, New Language,

    from the album The Ladder (1999) by Yes.

    Preface

    This book is my Ph.D. dissertation all grown up. Though this volume and my 1999 UCLA Linguistics dissertation share the same title and core ideas, the earlier work was woefully inadequate in many ways in which the present book is not. This is not to say, of course, that the present book isn’t woefully inadequate, but it is fair to say that many of the former inadequacies have been eliminated.

    Although certain publishers wouldn’t believe it, this book is in part a foundational project in computational linguistics. The term computational linguistics is nowadays taken to refer to a kind of engineering discipline whose primary goal is to get computers to deal with information presented by means of ordinary language. This project exemplifies my view of computational linguistics, which is not as above. Consider for a moment what the various computational sciences amount to. Computational biology means using computer models to simulate biological systems and extract answers to questions of biology. Computational fluid dynamics (CFD) involves using computer models to simulate fluid dynamical systems and extract relevant answers-you get the idea. CFD is a favorite example because it is a relatively simple theory that has been plagued by a long history of computational obstacles.

    The theory of CFD pretty much amounts to systems of equations that were first derived in the nineteenth century, now called the Navier-Stokes equations. For decades it has been thought, correctly, that if you want an answer to a fluid dynamical question, simply solve the Navier-Stokes equations. This last step proved to be very sticky since these equations can only be solved numerically (save a few special cases), and decades of work have been required to figure out decent methods for doing it; we are in some cases still waiting for sufficient computational power to get the answers we really want. In my view, computational linguistics can be like that-a real computational science in which the primary activities are the construction of mathematical and computational models of human language, and then undertaking efforts to solve the equations. I have found in my work, some of which is presented here, that the formulation of a good theory for modeling language is just the first important step in a long series of sticky problems, like how to compute the model in a reasonable time.

    The work herein is largely limited to the aforementioned first step, and it is thus a contribution to that unsung subfield known as the mathematics of language. A mathematical model is presented for certain aspects of language and its acquisition that is fitted with a computational model for solving the linguistic equations, as it were. Unfortunately, an adequate computational methodology for extracting answers in practical cases has not yet been developed, so the results here are strictly theoretical. This is not a downfall at this stage; after all, the Navier-Stokes equations were once strictly theoretical, too.

    I have undertaken to report my work and my research in this book without much regard for who the intended audience might be. I suppose it can be somewhat circularly defined as those people who find it interesting, and then I will have pleased my intended audience. But I suggest that such people are likely to be graduate students and researchers in mathematical linguistics, language learn-ability, and formal aspects of computational linguistics.

    At times, the material presented here is quite formalized and tedious. The main reason for doing this is the desire to provide other researchers with exact details of the algorithms developed, and of the functions which the algorithms compute. I found, in my own reading when I was first learning the background for this project, that literature that was not tediously formalized read breezily but left me without a complete understanding.

    I hope that there are many readers who can appreciate this project, although I realize that it will not be very accessible to most linguists in spite of my efforts to present most ideas from first principles. I thus also hope that the presentation of so many things from first principles does not too much bore those readers who are already well-versed in the background areas. Perhaps they will find in these presentations some valuable expository material for the classroom.

    The production of this book has been carried out by the author, including typing, typesetting, indexing, and cover design. It has been typed and edited with GNU Emacs, and typeset using the MEX system, with the Baskerville font family from Micropress, Inc. The final assembly was performed by the technical staff at Trafford.

    This is version 1.0 (beta)

    Chicago, Illinois, April 2004

    Acknowledgements

    I must first thank my Ph.D. advisor Ed Stabler, who showed great patience with the difficult problems that required his assistance. He has been willing to consult with me over a period of many years, often about matters at some distance from his own concerns, and has also provided much-needed guidance and encouragement at crucial junctures.

    Thanks are owed to Ed Keenan for many consultations on this work, and in particular for helping me reconstruct the foundations of abstract algebra from first principles. The other members of my dissertation committee, Carson Schtitze and Stott Parker, deserve thanks for raising the overall quality of my original 1999 project. I thank Anna Szabolcsi for her sage advice (which I don’t think I followed, poor me) during the genesis of this work. She said make sure your dissertation is new, but not so new that no one will read it.

    The original inspiration for this project was provided by a short course taught at UCLA during the Spring quarter of 1997 by Visiting Professors Michael Moort-gat and Dick Oehrle. They presented the principles and applications of type-logical grammar with an enthusiastic, missionary style that was impossible to resist. I have since profited from interactions with them both, and in particular I owe thanks to Michael for financing a two-week visit to the University of Utretch during the eleventh hour of my dissertation’s completion.

    I would like to extend thanks to the following list of colleagues (in no particular order), who have been willing to discuss my work and offer helpful ideas: Makoto Kanazawa, Henk Harkema, Francois Lamarche, Steve Pinker, Jerry Sadock, Jason Merchant, Larry Moss, Gerald Penn, Heinrich Wansing, Christophe Costa Florencio, GerhardJager, and Richard Moot.

    This book received much assistance from the participants at various conferences at which early work was presented which is represented in this book, including the 23rd Holiday Symposium: Algebraic Structures for Logic, held at New Mexico State University in 1999; the Van Gogh Meeting organized at Utrecht University in 1999 by Michael Moortgat; the GRACQLearning Workshop organized at the University ofNantes in 2001 by Christian Retore; Formal Grammars-Mathematics ofLanguage 2001 atthe University ofHelsinki, the 39th meeting of the Chicago Linguistic Society (2003), and Mathematics of Language 8 at Indiana University (2003). The book was also helped by students in my courses Logics of Grammar and Computation, and Type Theory in Syntax and Semantics, both offered at The University of Chicago.

    I also extend warmest thanks to those people and institutions that have directly supported my work with money or resources, beginning with the UCLA Department of Linguistics. Parts of my work on the original dissertation were completed with the support ofNSF LIS grant 9720410, Learning in Complex Environments by Natural and Artificial Systems. The dissertation was also partially supported by a scholarly exchange grant to Prof. Dr. M. Moortgat of OTS, University of Utrecht, and a 1998-99 UCLA Dissertation Year Fellowship. My recent work to expand and rewrite the project has been conducted while I was Visiting Assistant Professor of Linguistics at the University of Chicago, and I could not have done it without a Research Incentive grant courtesy The University of Chicago Dept. of Computer Science. In addition, with the kind forbearance of my first mentor Michael Dobrovolsky, I spent the summers of 2002 and 2003 as a visiting scholar in the Linguistics Dept. at the University of Calgary working on this project.

    On a personal note, thanks to Mom and Dad for buying all those school books and a computer, and to Jacquie for her love and support. Finally, thanks go to my publishing company Trafford, for saving this work from whatever purgatory the major companies send unmarketable monographs to. I dedicate this work to my daughters Sandra and Brenna, who learn new language at every turn.

    Chapter 1

    Introduction

    In a recent article, Halpern et al. (2001) summarize the large variety of influences that the subject of logic has had on the field of computer science. Indeed, they conclude quite justifiably that computer science is the one area where logic has had the most profound effect, far greater than its influence on mathematics. One purpose of the present book is to take note of the potential for logic to profoundly affect yet another discipline with close ties to computer science: linguistics, and in particular psycholinguistics, wherein one is concerned with modeling the phenomenon of language and its acquisition by humans in a way that maintains a plausible connection to the human computational engine. We cannot say in what ways the human brain uses logic to compute natural languages, or to learn them. It is possible to demonstrate, however, a great potential for logic to provide models of language and its acquisition which use the power of formal deduction. Insofar as the brain’s computer is a kind of deductive engine, the present studies can be viewed as psychologically relevant in a broad sense.

    In this study, many commonplace assertions, assumptions, and paradigms of modern linguistics will be dropped or ignored. This serves as fair warning for any readers who might still be expecting a mainstream treatment. We do not do away with popular ideas and developments lightly, or without good reason. The main reason seems to be that the goals of our study cannot be met by typical approaches to linguistic theory, frequently because most linguistic research does not pursue these goals. We will spare the reader any attempt at summarizing the goals of most researchers, being content to focus on our own.

    Traditional syntactic categories employed in grammars, such as noun and verb, will not be used. These categories descend from antiquity, do not shed much light on the nature of language, and thus can only hinder our efforts. Our main thrust here is the development of a scheme by which a grammar for any natural language (suitably idealized) could be learned from linguistic data of a reasonable nature, and to prove mathematically that such learning could succeed. We draw upon many years of recent developments in type-logical grammar to settle on a grammatical framework which is lexicalized, in the sense that large amounts of information that would frequently be stored as grammatical rules are here stored in the syntactic categories assigned to words in the lexicon. The type-logical framework is chosen for a number of reasons, but chief among them is the tight and simple model for the syntax-semantics interface that connects the syntactic logic with the typed lambda calculus for semantics. We therefore orient our treatment around the development of grammars which generate languages consisting of term-labeled sentences or trees, which are sentences, possibly tree-structured, paired with terms of a lambda calculus which show the compositional meaning structure.

    Since type logic is a lexicalized framework, term-labeled string (TLS) or tree (TLT) languages are generated from grammars whose language-specific information is entirely within the lexical categories. The type logical system itself is then posited as an important universal component of grammar. We suppose it is part of the human learner’s innate linguistic endowment. Thus, while we set aside quite a number of the modern linguist’s favorite assumptions, the general hypothesis of an innate Universal Grammar is adopted-we will see how sorely this is required to prove learnability of the relevant kinds of grammars, and indeed, to accomplish anything in this direction. It simply does not seem possible, in the light of the results to be presented, for any system (human or machine) to learn grammars reliably and completely starting with nothing, as it were. The type-logical system adopted is not a novel contribution of our work, but while the idea of a Universal Type Logic underlying all natural languages has been nascent in the literature for some time, the emphasis given to the notion appears to be more overt here than in previous work.

    The novel contributions offered here begin in the second part, with the development of a formalized discovery procedure for learning type-logical grammars for term-labeled tree languages from samples of such languages. The procedure is, in broad outline, similar in design to the grammar learning schemes of Pinker (1984). It begins by learning the basics of syntactic and semantic lexical categories by semantic bootstrapping, extracting all available information from the syntactico-semantic structures provided in the learning data.¹ The second stage applies a process of variable unification over the lexical categories (c.f. Buszkowski and Penn 1990), and we consider two ways of performing this unification. Ultimately a unification procedure is settled on that attempts to perform, in Pinker’s terms, structure dependent distributional learning, which is informed by a new formalization of the intersubstitutability criteria that were canonized by Bloom-field (1933) and the American Structuralist school.

    After these rigorous and explicit procedures have been developed, it is possible to mathematically analyze the learnability properties of various classes of TLS and TLT languages as definable with the help of the discovery procedures. In the end, it seems that TLT learning is the more suitable acquisition model, and we propose that human languages adhere to a restrictive property governing their syntactic and semantic structures (TLTs) which we term syntactic homogeneity. The syntactically homogeneous TLT languages form a new class of formal languages which intersects the Chomsky hierarchy in an interesting fashion-not all finite languages have this property, while many context-sensitive languages do. This encouraging feature of syntactic homogeneity aids the plausibility of the proposal that it really helps to define the nature of human languages, which the notorious classes of the Chomsky hierarchy do not really do. Moreover, our final learning procedure is biased in favor of such languages-in fact it can only learn syntactically homogeneous languages, and it learns them reliably and successfully, with mathematical precision.² This result leads us to propose that part of the innate human language faculty involves, as a complement to Universal Grammar, a biased learning procedure which assumes any learning sample to be a subset of a syntactically homogeneous language.

    As a warning to the hopeful reader it is only fair to state that not all of these procedures have yet been implemented as working software (though we prove that they could be, using computer science style algorithmic analysis). Part of the reason for this is that the theoretical results are the most important here, because we have undertaken a serious effort to model the complete acquisition of a language-this is not a machine learning project, in which the main goal is the practical success of the computer. Another reason is that the algorithms which have been implemented run extremely slowly because the problems being solved are intractable, indeed despicably so. Having proved the theoretical successes of our work, we leave it to future research to leap the implementational hurdles.

    1.1 Grammar of logic, logic of grammar

    Logic has come a long way in the twentieth century, and has expanded beyond the bounds of classical symbolic logic. Most science and linguistics students, other than logicians and some theoretical computer scientists, learn only about classical logic in some undergraduate class, that is if they learn anything about logic at all. In keeping with this, most non-logicians seem to think that logic is for modeling natural reasoning patterns. Indeed, that is what logic was invented to model, but there is considerably more to it nowadays. Logicians have worked to isolate the essence of logic, and it is now recognized that once logic is relieved of the burden of what it is supposed to be for, it is free to be useful for a variety of endeavors. The shape of the logic and its mathematical characteristics

    Enjoying the preview?
    Page 1 of 1