What are adpositional grammars?
Adpositional grammars (adgrams) are a novel grammar formalism that
aims to give a general, cross-linguistical description of how human beings
organise their linguistic mental spaces through the election of one or an-
other particular morphological and syntactic construction. Hence, adgrams
deal with morphology, syntax and semantics as well (in this dissertation, I
will not deal with phonology, as I will take as linguistic data only written
texts). Adgrams are a highly lexicalised approach to natural language (NL)
analysis. I say ‘adgrams’ instead of ‘adgram’ as each NL system has its
own autonomous adpositional grammar system. Here, NLs are considered
synchronically, i.e., there is no treatment of how they develop and evolve.
The distinctive characteristic of adpositional grammars is that they are
based on adpositions. For the moment, let us consider the term adposi-
tion merely as a hypernym of prepositions, postpositions and circumposi-
tions, depending on the NL. For example, English or Italian have mainly
prepositions while Japanese or Turkish have mainly postpositions. Thus,
unlike other grammar formalisms, which are based on a single NL and
then adapted to others, adgrams are a cross-linguistic model since the be-
ginning.
The linguistic part presented in this dissertation is derived from re-
search in the field of adposition by Fabrizio A. Pennacchietti for over 30
years.5 His explorations dealt mainly with prepositions of western and
semitic NLs, and were often published in journals for semitists, which are
difficult to access for the non-specialist.6 Moreover, the model was refined
over the years, without reaching a systematic presentation which is valid
in general, i.e., for every NL. The present work aims to describe such a
model in a strong formal way. Particular attention has been given to the
definitions of the technical linguistic terms, as they have been used in a
rather peculiar way. Readers proficient in linguistics might be annoyed at
such precision, even pedantic; please be patient, this dissertation should be
readable by computer scientists, who are not familiar with linguistic tech-
nicalities. The same advice applies to computer scientists when I introduce
the computational formalisms: they can be elementary to computer people,
but they are not taken for granted by linguists.
The dissertation is structured as follows. In the first part, the general
model of adgrams is presented, in the second and third parts a concrete
instance of it will be given. Therefore, the first part is more oriented to the
philosophical and linguistic aspects of adgrams, while afterwards the for-
mal model is presented in a linguistic instance. While the general frame-
work is derived from Pennacchietti’s work, the formal model is entirely
mine, developed while working closely with Marco Benini. I have chosen
the Esperanto language as the first instance of adgrams, for many reasons
which I will discuss later. The second part also deals with a machine trans-
22
Chapter 1. Introduction
lation scenario, as I think that a reliable machine translation is the best way
to prove formality and cross-lingual validity of the underlying language
models.
Pennacchietti has enriched and refined his model extracting concepts
and data from very different sources: his great merit has been able to adapt
tools and structures from everywhere so to build the adgram kernel, i.e.,
the analysis of each NL prepositional space (see below for details). I have
grouped these sources in three major schools of linguistic research: (i) the
Chomskyan school, (ii) the cognitive linguistic school, and (iii) various in-
fluences from structuralism and classic authors, in particular the school
founded on Lucien Tesnie`re – the so-called dependency grammar school.
This distinction does not mean that they do not have anything in common; I
have grouped the various authors in these broad categories because I think
it is the best way to present the literature under adpositional grammars.7
The rest of this chapter is devoted to clarifying how these sources influence
adgrams, before presenting the adpositional grammar model.
A final proviso is needed before going on: all linguistic data and formal-
isations are mine, and any mistake or error is mine only. Hereafter, ideas
may or may not be shared by Pennacchietti, Benini, or whoever.
Adpositional grammars are formal
Each adgram encapsulates the language model in a self-contained system,
i.e., a device which defines the set of well-formed sentences which con-
stitute the NL. Most probably, grammar becomes automatised in learning
through frequency of (un)successful use of linguistic patterns, so to build
syntagms and paradigms through contrastive collocation (see details be-
low about these terms); however, this dissertation does not deal with is-
sues about NL acquisition, so NLs are fictitiously considered as already
acquired. In this sense, adgrams pay credit to Chomsky’s original pur-
poses, as described in Syntactic Structures [Chomsky, 1957]: a language
model can be formalised independently to the purposes for which speakers
use NLs. Therefore, adgrams can be compared to other formal grammars,
such as Combinatory Categorial Grammars (CCG), Tree-Adjoining Gram-
mars (TAG) and Head-driven Phrase Structure Grammar (HPSG).8 Ad-
grams aim to specify with maximum precision principles and rules which
generate all and only the grammatical sentences of a NL. From a philosoph-
ical perspective, I agree with Chomsky about the belief that there exists an
inbuilt structure in the mind which constrains linguistic variability. Ad-
grams aim exactly to describe this kind of constrictions. An immediate
corollary of what said is that here NLs are neither approached as observed
behaviours nor investigated in their sociolinguistic dimensions.9 Neverthe-
less, rich linguistic data used in this dissertation will be extracted from real
corpora – in particular the multilingual parallel corpus of the international
23
newspaper articles published in Le Monde Diplomatique.10
On the other hand, there are some assumptions in the Chomskyan tra-
dition that have no place in this approach. First, the primitive categories of
functions normally used in the chomskyan tradition, e.g., S, N, NP, PP, etc.,
are not considered valid in adgrams, as they have no immediate linguistic
concretisation in terms of morphemes.11
Chomskyan linguistics has become more and more abstract over time,
as the postulated entities and processes that constitute grammar trees be-
came more and more theoretic: for instance, the dichotomy surface/deep
level, the X-bar theory, etc., are complex structures very far from the mor-
phemic reality. In contrast, adgrams aim to strictly adhere to the morpho-
logical entities that are visible in linguistic productions. I will refer to this
principle as the principle of linguistic adherence. Adgrams will give ac-
count directly of the logic underlying the production of morphemes in a
sentence and their collocation (see below for details). As an immediate
corollary, entities like ‘traces’ and ‘empty nodes’ have been kept to a min-
imum: only well-known linguistic phenomena as anaphora resolution or
wh-movements will be used. In NLs like English or Italian, these phenom-
ena are marked by a different collocation pattern or a specific morpheme
devoted to mark the phenomenon itself. I am deeply convinced that lin-
guists cannot claim that general rules govern linguistic structures while
idiosyncratic and anomalous patterns are relegated to the periphery of the
system, as ‘exceptions’. Adgrams give a highly general linguistic model
which also gives very precise account of nuances and ‘strange’ linguistic
patterns. More specifically, adpositional grammar trees (adtrees) give an
account of linguistic phenomena often relegated to the ‘periphery’ of lan-
guage in the approaches based on chomskyan linguistics. Therefore, as the
reader will see from chapter 2 onwards, adtrees are quite unlike any other
linguistic tree published until now.
Adpositional grammars are computational
Since this is a computer science dissertation, and since it treats linguistic
topics, it falls under the rubric of ‘computational linguistics’, and its sub-
class ‘natural language processing’ (NLP). I am deeply convinced that nat-
ural language engineering is, or should be, a testbed of general linguistic
theory and formal NL descriptions – in brief, language models. In my
view, language models should have cross-linguistic validity and should al-
ways be tested with a formal model that can be run on a computer – per-
haps it is the only testbed we have. It is important to note by now, that
computational efficiency of the language model is not a theme in this dis-
sertation. Cognitive linguists have a point claiming that for the most part
our linguistic production is rarely compositional, i.e., the output of rule-
based computation (this is called the rule/list fallacy).12 We probably store
24
Chapter 1. Introduction
in our mind well-practiced patterns of use both at morphosyntactic and
semantic levels. We learn to play an instrument or to drive in a similar
way: initially we have to apply consciously rules and our performance is
controlled, slow, and full of errors. With practice, the sequence of applied
rules become automatised, and performance becomes rapid and far less
prone to errors. Something similar happens in learning NLs: we do not
always need to apply fine-grained rules, as we automatise established pat-
terns that we store and recall, that indubitably fasten computation time, i.e.,
they increase efficiency. This means that the rules per se can be computa-
tionally inefficient. For the purposes of this dissertation, adgrams describe
fine-grained rules as if there is no storage and recall of results. The goal
here is to demonstrate that a concrete adgram, which is an instance of the
language model, is formal, computable, and linguistically feasible.
Another important limitation of the current model of adgrams is quot-
ing and name-entity recognition, which I see as two faces of the same coin.
Adgrams cannot “understand” sentences like Paris is a five letter word, nor
correctly translate into Italian as Parigi e` un parola di sei lettere (in Italian
‘sei’ is ‘six’). Analogously, the English sentence Green Day don’t like Bush is
completely out of scope, unless the name-entities ‘Green Day’ and ‘Bush’
are correctly tagged a priori – e.g., about which US president are we re-
ferring to? This kind of sentences need a lot of encyclopædic knowledge
to be inserted into the machine mostly a priori, and adgrams do not com-
pute common sense knowledge, but only linguistic-centred one (see later
for details).
The formal model is modelled on a Von Neumann’s machine with an
intrinsical non-deterministic parser. Non-determinism is simulated by the
backtracking primitive well-known in Prolog, although the formalism is
closer to the so-called logical frameworks, e.g., Isabelle [Paulson, 1990].
A fully-developed abstract machine is built on the primitive abstract ma-
chine. The lexical analyser parse the text in input and gives the appropriate
adpositional tree(s) in output. See chapter 7 for details.
Last but not least, adgrams are in debt with research made in the 1960s
in the Centro di Cibernetica e di Attivita` Linguistiche (Centre of Cyber-
netics and Linguistic Activities) lead by Silvio Ceccato, in particular the
formalism called Correlational Grammar (see details in chapter 2).
Adpositional grammars are cognitive
The term ‘cognitive’ is not free from controversy in linguistics. If we take
the philosophical position of nominalism in its extreme form, linguistic
entities are merely a matter of linguistic convention: the set of real-world
entities which may be called ‘dogs’, or the colour values that are described
as ‘red’ in English, have nothing in common with their name. According to
nominalism, there is no intersection between linguistics and cognitive sci-
25
ence. Note that the original position of Ferdinand de Saussure, the father
of modern linguistics and structuralism, is essentially nominalist: for him,
the language system (langue) is a system of signs where their only meaning
(sens) is their sound patterns (image acoustique).13 If the position ‘using a
word appropriately is simply an internal linguistic fact’ were true, Eliza,
Weizenbaum’s famous computer program that mimics a natural-language
dialogue between a Rogersian psychologist and its patient, would be intel-
ligent as a human being.14
At the other extreme, linguistic entities are merely instances of preex-
isting categories like DOG and RED. In other words, categories exist in-
dependently of NLs and their users. This position is called realism. The
fallacy of realism is revealed by the argument of the multiplicity of NLs:
why should the hyperuranus (i.e., the platonic place where real categories
exist) be written in English and not in Italian, Chinese, or Tamil?15 Let me
explain with a concrete example. The concept of BLUE can differ substan-
tially if we change the NL in which we are describing it. For instance, in
Rumanian BLUE is albastro, derived from the Latin albus, ‘white’. The an-
cient Romans probably saw the sky colour as a kind of ‘dirty white’, as the
Italian word celeste shows evidence of. In fact, the English word blue is of
German origin.16 Hence, not only is it impossible to a priori decide which
categories have the right to enter the hyperuranus, but it is also impossible
to decide in which NL they have been formulated. Even if the inventory
of phonemes of a given NL is limited and can be reasonably identified, the
inventory of a priori semantic categories can not.17
On the other hand, the very concept of NL is a (reasonable) abstraction
for the continuum variance of idiolects, i.e., the linguistic habits belong-
ing to each person. This fact undermines the very foundations of radical
nominalism. If each person has his/her own linguistic convention without
any further level of abstraction, it is very difficult to explain how we can
(mis)understand one another in a given NL. Things go even worse when
we accept the argument of multiplicity of natural languages: translation
becomes a priori impossible. Of course, these problems vanish if the aim
of linguistics is all and only describing each language system as a monad,
i.e., as the unique entity which do not have any perception of the real or
even mental world, as Ferdinand de Saussure seemed to believe, according
to his famous Cours.18 Following Saussure, thought is inherently shapeless
and so there is no pre-linguistic concept. In its extreme form, nominalism
transforms itself into relativism, a position ascribed to Edward Sapir and
Benjamin Whorf: concepts are entirely determined by the NL in use, and
therefore looking for universal aspects in linguistics becomes nonsensical.
There is a third way, other than nominalism and realism, namely con-
ceptualism. Cognitive linguistics take as a start point the philosophical
position of conceptualism: between real-world entities (referents) and lin-
guistic entities (sentences), there is a set of intermediate entities which re-
26
Chapter 1. Introduction
sides in our mind (concepts). Conceptualism has a number of advan-
tages, and that is why I take this philosophical position for adgrams. First,
it solves the problem of entities which have linguistic existence but not a
real one, such as the mythical chimera: chimeras simply exist as concepts
without having a clear referent in the real world. Second, this position lets
people share concepts: where there is an agreement about the referents, i.e.,
there is a linguistic convention, people can use the same linguistic entity in
order to indicate the same concept. Therefore, it makes sense to speak of
‘cognitive linguistics’ as the discipline which analyses the relation between
linguistic entities and concepts.
Adgrams are cognitive in the sense that they are indebted to some re-
sults found in cognitive linguistics. Cognitive linguistics has for the last
25 years been an alternative to the Chomskyan approach.19 Its aim is for a
cognitively plausible account of what it means to know a NL, how NLs are
acquired, and how they are used. Cognitive linguists are careful to have
sensible linguistic data to prove their theories, and that is why I take most
examples from corpora of language-in-use, following the principle of lin-
guistic adherence (see previous section).
Unlike Chomsky’s hypothesis that language is an innate cognitive fac-
ulty – known as the Language Acquisition Device (LAD) – cognitive lin-
guists take as their start point that premise that language is not an au-
tonomous cognitive faculty and consequently it is not separated from non-
linguistic cognitive abilities. To put it differently, the representation of lin-
guistic knowledge should be similar to the representation of other concep-
tual knowledge. This does not mean that a unique configuration of cogni-
tive abilities devoted to language does not exist; it means that the language
ability requires cognitive components that are shared by other cognitive
abilities. In other words, an autonomous level of organisation does not
necessarily entail modularity.20 NLs, cognitive linguists say, are driven by
established facts about human cognition, not by the internal logic of the
theory. Consequently, grammar is considered as a cognitive ability in all
its parts – i.e., phonology, morphology, syntax, semantics and pragmatics.
This holistic vision of cognition implies that cognitive linguistics take
concepts and models from other disciplines – like psychology – to describe
even the smallest, subtle difference between sentences. Using the words of
Taylor [2002, 11]:
the very wording that we choose in order to linguistically encode a
situation rests on the manner in which the situation has been mentally
construed.
I will refer to this principle as the grammar as conceptualisation princi-
ple.21 As I understand the literature, cognitive grammar focuses on the
following main areas: (i) categorisation; (ii) mental imagery and construal;
(iii) metaphor; (iv) inferencing and automatisation.22 In the field of cat-
27
egorisation, a place apart is given to figure-ground organisation, whose
prototype is visual perception.23 Figure-ground organisation inherits inter-
esting characteristics from perception. In fact, perception is highly related
to attention (e.g., you can focus your sight on this very word and, at the
same time, focus on the periphery of your vision). Furthermore, figure-
ground organisation can be reversed (e.g., you can look at this page as a
complexly shaped white figure which obscures a black background). Fi-
nally, there are several levels of figure-ground organisation (e.g., if you are
reading a paper version of this dissertation, your primary figure is the se-
quence of black letters while your primary background can be the desk
you are sitting at; but there will be also a secondary figure against a wider
background of the room where you are sitting in).
While cognitive linguists used the figure-ground organisation princi-
ple for semantics, adgrams use it to construct adtrees, i.e., for morphology
and syntax, that I treat as two sides of the same coin (see chapter 2 for
details). Therefore, the scene that is mentally construed by the speaker is
encoded linguistically not only in terms of lexical choices but also by the
morphosyntactic resources in charge. Let me explain with a couple of ex-
amples:24
• (1a.) The farmer shot the rabbit.
• (1b.) The rabbit was shot by the farmer.
• (2a.) The roof slopes gently downwards.
• (2b.) The roof slopes gently upwards.
In example 1 the different figure-ground organisation is due to morphosyn-
tax, while in example 2 it is due to lexical choices. In fact, 1a presents the
scene in terms of the farmer as the figure, and the rabbit as the ground of the
event shot, while in 1b the scene is reversed: the rabbit is the figure, and by
the farmer is merely a circumstance of the shot event, acting as ground. It
is important to know at this point that the figure does not always coincide
with the syntactic subject. In contrast, the difference presented in example
2 concerns how the speaker perceives the roof: in 2a it is mentally viewed
from above, while in 2b it is mentally viewed from below. Adgrams give
no cues about these pragmatic and semantic aspects as in example 2, but
only about morphosyntactic phenomena like in example 1.
If Chomskyan grammars were attacked by cognitive linguists as ‘syn-
tactocentric’, analogously adgrams can be rightly attacked as ‘morphocen-
tric’, i.e., they give a central place to morphology. In fact, I argue that
morphosyntax is also a key for semantics. Therefore, adgrams follow a
semasiological perspective in the language-world approach: it goes from
the language model to the world, asking ‘For this expression, what kinds
of situations can be appropriately designed by it?’25 Of course, linguistic
28
Chapter 1. Introduction
Table 1.1: Macro-levels of analysis in adgrams (specimens)
pragmatics
speech act analysis semantics
actant analysis semantic relations morphosyntax
cleft sentences semantic features adpositional spaces
lexical values adtrees
meaning cannot be reduced to a full compositional model, as it involves
not only the construal, i.e., the process whereby concepts are structured in
a given linguistic expression, but also the content evoked. In other terms, a
linguistic expression can evoke the same content and nonetheless be differ-
ent in meaning as it is construed in a different way. What I argue is that the
construal is compositional, while content is encyclopædic in scope. Adtrees
will give account of the choice of construals in terms of trajector/landmark
alignment (see Chapter 2). It is right to say that adgrams are meaningful
primarily in terms of construals: semantic structure is an inherent part of
adgrams.
I have just used the term ‘concept’. With respect to concepts, adgrams
borrow another expression from cognitive linguistics. According to cog-
nitive grammar, each sentence forms a conceptual space.26 A conceptual
space is formed by linguistic elements which are instances of concepts and
their relations. The main point is that understanding is prior to judgements
of semantic relations – such as synonymy, hyponymy, antonymy – and
pragmatic relations analysed in terms of speech acts; an analysis which
involves implication and presumptions, e.g., which mental states are en-
tailed in the each actant. To avoid confusion, I will adopt the term ‘actant’
from Tesnie`re, to indicate the semantic roles in each phrase, while the term
participant will simply indicate the two leaves of a minimal adtree (see
Chapter 2 for details).27 Semantic and pragmatic relations form content,
while the conceptual space is construed (see Table 1.1). Each adtree aims to
describe a conceptual space: content is included in the lexicon, i.e., in the
leaves of the adtree (see Chapter 3 for the treatment of lexicon).
My understanding of the literature in cognitive linguistics is that se-
mantics drives morphology and syntax, which are considered merely as
parts of it.28 In my view, the main limitation in the current cognitive lin-
guistics approach is the premature and prejudicial refuse to apply a for-
mal methodology to their linguistic analysis. Most of this literature is not
precise enough to be used in a language understanding or generation sys-
tem. At the same time, cognitive linguists use conceptual spaces as start-
ing points to describe semantics in terms of encyclopædic knowledge, an
approach which raises many unsolved problems.29 In constrast, adgrams
29
are driven by visible linguistic items – the morphemes – and they strictly
follow the conceptual space described by each sentence, following the prin-
ciple of linguistic adherence. Consequently, semantics is derived from the
treatment of morphemes in a strict formal way.
Adpositional grammmars are structural
The main problem with conceptualism is the mapping between references,
concepts and linguistic entities. Do linguistic entities reflect some proper-
ties of the referents (mild realism) or simply reflect concepts in the mind
(mild nominalism)? It seems quite odd to visualise concepts as ‘pictures in
the head’, since it is impossible to depict the prototypical concept of TREE,
not to mention the representation of concepts like LOVE or GOOD (that is
why the allegories of figurative arts are so interesting). We can only visu-
alise instances of the concepts, not their “real” properties. Therefore, we
should consider concepts with a purely functional point of view: a con-
cept is a principle of (flexible) categorisation, so that human beings can
draw inferences. Following this line of reasoning, each morpheme – i.e.,
each linguistic element with a visible representation which is both atomic
and meaningful – is a concept.30 In 1b these are the concepts involved:
rabbit, farm, shot, was, by, the, -er. At a first glance, it can sound strange
that a derivative morpheme like -er is treated as a concept. Let me explain
through some more instances of example 1:
• (1a.) The farmer shot the rabbit.
• (1b.) The rabbit was shot by the farmer.
• (1c.) ?The farm shot the rabbit.
• (1d.) The farmers shot the rabbit.
The four sentences depict four different conceptual spaces. In example
1c, the farm is a very unlikely one, equipped with guns and maybe an A.I.:
the model of the world behind such a sentence could be a science-fiction
novel! What is important is that there is no prior assumption about the
model of the world: it is taken as axiomatic in this dissertation that it is
always possible to find an appropriate world of reference if the sentence is
morphosyntactically sound, i.e., I am agnostic about world models.31. Fol-
lowing Chomsky, if the sentence is morphosyntactically sound, it is well-
formed, as in the famous example colorless green ideas sleep furiously.32 In
example 1d, there are more farmers shooting at the rabbit. The presence or
absence of -er show that the scene is construed in a totally different way;
analogously, the adding of the morpheme -s again changes the meaning of
the sentence.
30