1.1 Introductory Remarks
The rise of definite and indefinite articles continues to puzzle historical linguists, despite a vast, and growing, body of literature on the subject. Interest in grammaticalization of (in)definiteness is partly fueled by interest in the category of definiteness itself, and partly by the relatively large number of sources—there are a number of languages that have developed articles in their literate histories, leaving us with corpora of texts in which the process may be observed. In this sense the morphologization of (in)definiteness, by which we understand the rise of its morphological exponents in the form of articles, either or both definite and indefinite, differs from the morphologization of case, which seems to predate extant sources, making its study a reconstruction rather than an analysis of data.
In this respect, the northern branch of the Germanic languages, today comprising Danish, Faroese, Icelandic, Norwegian and Swedish, presents a promising field of study. These closely related languages, with a common ancestor—Old Nordic—located in the not-too-distant past, at ca. 500–800 AD (Bandle et al. 2002), have all developed the definite article, and, with the exception of Icelandic, they have all developed the indefinite article too. However, the grammaticalization processes belong only partly to their common history; most have taken place in their individual histories, leading to slightly different scopes of use of the articles in each language, and different patterns of noun phrases, both definite and indefinite. The extant texts, although limited in size and stylistic variation to begin with, are of good enough quality to make the diachronic study of (in)definiteness possible, at least for Danish and Swedish (the eastern division) and for Icelandic (the western division). Not only do the texts exhibit the gradual rise of articles, they also illustrate the common elements of the process as well as its more isolated aspects, i.e., those limited to one language only.
Grammaticalization of (in)definiteness has been the subject of a number of studies, both theoretical and empirical, concerned with one language or language family. A fairly universal model of the grammaticalization of the indefinite article was proposed by Heine in 1997 (based to some extent on Givón 1981, see also Herslund 2012). The model was tested and confirmed against data from a number of languages (among them Danish, Swedish and Spanish, Pozas-Loyo 2010). As regards the definite article, some proposals have been put forward (beginning with Greenberg 1978), though so far none has been fully successful in identifying the transition from one stage of grammaticalization of the article to the next. The questions that remain to be answered are:
-
What makes the numeral ‘one’ and the deictic ‘that’ (or its like) such good candidates for the role of articles? In other words, why, in spite of there being other potential articles in the making, do these two universally form the first stage of grammaticalization of (in)definiteness? We will refer to this question as ‘the puzzle of uniformity of sources’.
-
What makes these forms so successful, even when, at least to begin with, they are in competition with other forms, such as possessive or indefinite pronouns? In other words, how does the competition against other potential candidates unfold? We will refer to this question as ‘the puzzle of success’.
-
Why does the order of events (grammaticalizations) seem to be universally ‘definite first, indefinite second’? The studies of grammaticalization of (in)definiteness are for the most part concerned with either the rise of the definite or the rise of the indefinite article; however, the two developments are undoubtedly connected. In what ways does the rise of the definite article prepare the way for the rise of the indefinite article? But also, why is it not necessary for a language to develop both articles, as is the case with Icelandic? We will refer to this question as ‘the puzzle of definite first’.
-
What is the diachronic bridge between the use of the definite article with anaphora and its use with unique referents (in more recent terms: between strong and weak definite articles)? Why can a demonstrative replace the definite article in the former context but not in the latter? And how are we to treat a context that is neither strong nor weak, i.e., indirect anaphora? This context itself is understudied diachronically, to say the least, and yet it seems to be crucial in the grammaticalization of the definite article. We will refer to this question as ‘the puzzle of indirect anaphora’.
This list of puzzles is by no means exhaustive, and each would require a full-scale research project, in particular one taking into consideration languages from different language families. In the present study, we will focus in particular on the last puzzle, basing our study on a corpus of texts written in Danish, Icelandic and Swedish between 1200 and 1550. By limiting the scope of the study to one aspect of grammaticalization, we aim to shed more light on the process and somewhat elaborate on the model of grammaticalization. By focusing on a group of closely related languages, we aim to demonstrate both the universality and the individuality of the process.
1.2 Definiteness in the Modern North Germanic
Modern North Germanic languages include the so-called Continental or Mainland languages: Danish, Norwegian (in two official varieties, bokmål and nynorsk) and Swedish; and the so-called Insular languages: Faroese and Icelandic. All of these languages have developed the definite article; all but Icelandic have also developed the indefinite. Articles have evolved in all of the modern Germanic languages, English, German, Dutch and Frisian (see McColl Millar 2000 on the development of the definite article in English). Although the developments are common, they do not belong to the Proto-Germanic period, since no articles are attested in the most conservative texts, i.e., Runic inscriptions from before 800 AD. The now extinct Gothic exhibits some advancement in the formation of at least the definite article (Kovari 1984). It is, however, a matter of some debate to what extent this was the influence of the Greek original from which the only extant text, the New Testament, was translated (Askedal 2012). The northern branch of the Germanic language family has undergone clearly different developments than the rest, as the definite article in these languages is a suffix, while in other Germanic languages it is a free form, including the semi-article in Gothic.
The fundamental opposition in the article systems in North Germanic is one between definite and indefinite (as opposed to other potential articles, such as the partitive in French). The articles are either free lexemes (the indefinite and the preposed definite articles) or suffixes (the postposed definite article). The articles are inflected for number, gender and, in the Insular languages, case (Delsing, Vangsnes and Holmberg 2003).
The morphologization of (in)definiteness, i.e., the formation of morphological exponents in the form of articles, is typically a lengthy process with multiple stages, and this is also the case in North Germanic. The definite article seems to develop before the indefinite, and the preposed definite develops at a later stage than the postposed. It takes some time before the noun phrase acquires its modern form (slightly different in each language, Delsing 1993, Julien 2005). Also, the status of bare nouns (BNs) changes from being a marked alternative to an incipient definite article to being a marked alternative to a DP. The fact that the morphologization of (in)definiteness belongs at least partly to the individual history of each language is reflected in the variation of the NP structure, the functions of articles and the status of BNs among the languages.
In the modern North Germanic languages, the definite article is a suffix, always attached enclitically to the noun (in the Insular Scandinavian languages Icelandic and Faroese, to the case-inflected form of the noun). Its origins are to be sought in the distal demonstrative hinn ‘yon’. Apart from the postpositional article, in the Mainland Scandinavian languages (Danish, Norwegian, Swedish) and Faroese there is a preposed article den ‘that’, which can co-occur with the postpositional one (so-called double definiteness in Faroese, Norwegian and Swedish, see Börjars 1994 and LaCara 2011) or be in complementary distribution with it (Danish). Also, in all North Germanic languages apart from Icelandic, there is an indefinite article, related with the numeral en (Danish, Swedish, Norwegian bokmål), ein (Faroese, Norwegian nynorsk) or einn (Icelandic) ‘one’, etymologically PGmc (and early Proto-Nordic) *ainaz. An overview of the forms of NPs in North Germanic is given in Table 1.
Table 1
An overview of NPs in North Germanic
|
Context |
Article |
Form |
Languages |
|---|---|---|---|
|
Single noun NP ‘the house’ |
Definite article postposed |
hus-et house-def |
Danish, Norwegian, Swedish |
|
hús-ið house-def |
Faroese, Icelandic |
||
|
NP with an adjective and noun ‘the big house’ |
Definite article preposed only |
det stor-e hus def big-wk house |
Danish |
|
Double definiteness (preposed and postposed definite articles) |
det stor-e hus-et def big-wk house-def |
Norwegian |
|
|
det stora hus-et def big-wk house-def |
Swedish |
||
|
hið/tað stór-a hús-ið def big-wk house-def |
Faroese |
||
|
Definite article postposed only |
stór-a hús-ið big-wk house-def |
Icelandic |
|
|
Single noun NP ‘a house’ |
Indefinite article |
et hus indf house |
Danish, Norwegian |
|
ett hus indf house |
Swedish |
||
|
eitt hús indf house |
Faroese |
||
|
No indefinite article |
hús house |
Icelandic |
1.2.1 The Definite Article in Modern North Germanic Languages
1.2.1.1 Insular Languages
Icelandic and Faroese, the Insular North Germanic languages, have both developed the postposed definite article. In both languages, but in particular in Icelandic, the morphology of the article reflects a relatively conservative stage of development (Askedal 2012:73), as the article is inflected for case, independent of the case morphology of the noun. Consider the following examples from Icelandic (Table 2).
Table 2 presents the inflectional paradigms of three Icelandic nouns: hestur ‘horse’ (masculine), bók ‘book’ (feminine) and barn ‘child’ (neuter). The second, fourth and sixth columns show the bare nouns inflected for case and number, the third, fifth and seventh the nouns in the definite, also inflected for case and number. The hyphens mark morph boundaries. In the majority of the forms, the separate case inflection of the noun and the definite suffix is still clearly visible.
One form that is not as easy to segment as the others is the plural dative ending in -unum in all genders. Originally, the form consisted of the segment -uminum, with two dative endings -um, one of the noun, one of the article. This form has been reduced in both western and eastern branches of North Germanic: in the west to -unum as found in Icelandic, in the east to -omen as found in Old Danish and Old Swedish (in both languages the category of case was later reduced, and with it the plural definite dative).
The Faroese definite forms, as presented in Table 3, are of similar construction as the Icelandic ones, with double case inflection within one form. The difference is that, due to a more advanced process of case reduction, the genitive has become almost obsolete in modern Faroese, rendering the genitive definite forms relic forms rather than productive ones (Askedal 2012:76).
Table 2
Icelandic case inflection in bare nouns and definite nouns
|
Case |
hestur ‘horse’ |
bók ‘book’ |
barn ‘child’ |
|||
|---|---|---|---|---|---|---|
|
SG |
BN |
definite |
BN |
definite |
BN |
definite |
|
NOM |
hest-ur |
hest-ur-in-n |
bók |
bók-in |
barn |
barn-ið |
|
ACC |
hest |
hest-in-n |
bók |
bók-in-a |
barn |
barn-ið |
|
DAT |
hest-i |
hest-i-n-um |
bók |
bók-in-ni |
barn-i |
barn-i-n-u |
|
GEN |
hest-s |
hest-s-in-s |
bók-ar |
bók-ar-in-nar |
barn-s |
barn-s-in-s |
|
PL |
BN |
definite |
BN |
definite |
BN |
definite |
|
NOM |
hest-ar |
hest-ar-n-ir |
bæk-ur |
bæk-ur-n-ar |
börn |
börn-in |
|
ACC |
hest-a |
hest-a-n-a |
bæk-ur |
bæk-ur-n-ar |
börn |
börn-in |
|
DAT |
hest-um |
hest-u-n-um |
bók-um |
bók-u-n-um |
börn-um |
börn-u-n-um |
|
GEN |
hest-a |
hest-a-n-na |
bók-a |
bók-a-n-na |
barn-a |
barn-a-n-na |
Table 3
Faroese case inflection in bare nouns and definite nouns
|
Case |
armur ‘arm’ |
hurð ‘door’ |
barn ‘child’ |
|||
|---|---|---|---|---|---|---|
|
SG |
BN |
definite |
BN |
definite |
BN |
definite |
|
NOM |
arm-ur |
arm-ur-in |
hurð |
hurð-in |
barn |
barn-ið |
|
ACC |
arm |
arm-in |
hurð |
hurð-ina |
barn |
barn-ið |
|
DAT |
arm-i |
arm-i-n-um |
hurð |
hurð-ini |
barni |
barn-i-n-um |
|
GEN |
arm-s |
arm-s-in-s* |
hurðar |
hurð-ar-in-nar |
barns |
barn-s-in-s |
|
PL |
BN |
definite |
BN |
definite |
BN |
definite |
|
NOM |
arm-ar |
arm-ar-nir |
hurðar |
hurð-ar-na-r |
børn |
børn-in-i |
|
ACC |
arm-ar |
arm-ar-nar |
hurðar |
hurð-ar-na-r |
børn |
børn-in-i |
|
DAT |
ørm-um |
ørm-u-n-um |
hurðum |
hurð-u-n-um |
børnum |
børn-u-n-um |
|
GEN |
arm-a |
arm-a-n-na |
hurða |
hurð-a-n-na |
barna |
barn-a-n-na |
* Thráinsson et al. note that forms of this kind may also be simplified to arm-ins, marking the genitive only on the article, but they are very rarely used anyway (2012:94).
The two definite articles, the pre- and the postposed, are in complementary distribution in NPs with adjectival modifiers in Icelandic, i.e., they do not occur within one NP (examples in 1). The adjective in the definite NP is usually in the so-called weak inflection, although a combination of strong adjective and definite noun within one NP is still possible (example 2).
Icelandic (Askedal 2012:74)
(1)
a.
fljót-i
hestur-inn
quick-wk
horse-def
‘the quick horse’
b.
hinn
fljót-i
hestur (considered more formal)
def
quick-wk
horse
‘the quick horse’
Icelandic (Thráinsson 2007:3)
(2)
Rautt/??rauð-a
nef-ið
á
honum
glóði
í
myrkr-inu.
red.st/??red-wk
nose-def
on
him
glowed
in
dark-def
‘His red nose glowed in the dark.’
The ungrammaticality of the weak form of the adjective in (2) lies in the fact that it would imply that the person had more than one nose and it was the red one that glowed in the dark, or at least that there is a contrast between the red nose and some other nose. As the strong form does not imply the existence of a similar contrast, it is preferred in this context. In Faroese, on the other hand, the two definite articles occur within one NP if the NP includes an adjectival modifier, a phenomenon known as double definiteness, as in (3).
Faroese
(3)
tann
reyð-a
bil-in
def
red-wk
car-def
‘the red car’
1.2.1.2 Continental Languages
The article morphology of the Continental languages is simplified in comparison to the Insular languages. Case was lost by the 15th century (earlier in Danish, later in Swedish and Norwegian), which resulted in the loss of the so-called double inflection of the noun and the cliticized article, which we find in modern Insular languages. Also, the number of genders has been reduced from three to two in Danish, Swedish and some varieties of Norwegian (while other varieties retain the three-fold gender). Remnants of the older stage of language development can be found in fairly infrequent fossilized phrases such as havsens bunn ‘the bottom of the sea’ (Norwegian) or livsens rot ‘the root of life’ (Swedish).
As mentioned in section 1.2, definiteness in modern North Germanic languages can be realized both pre-nominally (with a definite article) and post-nominally (with a definite suffix on the head noun). All of the Continental languages have developed both pre- and postposed definite articles. The preposed article is used with adjectival modifiers. Languages differ in whether they allow the two articles to co-occur within one definite NP: Norwegian, Swedish and Faroese allow the so-called overbestemdhet ‘over-definiteness’ with both articles present, as in example (4), while in Danish the two articles are in complementary distribution, as in (5). Icelandic has not really developed the preposed definite article. The construction with preposed hinn is considered literary and stylistically marked (e.g., Sigurðsson 2006). However, recent research suggests that hinn has largely the same syntactic status in Modern Icelandic as Swedish den, in spite of different etymologies and developments (Pfaff 2015, 2017, Harðarson 2017).
Swedish
(4)
den
vackr-a
dag-en
def
beautiful-wk
day-def
‘the beautiful day’
Danish
(5)
a.
den
smukk-e
dag
def
beautiful-wk
day
‘the beautiful day’
b.
*den
smukk-e
dag-en
*def
beautiful-wk
day-def
‘the beautiful day’
For authors who consider both the pre- and the post-nominal determiners to be definite articles, the North Germanic languages become a natural object of study in terms of weak–strong definite semantics dichotomy. In this spirit, Ingason (2016a, 2016b) discusses Icelandic and Goodwin Davies (2016) discusses Swedish.
1.2.2 The Indefinite Article in Modern North Germanic Languages
1.2.2.1 Insular Languages
Among the five modern North Germanic languages, only Icelandic has not developed the indefinite article. There seem to have been some tendencies towards the grammaticalization of the numeral ‘one’ into an indefinite article in the 16th century (Kliś 2019). Why this development never truly became established, or whether it was perhaps suppressed, remains an open question and makes for a potentially fascinating field of study.
Faroese, on the other hand, has a fully formed indefinite article, just like the other Germanic languages (example 6). In contrast to the indefinite article in other Germanic languages, the Faroese indefinite has a plural form as well as a singular, as in (7), although its presence is limited to plurals denoting items typically found in pairs, such as shoes (Askedal 2012:75).1 Otherwise, the plural indefinite NP is bare or includes the equivalent of nakrir ‘some’ (plural of nakar), as in (8).
Faroese
(6)
ein
reyð-ur
hest-ur
indf
red-nom.sg.st
horse-nom.sg
‘a red horse’
Faroese
(7)
ein-ir
skógv-ar
one-nom.pl
shoe-nom.pl
‘a pair of shoes’
Faroese (Thráinsson et. al. 2012:132)
(8)
Hann
var
heima
nakrar
dag-ar.
he
was
home
some.acc.pl.m
day-acc.pl.m
‘He was at home for a few days.’
1.2.2.2 Continental Languages
All of the Continental languages have grammaticalized indefinite articles, etymologically continuations of the numeral ‘one’. The indefinite article is a free preposed lexeme. In indefinite NPs it is combined with the strong form of the adjective if an adjectival modifier is present, as in (9).
Swedish
(9)
en
snabb
bil
indf
fast.st
car
‘a fast car’
Similarly to Faroese, the article has only a singular form, and plural indefinite NPs are either bare or include indefinite pronouns: Swedish några, Danish nogle, Norwegian noen, all meaning ‘some’.
1.3 Aims, Scope and Organization of the Book
The book is organized as follows: in Chapter 2 we present current views on the grammaticalization of (in)definiteness and the models of grammaticalization proposed in the literature, with particular reference to studies of the rise of articles in North Germanic languages. In Chapter 3 we present the selection of texts used in the study and the tool used to annotate the texts, DiaDef. Chapter 4 presents data harvested from the corpus and its statistical analysis. In Chapter 5 we investigate the context of indirect anaphora in a diachronic perspective, presenting a revised model of the grammaticalization of the definite article. Closing remarks are given in Chapter 6.
The analyses presented in Chapters 4 and 5 are methodologically quite different. We have chosen to include both analyses in our study, since each has merits of its own. Statistical analysis allows us a more global view of the process of grammaticalization, making use of the fact that many sources have been digitalized in recent years. An in-depth analysis of longer text passages, on the other hand, offers a chance to study each form in a broader context. For reasons of practicality, such analysis can be conducted only on a limited text sample; therefore, a statistical analysis of a larger database makes the results more substantial.
A qualitative approach is and has long been a desideratum in diachronic linguistics. It is indispensable when dealing with limited data or when consideration of a larger text fragment is necessary to make sense of the data. However, the qualitative approach is perhaps at its best when dealing with clear cases—well-formed and well-defined categories. Jenset and McGillivray (2017) cite the case of English in and at, which can clearly function as prepositions, as opposed to concerning, regarding, following and given, which may be treated as prepositions in certain contexts or under certain circumstances, but hardly in the majority of cases (Jenset and McGillivray 2017:4). To deal with less prototypical cases, a probabilistic model based on quantitative rather than qualitative studies may be applied. Moreover, in the case of a category under development, in this instance (in)definiteness, such a model lets us escape the clear-cut dichotomy of demonstrative vs. definite article or numeral vs. indefinite article, allowing us to reveal the degree of ‘articleness’ of a given form rather than imposing a final taxonomy upon it. Our goal has been to study the grammaticalization of definite and indefinite articles by means of two different types of models: quantitative and qualitative (a ‘model parallelization’; Zuidema and de Boer 2014) with the aim of gaining new, and richer, insights into the morphologization of the category of (in)definiteness.
The results from both studies focus on the factors favouring the use of the (incipient) articles. These include types of nouns (such as countable vs. mass, singular vs. plural), their functions (subject, object, etc.), and the type of reference: i.e., whether the noun is marked as new (indefinite) or known (definite) because the discourse referent is known from the previous text, or rather because it is assumed to be universally known. We will specifically use the following terms throughout the book: direct anaphora to denote instances when the definite repeats a discourse referent introduced earlier, indirect anaphora for instances when the definite is a new discourse referent grounded in previous discourse, unique reference for instances when the discourse status of the referent is neutral, since the referent is known to both speaker and hearer based on their general knowledge (‘larger situation use’; Hawkins 1978), and finally, generic reference to denote instances of article use with reference to kinds and not individuals.
As this study is a diachronic one, we wished to avoid labeling the grammaticalizing forms as definite or indefinite articles, since we study them at different stages of their development. In the more theoretically oriented Chapters 1 and 2, we will therefore refer to the forms as incipient articles. In the chapters devoted to the study of the grammaticalizing articles in North Germanic we will use the notation -IN (in capitals) to refer to the incipient definite article and EN for the incipient indefinite article. The notations hint at the etymologies of the present-day articles (see Chapter 2).
One reviewer points out that einir has also survived as a pluralia-tantum numeral in Icelandic (alongside tvennir, þrennir, fernir), in numeral uses (with plural nouns and nouns denoting natural pairs) only.