As I have shown in the preceding discussion, it is clear that the word-phrase compound must be treated as a separate class of compound. Its prosodic signature is different from the other six types of compounds, seeming to be a mix of characteristics from the other types. Word-phrase compounds have in common the characteristic of losing the lexical accent of N1 with the word-word, foot-word, word-foot, word-word, and mono-phrasal compounds. The lexical accent of N2, however, is not lost and replaced by compound accent, as it is in word-foot and word-word compounds, but rather, it is retained, as it is in mono-phrasal compounds. Finally, the register tones of N1 and N2 are both retained, as they are in bi-phrasal compounds. Word-phrase compounds are shown in the examples below, given with the proposed word-phrase structure in Figure 70.



The difference among the other six types of compounds is attributed to differences in the results of syntax-prosody mapping, where the word-word compound is the result of a perfect match enforced by relevant Match constraints, and foot-foot, foot-word word-foot, mono-phrasal, and bi-phrasal compounds arise due to phonological well-formedness constraints being ranked higher than the Match constraints, resulting in non-isomorphisms between their syntactic and prosodic structures, as discussed in Chapter 3. The structures reflecting these differences are given in Figure 71. A summary of the prosodic characteristics of the seven compound types (with the most productive patterns marked with asterisks (*)) is given in Table 20.



Prosodic structures of the other six Kansai Japanese compound types



Summary of prosodic realizations of Kansai Japanese compounds
The following figures (Figures 72, 73, and 74) are schematics of the phonological well-formedness constraints which prevent perfect matching and the compounds they are crucial in producing. Recall that the word-word compound results from a perfect match from syntactic structure.



WordBinarity → Foot-foot, word-foot, foot-word
![BinMaxHead(ω[+max, -min])-Leaves → Mono-phrasal](/display/book/9789004677647/inline-9789004677647_webready_content_m00228.jpg)
![BinMaxHead(ω[+max, -min])-Leaves → Mono-phrasal](/display/book/9789004677647/full-9789004677647_webready_content_m00228.jpg)
![BinMaxHead(ω[+max, -min])-Leaves → Mono-phrasal](/display/book/9789004677647/full-9789004677647_webready_content_m00228.jpg)
BinMaxHead(ω [+max, -min])-Leaves → Mono-phrasal
![BinMax-φ[+min] (BinMax-φ) → Bi-phrasal](/display/book/9789004677647/inline-9789004677647_webready_content_m00229.jpg)
![BinMax-φ[+min] (BinMax-φ) → Bi-phrasal](/display/book/9789004677647/full-9789004677647_webready_content_m00229.jpg)
![BinMax-φ[+min] (BinMax-φ) → Bi-phrasal](/display/book/9789004677647/full-9789004677647_webready_content_m00229.jpg)
Word-phrase compounds are not so straightforwardly derived from the competition involving these constraints, as discussed in Chapter 3. The prosodic structure I propose for word-phrase compounds involves a non-isomorphic mapping from the syntactic structure, where the maximal N terminal is mapped to a
The issue is compounded by the fact that, in Nakai’s dictionary, the word-phrase parse is almost never the sole prosodic possibility for a compound, which I refer to as the “no unique word-phrase parse problem.” Nakai (2002) offers several generalizations for what kinds of N2s, in terms of morphological composition and word origin (namely, foreign loanwords), may allow a word-phrase compound, but these generalizations are only descriptive, are subsets of criteria which predict other compound types, and are nonetheless still related to N2 length. As might be expected from an understanding of Kansai Japanese compound typology based heavily on N2 length, even with the morphological structure of N2 and its loanword status taken into account, this results in non-word-phrase parses being possible for word-phrase parse candidate words as well. Some other factor or set of factors seems to be involved.
The N2 length problem and the no unique word-phrase parse problem are discussed in more detail below. This chapter also explores and argues for possible explanations which are not entirely syntactic or phonological in nature but are rather also gradient, frequency-based, and usage-based, particularly informativeness. Some discussion of semantic factors is also offered, though an implementation of this is not pursued in the present analysis due to the limited data sample.
5.1 The N2 Length Problem and the No Unique Word-Phrase Parse Problem
In Chapter 3, I presented a syntax-prosody mapping account for the six compound types whose prosodic structures could be predicted based on the length of their second members, using constraints requiring minimal binarity for prosodic words, maximal binarity for heads of maximal prosodic words, and maximal binarity for minimal phonological phrases.
A problem arises when attempting to account for word-phrase compounds in the same way: no N2 length-based criterion can be formulated which can be attributed uniquely to the word-phrase structure, as all length based criteria already describe other compound prosodic structures. Foot-foot and word-foot compounds arise when N2 is one to two moras in length, foot-word and word-word compounds arise when N2 is three to four moras in length, mono-phrasal compounds arise when N2 is five moras or longer, and bi-phrasal compounds arise when N2 is longer than three feet (six moras) in length. Given this, the only remaining length-based criterion which could describe word-phrase compounds uniquely is one in which N2 is longer than some number of feet greater than three and/or some number of moras greater than six, at some length longer than what already describes bi-phrasal compounds, if not the same criterion as bi-phrasal compounds.
However, this criterion has limited, if any, use as a predictor for when word-phrase compounds occur. The longest words which Nakai records in his dictionary are ten moras long in total, all compound words. Ten mora words are fewer in number than nine mora words, and significantly fewer in number than seven or eight mora words. This is supported by a cross-linguistic generalization that long words are uncommon. In a survey of word lengths observed in translations of the Book of Mark in the Bible into 102 languages, Stanton (2016) finds first that 94% of the 19,239 Japanese words surveyed cluster around 1, 2, and 3 syllables in length (where each vowel is counted as a syllable), and 5% of the remaining words are 4 to 5 syllables in length. Words 6 syllables and longer together account for the remaining 1% of words. Second, taking the data from the 102 languages together and assuming based on Stanton’s discussion that the median percentage of each word length represents the average percentage of words of that length in the corpus of these languages, words of six or more syllables constitute only 1% of the corpus across 102 languages. Of course, because Stanton’s survey is based on syllables, where every vowel counts as a syllable, this means that six-mora words like sinkansen ‘Shinkansen,’ which has three vowels (syllables) but six moras due to the moraic nasals at the end of each syllable, are grouped with three-syllable, three-mora words like sakura ‘cherry blossom.’ But, if such longer words up to six moras are represented as the variable x, their number can be no greater than (16 – x)% of Stanton’s corpus, as she reports 16% of the Japanese words surveyed are three syllables in length. Words seven moras or greater will be included in the remaining 6% of words four syllables or longer. To supplement this in more concrete moraic terms, of the 56,812 words in Sugito’s (1996) Osaka-Tokyo dictionary, only 5.7% are 7 moras or longer, corresponding well to the estimate from Stanton’s corpus. Given this, the relevant test cases for a particularly long N2 length criterion are quite uncommon, though it does not exclude the possibility of such a criterion.
The greater issue for such a length-based criterion concerns what N2 lengths are actually observed in word-phrase compounds. This is presented in Table 21 below, ordered from longest to shortest by the second column, N2 length, along with their accompanying N1 lengths and total lengths, all in moras, and the number of occurrences of each type.



Lengths in moras in entries with word-phrase prosody in Nakai (2002)
An examination of the 114 entries that fit the word-phrase parse in Nakai’s dictionary (which are mostly, though not entirely, compounds; less than 5 are non-compounds) reveals only five compound words with an N2 seven moras in length and no compound words with an N2 eight or more moras in length (which would require an N1 one or two moras in length in order to be included in Nakai’s dictionary, due to the longest entries being ten moras in total). Furthermore, of the 114 words, the majority (93) are seven, eight, or nine moras total in length, with 20 seven mora words, 57 eight mora words, and 16 nine mora words. Of the 57 eight mora words, 24 have four mora N2s, 22 have five mora N2s, 10 have six mora N2s, and one has a three mora N2. Among the 20 seven mora words, 17 have four mora N2s, and among the 16 nine mora words, 7 have five mora N2s, 4 have six mora N2s, and 4 have seven mora N2s. Again, if the distribution in Nakai’s dictionary is reflective of the distribution of compounds which may have the word-phrase parse in Kansai Japanese more broadly, then this distribution suggests that word-phrase compounds do not tend to have particularly long N2s. Indeed, they tend to have four to six mora N2s (99 out of the 114, 86.8%, of the entries in Nakai’s dictionary), which places them in the same territory in terms of N2 length as longer word-word compounds and mono-phrasal compounds. Interestingly, it seems that there is some clustering around 8 mora compound words, which have 4, 5, or 6 mora N2s, further suggesting some role of N2 length in the word-phrase compound.
Given the tendencies shown above, it can be seen that N2 length in word-phrase compounds generally overlaps with N2 lengths found in other compound types. It is clear, therefore, that although length is likely to be an important factor in determining when a compound can be a word-phrase compound (for example, it seems to be possible for an N2 to be too short to yield a word-phrase compound, based on the counts above, though there are cases with short N2s, though these include non-compound sequences such as uti-no hito ‘my husband, family member,’ lit. ‘I-GEN person’), N2 length is not by itself a sufficient criterion in the same way that N2 length predicts prosodic structure for other compound types. It seems that there must be some other factor or factors at play that opens up the possibility for the word-phrase compound.
As a starting point for identifying the relevant factor or factors, let us consider Nakai’s (2002) descriptive generalizations in (157) of some of the characteristics of the word-phrase compounds in his dictionary. Examples from Nakai are given below each criterion. To aid in distinguishing the parses, for these and following examples in the chapter, a parenthetical is added to each compound with the following abbreviations: WP for word-phrase, M for mono-phrasal, B for bi-phrasal, and WW for word-word. The word-phrase parse is listed first in each example, followed by the mono-phrasal and/or bi-phrasal parses, and, in the one example where a word-word parse is also available, the word-word parse is given last.
Given that criteria (157b–c) refer to length and are usually loanwords, their corresponding examples have English loanword N2s which are five and four moras in length respectively. The examples under (157a) also have N2s which are five moras in length, but unlike the N2s in the examples under (157b–c), which are monomorphemic, the N2s in the examples under (157a) are both compounds. The N2 of (157ai) is the compound eiga-kan, ‘movie theatre’ and, following the analysis of Kubozono, Ito, and Mester (1997) of each kanji in a Sino-Japanese compound being a separate Sino-Japanese morpheme, has three morphemes, ei ‘project (verb),’ ga ‘picture,’ and kan ‘building.’ The N2 of (157bi) is the compound kaigi-situ ‘conference room’ and has three morphemes, kai ‘meeting,’ gi ‘deliberation,’ and situ ‘room.’
These generalizations are helpful in suggesting that words can indeed be too short to participate in word-phrase compound mapping, in at least two ways that interact with each other. First, words may be too short in terms of mora count, as generalizations (157b–c) concern four to five mora loanword N2s, while generalization (157a) concerns compound word N2s, which are often four moras or longer. If a compound word has an N2 which is three moras in length or shorter, then, it is likely to not have a word-phrase parse. Second, words may be too short in terms of morpheme count, interacting with word length in moras. Thus, a monomorphemic five-mora N2 may be long enough mora-wise to trigger the availability of the word-phrase parse, but a monomorphemic three-mora word may be too short to trigger word-phrase compounds, expectedly on moraic length grounds, but also because it is too small on morphemic length grounds. Even a bimorphemic three or four-mora compound word may be too short, such as daigaku ‘university,’ which is composed of the morphemes dai ‘large’ and gaku ‘study’ or idoo ‘moving,’ which is composed of the morphemes i ‘shift’ and doo ‘move.’
However, there are limitations to the ability of these generalizations to predict whether the word-phrase parse is available. Nakai himself notes one: although many word-phrase compounds have an N2 which is a compound in which either element is three moras long or longer, there are also word-phrase compounds in which neither element in N2 is three moras long, such as the following examples in (158), given by Nakai.
These four examples have the N2s niwasi ‘gardener’ and harisi ‘acupuncturist.’ The second element in both compounds is the monomoraic Sino-Japanese morpheme si ‘master.’ In niwasi, the first element is the bimoraic native morpheme niwa ‘garden,’ while in harisi, the first element is the bimoraic native morpheme hari ‘acupuncture needle.’ Neither element in each N2 is at least three moras long. Such cases are relatively few in number in Nakai’s dictionary – only 9 of the 114 word-phrase entries have an N2 three moras in length, of which 6 have a polymorphemic N2. It is possible that a lexeme-specific effect is involved here, which may be like analogy effects described by Plag (2013), wherein compounds with the same N2 in English are more likely to have the same stress patterns. Here, (158a–b) both have onna ‘woman’ as their N1, and (158c–d) both have niwaka ‘bandwagon/fairweather’ as their N1.
The greater limitation, however, is that it is not possible to use any of the generalizations to reliably predict when a compound will have the word-phrase parse available to it, at least in terms of whether a word-phrase parse is recorded by Nakai, suggesting that these generalizations may point to factors which are necessary, but not sufficient for the word-phrase parse. The following are examples of compound words which fit Nakai’s descriptive generalizations but which are not recorded to have word-phrase parses. Examples (159a–b) fit the description of generalization (157a), examples (159c–d) fit the description of generalization (157b), and examples (159e–f) fit the description of generalization (157c), but none have recorded word-phrase parses.
First, let us consider (159a–b), which have N2s which are themselves compounds, as described in generalization (157a), (159a) has a five mora, three morpheme N2, kinenbutu, made up of the morphemes ki ‘account,’ nen ‘wish,’ and butu ‘thing,’ while (159b) has a five mora, three morpheme N2, hosyoonin ‘guarantor,’ made up of the morphemes ho ‘preserve,’ syoo ‘proof,’ and nin ‘person.’ Both have the same characteristics as (157ai) and (157aii), which also have five mora, three morpheme N2s. The N2 of (157ai) is eiga-kan ‘movie theatre,’ consisting of the morphemes ei ‘project (verb),’ ga ‘picture,’ and kan ‘building,’ while the N2 of (157aii) is kaigi-situ ‘conference room,’ consisting of the morphemes kai ‘meeting,’ gi ‘deliberation,’ and situ ‘room.’ Furthermore, all of these N2s have an element which is 3 moras long: kinen ‘commemoration,’ hosyoo ‘guarantee,’ eiga ‘movie,’ and kaigi ‘meeting.’ Despite this, neither (159a) nor (159b) have a recorded word-phrase parse in Nakai’s dictionary. Instead, (159a) has two mono-phrasal parses, one with the accent on the third mora of N2 and one with the accent on the second mora of N2, while (159b) has one accented mono-phrasal parse and one unaccented mono-phrasal parse. The fact that these have multiple reported mono-phrasal parses may be due to variation in the pronunciation of N2 in the speakers surveyed.
Turning to examples which have loanword N2s, (159c) and (159d) have the five mora loanword N2s sutirooru ‘styrene (from German styrol)’ and hoomuran ‘home run’ but do not have recorded word-phrase parses. (159c) has a mono-phrasal accented parse, while (159d) has a mono-phrasal unaccented parse. This is unlike the two compounds in (157bi) and (157bii), which also have five mora loanword N2s and have word-phrase parses, namely kurisumasu ‘Christmas’ in (157bi) and tyanpion ‘champion’ in (157bii).
Finally, (159e) and (159f) have four mora loanword N2s that are low-register and accented on the second mora when in isolation, ᴸsuke’eto ‘skate(s)’ and ᴸsuto’ppu ‘stop,’ but, again, neither compound has a word-phrase parse; instead both have mono-phrasal parses. This is unlike (157ci) and (157cii), which have the low-register, peninitial accented N2s suta’ndo ‘stand’ and misa’iru ‘missile’ and which both have word-phrase parses. Thus, as the examples in (159) demonstrate, simply having an N2 which has the characteristics of as described in Nakai’s generalizations for N2s in word-phrase compounds is not sufficient for a compound to have the word-phrase parse available to it.
A final complication for identifying a criterion that can predict word-phrase compounds is the no unique word-phrase parse problem. Whatever criteria are involved in influencing the availability of the word-phrase parse, such criteria cannot in general uniquely categorize a compound as a word-phrase compound. Whereas compounds are generally reliably mapped to foot-foot, foot-word, word-foot, word-word, mono-phrasal and (to some extent) bi-phrasal compounds based on the length-based criteria previously discussed (with some compounds able to be parsed as either mono-phrasal or bi-phrasal), the word-phrase parse is never recorded to be the sole parse available to a compound. Rather, it is always one of several parses, usually alongside a mono-phrasal or bi-phrasal parse or both, but in shorter compounds, sometimes also alongside a word-word parse. Observe in the following examples. Note that, in some cases, a compound may have multiple instantiations of the word-phrase parse, as Nakai’s data is based on multiple speakers from multiple Kansai locations. This can be seen in (160d), where examples (i–iii), all show N1 tihoo ‘region’ losing its lexical accent, the compound N2 koohuzei ‘delivery’ with its compound accent, and different registers on N1 and N2. Note that there is a parse ᴴtihoo-koohu’zei (160div) which is listed here as mono-phrasal, as this is how Nakai (2002) reported it. There is a possibility that it is another type of word-phrase parse, but this is not clear just from the dictionary.
Furthermore, in some cases, Nakai marks whether a prosodic pattern is uncommon compared to the other recorded patterns. Of the 114 entries with word-phrase parses, there are 30 cases in which the word-phrase parses are listed as uncommon patterns compared to the other, non-word-phrase patterns. There are several additional cases in which a compound has multiple word-phrase parses, where one word-phrase pattern is marked as uncommon, but the others are not – these cases are not included in the count of 30. An example of this is yagai-konsaato ‘outdoor concert,’ which has a common word-phrase parse ᴸyagai-ᴴkonsa’ato as well as an uncommon word-phrase parse ᴴyagai-ᴸkonsa’ato. In contrast, the word-phrase parse is listed as equal to the other parses in the remaining 75 cases. The word-phrase parses in the compounds in (160) above are among these 75. In only three cases is the word-phrase parse listed as the most common pattern. Two are ᴸniwaka-ᴴniwa’si ‘bandwagon/fairweather gardener,’ ᴸniwaka-ᴴhari’si ‘bandwagon/fairweather acupuncturist,’ which both have the word-phrase pattern as their most common pattern, alongside the uncommon mono-phrasal parse ᴸniwaka-niwa’si for the former and the two uncommon mono-phrasal parses ᴸniwaka-hari’si and ᴸniwaka-ha’risi for the latter. The third is nikai-tyuugaeri ‘double somersault,’ which has the word-phrase pattern ᴴnikai-ᴸtyuuga’eri as its most common pattern, and as its less common patterns, another (unaccented) word-phrase parse ᴴnikai-ᴸtyuugaeri, a mono-phrasal parse ᴴnikai-tyuuga’eri, and two bi-phrasal parses, ᴴni’kai-ᴸtyuugaeri and ᴴni’kai-ᴸtyuuga’eri. Accordingly, it seems that the norm is for the word-phrase parse to be co-available with other parses. Observe in the following examples in (161). Using an English equivalent of Nakai’s notation of a lowercase ‘s’ (for the first letter of sukunai ‘few’), I mark compounds which are less common compared to the others with “LC” following the compound type, within the parentheses.
This is perhaps unsurprising given that the length characteristics of compounds which have the word-phrase parse available are shared with other compound types, but it is a complicating factor nonetheless. For the purposes of the present analysis, word-phrase parses will be treated equally regardless of how common it is or whether it is the main parse for a given word. The present analysis aims to identify factors which may lead to the occurrence of the word-phrase parse in any case.
5.2 Discovering Additional Conditioning Factors on the Word-Phrase Parse
Having discussed the N2 length problem and the problem of uniqueness of the word-phrase parse, I turn to factors which may be relevant for the availability of the word-phrase parse. As discussed previously, I treat all compounds as having the same basic syntactic structure of two or more noun syntactic terminals combining to form a new noun syntactic terminal (structure from Chapter 3 repeated in Figure 75 below), and thus, it cannot be special syntactic factors which result in the availability of the word-phrase parse.



Syntactic structure of Japanese compounds
The discussion above argued that although there seems to be a lower limit on phonological or morphological length for whether a compound may have a word-phrase parse or not, no other phonological or morphological length factor can be identified. Given this, I look to non-syntactic, non-phonological, non-morphological factors for potential answers.
The issue of whether a word-phrase parse is available to a Kansai Japanese compound resembles in some respects a well-known issue in English compound prosody. English two-word compounds can be divided into two categories based on their prosody. In one category, compound words have what has been considered (for example by Chomsky and Halle 1968) special compound prosody, with the first element receiving stress, such as in the following compounds, where the compound stress is marked with an acute diacritic: ápple cake, télevision stand, ólive oil, and dógwalker. In the second category, compounds are stressed on their second element (or more precisely, are stressed on their second elements in addition to having a stress on the first element, per Bell and Plag (2012)), such as apple píe, winter sýmphony, and main ávenue. Analyses have long encountered difficulty accounting for these differences in a unified fashion, as it is clear that compounds in both groups share the same syntactic structure, which is especially evidenced in compounds that involve very similar elements, such as the dessert words apple cake and apple pie, which both have the same first element and which have semantically related, monosyllabic, heavy syllable second elements, but which nonetheless have different prosodic patterns. Accounts often have to invoke exceptions to the rules posited.
This is similar to the case of compound prosody in Kansai Japanese because Kansai Japanese compounds also clearly share the same syntactic structure despite having different prosodic structures, such as in the words tihoo-kan ‘regional administrator’ (word-foot), tihoo-gikai ‘regional congress’ (WW), and tihoo-koomuin ‘regional government worker’ (WP, M, B), which all have the same N1, tihoo ‘region,’ and different N2s of related semantic classes, kan ‘official,’ gikai ‘congress,’ and koomuin ‘government worker.’ An important aspect in which the Kansai Japanese and English cases differ is the level of variation found in how compounds can be pronounced. Whereas Kansai Japanese compounds that have a word-phrase parse available to them always have a non-word phrase parse, usually mono-phrasal or bi-phrasal or both, available to them as well, there is generally less variation in how compound words are pronounced in English. In general, most speakers agree that compounds with the first element stressed have the first element stressed and that compounds with the second element stressed have the second element stressed. This is not to say that there is no variation, even of the sort commonly found in Kansai Japanese compounds. Bell and Plag (2012) briefly discuss that variation is observed in both production and perception. For example, on the production side, they refer to the cases of boy scout being pronounced either with left prominence as in bóy scout (common in American English) or with right prominence as in boy scóut (common in British English) and of ice cream having the pronunciations íce cream and ice créam in free variation. Anecdotally, I observe variation in my own pronunciation of Santa Cruz (a city in California in the United States), sometimes with right prominence as Santa Crúz, and sometimes with left prominence as Sánta Cruz. On the perception side, Kunter (2010), conducting a prominence rating study in English noun-noun compounds, finds that less proficient raters have less reliable ratings when rating compounds with right prominence, which may suggest some variability in the perception of where compound stress occurs as well.
Possible factors that have been proposed for accounting for the variation in compound prosody which are neither syntactic nor phonological nor morphological include informativeness, semantic relationship between compound members, and pragmatic factors. Taking into account these factors represents a departure from syntax-prosody mapping accomplished primarily by constraints requiring constituent alignment or matching between syntactic structure and prosodic structure interacting with surface well-formedness constraints, but as the preceding discussion demonstrates, there seems to be no factor(s) in the syntax, morphology, or phonology which are sufficient to explain the availability of the word-phrase parse. I thus turn to these non-syntactic, non-phonological, non-morphological factors that have previously been proposed and examine their utility for Kansai Japanese. After discussing informativeness, the semantic relationship between compound members, and pragmatic factors, I discuss informativeness in terms of Kansai Japanese and develop hypotheses connecting informativeness with the availability of the word-phrase parse. I will ultimately propose that informativeness does play a role in word-phrase parse availability.
5.2.1 Informativeness
In this discussion, I use the term “informativeness” following Bell and Plag (2012). This is a statistical/probabilistic measure, related to the notion of “information content” as defined for information theory by Shannon (1948). Bell and Plag use three measures of informativeness: absolute predictability, relative predictability, and semantic specificity. These terms are discussed below for their application in Bell and Plag’s study on English compounds.
For Bell and Plag, absolute predictability is measured as the raw frequency of N2 in a corpus, in which greater frequency indicates lower informativeness, and lower informativeness is hypothesized to result in lower likelihood of being stressed. “The raw frequency of N2” is a token-based measure and includes all occurrences of the N2 of a given compound being considered (such as pie in apple pie) in a corpus, regardless of whether it occurs alone or as the second member of a compound.
Relative predictability is the predictability of a member of a compound occurring with respect to another element. Bell and Plag use three conditional probability measures for relative predictability. The first is the conditional probability of N2 with respect to N1, obtained by dividing the frequency of the whole compound by the frequency of N1, where a higher conditional probability indicates lower informativeness, indicating a lower likelihood of being stressed. This measure, too, is a token-based measure. The second measure is the conditional probability of N2 occurring as the second member of a compound, which they refer to as the family size of N2, and is obtained by dividing 1 by the amount of compound types that have a given N2. The third measure is the conditional probability of N2 given the family size of N1 (that is, the conditional probability of N1 occurring as the first member of a compound). These two measures are both type-based measures. An N1 having a larger family size means that the occurrence of a particular N2 is less probable, as a compound containing both N1 and N2 is only one of a large number of compounds containing N1. Lesser probability of that N1-N2 compound indicates greater informativeness of N2 and a greater likelihood of that N2 being stressed. Bell and Plag briefly discuss that it would also be possible to use a token-based family size measure, in which instead of counting compound types with a given N1 or N2, the sum of all compounds with a given N1 or N2 would be used as the family size measure. However, citing Schreuder and Baayen (1997), who report that type frequency is the more psychologically salient measure in compounds, Bell and Plag use only the type-based family size measure.
Finally, semantic specificity refers to how specific a word is, based on synsets, which are groups of words with similar meanings. The fewer synsets an N2 belongs to, the more specific it is, and the more informative it is, making it more likely to be stressed. Table 22 summarizes these factors.



Measures of informativeness investigated by Bell and Plag (2012)
Bell and Plag conducted an experiment testing these hypotheses (and hypotheses related to other, semantic factors, to be discussed in following subsection) with 17 adult native speakers of British English. Participants were asked to read aloud compounds presented in the carrier sentence ‘She told me about the (compound).’ Items included 1,000 experimental item sentences containing noun-noun compounds and 2,000 fillers, consisting of 1,000 filler sentences containing simplex nouns and 1,000 filler sentences containing adjective-noun combinations. The compounds were taken from the demographic section of the British National Corpus (BNC), which consists of 4.23 million words of spontaneous conversation, meaning that any compounds present in this section are compounds that are actually used in daily conversation. The items were then reproduced four times yielding 12,000 tokens, which were then split into lists of 300 sentences with 100 experimental items and no repeated items. Lists were assigned to participants such that no speaker would repeat an item. Each participant read one to five lists, with most participants reading two or three lists, one list per session, with sessions separated by at least one day. 4,000 acceptable tokens were elicited, four tokens for each type, each token spoken by a different participant.
The tokens were also rated by two raters in terms of where they perceived the compound prominence to be – on the left or right word. Both raters had participated in a previous study by Kunter (2010, 2011) on the perception of compound prominence and had been identified in that study as being reliable raters, a group of listeners whose ratings agreed to a statistically significant extent. One rater gave prominence ratings both on-line during reading sessions and at a later time, while the other rater gave ratings only at a later time. Items were included for further analysis if the three ratings were unanimous, resulting in a total of 3,764 tokens. The remaining 236 tokens were excluded. An extra 512 tokens were excluded, as they had estimated family sizes that were disproportionately large compared to actual, manually calculated family sizes, due to a large portion of noun-noun collocations involving them either not being actual compounds, being homonyms, being part of high-frequency formulas (such as morning, meaning good morning), being likely to be mis-tagged, or which had very small family sizes. This exclusion resulted in 3,252 remaining tokens (representing 864 of the 1,000 original types) for analysis, for which it could be assumed that the estimated family sizes would be highly correlated with the actual family sizes.
Each measure of informativeness for the compounds that were tested was obtained or calculated from the BNC, which consists of 100 million words, in the case of absolute and relative predictability, and from the Wordnet lexical database, in the case of semantic specificity. Lemmatized frequencies of N2 (tokens) were collected and the family size of each N2 (types) calculated from the whole BNC. The conditional probability of N2 based on N1 frequency (a token-based measure) was calculated by collecting lemmatized frequencies for N1 from the BNC, then dividing compound frequencies by frequencies of N1. The conditional probability of N2 based on N1 family size (a type-based measure) was calculated by estimating the family size of N1 from the BNC and dividing 1 by the family size of N1. Synset counts for all N1s and N2s were obtained from either the Wordnet index file for nouns or the online version of the Oxford English Dictionary. Synset counts were extracted from the Wordnet for all words in the index file. Synset counts were obtained from the online version of the Oxford English Dictionary for nouns which did not occur in Wordnet. Finally, three proper nouns did not occur in either, and these were assumed to have one sense each.
Bell and Plag conducted both a token-based analysis and a type-based analysis. In the token-based analysis, they find significant roles for informativeness in predicting compound stress type, for all three types of measures for informativeness – absolute and relative predictability and semantic specificity – in accordance with their hypotheses. That is, that less informative N2s are less likely to receive stress. In more concrete terms, if stand in television stand has low informativeness, then this compound is more likely to be pronounced with left stress as télevision stand. However, if stand has high informativeness, then this compound is more likely to be pronounced with both left and right stress as télevision stánd. This is the expected outcome, given the relationship between informativeness and compound prosody location hypothesized by Bell and Plag. In addition to this, there is also an intuitional sense in which less informative N2s are less likely to receive stress. This can be thought of in terms of surprisal. In terms of token frequencies, a word with a larger frequency (and lower informativeness) is more likely to be the N2 of a compound with a given N1 than a word with smaller frequency (and higher informativeness). If likelihood of a word being the N2 in a compound with a given N1 is higher, then when such an N1-N2 compound occurs, this has low surprisal, and the expected “default” left compound prominence arises. On the other hand, if the likelihood of a word being the N2 in a compound with a given N1 is lower, then when such an N1-N2 compound occurs, this has higher surprisal, which is then signaled by N2 receiving stress.
For the type-based analysis, only 541 of the 864 types were analyzed, as these were the types for which there was no inter-speaker variation in stress. This analysis yields “very similar” results as the token-based analysis, and they again find significant roles for informativeness that were found in the token-based analysis. Considering surprisal in terms of types, a less informative, more frequent N2a (for example, one that is one of a small N1 family size of 5 types) is more likely to be an N2 of a compound with an N1a with a small family size than a more informative, less frequent N2b (for example, one that is one of a large N1 family size of 500 types) is to be the N2 of a compound with an N1b with a large family size. Thus, when N1a is the N1 of a compound, there is a high probability that N2 is N2a, because the sequence N1a-N2a is one of 5 possibilities in N1a’s small family. In this case, there is low surprisal, so the expected “default” left compound prominence arises. However, when N1b is the N1 of a compound, there is a lower probability that N2 is N2b, because the sequence N1b-N2b is one of 500 possibilities in N1b’s large family. In this case, there is higher surprisal than in the case of N1a-N2a and a higher likelihood of N2a receiving stress. We can conceive of right prominence, that is, stress occurring on N2 in English, then, as a prosodic signature of surprisal in N2’s appearance as an N2.
Given this finding for a role of informativeness in English compound stress, I investigate the role of informativeness in Kansai Japanese compound prosody as well.
5.2.2 Semantics
It was previously discussed in Chapter 3 that compound nouns have the same general morphosyntactic structure (although they may differ in syntactic branchingness), consisting of two or more noun terminals which are combined to form new noun terminals, and new compound noun terminals can be created by iterating this combinatory process. However, despite the general uniformity of their morphosyntactic structures, compounds are not uniform when the semantics of how compound components relate to each other is taken into consideration. While one (or more) of the compound components specifies the meaning of the head of the compound in some way, the precise way that the component(s) specify the meaning of the head differs from compound to compound. Observe in the following examples in (162), given with the relationship (as labeled by Bell and Plag) between the members indicated. Examples are from Bell and Plag (2012) and Bauer (2017).
In the same experiment described in the previous subsection, Bell and Plag also examined the connection between semantic relationships between compound members and right prominence in English compounds. Bell and Plag tested four semantic relations: N1 is a temporal location defining N2 (“temporal”), N1 is a spatial location defining N2 (“location”), N1 is a material or ingredient of N2 (“made of”), and NN is the name of a food item (“name of food item”). In the token-based analysis, there were significant main effects for the temporal, location, and made of semantic relations, and compounds that were classified as one of these categories had a higher chance of having right prominence. No effect was found for name of food item, and Bell and Plag hypothesize that this is because most of their name of food item compounds were also all part of the larger “made of” class, which may have subsumed any independent effect of the smaller category. Similar results were obtained in the type-based analysis.
The effect of semantic factors on compound prosody has been observed in Japanese as well. Kubozono (1993), discussing compounds which fail to undergo prosodic compounding, notes that certain semantic relationships between compound elements result in compounds which do not have special compound prosody (i.e., do not undergo prosodic compounding) and would be classified as bi-phrasal compounds in the present work. Importantly, N2 length does not play a role in these compounds being bi-phrasal. Some examples from each semantic relationship category discussed by Kubozono (1993) are given below in (163). As these are all bi-phrasal compounds, the components of each compound have the same prosodic characteristics that they have in isolation. Accordingly, only the compounded expressions are given.
Importantly, it is not necessarily the case that when the elements of a compound have one of the semantic relationships listed, the compound will be bi-phrasal. Kubozono also provides examples of compounds with the same semantic relationships which do undergo prosodic compounding. Kubozono notes that as these combinations become more established in the language, they become more likely to be subject to undergoing prosodic compounding. Below in (164) are some examples given by Kubozono.
The same semantic effects are observed in Kansai Japanese as well, with these types of compounds also being pronounced as bi-phrasal compounds. Nakai (2002) notes that the conditions resulting in these compounds are the same as those in Tokyo Japanese, as described by Kubozono. Examples from Nakai are given in (165) below. Only the compounded expressions are given.
As in Tokyo Japanese, it is not necessarily the case that a compound with one of these semantic relationships will be bi-phrasal. Some examples of compounds in the categories above are given in (166) below. Examples are from Sugito (1996) and Nakai (2002).
Nakai does not mention these cases where prosodic compounding occurs instead of producing a bi-phrasal compound. If the conditioning factors are the same in Kansai Japanese as in Tokyo Japanese, then it may be possible that more established compounds are more likely to undergo prosodic compounding, as is the case in Tokyo Japanese, as described by Kubozono. I leave this question to future work.
Given this, semantic factors may also play a role in whether the word-phrase parse is available for a compound in Kansai Japanese. However, the amount of data (218 compounds) collected for the present study is insufficient for carrying out a proper analysis of the effect on semantic factors on Kansai Japanese prosody. For example, I observed at least the following ways in (167) to categorize compounds according to their semantics. Examples are also given.
In particular, it was not clear how to classify many compounds more specifically than “compound is a type of N2,” making it a rather heterogeneous class that likely has additional internal structure. As there are at least 8 semantic classes that the data can be divided into, and not every class has equal or otherwise comparable amounts of data, it may be difficult to make conclusions based on semantics with the data collected for the present study. Accordingly, I leave the semantic factor-based analysis to future work, when more thorough data collection can be conducted to ensure comparable amounts of data for each semantic class.
5.2.3 Pragmatics
Kubozono (1993) also discusses pragmatic factors that play a role in prosodic variation in Japanese compounds as well, as longer compounds show variation in prosody. Some examples of this variation are given below in (168) from Kubozono (1993), along with the bracketing notation used to represent the branchingness of the compounds.
According to Kubozono, such variation is observed not only between speakers, but within speakers, and may vary not only from compound to compound within the same speaker, but the same compound may be pronounced with different prosodies across different utterances. Kubozono gives three pragmatic factors which are involved in conditioning when a given pronunciation may be used. First, speakers tend to prefer the bi-phrasal pronunciation in slow, careful speech, and the mono-phrasal pronunciation in fast, casual speech. Second, the more familiar a speaker is with a compound, the more likely they are to use the mono-phrasal pronunciation. Third, if a compound or one of its components is focused, then the bi-phrasal pronunciation is used.
These effects can be observed across Japanese dialects as well. As an example of the second pragmatic factor, Kubozono (p.c.) has informed me of the case of ganba-oosaka ‘Gamba Osaka (the name of a soccer team from Osaka).’ In Tokyo, this name is usually pronounced as the bi-phrasal compound ga’nba-oosaka (ga’nba + oosaka). However, in Osaka, where people are more likely to be familiar with the team, the name is often pronounced as the word-word compound ganba-o’osaka.
The present study does not have enough data to conduct a proper analysis of the effect of pragmatic factors on variation in compound prosody in Kansai Japanese, so this is left to future work. However, the fact that such factors can influence the way a compound is pronounced provides some support for the role of gradient, usage-based factors in influencing prosodic structure, especially in light of the no-unique word-phrase parse problem.
5.2.4 The Word-Phrase Parse in Kansai Japanese and Informativeness
As discussed above, the variability in Kansai Japanese compound prosodies, particularly in compounds with longer N2s, which may vary in whether they are pronounced with a word-phrase, mono-phrasal, or bi-phrasal parse, is reminiscent of the issue of whether a compound is pronounced with left or right prominence in English. Similar issues are involved, as well. Why can two compounds which have what is evidently the same input syntactic structure be pronounced in two different ways, reflecting different prosodic structures? Furthermore, why can some compounds be pronounced in multiple ways?
Given these similarities, for the present study, I investigated the role of informativeness in the availability of the word-phrase parse in Kansai Japanese. This work was conducted under the following general hypothesis: The availability of the word-phrase parse is correlated with some measure of informativeness.
An important way in which the study of Kansai Japanese word-phrase parse compounds differs from compound prosody in English is that, while there was only one mark of “special” prosody in English, namely, right prominence, there are potentially two marks of special prosody in Kansai Japanese word-phrase parses. First, the accent of N1 is lost. Second, the register of N2 is retained. Respectively, these are marks that, as I argue in Chapter 3, are signs that N1 has been mapped to a prosodic word and that N2 has been mapped as being contained within a phonological phrase. These marks can be conceptualized in at least two ways, descriptively speaking. In one way, the word-phrase parse could be conceived of as a modification of the other phrasal parse with recursive structure, the bi-phrasal parse. Word-phrase compounds are prosodically like bi-phrasal compounds except that, instead of retaining the accent of N1, it is lost instead. In this conception, it is the loss of N1 which is the mark of surprisal, reflecting something about the informativeness of N1. In the second way, the word-phrase parse could be conceived of as a modification of the mono-phrasal parse. Word-phrase compounds are prosodically like mono-phrasal compounds except that, instead of losing the register of N2, it is retained instead. In this conception, it is the retention of N2’s register which is the mark of surprisal, reflecting something about the informativeness of N2. Due to these possible conceptualizations of the relationship between word-phrase marking and surprisal, I investigate not only the informativeness of N2 on its own and in relation to N1 as Bell and Plag did for English, but also the informativeness of N1 on its own and in relation to N2.
For the present study, I utilize the conception of informativeness as it relates to corpus frequency as used by Bell and Plag (2012) and as discussed above. Thus, for Kansai Japanese, I use absolute predictability and relative predictability measures of informativeness. Absolute predictability refers to the raw frequencies of N1 and N2 in the corpus I used, the Balanced Corpus of Contemporary Written Japanese (BCCWJ; NINJAL 2022), regardless of whether N1/N2 occurs on its own or in a compound. I use four measures of relative predictability. The first two hold N1 constant and are 1) the conditional probability of N2 given N1 based on tokens, which is obtained by dividing the frequency of the whole compound by the frequency of N1, and 2) the conditional probability of N2 given N1’s family size, a type-based measure, which is obtained by dividing 1 (because a compound containing both a given N1 and N2 is only one compound in the entire family size of N1) by the family size of N1. Similarly, the second two measures of relative predictability hold N2 constant and are 1) the conditional probability of N1 given N2 based on tokens, obtained by dividing the frequency of the whole compound by the frequency of N2, and 2) the conditional probability of N2 given N2’s family size counted as types, obtained by dividing 1 by the family size of N2. In the statistical analysis, I focus primarily on relative measures of predictability, because the measures of absolute predictability, the raw frequencies of N1 and N2, are part of the calculation of the token-based measures of conditional probability. I discuss this in more detail below. A summary of the relative predictability measures I used is given in Table 23.



Relative predictability measures used in the present study
Extending Bell and Plag’s hypotheses regarding informativeness to Kansai Japanese, I use the following hypotheses. Hypotheses (169a–b) are based on the conception in which the surprisal being marked by the word-phrase parse concerns the informativeness of N1, while hypotheses (169c–d) are based on the conception in which the surprisal concerns the informativeness of N2.
An alternative conception of the hypotheses in (169a) and (169b) is that N1 is more likely to lose accent (with accent retention being the mark of surprisal) if it is less informative. This is a reasonable alternative, as this is closer to the situation in English, wherein an N2 loses its isolation stress if it is less informative. This version of the hypothesis is worth further consideration in future work, pending further investigation on what situation should be considered “default” in Kansai Japanese phrasal compounds, given that the word-phrase parse could be taken as surprisal from a mono-phrasal perspective with N2 retaining register or from a bi-phrasal perspective with N1 losing accent.
In order to investigate the informativeness of words in Kansai Japanese compounds and test these hypotheses, it was necessary to collect additional, novel data. I turn to this data collection in the next section.
5.3 Novel Fieldwork on the Word-Phrase Parse
In order to investigate whether informativeness plays a role in conditioning the possibility of the word-phrase parse in compounds, additional data beyond the 114 compounds reported by Nakai was collected. This data collection was undertaken to collect additional data on the availability of the word-phrase parse in compounds reported by Nakai to exhibit it, collect novel data on the availability of the word-phrase parse in compounds that have not been previously reported to exhibit it or which are not included in accent dictionaries, and collect novel data on compounds with the same first or second member as compounds previously reported to exhibit or not exhibit the word-phrase parse in order to compare them.
5.3.1 Materials
Novel items to be tested were constructed using the word-phrase compounds reported by Nakai (henceforth also referred to as “Nakai compounds”) as a basis. Many of the Nakai compounds were also included as items. Novel items were constructed using at least one of the following principles. Some compounds adhere to more than one construction principle, such as terebi-bangumihyoo ‘television program guide,’ which adheres to both principle (170a), as terebi-bangumi is a Nakai compound, and both have the same N1, and principle (170c), as bangumihyoo ‘program guide’ is itself a compound consisting of bangumi ‘program’ and hyoo ‘table.’
Principles (a) and (b) were selected because if these elements are present in Nakai compounds and some non-syntactic, non-phonological characteristic of these elements conditions the word-phrase parse, then using compounds with these same elements would allow for direct pairwise comparisons between Nakai compounds and novel data. An example of a compound adhering to principle (a) is tyuuoo-hakubutukan ‘central museum,’ which has the same N1 as the Nakai compound tyuuoo-koominkan ‘central public hall,’ while an example of a compound adhering to principle (b) is zinriki-hikooki ‘human-powered aircraft,’ which has the same N2 as the Nakai compound mokei-hikooki ‘model aircraft.’ Additional items beyond those having an N1 or N2 which is the same as the N1 or N2 of a Nakai compound were constructed by the same principle from novel data, using an N1 or N2 which is a component in a novel item which was found to have a word-phrase parse. For example, the Nakai compound utyuu-hikoosi ‘astronaut’ led to the creation of the novel item utyuu-booenkyoo ‘space telescope’ by adhering to principle (a). Utyuu-booenkyoo was found to have the word phrase parse in my consultants’ productions, so a new item, denpa-booenkyoo ‘radio telescope,’ in which neither N1 nor N2 is present in a Nakai compound, was created, using the novel item utyuu-booenkyoo’s N2 and adhering to the second part of principle (b).
Principles (c), (d), and (e) were based on the descriptive generalizations given by Nakai, as discussed above, with some modifications. As observed by Nakai, many compounds with word-phrase prosody have an N2 which is itself a compound. Nakai specifically gives this generalization as a compound in which either of the elements is three moras or greater in length. However, he also does note several exceptions in which N2 is a compound, but neither component of N2 is three moras or greater, such as niwaka-niwasi ‘bandwagon/fairweather gardener,’ as discussed earlier. For this fieldwork’s construction principle (c), compounds with smaller N2 compounds were also considered in addition to N2 compounds which adhere to Nakai’s generalization. This was done in order to capture a wider ranger of N2 compound possibilities and to allow for the appearance of exceptions to Nakai’s generalization, like niwa-si ‘gardener’ appearing as N2. Principles (d) and (e) are based on Nakai’s generalizations that some word-phrase compounds have a loanword N2, which is either long (5+ moras in length), or which has low register and an accent on the second mora of the word. For this fieldwork, this latter observation was loosened to include accent anywhere in the middle of the word to allow for the consideration of more possible N2s. Shorter loanwords were considered as well, starting at 3 moras in length, as one Nakai compound has a 3-mora loanword N2, and there are several other Nakai compounds with a 4-mora loanword N2. This again allows for the appearance of exceptions to Nakai’s generalizations.
Principle (f) is loosely based on Nakai’s generalization involving low register N2s, as it is a mirror principle to this generalization, but it is also based on the observation that many of the Nakai compounds have a low register N1. Nakai reports word-phrase compounds with N1s and N2s having both high and low registers, resulting in a typology of four types of word-phrase compounds based on the registers of their input components – high register N1 and N2, high register N1 and low register N2, low register N1 and N2, and low register N1 and high register N2. Schematically, this typology can be represented as the following in (171), with x’s representing each mora and a hyphen separating N1 and N2. An accent is arbitrarily placed after the third mora in N2 to show that N2 has an accent (if it has one in isolation).
However, the word-phrase parse is most easily identified with compounds involving at least one low register component (171b–d). This is because when both components are high register, even if N1 loses its accent and N2 retains its register, it is difficult to distinguish between the word-phrase parse and the mono-phrasal parse, in which N1 loses its accent, and N2 acquires N1’s register. Schematically, a word-phrase compound with two high register components compares to a high-register mono-phrasal compound in the following example in (172).
The result of both would be a compound with a high tone plateau from the beginning until the N2-internal accent. As described in Kori (1987), when a high plateau encounters the high tone of a following word, the two essentially coalesce. Accordingly, it would be very difficult to tell such parses apart, if any distinction can be made at all. Using an N1 with a low register allows for a clear distinction to be made, regardless of whether N2 has a high or low register. If N1 and N2 are both low register, then an N1-final high tone (which is found in low register unaccented words in isolation) will split N1 and N2, distinguishing it from a low-register mono-phrasal compound, which would have a low tone plateau from the beginning of the compound until the N2-internal accent, as shown below in (173).
If N1 is low register and N2 is high register, N1 will surface with a low tone plateau, and N2 will surface with a high tone plateau until the accent, a pattern which is distinct from both low register mono-phrasal compounds as discussed above and high register mono-phrasal compounds, which, as discussed above, have a high tone plateau from the beginning of the compound until the N2-internal accent. Similarly, if N1 is high register, and N2 is low register, N1 will surface with a high tone plateau until the end of N1, and N2 begins with a low tone plateau that continues until the accent. These two possibilities are shown below in (174) with comparison to high and low register mono-phrasal compounds
5.3.2 Methods
PowerPoint slides were prepared with each item of interest included in a frame conversation. In some cases, two related items were included in the same slide, such as siteiseki-ryookin ‘fare for designated seating’ and ziyuuseki-ryookin ‘fare for non-reserved seating.’ A picture representing the item (such as a picture of a fire alarm or a museum) or of something related to the item (such as a picture of a place where the item can be found or a situation using the item) were also included in each slide in order to provide additional information about what an item refers to. Frame conversations consisted of, at minimum, a question and an answer. For example, one frame conversation was [place]
Each PowerPoint slide deck included two slides per item, one in which one participant was the question asker and the other was the answerer, and an equivalent slide with the roles reversed, and dialect-appropriate changes made to reflect the reversed roles. An example of a conversation used during elicitation is given below. The item of interest is bolded.
5.3.3 Participants
The participants were two adult female native speakers of a Kansai Japanese dialect. Each participant lived in the Kansai Region of Japan for at least 20 years and have spent at least 15 years outside of the Kansai Region, either in Japan or abroad. Both now reside in the United States and are teachers of Standard Japanese. Both were informed that they would be participating in a study of differences in how compound words are pronounced in Kansai Japanese dialects. While the participants had prosody consistent with Kansai Japanese prosody (e.g., words have high and low registers, verbs and adjectives and their conjugations have Kansai Japanese prosody), the specific realizations of lexical items and compounds in their dialects differed from each other in terms of features such as register, accentedness, and accent location in accented words (e.g., one participant may pronounce a word with high register, while the other pronounced it with a low register). The participants also consistently differed from each other on which sentence ending particles their dialects preferred.
5.3.4 Procedure
All sessions were conducted as group sessions with both participants. Due to restrictions related to the COVID-19 pandemic, sessions were conducted online by Zoom for about 4 months during the height of the pandemic. When restrictions were loosened, sessions were conducted in-person. Recording of sessions was accomplished with Zencastr, an online podcast recording service which creates local recordings instead of online cloud recordings. Local recordings have the advantage of ensuring the highest possible recording quality and protecting against any loss of relevant linguistic information due to connectivity issues or audio compression related to connectivity issues (Sanker et al. 2021). Zencastr was used for both online and in-person sessions. During Zoom sessions, audio was recorded from participants using the microphones on their laptops. During in-person sessions, audio was recorded with a FIFINE K668 USB microphone connected to a laptop with Zencastr recording the session.
Several days before each session, a PowerPoint slide deck containing the slides to be used for the session was sent to the participants on Google Drive to be edited. This was done to ensure that the frame conversations on each slide were in natural Kansai Japanese for each speaker prior to the session. Participants edited the slides on their own, with each participant correcting their own lines. Participants were compensated for their time in both elicitation sessions and PowerPoint editing sessions.
Each session was divided into two parts. The first part was a group elicitation section, during which both participants would read aloud conversations containing items from the prepared PowerPoint slides. Group elicitation was performed in order to reduce interference from Standard Japanese and to ensure that Kansai Japanese pronunciations were used during the production of items, as well as to obtain the pronunciation of compounds in a conversational context. All slides with one participant as the first speaker and the other participant as the second speaker were read before switching roles. This was done in order to reduce the influence of each participant’s pronunciations on the other participant’s pronunciations of target words.
The second part of each session consisted of one-on-one elicitation. During this part, participants were asked to produce items in isolation. Items were underlined on the PowerPoint slides (corresponding to the bolding in (175) above), which were again presented to the participants during this part. Participants were allowed to use the context of the frame conversations to help them maintain Kansai Japanese pronunciations if necessary. Once a participant had read an item in isolation, they were asked to pronounce each component of the item in isolation. Thus, a participant would produce an item like pengin-suizokukan ‘penguin aquarium’ in isolation, followed by pengin ‘penguin’ in isolation and suizokukan ‘aquarium’ in isolation. Each participant was asked to do this for about five to ten items depending on the number of items to be elicited and the amount of time remaining in the session. Once these had been completed, the participants would switch, and the process would be repeated until all items were elicited or there was no more remaining time in the session. In order to confirm the prosody of each item and its components, I repeated each pronunciation back to the participants until I received confirmation that the pronunciation was correct. I then recorded the obtained prosody on a sheet of paper containing all items to be elicited for the session. Where necessary to distinguish between accent locations, I presented self-produced or computer synthesized pairwise comparisons and had participants confirm which production of the pair matched their production. Computer synthesized productions were created using a voice synthesis program created by AI Inc. called A.I.VOICE
For elicitation, participants were asked to produce items in the way that they would say it in their dialect. In some cases, they also offered alternative pronunciations that either they expected they might hear from other speakers of their own dialect or other speakers of a Kansai Japanese dialect or which they thought they might produce themselves on another occasion. In general, when participants gave a non-word-phrase pronunciation, they were not asked if a word-phrase pronunciation would be possible. Because of the possibility that the conditioning factors of the word-phrase parse are statistical in nature, it was determined that it would be more beneficial to obtain data on a larger range of items rather than take extra time to probe alternative pronunciations for each compound.
In addition to these primary elicitation tasks, the participants were occasionally asked questions about the compounds. Questions included questions that probed syntactic structure (which is suggested by where a speaker might place the genitive particle no, cf. probing whether John’s history book is a book of John’s history or a history book of John), questions about the meaning or assumed meaning of a compound, and questions about the naturalness of a compound or equivalent expressions that might be more natural than the item presented. For example, an important question for longer compounds such as toohoku-akusento-ziten ‘Tohoku Accent Dictionary’ is what underlying syntactic branching the compound has. This compound could mean an accent dictionary of/from/regarding the Tohoku region, or it could mean a dictionary of Tohoku accent. Participants were asked where they would place the genitive particle no, which was taken as indicative of what kind of syntactic branching the compound has. An example of probing whether a compound was natural or not included asking whether tanuki-nuigurumi ‘(intended) raccoon dog plush toy’ sounded natural or if there was a more natural expression (in this case, tanuki-nuigurumi was judged unnatural and the version with no, tanuki no nuigurumi, was offered as the more natural alternative). In some cases, these questions would lead to additional compounds suggested by the participants which would then be elicited later in the session or in a subsequent session.
In total, 218 compounds were elicited from the participants.
5.3.5 Data Processing and Analysis
5.3.5.1 Obtaining Measures of Informativeness
As mentioned previously, the present study was interested in two measures of absolute predictability, one for N1 and one for N2, and four measures of relative predictability, two for N1 and two for N2. Data for these measures was collected from the Balanced Corpus of Contemporary Written Japanese (BCCWJ; NINJAL 2022), which is a corpus of approximately 100 million words of written Japanese collected from various media including general books, magazines, newspapers, legal documents, internet blogs, and other forms of print or digital written media spanning a period of 30 years from 1976 to 2006. This corpus was selected due to its large size and because many of the compounds elicited tend to appear in more formal discourse, which is more likely to be written.
The BCCWJ corpus is primarily interacted with using the National Institute for Japanese Language and Linguistics (NINJAL)’s Chunagon corpus search application. The BCCWJ’s database search function is divided into several search types: short unit word searches (
The long unit word search allows corpus users to search for longer word units, based on phrases. In this search function, kenkyuuzyo
Character string search was generally not used to collect informativeness measures, except in two main cases. The first case is when a compound was expected to exist in the corpus, such as minami-taiheiyoo
The position search, which allows users to search for words based on sample ID and position of the word in the corpus, was not used.
The absolute predictability measure of raw corpus frequencies for N1 and N2 were conducted simply by running a search query for the word in question in a short unit word search in the BCCWJ, using multiple conditions if necessary (such as for N2s like kenkyuuzyo
The token-based relative predictability measures were calculated using the raw frequency counts for N1/N2 and the compounds. The conditional probability of N1 given N2 was calculated by dividing the raw corpus frequency of a compound containing N1 by the raw corpus frequency of N2. The conditional probability of N2 given N1 was calculated by dividing the raw corpus frequency of a compound containing N2 by the raw corpus frequency of N1.
Family sizes are the number of types of compounds with a given N1 or N2. Obtaining family sizes required a multi-condition short unit word search. In order to do this, the first short unit word lexeme in the constant component (for example, kenkyuu
Because of the search procedure, leaving the data as-is would result in an overestimation of how many compound types existed with a given N1 or N2. This is due to the fact that not every noun-noun sequence in the corpus is actually a compound. Many cases are in fact similar to the type discussed by Bell and Plag (2012) as the tea mother cases, which arise when two nouns come together because the second one is a vocative, as in the sentence Would you like some tea, mother?, which are not actually compounds. In Japanese, many adverbial phrases involve a word that was tagged as a noun. These include sequences such as sono ato [N2 of interest], literally ‘afterwards, N2’ or [N1 of interest] mainiti, literally ‘N1 everyday …’ occurring at a sentence boundary that was not marked. Each CSV downloaded for family size calculation was examined for such cases, and these cases were removed.
Three additional case types were removed as well. First, compounds involving a so-called Aoyagi prefix (Poser 1990b), such as doo
All remaining data not involving these cases were assumed as a heuristic to contain legitimate compounds and were retained due to time constraints. Under the assumption that legitimate compounds would appear in noun-noun sequence searches more frequently than tea mother-like sequences, given the removals above, it seems safe to assume that the great majority of the remaining data consists of legitimate compounds. A future study would involve more thorough and rigorous cleaning of the data.
Returning to the type-based measures of predictability, once the family size CSVs were cleaned up with the aforementioned removals, the conditional probability of N1 given N2 was calculated by dividing 1 (representing the one type in which N1 and N2 form a compound) by the family size of N2, and the conditional probability of N2 given N1 was calculated by dividing 1 by the family size of N1.
In order to account for compounds with a frequency of 0 in the corpus, the Laplace transformation, as discussed in Brysbaert and Dipendaele (2013), was used. The Laplace transformation involves adding 1 to every frequency and increasing the corpus size by the number of types in the corpus. Accordingly, I added 1 to every raw frequency and family size count in the data, such that no frequency for the data examined was 0.
Finally, of the 218 compounds elicited from the participants (henceforth referred to as “participant compounds”) 10 were discarded for the present analysis. There were two reasons for this. In the first case, one or both components included so many results that processing them for family size determination would have been unfeasible for the present work. For example, a search of nippon
The participant compounds and Nakai compounds and their related informativeness measure values were pooled into the same document.
For the purposes of the present analysis, compounds were classified as having the word-phrase parse available or not based on whether at least one participant or Nakai reported a word-phrase parse. Although this comes with the obvious risk of collapsing all of Nakai’s consultants into one entity, ‘the Nakai dictionary,’ doing this allows for treating all of the Kansai Japanese data together, regardless of the specific production of any given speaker. This also simplifies the analysis by making the dependent variable, whether a word-phrase parse is available, a binary variable, rather than a ternary variable, such as ‘yes, both participants and the Nakai dictionary report a word-phrase parse,’ ‘yes, at least one, but not all three report a word-phrase parse,’ and ‘no one reports a word-phrase parse.’ The danger of collapsing all of Nakai’s consultants into a single entity is readily apparent in the case of a ternary variable. Specifically, because Nakai does not report how many speakers gave a word-phrase parse, it is not clear exactly how strong a ‘yes’ from the Nakai dictionary actually is. The use of a binary variable allows for taking Nakai’s reports into consideration without making any claims about the strength of a ‘yes’ report from the Nakai dictionary.
5.3.5.2 Visualizing the Data
All of the informativeness measures collected above were put into a single CSV file, along with their corresponding compounds and whether the compound was reported by at least one of the participants or the Nakai dictionary as having the word-phrase parse available. Whether a compound has the word-phrase parse (henceforth “Word-phrase parse?”) was plotted against the measures of informativeness in RStudio (RStudio Team 2020) to visualize the data using the ggplot2 package (Wickham 2016) and its violin plot function. The measures of informativeness have been converted to log scale using the log function in RStudio in order to aid in visualization and to reflect the fact that log converted measures of informativeness were used in the statistical analysis. Note that because of the log conversion, the range of values for log(measure of informativeness) on the y-axis differs from plot to plot. Box plots showing the median, 25th percentile, and 75th percentile are also provided, overlaid onto each violin plot. Several of the plots do not suggest anything, as the densities are similar whether the word-phrase parse is available or not. These plots are included with plots that do suggest a role for informativeness for completeness.
To reduce the effects of outliers on these visualizations, a process of outlier removal was undertaken. A data point was considered an outlier if the value of at least one of the frequency measures involved in the calculation of the conditional probabilities (i.e., the raw frequencies of the compound, N1, and N2, and the family sizes of N1 and N2) was at or greater than the 97.5th percentile, as calculated in R. Though somewhat stipulative, this method allows for the removal of true outliers, given the relatively small sample of data points, and the fact that data points with very high values for these frequency measures usually had values that were multiple times larger than nearby, lower percentile values. For example, in one case, the largest value (100th percentile) for the raw frequency of N1 was 42,935. The 95th percentile as calculated in R was 15773.30.



Word-phrase parse? (wpalpha) vs. raw frequency of N1 (rfw1)



Word-phrase parse? (wpalpha) vs. raw frequency of N2 (rfw2)
The following two violin plots in Figures 76 and 77 show “Word-phrase parse?” (represented as “wpalpha” with the values y(es) and n(o)) plotted against the raw frequencies in tokens of N1 (rfw1) and N2 (rfw2). The log(raw frequency) values range begin at 0, since raw frequencies are always positive integers. Not much is suggested by these plots, as the densities for log(raw frequency) values for N1 and N2 are similar, regardless of whether a compound which contains N1 or N2 is pronounced with a word-phrase parse or not.
Turning to the relative measures of informativeness, “Word-phrase parse?” was plotted against the two token-based measures of informativeness, the conditional probability of N1 given N2 (cpn1gn2) (Figure 78), and the conditional probability of N2 given N1 (cpn2gn1) (Figure 79). Like the plots for raw frequency, these plots do not suggest much, as the box plots across wpalpha overlap in conditional probability values.



Word-phrase parse? (wpalpha) vs. conditional probability of N1 given N2 (tokens) (cpn1gn2)



Word-phrase parse? (wpalpha) vs. conditional probability of N2 given N1 (tokens) (cpn2gn1)
Because of the tendency for word-phrase compounds to have a compound N2, as noted by Nakai (2002), it is also useful to visualize the data split by whether N2 is a compound or not. In these plots, the bottom x-axis indicates wpalpha, while the top x-axis indicates whether N2 was a compound (y) or not (n). Like the preceding plots, there is overlap in the densities/box plots from plot to plot, so these plots do not suggest much about the role for informativeness in word-phrase parse availability.



Word-phrase parse? (wpalpha, bottom x-axis) vs. conditional probability of N1 given N2 (tokens) (cpn1gn2), split by N2 compound status (top x-axis)



Word-phrase parse? (wpalpha, bottom x-axis) vs. conditional probability of N2 given N1 (tokens) (cpn2gn1), split by N2 compound status (top x-axis)
“Word-phrase parse?” was also plotted against the type-based informativeness measures, conditional probability of N1 given N2 family size (cpn1gn2fs) (Figure 82), and conditional probability of N2 given N1 family size (cpn2gn1fs) (Figure 83). There is once again significant overlap in the box plots/violin density in these. Although the “yes” word-phrase parse has greater density than the “no” word-phrase parse in the middle ranges in Figure 82 for the conditional probability of N1 given N2 family size (cpn1gn2fs), there is greater density for “no” in the low ranges of log(cpn1gn2fs). On the other hand, the conditional probability of N2 given N1 family size (cpn2gn1fs) in Figure 83 has greater density with very low log(cpn2gn1fs) values for compounds with the word-phrase parse, when compared to compounds without the word-phrase parse. This could be taken to suggest a possible role for informativeness in word-phrase parse availability, when it comes to type-based conditional probabilities, which may be expected given Schreuder and Baayen (1997) reporting greater psychological salience of type frequencies.



Word-phrase parse? (wpalpha) vs. conditional probability of N1 given N2 family size (types) (cpn1gn2fs)



Word-phrase parse? (wpalpha) vs. conditional probability of N2 given N1 family size (types) (cpn2gn1fs)
The possible role of the conditional probability of N2 given N1’s family size is suggested further when these are split by whether N2 is a compound or not. Let us consider the conditional probability of N1 given N2’s family size first (cpn1gn2fs). As shown in Figure 84, when N2 is not a compound, it seems that lower conditional probability values occur more often when the compound does not have the word-phrase parse available. On the other hand, when N2 is a compound, there appears to be a weak tendency for lower conditional probability values to occur with compounds that have the word-phrase parse, as part of the box plot for “yes” on wpalpha is located below the 25th percentile of the box plot for “no.” I take this to suggest a role for informativeness in word-phrase prosody, possibly interacting with whether N2 is a compound or not.



Word-phrase parse? (wpalpha, bottom x-axis) vs. conditional probability of N1 given N2 family size (types) (cpn1gn2fs), split by N2 compound status (top x-axis)
Moving onto the conditional probability of N2 given N1’s family size (cpn2gn1fs) in Figure 85, there is a trend for lower conditional probability values to occur with compounds that have the word-phrase parse, whether N2 is a compound or not. When N2 is not a compound, the median conditional probability value for compounds with the word-phrase parse is lower than the 25th percentile of the conditional probability values for compounds without the word-phrase parse, and much of the right-hand box falls below this point as well. When N2 is a compound, the median conditional probability value for compounds with the word-phrase parse is approximately the same as the 25th percentile of the conditional probability values for compounds without the word-phrase parse, and some amount of the right-hand box falls below this point as well. As with the plots in Figure 84, I take this to suggest a role for informativeness in word-phrase prosody, again possibly interacting with whether N2 is a compound or not.



Word-phrase parse? (wpalpha, bottom x-axis) vs. conditional probability of N2 given N1 family size (types) (cpn2gn1fs), split by N2 compound status (top x-axis)
In the next section, I turn to a statistical analysis of a subset of the informativeness measures considered above in order to formally confirm a role for informativeness in word-phrase parse availability.
5.3.5.3 Statistical Modeling
As previously mentioned, whether a compound has a word-phrase parse available was considered a binary variable “yes, at least one source reports the word-phrase parse” vs. “no, no source reports the word-phrase parse” for the present study. Accordingly, a binomial logistic regression is appropriate for conducting a statistical analysis on this data (Gries 2013). The binary variable “Word-phrase parse?” was set as the dependent variable for this analysis, with the measures of informativeness treated as independent variables.
The six measures of informativeness discussed to this point are the raw frequency of N1 in the corpus, the raw frequency of N2 in the corpus, the conditional probability of N1 given N2 (tokens), the conditional probability of N2 given N1 (tokens), the conditional probability of N1 given N2’s family size (types), and the conditional probability of N2 given N1’s family (types). As mentioned above, the raw frequencies are involved in the calculation for the conditional probability of N1 given N2 (tokens) and the conditional probability of N2 given N1 (tokens), so there is a correlation between the raw frequency measures and the conditional probabilities. One of the assumptions of a binomial logistic regression is that there are relatively low levels of correlation between the variables, as high levels of correlation can make the results of a statistical analysis essentially uninterpretable, as, if there are high levels of correlation, it is not clear which factors explain the data. In order to check the severity of correlations between independent variables, a variance inflation factor (VIF) was calculated for each independent variable in R. If raw frequencies are not removed from the model, the VIF for each factor ranges from relatively low, below 3, for predictors which are not related to informativeness measures, such as whether N2 is a compound, to extremely high, above 1 × 10¹³, for informativeness measure predictors. Accordingly, because the raw frequencies are used to calculate the conditional probabilities, the raw frequencies were removed, as they are highly correlated with the conditional probabilities. With raw frequencies removed from the model, the VIF for each factor was relatively low, below 4.3, indicating relatively low levels of correlation between factors. This leaves only the four conditional probability measures as independent variables.
In addition to the informativeness measures, three additional factors were also included. These were the length of N2 in moras, known to be a factor for Kansai Japanese compounds in general, whether N2 is a foreign loanword or not (based on Nakai’s generalization that word-phrase compound N2s may be foreign loanwords), and whether N2 is a compound or not (based on Nakai’s generalization that word-phrase compound N2s tend to be compounds). For this last factor, N2s were considered compounds if they had the form of one of the compound types present in Kansai Japanese as discussed in this work. Sino-Japanese words made up of two Sino-Japanese morphemes, such as bangumi
Given that most of the word-phrase compounds reported by Nakai have a compound as an N2, there is some sense in which this is the “prototypical” word-phrase structure, at least descriptively speaking. The visualization of the data from the previous section also suggests some role for whether N2 is a compound in word-phrase compounds. Accordingly, I also considered the possibility that whether a compound has a “prototypical” N2 interacts with the informativeness measures. Thus, in addition to the simple factors of the four informativeness measures, as well as N2 length, N2’s status as a loanword, and N2’s status as a compound, I also considered the interaction of N2 being a compound with the conditional probabilities of N1 given N2 and N2 given N1 in both tokens and types. Second, because N2 length in moras has long been known to be an important factor in Japanese compound prosody, I also considered the interaction of N2 length with the four measures of informativeness.
The initial model for this binomial logistic regression, then, is as follows. The abbreviated variable name I used in R is given as well.
The informativeness factors were centered and log-transformed (represented in R by the scale() and log() functions) in order to reduce the influence of any remaining outliers below the 97.5th percentile in the data.
The formula used in R for this model is given in (177) below.
5.3.5.4 Results of the Model and Discussion
When this model is run in R using the generalized linear model glm() function, the following results in Figure 86 are given. This model was run after removing outliers from consideration according to the process described above in the section on visualization of the data.



Results of a binomial logistic regression on the model in (176)
As the results show, none of the single factors are significant. However, all four interactions between the conditional probability measures and whether N2 is a compound are significant, one interaction between a conditional probability measure (cpn2gn1fs) with N2’s length in moras is significant, and another interaction between a conditional probability measure (cpn2gn1) with with N2’s length approaches significance. First, the interactions of whether N2 is a compound with the conditional probability of N2 given N1 family size (types) (cpn2gn1fs), the conditional probability of N1 given N2 (tokens) (cpn1gn2) and of N2 given N1 (tokens) (cpn2gn1) are significant to the p < 0.05 level. The interaction of N2’s length in moras with the conditional probability of N2 given N1 family size (types) (cpn2gn1fs) is also significant to the p < 0.05 level. Even more significant, however, is the interaction of whether N2 is a compound with the conditional probability of N1 given N2 family size (types) (cpn1gn2fs), which is significant to the p < 0.001 level. The interactions between whether N2 is a compound and N2’s length in moras with the family size-based measures of informativeness may be expected if informativeness measures play a role in whether a word-phrase parse is available, given that type frequencies are psychologically more salient as found by Schreuder and Baayen (1997). However, the model also finds a role for the interactions between whether N2 is a compound and the token-based informativeness measures as well, further suggesting a role for frequency on the word-phrase parse, whether frequency is measured in types or in tokens. Second, unlike the case of English compound prosody, in which it is the informativeness of N2 which plays a role in double accenting (noting that Bell and Plag did not test N1 in the same way that N2 was tested), it may be the case that Kansai Japanese word-phrase compounds are more concerned with the informativeness of N1 than the informativeness of N2, if the greater significance of the interaction between whether N2 is a compound and the conditional probability of N1 given N2’s family size (cpn1n2fs) can be taken as related to a larger role of N1’s informativeness in whether a compound can have the word-phrase parse. In this case, the former conception discussed above of N1 losing its accent being the mark of surprisal may be the state of affairs in Kansai Japanese, rather than the latter conception of N2 retaining its accent being the mark of surprisal. That said, the fact that there is a significant interaction between N2’s length in moras and the conditional probability of N2 given N1’s family size (cpn2gn1fs), despite no other interaction with N2 length being significant, as well as both interactions between whether N2 is a compound and conditional probabilities of N2 given both N1 family size (cpn2gn1fs) and N1 tokens (cpn2gn1) being significant, suggests that N2 is not in any way unimportant in the consideration. The informativeness of both N1 and N2 are important for the availability of the word-phrase parse.
As expected, the traditional factor of N2’s length in moras determining compound length is perhaps not as important for word-phrase compounds as whether N2 is a compound or not. However, given that one interaction with N2’s length is significant and another interaction approaches significance suggests that N2 length still has a role to play, even if it plays less of a role than whether N2 is a compound or not. That N2 length has a role to play at all is perhaps not unexpected, given that, as previously mentioned, compounds seem to need to have an N2 larger than a certain number of moras to be eligible for the word-phrase parse, generally speaking. That there is an interaction between N2 length and one of the conditional probability measures confirms a role for N2 length, even if such a role is not the same as or as clear-cut as the role it has for other compound types.
Although whether N2 is a compound or not is not significant by itself as a factor, when it is considered interacting with the informativeness measures, its effect can be seen. This is interesting given that the compound status of N2 has not to this point played a role in the prosodic parse of a compound. While it cannot be the case that N2 being a compound is necessary, as word-phrase examples like gasorin-sutando and gureepu-huruutu, which both have simplex loanword N2s, show, when a compound N2 is combined with low informativeness in N1 or N2, a word-phrase parse is more likely to result. In the case of word-phrase compounds with simplex N2s, perhaps these are in some sense “super” informative, which may lead to the word-phrase parse, or perhaps the significant interaction of N2 informativeness (types) with the length of N2 in moras or the interaction of N2 informativeness (tokens) with the length of N2 in moras, which approaches significance, may be playing a role. There is also the possibility that some of these compounds are pseudo-compounds (Kubozono 2002, Karvonen 2005), as previously discussed in the section on the syntax-prosody mapping of mono-phrasal compounds, even though their parts also exist as independent words. This may be the case for gureepu-huruutu ‘grapefruit,’ as although gureepu ‘grape’ and huruutu ‘fruit’ both exist, gureepuhuruutu refers to a different fruit, not a grape, just as ‘grapefruit’ does in English, compared to ‘grape’ and ‘fruit.’ Further investigation of pseudo-compounds would be needed to determine whether the word-phrase parse is available to them, and if so, exactly how informativeness might influence the word-phrase parse. It may be that even though a gureepuhuruutu is not literally a gureepu-huruutu ‘fruit which is a grape, fruit of the grape plant,’ the informativeness of the N2 huruutu ‘fruit’ still plays a role. Similarly, irasutoreesyon ‘illustration’ shows compound accentuation irasuto-re’esyon in Tokyo Japanese (Kubozono 2002), but the division of the word results in the combination of irasuto ‘illustration’ (clipping of irasutoreesyon) and an element identical to reesyon ‘field/combat rations.’
In order to attempt to refine the model, I attempted to remove one of the least significant interactions in the model, which was the interaction between N2 length in moras and the conditional probability of N1 given N2 family size (cpn1gn2fs). However, when the original and the simplified models were compared with a Chi-Square Test using the anova() function in R, the result of this simplification was a model that was significantly different from the original model (Pr(> Chi) of 0.5733), so the original model was kept. The original model has a p-value from chi-squared distribution of 0.0008632245, as calculated by R, indicating that the model is highly significant.
These results suggest informativeness may play a role in whether a compound can have the word-phrase parse in Kansai Japanese. Second, it suggests that neither informativeness measures nor morphological or phonological length factors are sufficient on their own to influence the availability of a word-phrase parse. Rather, this availability seems to come from some combination of factors, particularly the morphological status of N2 as a compound itself, in combination with the informativeness of N1 and N2, and to some extent the length of N2 in moras, particularly in combination with the informativeness of N2.
These results open up a path to further research in the role of gradient, usage/frequency-based measures in influencing prosodic structures in Japanese and in other languages. In particular, because the word-phrase parse is predicted as a separate prosodic structure, per the discussion in Chapter 3, it seems that informativeness does not simply influence a compound’s pronunciation itself, but rather, it does so because informativeness plays a role in mapping a compound syntactic structure to a word-phrase prosodic structure. If it is the case that informativeness is involved, however, this will necessitate an approach to syntax-prosody mapping that takes into account such gradient factors. One possible approach would be to attempt to implement Match Theory in approaches that use weighted constraints, such as Harmonic Grammar (Legendre, Miyata, and Smolensky 1990), Maximum Entropy Grammar (Goldwater and Johnson 2003), or Gradient Symbolic Computation (Smolensky and Goldrick 2016). For such grammars with weighted constraints, higher or lower weights may be assigned based on higher or lower values of informativeness. For example, in cases when the informativeness of one or both constituents of the compound is high (= greater surprisal), then the relative weight of a constraint that would force N2 to surface within a minimal phonological phrase co-extensive with N2, leaving N1 the daughter of only the maximal phonological phrase, may allow (perhaps optionally) the word-phrase parse to surface, even if no syntactic or phonological factors require the word-phrase parse to surface. If this is reasonable, it seems that an additional constraint forcing this phonological phrase would be required, as the constraints covered in the previous chapter would only put N2 in its own minimal phonological phrase if N2 were long enough. Such a constraint may be a simple “map X⁰ to
I briefly sketch this proposal here, using tyuuka-ryooriya ‘Chinese restaurant,’ a compound which has both mono-phrasal and word-phrase prosodic patterns available, as many word-phrase compounds have the same N2 length as mono-phrasal compounds. Only the correct mono-phrasal and word-phrase outputs are considered in this illustration, and only Match constraints are given here. The informativeness-dependent constraint is given here as Match(X⁰,
The first tableau displays a competition in which the informativeness of N1 and N2 are not particularly high (= lower surprisal), so the mono-phrasal parse would be expected. Because the informativeness of N1 and N2 are low, the weight of the Match(X⁰,
However, if the informativeness of one of the components is high (which may be related to, for example, the speaker’s individual experience with the component words in question or the current conversational context), then Match(X⁰,
This time, the score of candidate (179b), the word-phrase parse, is lower, and so it is selected as the winner. Speculating on the meaning of the scores (which were based on arbitrary numerical weight assignments to the relevant constraints, so care should be taken in interpreting the scores too heavily beyond simple optimum determination), if scores are close, as they are in this illustration, it may have bearing on the fact that the word-phrase parse is never the sole parse available for compounds which allow it. Further investigation is needed.
I leave the exploration of this possibility, as well as the exploration of semantic and pragmatic factors mentioned earlier in the chapter, to future research. For now, it seems to me to be possible to say that informativeness may play a role in compound prosody in Japanese as well, and that more research into this area (as well as other types of non-syntactic, non-morphological, non-phonology influencing of prosodic structure, such as by factors like the semantic relations between compound constituents and pragmatic considerations) is warranted.
It should be noted that yo is used in Kansai Japanese dialects as well, and for one participant, yo was a typical sentence ending particle in answers.













































