31 December 2009

The lentil

The native Basque word for 'lentil (Lens culinaris)' can be reconstructed as *tink-il:(a), *sink-il:(a), most of whose variants are restricted to the dialect spoken in the Navarrese valley of Salazar1: tindil, txindil, txingla, xingala. The form txintxil(a) spread also to the neighbouring Roncalese dialect, and further along in the Aragonese valley of Ansó we find tentilla, an interference with the genuine Romance form lentilla.

This word (also found in other Romance languages like Spanish lenteja, French lentille) is derived from Latin lenticula, a diminutive form of Latin lēns, lentis 'lentil', with cognates in Germanic (Old High German linsī, linsin), Slavic *lę̄tjā and Baltic (Lithuanian lę̃ši-s (-iō)).  This could be related to Semitic *ʕa-daʃ- 'lentil' with a shift d- > l- and a nasal infix.

 The Basque forms tilista, txilista, dilista (Westernmost dialects)2 are a diminutive  form *ti-lis-ta with a fossilized Berber article (feminine plural) ti-3 agglutinated.

I think this is substrate loanword whose ultimate origin is Afro-Asiatic *da/ingw- 'a k. of beans; corn', a root widely attested in several branches.
1 In other dialects this word can refer to different plants, mostly of the vetch (Vicia) family.
2 The Gipuzkoan form dilista was chosen for representing the word in the standardized language or Euskera Batua (lit. 'United Basque language').
3 Also found in Sardinian.

26 December 2009

More about pigs (updated)

Basque urde 'pig' is a Neolithic Wanderwort also found in NEC *wHa:rttɬ’wǝ 'boar, pig' and IE *pork´-o- 'young pig, piglet' and whose ultimate origin is Austronesian *beRek 'domesticated pig'.

On the other hand, there's the isolated Roncalese form ti, apparently similar to Albanian thi < IE *suH- 'pig'1.

Basque herauts 'male boar', herause, heusi, iñaus, iraus(i), irusi, i(h)ausi 'heat of sow' is a compound *ena-uśi whose second member is probably related to Kartvelian *eʃw- 'boar, pig' and possibly also to Berber *kus- 'pig'2, ultimately deriving from a Vasco-Caucasian root 'ungulate' (see this post).

This word has been the object of a confusion as regarding the Aquitanian theonym HERAVSCORRITSEHE. Although the German linguist Hugo Schuchardt correctly proposed its first element to be related to Basque errauts, erhauts 'cinder, dust' (a compound from erre 'burnt' and hauts 'dust'), he changed his mind afterwards and linked it to herauts, being unaware of the variants which point to a nasal in Proto-Basque.

This mistake, resulting from a poor understanding of the ortographic conventions of the Aquitanian inscriptions (in Latin), which employed -R- to represent the thrill rhotic (Basque <rr>) instead of the flap one (Basque <r>), as well as laziness for reconstructing the protoform of the Basque word, has been copied down by Vascologists to this day2.

This root is also found in Sardinian irrussu 'little boar, whose first member is related to IE *(w)eper-o- 'boar', an interesting Neolithic word linked to Arabic ʕifr-, ʕufr- 'pig, boar; piglet', ultimately deriving from an Eurasiatic root 'ungulate' also found in IE *kapr-o- 'male goat' and Arabic ɣafr-, ɣufr- 'young of deer or goat, goat kid'.
1 Which should be reconstructed as *suq- in account of Celtic *sukk-o-. 
2 This is the source of Asturian gocho, Basque (Zuberoan) kutxu, Catalan cotx, Aragonese cochín, Spanish cochino, French cochon 'pig'. 
3 Gorrochategui: Estudio sobre la onomástica indígena de Aquitania (1984), pp. 330-331. 

22 December 2009

Hair and wool

Basque bil(h)o 'hair, mane' is an interesting word. In compound with the verb utzi, itzi 'to leave' it gives biluz 'naked', biluzi 'to get naked' (with many variants: biloiz, bilaiz(i), bileiz(i), biluxi, buluzi, buluxi).

The etymology of this word has puzzled more than one Vascologist, specially given its similarity to Latin pilus '(a single) hair'. But a borrowing can be safely discarded due to phonetical reasons, because Latin -l- would have given -r- in Basque.

An alternative source from Latin villus 'tuft, lock of hair; hair' (probably an Italoid loanword parallel to Baltic *wil-na 'wool') was then proposed, but I find this unrealistic. My own view is that Basque bil(h)o and Latin pilus are one and the same Cantabrian word, ultimately related to PNC *p’VħVɫV 'feather; mane' (an etymology proposed by Bengtson).

Other 'hair' words found in IE languages come from this Vasco-Caucasian root. In particular, Greek púligges [pl.] 'hairs of the body' and Sanskrit pulakās [pl.] 'bristling hairs of the body' are similar to Nakh *pēla-k’ 'feather'.

The Tyrrhenian counterpart of bil(h)o is Basque ule 'hair', ile 'hair, wool', with several doublets in compounds. The meaning 'wool' suggested to some authors a possible borrowing from Gothic wulla < Proto-Germanic *wullō(n), but this rather makes me think PIE *w̥lH2neHa- 'wool' could be in fact a very old Vasco-Caucasian loanword: *p’VħVɫV > *bVlħV > *wVlħ-, with merging of b/w in PIE (where the voiced labial stop is very rare).

21 November 2009

Two roots for 'earth'

According to Bengtson, Basque lur, lu- 'earth' is related to PNC *lhemdɮɮwɨ 'earth'1.  This rooy is also found as a loanword in substratal IE *lendh- 'open land, waste' (Celtic, Germanic, Balto-Slavic) and Uralic *lamti (*lamtз) 'lowland, meadow'. I supose this term was introduced by Neolithic farmers of the Linear Ceramics (LBK) culture. From Gaulish *landā 'open land' come the toponym Landes and Basque landa 'field, lot'2.

There's also a widespread Paleo-Eurasian root *dVQV 'dirt, clay' whose Basque reflexes are lohi, lot- 'dirty, mud' and zohi, zot- 'clod of earth; brick', the first with the standard treatment of the initial dental stop and the second with assibilation. 

This root was borrowed into PIE *dhigh- 'wall, fortification' (Sanskrit sa-dih- 'mound, heap, wall', Avestan para-daēza 'enclosure'3, Greek teîkhos ~ toîkhos 'wall'), which in NW languages refers to clay-like substances (e.g. English dough). But the native PIE reflex is *dheghóm 'earth'.
1 Dialectally also 'uncultivated land, desert'. 
2 There is also the old collective Biscayan form landar 'desert'.
3 Borrowed into Greek parédeisos, hence English paradise.

16 November 2009

Sea meadows

Although considered by Vascologists to be a borrowing from Latin sŏlum1, Basque sor(h)o 'cultivated field; meadow'2 is actually an Iberian loanword related to PNC *dʒʒǝlV 'plain, plateau'3 whose native counterpart is the archaic Lapurdian zar(h)o 'meadow'. This means PNC *ddʒ corresponds to an apico-alveolar sibilant /ś/ (Basque <s>) in Iberian but a lamino-alveolar /s/ (Basque <z>) in Vasconic.

This root is related to Eurasiatic *TSolV 'steppe, valley, meadow', reflected in Slavic *selo 'arable field' but which in other IE languages underwent a shift meaning to 'sea, lake'4: Greek hélos 'wet meadow, marsh', Old Indian sáras- 'lake, pond, pool'.
1 Borrowed into Basque zoru 'ground, floor'.
2 With the Biscayan variant solo (B) 'field (prepared for sowing)'.
3 Bengtson proposed a different etymology from PNC *tʃʃHæɫu 'earth, ground, sand', which semantically doesn't fit.
4 For people grown up inland, large water extensions look like meadows.

10 November 2009

Putting the pieces together

I've received several complaints from some of my fellow amateurs asking me for a complete explanation of my "system", including a table of detailed sound correspondences between Starostin's PNC and (Proto)-Basque, instead of focusing on individual etymologies.

I think it should be clear from earlier posts that Basque is far from being a homogeneous language and the study of its prehistory can't be separated from substrate loanwords in Western European languages.

I've already addressed the distinction between two main lexicon layers in Basque: standard Basque (that is, those words which comply with Late (Mitxelena's) Proto-Basque phonotatics, for example haga 'stake, pole'), and non-standard Basque (those which don't, for example tako 'block of wood, wedge'). I consider the latter to be borrowings from one or more linguistic varieties to which I give the collective name of Pyrenaic1.

The observed differences between Pyrenaic and Proto-Basque are basically due to the conditioned aspiration of stops p, t, k into h, a sound change which I call Martinet's Law.  Therefore, I reconstruct a prehistoric stage of Basque before it which I name Early (Martinet's) Proto-Basque to differentiate it from Late (Mitxelena's) Proto-Basque. In this way, Pyrenaic and Early Proto-Basque look very similar.

Other isoglosses (mainly regarding to vocalism in the first syllable2) made me distinguish between two language groups, Cantabrian and Tyrrhenian. Some Tyrrhenian words seem to have reached Basque through Iberian, as part of its lexicon is from that origin. For example ui (B) 'pitch' < *uni, from an earlier **guni (PNC *kk’wVnV 'mastics, tar'), with assimilation to *buni and regular loss of the labial stop3. By contrast, Basque koipe, goipe 'oil' < *goni-pe (a compound from the same root) keeps the velar stop.

Very often, Cantabrian words have Tyrrhenian counterparts and viceversa. For example, Basque taket 'stake, wedge; dump, stupid' corresponds to Spanish zoquete 'block of wood; dumb, stupid'.
1 This name was first suggested by my Italian colleague Marco Moretti, who leaded me in the earlier stages of my research. Pyrenaic is also the source of many substrate loanwords in Romance languages (specially Spanish) as well as part of the Iberian lexicon.
2 In many words, Cantabrian has a/e while Tyrrhenian has o/u.
3 Although Bengtson identified this process, he wrongly attributed it to native Basque.

06 November 2009

The magpie and the heron

There's a class of onomatopoeic roots *kVr- which designate several types of birds.

For example, Aragonese garza, Catalan garça, Italian gazza 'magpie' < *karkea is an Italoid loanword from PIE *ḱarhk-eh2- 'magpie' (Lithuanian šárka, Russian soróka)1, akin to Turkic *KArga 'crow' and PNC *q’q’HVrVq’V 'a k. of bird (magpie; eagle-owl)'.

The homonymous Spanish garza 'heron'2 is a loanword from Celtic *kor(x)sā 'heron, crane', probably through a Cantabrian intermediate *karsa. A similar root can be found in Latin ardea 'heron' < PIE *h1orhd-eh2- 'heron', akin to Turkic *Kordaj 'pelican; swan'.

Latin corvus 'raven' is a loanword from Afrasian *ɣurVb- 'crow, raven', and cornīx 'crow' is akin to Uralic *kOrnV 'raven'.
1 Probably also Celtic *kerkā 'hen'.
2 In Spanish, 'magpie' is urraca.

01 November 2009

Spears and poles

Basque langa 'enclosure, rustical door; bar, catch (on a door or window)'1 is related to Occitan/Catalan tanca id.2, with a sound shift *t- > l- in Proto-Basque.

This is an Italoid substrate loanword whose etymology is IE *tengh-s- 'pole' > Latin tēmō 'steering-wheel; spear (of a cart)' and Old English þīsl 'wagon-pole, shaft'3. This IE root is a -n- infixed variant of *(s)teg- 'pole, post' (English stake), which in turn is related to PNC *dwɨq’(w)V: 'log, stump'.

This Vasco-Caucasian root is found as a Vasconic loanword in Spanish taco, Catalan tac 'block of wood, wedge; wooden stick', French taquet (diminutive form) 'block of wood, wedge', as well as in Basque haga 'pole, stake'4. The Tyrrhenian variant is found in Spanish tocón (augmentative form), Portuguese tôco 'stump' and also in Catalan soca, French souche 'stump', Old French choque 'log' and Spanish zoquete (diminutive form) 'block of wood; dumb, stupid'5, with assibilation of the initial dental.
1 Mitxelena derives it from Romance *planka 'plank, board'.
2 Coromines proposes a derivation from a hypothetical Sorotaptic verb *tankō- 'to close'.

3 See Mallory & Adams (2006), p. 249.
4 There's also non-standard Basque tako 'circular piece of wood' and taket 'stake, wedge; dump, stupid'.

5 Coromines considered this word to be an arabism. What a zoquete!

Scabies and iron

English iron (and related forms in other Germanic languages) is a loanword from Celtic *īsarno- 'iron', whose origin is the IE adjective *H1ésH2r̥-no- 'bloody', derivated from *H1ésH2r̥- '(flowing) blood'.

Although nobody thought of it until now, *H1ésH2r̥-no- is also the origin of Spanish sarna 'scabies, mange', a pre-Romance word glossed by Latin authors. There're also Basque sarra 'rust, iron tartar'1 and Spanish sarro 'tartar, plaque', with assimilation -rn- > -rr-.

The source of this and other loanwords is an IE substrate language called Italoid by the Indo-Europeanist Francisco Villar2 and Sorotaptic by the Catalan linguist Joan Coromines3. According to them, Italoid shares isoglosses with Baltic and Italic in the IE dialectal cloud.
1 Only found in the Biscayan dialect, which also has sarna.
2 See Indoeuropeos y no indoeuropeos en la Hispania prerromana (2000).
3 Who has a quite different etymological proposal for this word.

The rock

Romance *rokka 'rock' (Italian rocca, French roche, Occitan ròca, Spanish roca, Portuguese rocha) is a Tyrrhenian loanword related to PNC *riqq’wA 'mountain, rock; cave'.

Basque arroka, harroka has a prothetic vowel also found in Gascon arròca1. This is so because Basque doesn't allow rhotics at word-initial.
1 The Basque word is probably a borrowing from Gascon.

24 October 2009

Dentalization of velar stops

Another non-standard phonological feature of Basque is dentalization of velar stops1. This phenomenon is very often accompanied by expressive palatization2:

kakur 'dog' > ttattur
-ko 'diminutive suffix' > -t(t)o
konkor, kunkur 'hunchback(ed); bump' > ttonttor, t(t)unt(t)ur.

Dentalized variants of Latin and Romance borrowings like *fongo 'mushroom' > onddo, kipula 'onion' > tipula, (k)upa 'barrel' > dupa show the process was still going on in the High Middle Ages.

Unlike assibilation, dentalization only affects velars and can occur at any syllable onset and not only at word-initial. Sometimes, we can also find non-dentalized forms in Iberian corresponding to dentalized ones in Basque:

idi 'ox'3 ~ Iberian biki4 (PNC *bHe:mttɬɨ (˜ -u,-i)  'deer, mountain goat')
borda 'hut' (PNC *borGwV: (˜ -ǝ-) 'stall; shed, tower')
gurdi, burdi 'cart' ~ Iberian oŕke (PNC *k’wVrk’V 'something round or rotating')
idor 'dry'5 ~ Iberian bekoŕ6 (PNC *=iGGwAr 'dry, to dry')
urde 'pig' ~ Iberian uŕke (PNC *wHa:rttɬ’wǝ 'boar, pig')

Dentalization also arises in compound variants7 where a velar stop gets to final position:
begi 'eye' > *bet-
behi 'cow' (EPBasq. *pekhi) > *bet-
ogi 'bread' > *ot-
zohi 'clod of earth' (EPBasq. *tokhi) > *zot-
1 Commonly found in children's language.
2 As in the case of assibilation (see), this is why Vascologists haven't indentified it.
3 Also non-standard pit(t)ika 'goat kid' and similar forms.
4 Compare also Spanish pequeño/a (without palatization!) and Italian piccino/a 'little' with Catalan petit/a and French petit/e 'id.'
5 There's also ador (L), a form contaminated by agor 'dry'.
6 There's also Basque pegor 'sterile, poor'.
7 Called combinatory forms by Vascologists.

17 October 2009

Assibilation of initial stops

The lamino-alveolar sibilant [s] <z> isn't a native consonant in Basque at word-initial, its possible sources being these ones:

1) The retention of an etymological sibilant in non-standard varieties vs. its loss in standard Basque, as in zapo 'toad' (a loanword from Semitic *ɬ’abb- 'a kind of lizard') vs. standard apo.

2) The result of Iberian *t- in loanwords. We've already seen that in a previous post.

3) The assibilation of an initial stop, either velar (most frequent) or labial (very rare), which often undergone expressive palatalization1 into [ʃ] <x> or [] <tx>. For example, non-standard kakur 'dog' has given the assibilated forms zakur, txakur, xakur, and Romance *capel 'hat' has given txapel, xapel, zapel.

An example of assibilation of a labial stop is zegi 'milky cow' vs. behi 'cow' < EPBsq. *pek:i.

The coalescence of assibilation and expressive palatization in the same  word has made Vascologists2 incapable of identifying the former. In Basque only coronal consonants [s, ś, ts, tś, d, t, n, l] (and in some dialects also [r, ŕ]) can undergone palatalization, so in order for a velar to be palatalized, it must be first converted to a sibilant.

The chronology of assibilation is ancient, as it's already found in Iberian. For example, zaldi 'horse' (Iberian saldu) is related to Germanic *kulta- 'colt'3, and zamar 'sheepskin jacket'4 is related to Germanic *xamísa- 'clothes, skirt'5 (PNC ʕa:mV ‘skin, cloth’).
1 A common linguistic device in Basque used to denote diminutive or affective meanings.
2 Bengtson among them.
3 Nikolayev reconstructs IE *g(w)Ald- 'foal, young of an ass', related to NEC *gwalV 'horse'.
4 A non-standard word borrowed into Romance (Spanish zamarra, Catalan samarra).
5 Hence Spanish camisa 'shirt'.

The fate of Proto-Basque *t-

Proto-Basque's consonant system had a tense/lax contrast rather than a voiceless/voiced one, so dental stops were *t: (tense) and *t (lax).

Initial *t- was apparently absent from Late PBsq., as d- is exceedingly rare in modern Basque outside of recent borrowings and verbal forms. Although Trask noticed this anomaly in Mitxelena's system, he didn't offer any explanation1.

The key is found in borrowings like lanjer (< French danger) or lizifrina (< Romance disciplina)2, where the original d- evolved into l-. This suggests EPBasq. *t- became Basque l-, as in these examples:

Basque langa 'enclosure, rustical door; bar, catch; crossbeam' ~ Catalan tanca < EPBsq. *tanka  (IE *tengh-s- 'pole')

Basque lan(h)o, laino 'cloud, fog' ~ non-std Basque t(t)anka 'drop' < EPBsq. *tank:A (PNC *t’Hænk’o 'drop, spray')

It looks like Iberian t was actually realized like a dental affricate [ts]3 which gave a lamino-alveolar sibilant [s] (Basque <z>)4 in loanwords from that language in Basque:

Basque zohi 'clod of earth; brick' < Iberian *tok:i but lohi 'mud', idoi 'pool, puddle; bog, marsh' < EPBsq. *(i-)tok:i (Starostin's PSC *[t]VQV́ 'dirty, clay')

EPBsq. *t:- gave Basque is h or zero, as in these examples:

Basque haga 'stake, pole' < EPBasq. *t:akA ~ non-standard tako 'block of wood, wedge' (PNC *dwɨq’(w)V: 'log, stump')

Basque -ar 'male', Aquitanian HAR- ~ non-std Basque *-tar 'man' < EPBasq. *t:ar: (PNC *dlʒiwlV 'man, male')
1 The History of Basque (1997), p.136.
2 It's worth noticing these words are geographically restricted to a few dialects.
3 See his Economie des changements phonétiques (1955).
4 This sound shift is often quoted by Ibero-Basquists (defensors of a close relationship between Basque and Iberian) as being a native Basque one.

15 October 2009

Proto-Basque and non-standard Basque

Proto-Basque is the reconstructed ancestor of the historical Basque dialects, presumably spoken in the Late Iron Age and whose main features were worked out by the Vascologist Koldo Mitxelena in his magna opus Fonética Histórica Vasca1. The consonantal system of Mitxelena's Proto-Basque didn't had a voiced/voiceless contrast but a fortis (tense)/lenis (lax) one. This was only relevant in medial position (between vowels), because at word-initial only lenis phonemes might appear.

With regard to stops, the French linguist André Martinet (Économie des changements phonétiques, 1955) posited an older system modelled after Danish, in which plosives could be realized either as voiceless aspirated (fortis) or mild voiceless (lenis) at word-initial and as mild voiceless (fortis) and approximants (lenis) between vowels. In Mitxelena's Proto-Basque, the aspirated plosives would have evolved to [h], which had no phonemic value.

I call this process Martinet's Law, and for comparative purposes we should differentiate between Early Proto-Basque (before Martinet's Law) and Late Proto-Basque (after Martinet's Law), sometimes called "canonical" or "Mitxelena's" as opposed to Bengtson's.

In the Aquitanian inscriptions we can found forms with initial h- like HALSCO or HAVTENN2 vs. others with t- like TALSCO (Iberian talsku) or TAVTINN (Iberian tautin) and different geographical distribution. This isogloss is a particular case of Martinet's Law, which separates Proto-Basque from other linguistic varieties like Iberian.

There are also Basque words like k(h)ako 'hook'3 < IE *ko(n)g- 'peg, hook, claw' < Paleo-Eurasian *ɣoŋɣV 'peg, nail' or tara 'young branch' < IE *dhal- 'sprout' which don't follow Martinet's Law and  constitute what I call non-standard Basque4. They are late loanwords from those linguistic varieties, whose speakers were stigmatizated people like highlanders or nomadic shepherds, a fact which lead to their ultimate extinction in the High Middle Ages (there's evidence in NW Catalonia's toponymy of a Bascoid language spoken there until aprox. 1000 AD).

Unlike his original proposal, Martinet's Law also can occur between vowels. For example, Basque zahar 'old' < EPBasq. *sakh5 vs. Iberian sakaŕ (Starostin's PNC *tʃ’HǝqwV 'big').
1 For a summary of Mitxelena's work, see Trask's The history of Basque (1997).

2 Compare Basque hauta 'election, excellent' < Celtic *toutā- 'people'.
3 In the Salazarian dialect there's an isolated form kaiku 'aquiline' found part of the compound sudur-kaiku 'aquiline nose'.
4 It's a pity Bengtson didn't recognize this as a genuine phenomenon.
5 Basque /z/ represents a lamino-alveolar sibilant [s].

The Vasco-Caucasian hypothesis

Although the idea of a genetical relationship between Basque and Caucasian languages was envisaged by 20th century linguists like Alfredo Trombetti, René Lafon and Karl Bouda, it wasn't properly formulated until circa 1970, when the Polish geographer Bogdan Zaborski grouped Basque, Caucasian languages and Burushaski into an Asianitic family. 

In the 90's, the American linguist John Bengtson proposed a Macro-Caucasian (also called Vasco-Caucasian) phylum comprising Basque, North Caucasian1 and Burushaski (see his seminal paper), and being part of a larger Dene-Caucasian (also called Sino-Caucasian) phylum comprising Sino-Tibetan, Yenisseian and Na-Dené, first posited by the Russian linguists Sergei Starostin and Sergei Nikolayev2. Unfortunately, Bengtson's work is full of methodological and factual errors. This is why I have created an independent line of research, whose guidelines I'm going to explain in this blog.

In my view, the Vasco-Caucasian family spread through Europe in the Neolithic à la Renfrew, leaving substrate loanwords in the IE languages which superseded it in the Bronze Age. And although this might be correct as regarding the whole picture, it needs some corrections at a smaller scale. For example, Etruscan (possibly a Vasco-Caucasian language) was brought to Italy by seafaring invaders from the Aegean (one of the Sea Peoples) who gave rise to the Iron Age Villanovian culture.
1 A hypothetical language family comprising NWC (Abkhaz-Adyghe) + Hatti and NEC (Nakh-Daghestanian) + Hurro-Urartian.
2 Who in 1994 published their North Caucasian Etymological Dictionary. In addition to the etymological on-line dictionaries of Dene-Caucasian branches (except Na-Dené) and Bengtson's "Proto-Basque" at Starostin's site, there's a sound correspondence table at Wikipedia.