Une nouvelle méthode, rapide et efficace, pour reconstruire les premières migrations de l'humanité. - HAL Accéder directement au contenu
Article dans une revue Mythologie française Année : 2015

Reconstructing Early Human Migrations Using Folktales as Data: Research Report on a New, Quick and Efficient Method.

Une nouvelle méthode, rapide et efficace, pour reconstruire les premières migrations de l'humanité.

Julien d'Huy
  • Fonction : Auteur
  • PersonId : 926140

Résumé

Can we reconstruct ancient human migrations by using a statistical signal from the diffusion of certain tales-types? It is likely that the first human migrations to new regions of the world have left a deep cultural footprint on the areas through which these ancient humans migrated. I have recently proposed that, even now, the signal of this past migration can be found by studying folktales (d’Huy 2015a, following a proposition in d’Huy 2015b), which is here reviewed in the English language for readers who may not find the original French publication accessible. Indeed, part of oral folklore appears to be relatively stable across time at a large geographical scale, in some cases since the Palaeolithic Period (e.g. Berezkin 2013), and this can be at least partly correlated with a continuity in genetic material of people from whom the stories were recorded (e.g. d’Huy & Dupanloup 2014). In this work, I used Hans-Jörg Uther’s motif index (2004) as a database to construct a matrix in which the rows represent the cultural area to be studied and the columns represent tales-type (i.e. recurring plot patterns in the narrative structures of traditional folktales). The advantage of Uther’s index is that it further develops the Aarne–Thompson classification system (resulting in the Aarne–Thompson–Uther classification system) and it includes a greater number of international folktales in its expanded listing, even if the index is not without problems of Eurocentrism, geographical bias and inconsistency of criteria for classification. I focus on tales-type connected with wild animals, according to the hypothesis that these tale-types can allow us to reconstruct a deeper past than motifs linked to domestic animals. I also employ two sections of Uther’s index: “Wild Animals” (in its entirety; tale-types ATU 1–99) and “Other Animals and Objects” (tale-types ATU 220–299, but where motifs with interactions between wild and domestic animals and between wild animals and humans were removed; for a full listing, see d’Huy 2015a). From this, I developed two databases. The first database excludes those areas with less than 20 of the relevant tales-types identified, with the exception of the Basque language area with 19 relevant types. This yielded a database of 109 tales-types. I maintained the same cultural areas in the second database, including the tales identified in the section “Other Animals and Objects” to check the initial results. The second database exhibits 73 tale-types, and certain regions do not reach the 20 motif limit and these results should be read with caution. The principle of the test was to see if two independent, unrelated databases (the tales-type used for each dataset are not the same), which may not be very strong on their own, might ‘converge’ in stronger conclusions through reciprocally supportive results. Using SplitsTree4 (Huson & Bryant 2006) with the first database, I create a bio-neighbor-joining tree (incorrect ‘p’ distance) that is a classic clustering method for the creation of a phylogenetic tree. The delta score of this tree is 0.39. A delta score ranges from 0 to 1: a score of 0 indicates that the data perfectly fits a phylogenetic message; a score of 0.39 shows a low (yet existing) phylogenetic message, accompanied by strong borrowings between individual areas. The bio-neighbor-joining tree shows a progression from the African area to the Amerindian area, via both the Near East and the Far East, and a cluster associating the Basque and Northern European countries. From the Far East, one branch goes back to the other European areas. The geographical coherence of certain clusters with a complex settlement history, such as the African area (all the African area plus the Indian area) or the American area, should be noted. In this light, it becomes unlikely that the stories were only diffused to indigenous peoples by Europeans during the period when Europeans were beginning to appropriate the lands of the American and African area. Had the stories been diffused by Europeans, one would expect a random distribution. In the same way that newcomers to a region do not generally succeed in completely imposing their genes, thus allowing for the continued flow of genetic material from the earlier inhabitants of a region to later populations, immigrants may often largely assimilate the tales of indigenous peoples, not necessarily supplanting them with their own. Previously known tales could thus potentially continue by vertical diffusion through one wave of migration after the next. Under such circumstances, only a sufficiently substantial immigration, or a radical ontological change, could provoke the emergence of a completely new constellation of folklore material. If this working hypothesis is correct, the conjoined study of the distribution of presence and absence of certain tales-type should enable the reconstruction of the diverse human migrations that have gradually covered our planet. This leads to the question of how to demonstrate the stability of such primitive clusters of motifs. For the Eurasian area, I used the software SAM (Rangel et al. 2010) in order to calculate that the correlations between the Jaccard distance values of similarly attested motifs in each individual area and the geographical distance of these areas from one another was very low. The geographical distance explains only 6.1% of the variance for the first database and 2.2% for the second database. Significantly, this low result for groups of identified tale-types suggests that only a small portion of the folktales identified with each cultural area studied have been borrowed from that population’s closest neighbors. It would show that vertical continuity inside a same area should be a better predicator to explain the content of oral folklore than a series of exchanges through local and regional contact networks. The delta score of the trans-continental bio-neighbor-joining may also be due in part to the late, post-colonialist, diffusion of certain folktales from Europe, which would diminish the quality of the phylogenetic message of other areas, yet without hiding the primitive migration (i.e. the African or the American cluster). In order to control the results of the clusters found with the bio-neighbor-joining tree, I used the software Structure 2.3.4, a Bayesian model-based algorithm that is widely used for clustering genetic data (Pritchard et al. 2000; Falush et al. 2003). Given the number of clusters (K), Structure is designed to estimates genetic allele frequencies in each cluster and population memberships for every individual. With the online software StructureHarvester (Evanno et al. 2005), it is also possible to infer a population structure and estimate the more likelihood number (“K”) of founding populations. Using the first database, the number K is four, i.e. all the area studied seems to be possibly divided in four. Each of the four clusters has a significant fixation index (Fst). With genetic data, the Fst measures a population differentiation according to genetic structure. An Fst of zero indicates no divergence between populations, whereas an Fst of one indicates complete isolation of populations. This statistical tool can be applied to test the stability of other elements of culture such as tale-types or motifs within a cultural area. The first cluster I found associates the African area (East African, South African, Algerian, Moroccan, Egyptian, Namibian, Sudanese, Afro-American) and some of the Asian area (Chinese, Indian, Iranian, Kurdish, Tadzhik) with an Fst of 0.376. This result could potentially be a trace of the moment when homo sapiens left Africa or it could provide insight into continuous exchanges of population between Asia and Africa. The fact that all the African versions are brought together could imply a very ancient substratum. The second cluster shows significant grouping between the North European area (Byelorussian, Karelian, Finnish-Swedish, Sámi/Lappish, Norwegian, Polish, Swedish), the American area (Argentine, South American Indian, Mayan, Mexican), Japanese, Turkish, Basque and the Flemish area. It shows an Fst of 0.615. This cluster must be older than 15,000 years, when America was populated as a result of migrations across the Bering Strait. The grouping of Basque and North European folklore might be explained by the fact that the Franco-Cantabrian refuge area was the source of expansions of hunter-gatherers and recolonized much of northern and central Europe after the last glacial maximum (Torroni et al. 2001; Olalde et al. 2014). Note that a link between certain Basque and Sámi/Lappish-Scandinavian oral traditions has been observed previously (e.g. for the Cosmic Hunt and the Bear taboo, see d’Huy 2013a–b). The presence of the Flemish remains difficult to explain. The third group clusters Southeast European areas (Bulgarian, Georgian, Greek), North European areas (Estonian, Latvian, Lithuanian, Finnish) and East European areas (Ukrainian, Russian), with an Fst score of 0.368. This group is geographically located between the first and the second cluster, yet surrounds the fourth cluster, which may indicate that the fourth cluster is younger than the first and the second cluster, and older than the fourth cluster. The fourth cluster includes most of the west European area (German, Catalan, French, Frisian, Irish, Italian, Dutch, Portuguese) and a Central European country (Hungarian). This cluster, with an Fst of 0.36, may be linked to a tradition and circulation of written texts. For each cultural area, the Fst score oscillates between 36.04–61.64%, which is much higher than the genetic variation for these groups, and suggests that the traditions in these areas, once established, seem to have been very conservative. These results parallel those of Ross et al. (2013) and exhibit significantly stronger average Fst results based on cultural elements (0.08) and genetics (0.0053), as calculated by Bell et al. (2009). These data also leave the impression that it is easier for two neighboring populations belonging to two clusters to exchange genetic material than it is for them to modify their folklore. On the basis of the combination of the bio-neighbor-joining and clustering approaches applied here, it becomes possible to propose a simple evolution for the folktales studied: the first humans and their folklore may have spread out from East Africa into the Near East to the Far East. After a deep reformulation or restructuring of the folklore (the Fst of the second group is very high), a second expansion took place from this region, carrying the folklore to Palaeolithic Europe and America. In Europe, this diffusion has been followed by a Neolithic (?) substratum and a more recent and strictly western European superstratum. However, population movements and cultural exchange across Europe has been so extensive and stratified that it is methodologically very problematic to interpret the history behind the degrees of similarity between these European cultures’ folklore revealed by phylogenetic tools (d’Huy 2014/2015: 58–59). The most compelling clusters for comparison are those spanning different continents, where such stratified contacts cannot account for the patterns in the data. In order to test this hypothesis of the history of diffusion, I have applied the Structure algorithm in order to exclude from the corpus each area with greater than 30% data which refer to acculturation signal (i.e. less than 70% of single-origin). Indeed, these area having a complex history, it is expected that they are intermediate between many others and “blur” the phylogenetic signal. So the algorithm cannot place such area in an intermediate between many others around it, and that can biase their position. Accordingly, I found an inverse correlation between the delta score of individual area and the importance of their main source (Pearson : -0,46238 ; p = 0,00093719 ; Spearman : -0,44254 ; p = 0,001635) : bigger is the delta score, lower is the part of the main source and, consequently, the coherence of the whole area. Once I purged the area with strong admixture effects, I created a new bio-neighbor-joining tree. The disposition of this new tree agrees with the previous one. In order to control the results obtained from the larger corpus, I analysed the second, smaller corpus with Structure; the best number K of clusters found is two. The first one (Fst = 0.7198) presents the equivalent of a combination of the first and second clusters identified in the corpus one, although with the exception of China and India. The second cluster (0.0075) groups together the equivalent of the main part of the third and the fourth groups of corpus one, with the exception of Georgian, Frisian, Irish, Italian and Portuguese, which are connected with the first group. These results are remarkable as the number of tales-type used to make these calculations is in some cases much lower than for the same areas in the first corpus (the number of tales-types used for each area in the second corpus oscillates between 3–42 tales). To a certain extent, the second corpus corroborates the initial conclusions: a primitive Palaeolithic diffusion, and another, more recent (Neolithic?) diffusion in the European area. Finally, I applied Mesquite 2.75 (Maddison & Maddison 2006; 2011) to the two versions of the first corpus (altered and unaltered), in order to establish the 100 most parsimonious trees and the two strict consensus trees that subsume all of them. These trees proved very similar to the clusters previously found above. Moreover, if both trees are rooted between African and American areas, the relationship may offer indications concerning part of the proto-folklore that was carried in the initial migration of humanity from Africa. Of course, not all of the reconstructed proto-folklore (ATU 1, 8, 9, 38, 47A, 60) is necessarily fully accurate as certain European tales have been disseminated entirely during the post-colonisation period, yet much of it may be correct, as will be born out through additional research from different perspectives in the future. Note that such reconstructions might also be done at any node of these trees (e.g. to reconstruct a Proto-Indo-European or a Pre-Amerindian folklore). This study is only a step, a pilot study test on two sample corpora to assess what might be done on a larger scale. It is necessary to collaborate with other researchers in order to create ever larger mythological databases so that we may fine tune our analysis models and develop new methodologies.
En m'appuyant sur deux corpus de contes (l'un servant de corpus de contrôle) et en utilisant des outils statistiques pour beaucoup empruntés à la phylogénétique, je montre 1/ que la distance géographique séparant deux folklores explique faiblement leur variabilité, ce qui laisse supposer une transmission essentiellement verticale des contes, 2/ qu'il existe des "super-aires" folkloriques, reflétant sans doute l'histoire des migrations humaines, 3/ qu'il est possible, en se basant sur des corpus de contes-type, de reconstruire d'anciennes migrations humaines, 4/ qu'il est possible de reconstruire le folklore de l'être humain lors de sa sortie d'Afrique et à différentes étapes de son expansion à travers le monde.
Fichier principal
Vignette du fichier
2015.3. Une nouvelle méthode, rapide et efficace, pour reconstruire les premières migrations de l'humanité. - Mythologie francaise, 259, juin, 66-82.pdf ( 15.59 Mo ) Télécharger
Origine : Accord explicite pour ce dépôt
Loading...

Dates et versions

halshs-01295102, version 1 (04-04-2016)

Licence

Paternité - CC BY 4.0

Identifiants

  • HAL Id : halshs-01295102 , version 1

Citer

Julien d'Huy. Une nouvelle méthode, rapide et efficace, pour reconstruire les premières migrations de l'humanité.. Mythologie française, 2015, 259, pp.66-82. ⟨halshs-01295102⟩
242 Consultations
83 Téléchargements
Dernière date de mise à jour le 07/04/2024
comment ces indicateurs sont-ils produits

Partager

Gmail Facebook Twitter LinkedIn Plus