diff --git a/concepticondata/conceptlists.tsv b/concepticondata/conceptlists.tsv index 46da661d..5dcb224b 100644 --- a/concepticondata/conceptlists.tsv +++ b/concepticondata/conceptlists.tsv @@ -405,9 +405,9 @@ Zhong-2022-664 Zhong, Yin and Wan, Mingyu and Ahrens, Kathleen and Huang, Chu-Re Araujo-1996-289 Araújo, Gabriel Antunes 1996 289 basic, documentation German, Portuguese Maxakalí http://etnolinguistica.wdfiles.com/local--files/artigo%3Aaraujo-1996/araujo_1996_masakari.pdf Araujo1996 This list, compiled and translated to Portuguese in Araújo (1996), was originally documented during fieldwork by Curt Nimuendajú in 1939, with notes published in 1958. It features basic vocabulary from Maxacalí (Nuclear-Macro-Je), a language of Brazil. It also includes some examples from the possessive paradigm. Araújo's compilation, transcriptions, and additional comments are openly available online. Tjuka-2022-220 Tjuka, Annika 2022 220 body parts, specific, colors English Global https://doi.org/10.5281/zenodo.6226423 Tjuka2022a The list includes 220 concepts and each concept was categorized into color, emotion or human body part. It is the second version of [Tjuka-2021-192](:ref:Tjuka-2021-192) with 28 additional emotion concepts. The list contains 22 color concepts, 62 emotion concepts, and 136 human body part concepts. Wang-2021-90 Wang, Xiaosha and Bi, Yanchao 2021 90 norms, ratings English, Chinese Global https://doi.org/10.1177/09567976211003877 Wang2021 This list was used as part of a psycholinguistic and neuroimaging study on inter-subjective variability of word meaning representation at the neural level. It features 90 nominal stimuli in Chinese of varying degrees of relative abstractness (independent variable), ranging from referents of high concreteness (e.g. physical objects) to higher abstractness, as in the case of logical concepts (e.g. relationship, effect), including social and emotional concepts. -Tjuka-2022-784 Tjuka, Annika 2022 784 body parts, specific English Global https://doi.org/10.5281/zenodo.6365495 Tjuka2022b The list includes 134 body and 650 object concepts. The object concepts were further classified into ten semantic fields: animal, clothing, food, household items, instrument, landscape, plant, spatial relation, tool, and vehicle. All concepts were also categorized into two additional groups indicating whether they are visual or non-visual concepts. +Tjuka-2022-784 Tjuka, Annika 2022 784 body parts, specific English Global https://doi.org/10.5281/zenodo.6365495 Tjuka2022b The list includes 134 body and 650 object concepts. The object concepts were further classified into ten semantic fields: animal, clothing, food, household items, instrument, landscape, plant, spatial relation, tool, and vehicle. All concepts were also categorized into two additional groups indicating whether they are visual or non-visual concepts. KochGrunberg-1914-821 Koch-Grünberg, Theodor 1914 821 annotated, areal, basic German Tukanoan languages http://www.jstor.org/stable/40443137 Koch1914 List used by Koch-Grünberg to compare languages of the Tukanoan family (Desano, Yahuna, Yupua and Koretu). This list contains, among other things, words for human body parts, animals, plants, colors and objects. A considerable part of the list contains concepts that are culturally specific for the region where those languages were spoken. Theodore Koch-Grünberg largerly used the same concepts of this list to collect data from neighboring Tukanoan, Arawakan and Makuan languages in his 1902-1903 travels. Portuguese and English translations for the original German list were added. We have added all those entries from the original data where we can find a gloss with a concrete translation by the author, cases where only grammatical markers or other aspects not pertaining to the lexicon were discussed are thus not rendered here. 812-832 -BarbosaEscobar-2021-12 Barbosa Escobar, Francisco and Velasco, Carlos and Motoki, Kosuke and Byrne, Derek Victor and Wang, Qian Janice 2021 12 specific English English, Chinese, Japanese, Spanish https://doi.org/10.1371/journal.pone.0252408 BarbosaEscobar2021 The list includes 12 emotion adjectives arranged in form of a wheel. The adjectives were based on the valence x arousal circumplex-inspired emotion questionnaire (CEQ, [Jaeger et al.](:bib:Jaeger2021)) and were translated from English into Mandarin Chinese, Japanese, and Spanish. +BarbosaEscobar-2021-12 Barbosa Escobar, Francisco and Velasco, Carlos and Motoki, Kosuke and Byrne, Derek Victor and Wang, Qian Janice 2021 12 specific English English, Chinese, Japanese, Spanish https://doi.org/10.1371/journal.pone.0252408 BarbosaEscobar2021 The list includes 12 emotion adjectives arranged in form of a wheel. The adjectives were based on the valence x arousal circumplex-inspired emotion questionnaire (CEQ, [Jaeger et al.](:bib:Jaeger2021)) and were translated from English into Mandarin Chinese, Japanese, and Spanish. Epps-2021-843 Van Epps, Briana, and Carling, Gerd and Sapir, Yair 2021 843 documentation, historical, norms English Old Norse, Norwegian, Old Swedish, Swedish, Jamtlandic, Elfdalian https://github.com/gerdcarling/gendernorthscandinavian Epps2021 This list is derived from of a study focusing on lexical gender assignment in six North Scandinavian varieties. It provides statistical information about semantic change in about 1300 cognate sets and their respective patterns of change from their ancestral forms in Old Norse. Since the data is arragned by cognate sets, identical elicitation glosses for concepts are distinguished by adding numerals to the end of the respective glosses. These are retained in the column LEXIBANK_GLOSS, but ignored in our representation. Appendix A Nordenskioeld-1905-50 Nordenskiöld, Erland 1905 50 basic, areal German South American languages https://www.pueblos-originarios.ucb.edu.bo/Record/106001946?lng=da Nordenskioeld1905 Nordenskioeld1905 This list compares five languages of the Madre de Dios region in the Peruvian selva across 50 basic concepts. While four languages are part of the hypothesized Pano-Takana language family, one language is part of the Harákmbut family. Of the Pano-Takana languages, two are unclassified with respect to the internal classification of the family. The data is based on fieldwork from the original author which was done in 1904 and 1905. 275-276 RonaTas-2011-431 Róna-Tas, András and Berta, Árpád 2011 431 historical English Hungarian, Old Hungarian, Early Ancient Hungarian, West Old Turkic RonaTas2011 This list features 431 Hungarian words that were loaned from West Old Turkic, compiled by András Róna-Tas. @@ -439,10 +439,10 @@ Strong-1911-152 Strong, W. Mersh 1911 152 basic English New Guinea languages S Blum-2024-501 Blum, Frederic and Barrientos, Carlos and Zariquiey, Roberto and List, Johann-Mattis 2024 501 basic English, Spanish Pano-Tacanan languages Blum2024 A list of basic vocabulary that serves as the basis for studying the relation between Panoan and Tacanan languages. The careful selection of relevant items is based on existing conceptlists like [Kaufman-1973-1028](:ref:Kaufman-1973-1028), [Girard-1971-559](:ref:Girard-1971-559) and [Oliveira-2014-517](:ref:Oliveira-2014-517). Smith-2017-679 Smith, Alexander D. 2017 679 basic English Burneo languages Smith2017 This concept list is used to investigate the relatedness and history of the Austronesian languages of Borneo. The study is based primarily on fieldwork in Borneo. It describes the historical phonologies of the languages, evaluates lower- and higher-level linguistic subgroupings, and discusses the history of population movements in Borneo from a linguistic perspective. i-693 Zalizniak-2024-4583 Anna Zalizniak and Anna Smirnitskaya and Maksim Russo (Rousseau) and Ilya Gruntov and Timur Maisak and Dmitry Ganenkov and Maria Bulakh and Maria Orlova and Marina Bobrik-Fremke and Oksana Dereza and Tatiana Mikhailova and Maria Bibaeva and Mikhail Voronov 2024 4583 ranked English global http://datsemshift.ru Zalizniak2024 This is the most recent dump of the DatSemShift database, which was retrieved on February 5 (2024) from the database and then converted to CLDF. The data contains several duplicates, which we have marked by a preceding asterisk, leaving them unlinked to Concepticon and also ignoring them in the network representations. DatSemShift -Lastra-1986-381 Lastra de Suárez, Yolanda 1986 381 English Spanish Lastra1986 This concept list is the result of a study of the dialectal regions in modern Nahuatl. 239-248 +Lastra-1986-381 Lastra de Suárez, Yolanda 1986 381 English Spanish Lastra1986 This concept list is the result of a study of the dialectal regions in modern Nahuatl. 239-248 Tjuka-2024-1310 Tjuka, Annika 2024 1310 English Vietnamese Tjuka2024 The list includes 1310 concepts for Vietnamese. The concepts are based on the items provided in the [Intercontinental Dictionary Series](:ref:Key-2016-1310). -Snodgrass-1980-260 Snodgrass, Joan G. and Vanderwart, Mary 1980 260 basic, naming test English global https://psycnet.apa.org/doi/10.1037/0278-7393.6.2.174 Snodgrass1980 This list comprises 260 concepts from several widely studied semantic categories that are to be elicited using a standardized set of 260 pictures for use in experiments investigating differences and similarities in the processing of pictures and words. The pictures are black-and-white line drawings executed according to a set of rules that provide consistency of pictorial representation. -MorenoMartinez-2012-360 Moreno-Martinez, Francisco J. and Montoro, Pedro R. 2012 360 naming test English, Spanish global https://doi.org/10.1371/journal.pone.0037527.s002 MorenoMartinez2012 This work presents a new set of 360 high quality color images belonging to 23 semantic subcategories (ANIMALS, BIRDS, BODYPARTS, FLOWERS, FRUITS, INSECTS, MARINE CREATURES, NUTS, TREES, VEGETABLES, BUILDINGS, CLOTHING, DESK MATERIAL, FOOD, FURNITURE, JEWELLERY, KITCHEN UTENSILS, MUSICAL INSTRUMENT, SPORTS/GAMES, TOOLS, VEHICLES, WEAPONS, NATURE). The images were named by 230 Spanish native speakers and come with multiple psycholinguistic variables rated by the participants, listed here are mean values for: AoA (7-interval scale, 1 = 0–2 years; 7 = 13 years or more), familiarity (5-point lickert scale), typicality (5-point lickert scale), visual complexity (5-point lickert scale) and lexical frequency (obtained via the AltaVista search engine, log). +Snodgrass-1980-260 Snodgrass, Joan G. and Vanderwart, Mary 1980 260 basic, naming test English global https://psycnet.apa.org/doi/10.1037/0278-7393.6.2.174 Snodgrass1980 This list comprises 260 concepts from several widely studied semantic categories that are to be elicited using a standardized set of 260 pictures for use in experiments investigating differences and similarities in the processing of pictures and words. The pictures are black-and-white line drawings executed according to a set of rules that provide consistency of pictorial representation. +MorenoMartinez-2012-360 Moreno-Martinez, Francisco J. and Montoro, Pedro R. 2012 360 naming test English, Spanish global https://doi.org/10.1371/journal.pone.0037527.s002 MorenoMartinez2012 This work presents a new set of 360 high quality color images belonging to 23 semantic subcategories (ANIMALS, BIRDS, BODYPARTS, FLOWERS, FRUITS, INSECTS, MARINE CREATURES, NUTS, TREES, VEGETABLES, BUILDINGS, CLOTHING, DESK MATERIAL, FOOD, FURNITURE, JEWELLERY, KITCHEN UTENSILS, MUSICAL INSTRUMENT, SPORTS/GAMES, TOOLS, VEHICLES, WEAPONS, NATURE). The images were named by 230 Spanish native speakers and come with multiple psycholinguistic variables rated by the participants, listed here are mean values for: AoA (7-interval scale, 1 = 0–2 years; 7 = 13 years or more), familiarity (5-point lickert scale), typicality (5-point lickert scale), visual complexity (5-point lickert scale) and lexical frequency (obtained via the AltaVista search engine, log). Migliazza-1972-714 Migliazza, Ernest 1972 714 English Yanomamic languages Migliazza1972 The list is part of a dissertion on the Yanomama languages. The dissertion gives information on the external situation of the language, an overview on major syntactic aspects, the phonology of the language, and a comparison between the four varieties with respect to their degree of intelligibility. 457 Berry-1987-195 Berry, Christine and Berry, Keith 1987 195 basic English, Indonesian West Papuan languages https://www.sil.org/resources/archives/37738 Berry1987 This concept list was compiled for a survey on West Papuan languages. It bears general resemblance with typical concept lists compiled for Papuan and Austronesian languages, but to our knowledge, the study does not mention a specific concept list that was used as a basic model. 62-80 Greenhill-2023-121 Greenhill, Simon J. and Haynie, Hannah J. and Ross, Robert M. and Chira, Angela M. and List, Johann-Mattis and Campbell, Lyle and Botero, Carlos A. and Gray, Russell D. 2023 121 English Uto-Aztecan languages https://doi.org/10.1353/lan.2023.0006 Greenhill2023 This list is part of a phylogenetic study on the history of the Uto-Aztecan language family. It analyzes lexical data from thirty-four Uto-Aztecan varieties and two Kiowa-Tanoan languages with Bayesian phylogenetic methods. The study infers the age of Proto-Uto-Aztecan and identify the most likely homeland of the proto-language. 81-107 @@ -450,8 +450,8 @@ Hwang-2021-60 Hwang, Yu M. And Yoonhye, Na and Sung-Bom P. 2021 60 naming test, Needham-1897-262 Needham, Jack Francis 1897 262 basic English Muishaung https://archive.org/details/collectionoffewm00needrich Needham1897 This concept list was used to elicit lexical data for the Muishaung language of Myanmar and Northeast India. The list originally contains 262 items, but in a note column, additional items are added, displayed here with a *b* as number prefix. The original text occasionally uses *ditto* to refer to repeated words from preceding lines. Here, the intended word was used, preceded by an asterisk. 2-7 vanDort-2007-50 van Dort, Sandra and Vong, Etain and Razak, Rogayah A. and Kamal, Rahayu Mustaffa and Meng, Hooi Poh 2007 50 naming test, basic English Malay https://journalarticle.ukm.my/1033/1/jurnal64.pdf vanDort2007 This is a Malay version of the Boston Naming Test (M-BNT) and its normative data. The M-BNT follows closely the general administration procedures of the original Boston Naming Test (BNT) but is different in terms of item content. A total of 29 items from the original 60 items on the test were deemed culturally and linguistically valid for the Malay population and were thus retained. Gruehn-2008-200 Gruehn, Daniel and Smith, Jacqui 2008 200 ratings English, German German https://doi.org/10.3758/BRM.40.4.1088 Gruehn2008 This list of 200 German adjectives was compiled and used for testing age-dependent ratings of the items as well as for judging self-other relevance. Participants were asked to judge the given adjectives as human attributes. AGE -Tsaparina-2011-260 Tsaparina, Diana and Bonnie, Patrick and Méot, Alain 2011 260 naming test, basic English Russian, global https://doi.org/10.3758/s13428-011-0121-9 Tsaparina2011 This is the original [Snodgrass and Vanderwart (1980)](:bib:Snodgrass1980) list normed for Russian using the colorized pictures from Rossion and Portois (2004). The pictures were standardized on name agreement, image agreement, conceptual familiarity, imageability, and age of acquisition. -Dunabeitia-2018-750 Duñabeitia, Jon Andoni and Crepaldi, David and Meyer, Antje S. and New, Boris and Pliatsikas, Christos and Smolka, Eva and Brysbaert, Marc 2018 750 naming test, basic English English, Italian, Spanish, German, Western Flemish, French, global https://www.bcbl.eu/databases/multipic Dunabeitia2018 This is a new set of 750 colored pictures of concrete concepts called MultiPic. The MultiPic databank has been normed in six different European languages (British English, Spanish, French, Belgian Dutch (Western Flemish), Italian and German). All stimuli and norms are freely available. The pictures were standardized for name agreement and visual complexity. +Tsaparina-2011-260 Tsaparina, Diana and Bonnie, Patrick and Méot, Alain 2011 260 naming test, basic English Russian, global https://doi.org/10.3758/s13428-011-0121-9 Tsaparina2011 This is the original [Snodgrass and Vanderwart (1980)](:bib:Snodgrass1980) list normed for Russian using the colorized pictures from Rossion and Portois (2004). The pictures were standardized on name agreement, image agreement, conceptual familiarity, imageability, and age of acquisition. +Dunabeitia-2018-750 Duñabeitia, Jon Andoni and Crepaldi, David and Meyer, Antje S. and New, Boris and Pliatsikas, Christos and Smolka, Eva and Brysbaert, Marc 2018 750 naming test, basic English English, Italian, Spanish, German, Western Flemish, French, global https://www.bcbl.eu/databases/multipic Dunabeitia2018 This is a new set of 750 colored pictures of concrete concepts called MultiPic. The MultiPic databank has been normed in six different European languages (British English, Spanish, French, Belgian Dutch (Western Flemish), Italian and German). All stimuli and norms are freely available. The pictures were standardized for name agreement and visual complexity. Nishimoto-2005-359 Nishimoto, Takehiko and Miyawaki, Kaori and Ueda, Takashi and Une, Yuko and Takahashi, Masaru 2005 359 naming test, basic English Japanese https://static-content.springer.com/esm/art%3A10.3758%2FBF03192709/MediaObjects/Nishimoto-BRM-2005.zip Nishimoto2005 This is a 359-item list with norms in Japanese. It includes 260 redrawn pictures from Snodgrass and Vanderwart (1980). The authors provide a katakana spelling, a romanized spelling and an English translation. The items are normed on name agreement, AoA, Familiarity and mora length. Boukadi-2015-348 Boukadi, Mariem and Zouaidi, Cirine and Wilson, Maximiliano A. 2015 348 naming test, basic English Tunisian Arabic https://link.springer.com/article/10.3758/s13428-015-0602-3#Sec9 Boukadi2015 This is a 348-item dataset with normative data for Tunisian Arabic on name agreement, familiarity, subjective frequency, and imageability. The dataset uses the original line drawings taken from Cycowicz, Friedman, Rothstein, and Snodgrass (1997), which include Snodgrass and Vanderwart’s (1980) 260 pictures. Shao-2016-327 Shao, Zeshu and Stiegert, Julia 2016 327 naming test, basic English Dutch https://link.springer.com/article/10.3758/s13428-015-0613-0?fromPaywallRec=false#citeas Shao2016 This list includes naming latencies and norms for 327 photos of objects in Dutch for eight psycholinguistic variables: age of acquisition, familiarity, imageability, image agreement, objective and subjective visual complexity, word frequency, word length in syllables and letters, and name agreement. @@ -462,11 +462,12 @@ Dimitropoulou-2009-260 Dimitropoulou, María and Duñabeitia, Jon Andoni and Bli Rogic-2013-346 Rogić, Maja and Jerončić, Ana and Bošnjak, Marija and Sedlar, Ana and Hren, Darko and Deletis, Vedran 2013 346 naming test, basic English Croatian https://static-content.springer.com/esm/art%3A10.3758%2Fs13428-012-0308-8/MediaObjects/13428_2012_308_MOESM1_ESM.docx Rogic2013 This is a dataset for 346 visually presented objects in Croatian. It features some of the original [Snodgrass and Vanderwart (1980)](:bib:Snodgrass1980) pictures as well as Roach et al.'s PNT (1996) and Cycowicz et al. (1997) ones. They are standardized according to seven variables: naming latency, name agreement, familiarity, visual complexity, word length, number of syllables, and word frequency. On top of that, an individual item analyses were conducted to account for item difficulty (percentage of the sample who answered correctly) and discrimination (corrected item - total correlation). Liu-2011-435 Liu, Youyi and Meiling, Hao and Li, Ping and Shu, Hua 2011 435 naming test, basic English Chinese https://doi.org/10.1371/journal.pone.0016505.s001 Liu2011 This is a dataset with timed norms for 435 object pictures in Mandarin Chinese. These data include naming latency, name agreement, concept agreement, word length, and age of acquisition (AoA) based on children's naming and adult ratings, and several other adult ratings of concept familiarity, subjective word frequency, image agreement, image variability, and visual complexity. The picture stimuli were taken from multiple sources marked by numbers from 1 to 5: Cycowicz et al. (1997) - 1, Bonin et al. (2003) - 2, the PNT (Roach et al., 1996) - 3, Zhang & Yang (2003) -4, and other -5. Ramanujan-2019-158 Ramanujan, Keerthi and Weekes, Brendan S. 2019 158 naming test, basic English Hindi https://doi.org/10.1017/S1366728918001177 Ramanujan2019 This is a dataset with Hindi norms for 158 object pictures from [Snodgrass and Vanderwart (1980)](:bib:Snodgrass1980) with stimuli selected from the color version of the dataset by Rossion&Pourtois (2004). From the original 260, those items were excluded that were e.g. culturally unfamiliar or that were uncommon in the Hindi visual context. The remaining data was normed on nameability, naming latency, name agreement, image agreement, familiarity and visual complexity, average latency, age of acquisition, word length in syllables, phonemes and alpha syllables and lastly, word frequency. -Dunabeitia-2022-500 Duñabeitia, Jon Andoni and Baciero, Ana and Antoniou, Kyriakos and Antoniou, Mark and Ataman, Esra and Baus, Cristina and Ben-Shachar, Michal and Çağlar, Ozan Can and Chromý, Jan and Comesaña, Montserrat and Filip, Maroš and Filipović Đurđević, Dušica and Gillon Dowens, Margaret and Hatzidaki, Anna and Januška, Jiří and Jusoh, Zuraini and Kanj, Rama and Kim, Say Young and Kırkıcı, Bilal and Leminen, Alina and Lohndal, Terje and Yap, Ngee Thai and Renvall, Hanna and Rothman, Jason and Royle, Phaedra and Santesteban, Mikel and Sevilla, Yamila and Slioussar, Natalia and Vaughan-Evans, Awel and Wodniecka, Zofia and Wulff, Stefanie and Pliatsikas, Christos 2022 500 naming test, basic English American English, Australian English, Basque, British English, Catalan, Cypriot Greek, Czech, Finnish, French, German, Greek, Hebrew, Hungarian, Italian, Korean, Lebanese Arabic, Malay, Malaysian English, Mandarin Chinese, Belgium Dutch, Netherlands Dutch, Norwegian, Polish, Portuguese, Quebec French, Rioplatense Spanish, Russian, Serbian, Slovak, Spanish, Turkish, Welsh, Cantonese, Galician, Ukrainian global https://figshare.com/articles/dataset/Untitled_Item/19328939 Dunabeitia2022 This is a continuation of the [Dunabeitia et al. (2018)](:bib:Dunabeitia2018)dataset, now with 35 language varieties (v.8) (American English, Australian English, Basque, Belgium Dutch, British English, Catalan, Cypriot Greek, Czech, Finnish, French, German, Greek, Hebrew, Hungarian, Italian, Korean, Lebanese Arabic, Malay, Malaysian English, Mandarin Chinese, Netherlands Dutch, Norwegian, Polish, Portuguese, Quebec French, Rioplatense Spanish, Russian, Serbian, Slovak, Spanish, Turkish, Welsh, Cantonese, Galician and Ukrainian) in total and with familiarity ratings for 500 chosen picture stimuli of the original 750 (currently not available for all languages). Apart from Familiarity, we provide the following variables: Shannon Diversity Scores, Modal Response Percentage, 'I don't know' Response Percentage, Idiosyncratic Response Percentage. +Dunabeitia-2022-500 Duñabeitia, Jon Andoni and Baciero, Ana and Antoniou, Kyriakos and Antoniou, Mark and Ataman, Esra and Baus, Cristina and Ben-Shachar, Michal and Çağlar, Ozan Can and Chromý, Jan and Comesaña, Montserrat and Filip, Maroš and Filipović Đurđević, Dušica and Gillon Dowens, Margaret and Hatzidaki, Anna and Januška, Jiří and Jusoh, Zuraini and Kanj, Rama and Kim, Say Young and Kırkıcı, Bilal and Leminen, Alina and Lohndal, Terje and Yap, Ngee Thai and Renvall, Hanna and Rothman, Jason and Royle, Phaedra and Santesteban, Mikel and Sevilla, Yamila and Slioussar, Natalia and Vaughan-Evans, Awel and Wodniecka, Zofia and Wulff, Stefanie and Pliatsikas, Christos 2022 500 naming test, basic English American English, Australian English, Basque, British English, Catalan, Cypriot Greek, Czech, Finnish, French, German, Greek, Hebrew, Hungarian, Italian, Korean, Lebanese Arabic, Malay, Malaysian English, Mandarin Chinese, Belgium Dutch, Netherlands Dutch, Norwegian, Polish, Portuguese, Quebec French, Rioplatense Spanish, Russian, Serbian, Slovak, Spanish, Turkish, Welsh, Cantonese, Galician, Ukrainian global https://figshare.com/articles/dataset/Untitled_Item/19328939 Dunabeitia2022 This is a continuation of the [Dunabeitia et al. (2018)](:bib:Dunabeitia2018)dataset, now with 35 language varieties (v.8) (American English, Australian English, Basque, Belgium Dutch, British English, Catalan, Cypriot Greek, Czech, Finnish, French, German, Greek, Hebrew, Hungarian, Italian, Korean, Lebanese Arabic, Malay, Malaysian English, Mandarin Chinese, Netherlands Dutch, Norwegian, Polish, Portuguese, Quebec French, Rioplatense Spanish, Russian, Serbian, Slovak, Spanish, Turkish, Welsh, Cantonese, Galician and Ukrainian) in total and with familiarity ratings for 500 chosen picture stimuli of the original 750 (currently not available for all languages). Apart from Familiarity, we provide the following variables: Shannon Diversity Scores, Modal Response Percentage, 'I don't know' Response Percentage, Idiosyncratic Response Percentage. DellaRosa-2010-417 Della Rosa, Pasquale A. and Catricà, Eleonora and Vigliocco, Gabriella and Cappa, Stefano F. 2010 417 ratings English Italian https://doi.org/10.3758/BRM.42.4.1042 DellaRosa2010 This list of 417 Italian nouns was rated concerning mode of acquisition, concreteness, imageability, familiarity, age of acquisition, context availability, and abstractness. Ratings were given by 250 Italian native speakers. Ray-1889-78 Ray, Sidney H. 1889 78 English Oceanic languages https://zenodo.org/records/2403292 Ray1889 This list of 78 words was chosen to show the 'variation' in some Oceanic languages compared to Baki. 302-303 Schauenburg-2015-858 Schauenburg, Gesche and Ambrasat, Jens and Schröder, Tobias and Von Scheve, Christian and Conrad, Markus 2015 858 ratings German German https://doi.org/10.3758/s13428-014-0494-7 Schauenburg2015 This list contains ratings of valence, arousal, potency, authority and community for 858 German words. Participants were native German speakers recruited through university mailing lists. 720-735 Janschewitz-2008-460 Janschewitz, Kristin 2008 460 ratings English English https://doi.org/10.3758/BRM.40.4.1065 Janschewitz2008 This list of 460 English taboo, emotionally valenced, and emotionally neutral words was rated for frequency, inappropriateness, valence, arousal, and imageability by 78 native-English-speaking college students. 1065-1074 Roest-2018-672 Roest, Sander A. and Visser, Tessa A. and Zeelenberg, René 2018 672 ratings English Dutch https://doi.org/10.3758/s13428-017-0890-x Roest2018 This list of 672 Dutch taboo, emotionally valenced, and emotionally neutral words was rated for valence, arousal, and personal as well as general tabooness by 78 psychology students. All participants were native speakers of Dutch. An additional GLOSS column was included here to ensure consistent formatting of the Dutch entries. Mappings were created with reference to the GLOSS column. 630-641 Soederholm-2013-420 Söderholm, Carina and Häyry, Emilia and Laine, Matti and Karrasch, Mira 2013 420 ratings English Finnish https://doi.org/10.1371/journal.pone.0072859 Soederholm2013 This list of 420 Finnish Nouns was rated for valence and arousal by 996 native Finnish speakers, aged 16 to 77. The results were presented in sum as well as split up by age and gender. For this purpose, four groups were created in the original study, which are encoded here as GROUP_01 (16-19 years old), GROUP_02 (20-30 years old), GROUP_03 (31-59 years old) and GROUP_04 (60-77 years old). The original dataset further includes surface and lemma frequencies, as well as bigram, initgram and fintrigram frequencies for each word, which were omitted here. e72859 -Eilola-2010-210 Eilola, Tiina M. and Havelka, Jelena 2010 210 ratings English, Finnish English, Finnish https://doi.org/10.3758/BRM.42.1.134 Eilola2010 This list of 210 words contains ratings of familiarity, valence, emotionality, offensiveness, and concreteness for Finnish and British English nouns, including 34 taboo words. Ratings were provided by native speakers of each language. For British English in particular, the aim of the study was to collect data comparable to the American English norms in the Affective Norms for English Words database [(Bradley & Lang, 1999)](:bib:Bradley1999). The present mappings were based on the English words. 134-140 \ No newline at end of file +Eilola-2010-210 Eilola, Tiina M. and Havelka, Jelena 2010 210 ratings English, Finnish English, Finnish https://doi.org/10.3758/BRM.42.1.134 Eilola2010 This list of 210 words contains ratings of familiarity, valence, emotionality, offensiveness, and concreteness for Finnish and British English nouns, including 34 taboo words. Ratings were provided by native speakers of each language. For British English in particular, the aim of the study was to collect data comparable to the American English norms in the Affective Norms for English Words database [(Bradley & Lang, 1999)](:bib:Bradley1999). The present mappings were based on the English words. 134-140 +Pache-2023-207 Pache, Matthias 2023 207 basic English Chibchan https://doi.org/10.1086/722240 Pache2023 This list was used for a comparative analysis of the Chibchan languages with the aim of revising their internal genealogical classification. The author claims that the list represents the Swadesh 207 list, however, it is unclear which list is meant exactly, since Swadesh never published a list containing 207 words. The list is likely very similar to [Comrie(1977)](:bib:Comrie1977) but uses slightly different glosses. The data for the Chibchan languages was gathered from existing sources on various Chibchan languages. 81-103 \ No newline at end of file diff --git a/concepticondata/conceptlists/Pache-2023-207.tsv b/concepticondata/conceptlists/Pache-2023-207.tsv new file mode 100644 index 00000000..bbc81856 --- /dev/null +++ b/concepticondata/conceptlists/Pache-2023-207.tsv @@ -0,0 +1,208 @@ +ID NUMBER ENGLISH CONCEPTICON_ID CONCEPTICON_GLOSS +Pache-2023-207-1 1 I 1209 I +Pache-2023-207-2 2 all 98 ALL +Pache-2023-207-3 3 and 1577 AND +Pache-2023-207-4 4 animal 619 ANIMAL +Pache-2023-207-5 5 arrow 977 ARROW +Pache-2023-207-6 6 ash 646 ASH +Pache-2023-207-7 7 at 1461 AT +Pache-2023-207-8 8 back 1291 BACK +Pache-2023-207-9 9 bad 1292 BAD +Pache-2023-207-10 10 bark 1204 BARK +Pache-2023-207-11 11 because 1157 BECAUSE +Pache-2023-207-12 12 belly 1251 BELLY +Pache-2023-207-13 13 big 1202 BIG +Pache-2023-207-14 14 bird 937 BIRD +Pache-2023-207-15 15 black 163 BLACK +Pache-2023-207-16 16 blood 946 BLOOD +Pache-2023-207-17 17 bone 1394 BONE +Pache-2023-207-18 18 breast 1402 BREAST +Pache-2023-207-19 19 child 2099 CHILD +Pache-2023-207-20 20 cloud 1489 CLOUD +Pache-2023-207-21 21 cold 1287 COLD +Pache-2023-207-22 22 correct 1725 CORRECT (RIGHT) +Pache-2023-207-23 23 day 1225 DAY (NOT NIGHT) +Pache-2023-207-24 24 dirty 1230 DIRTY +Pache-2023-207-25 25 dog 2009 DOG +Pache-2023-207-26 26 dry 1398 DRY +Pache-2023-207-27 27 dull (as a knife) 379 BLUNT +Pache-2023-207-28 28 dust 2 DUST +Pache-2023-207-29 29 ear 1247 EAR +Pache-2023-207-30 30 earth 1228 EARTH (SOIL) +Pache-2023-207-31 31 egg 744 EGG +Pache-2023-207-32 32 eye 1248 EYE +Pache-2023-207-33 33 far 1406 FAR +Pache-2023-207-34 34 fat (noun) 323 FAT (ORGANIC SUBSTANCE) +Pache-2023-207-35 35 father 1217 FATHER +Pache-2023-207-36 36 feather 1201 FEATHER +Pache-2023-207-37 37 few 1242 FEW +Pache-2023-207-38 38 fingernail 1258 FINGERNAIL +Pache-2023-207-39 39 fire 221 FIRE +Pache-2023-207-40 40 fish 227 FISH +Pache-2023-207-41 41 five 493 FIVE +Pache-2023-207-42 42 flower 239 FLOWER +Pache-2023-207-43 43 fog 249 FOG +Pache-2023-207-44 44 foot 1301 FOOT +Pache-2023-207-45 45 forest 420 FOREST +Pache-2023-207-46 46 four 1500 FOUR +Pache-2023-207-47 47 fruit 1507 FRUIT +Pache-2023-207-48 48 full 1429 FULL +Pache-2023-207-49 49 good 1035 GOOD +Pache-2023-207-50 50 grass 606 GRASS +Pache-2023-207-51 51 green 1425 GREEN +Pache-2023-207-52 52 guts 1334 GUTS +Pache-2023-207-53 53 hair 1040 HAIR +Pache-2023-207-54 54 hand 1277 HAND +Pache-2023-207-55 55 he 1211 HE +Pache-2023-207-56 56 head 1256 HEAD +Pache-2023-207-57 57 heart 1223 HEART +Pache-2023-207-58 58 heavy 1210 HEAVY +Pache-2023-207-59 59 here 136 HERE +Pache-2023-207-60 60 horn 1393 HORN (ANATOMY) +Pache-2023-207-61 61 how 1239 HOW +Pache-2023-207-62 62 husband 1200 HUSBAND +Pache-2023-207-63 63 ice 617 ICE +Pache-2023-207-64 64 if 1459 IF +Pache-2023-207-65 65 in 1460 IN +Pache-2023-207-66 66 knee 1371 KNEE +Pache-2023-207-67 67 lake 624 LAKE +Pache-2023-207-68 68 leaf 628 LEAF +Pache-2023-207-69 69 left 244 LEFT +Pache-2023-207-70 70 leg 1297 LEG +Pache-2023-207-71 71 liver 1224 LIVER +Pache-2023-207-72 72 long 1203 LONG +Pache-2023-207-73 73 louse 1392 LOUSE +Pache-2023-207-74 74 man 1554 MAN +Pache-2023-207-75 75 many 1198 MANY +Pache-2023-207-76 76 meat 2615 FLESH OR MEAT +Pache-2023-207-77 77 moon 1313 MOON +Pache-2023-207-78 78 mother 1216 MOTHER +Pache-2023-207-79 79 mountain 639 MOUNTAIN +Pache-2023-207-80 80 mouth 674 MOUTH +Pache-2023-207-81 81 name 1405 NAME +Pache-2023-207-82 82 near 1942 NEAR +Pache-2023-207-83 83 neck 1333 NECK +Pache-2023-207-84 84 new 1231 NEW +Pache-2023-207-85 85 night 1233 NIGHT +Pache-2023-207-86 86 nose 1221 NOSE +Pache-2023-207-87 87 not 1240 NOT +Pache-2023-207-88 88 old 1229 OLD +Pache-2023-207-89 89 one 1493 ONE +Pache-2023-207-90 90 other 197 OTHER +Pache-2023-207-91 91 person 683 PERSON +Pache-2023-207-92 92 rain 658 RAIN (PRECIPITATION) +Pache-2023-207-93 93 red 156 RED +Pache-2023-207-94 94 right 1019 RIGHT +Pache-2023-207-95 95 river 666 RIVER +Pache-2023-207-96 96 road 2457 PATH OR ROAD +Pache-2023-207-97 97 root 670 ROOT +Pache-2023-207-98 98 rope 1218 ROPE +Pache-2023-207-99 99 rotten 1728 ROTTEN +Pache-2023-207-100 100 round 1395 ROUND +Pache-2023-207-101 101 salt 1274 SALT +Pache-2023-207-102 102 sand 671 SAND +Pache-2023-207-103 103 sea 1474 SEA +Pache-2023-207-104 104 seed 714 SEED +Pache-2023-207-105 105 sharp (as a knife) 1396 SHARP +Pache-2023-207-106 106 short 1645 SHORT +Pache-2023-207-107 107 skin 763 SKIN +Pache-2023-207-108 108 sky 1732 SKY +Pache-2023-207-109 109 small 1246 SMALL +Pache-2023-207-110 110 smoke 778 SMOKE (EXHAUST) +Pache-2023-207-111 111 smooth 1234 SMOOTH +Pache-2023-207-112 112 snake 730 SNAKE +Pache-2023-207-113 113 snow 784 SNOW +Pache-2023-207-114 114 some 1241 SOME +Pache-2023-207-115 115 star 1430 STAR +Pache-2023-207-116 116 stick 1295 STICK +Pache-2023-207-117 117 stone 857 STONE +Pache-2023-207-118 118 straight 1404 STRAIGHT +Pache-2023-207-119 119 sun 1343 SUN +Pache-2023-207-120 120 tail 1220 TAIL +Pache-2023-207-121 121 that 78 THAT +Pache-2023-207-122 122 there 1937 THERE +Pache-2023-207-123 123 they 817 THEY +Pache-2023-207-124 124 thick 1244 THICK +Pache-2023-207-125 125 thin 2308 THIN +Pache-2023-207-126 126 this 1214 THIS +Pache-2023-207-127 127 three 492 THREE +Pache-2023-207-128 128 to bite 1403 BITE +Pache-2023-207-129 129 to blow 176 BLOW (WITH MOUTH) +Pache-2023-207-130 130 to breathe 1407 BREATHE +Pache-2023-207-131 131 to burn 2102 BURN +Pache-2023-207-132 132 to come 1446 COME +Pache-2023-207-133 133 to count 1420 COUNT +Pache-2023-207-134 134 to cut 1432 CUT +Pache-2023-207-135 135 to die 1494 DIE +Pache-2023-207-136 136 to dig 1418 DIG +Pache-2023-207-137 137 to drink 1401 DRINK +Pache-2023-207-138 138 to eat 1336 EAT +Pache-2023-207-139 139 to fall 1280 FALL +Pache-2023-207-140 140 to fear 1419 FEAR (BE AFRAID) +Pache-2023-207-141 141 to fight 1423 FIGHT +Pache-2023-207-142 142 to float 1574 FLOAT +Pache-2023-207-143 143 to flow 2003 FLOW +Pache-2023-207-144 144 to fly 1441 FLY (MOVE THROUGH AIR) +Pache-2023-207-145 145 to freeze 1431 FREEZE +Pache-2023-207-146 146 to give 1447 GIVE +Pache-2023-207-147 147 to hear 1408 HEAR +Pache-2023-207-148 148 to hit 1433 HIT +Pache-2023-207-149 149 to hold 1448 HOLD +Pache-2023-207-150 150 to hunt 1435 HUNT +Pache-2023-207-151 151 to kill 1417 KILL +Pache-2023-207-152 152 to know 3626 KNOW +Pache-2023-207-153 153 to laugh 1355 LAUGH +Pache-2023-207-154 154 to lie (as in a bed) 215 LIE DOWN +Pache-2023-207-155 155 to live 1422 BE ALIVE +Pache-2023-207-156 156 to play 1413 PLAY +Pache-2023-207-157 157 to pull 1455 PULL +Pache-2023-207-158 158 to push 1452 PUSH +Pache-2023-207-159 159 to rub 1449 RUB +Pache-2023-207-160 160 to say 1458 SAY +Pache-2023-207-161 161 to scratch 1436 SCRATCH +Pache-2023-207-162 162 to see 1409 SEE +Pache-2023-207-163 163 to sew 1457 SEW +Pache-2023-207-164 164 to sing 1261 SING +Pache-2023-207-165 165 to sit 1416 SIT +Pache-2023-207-166 166 to sleep 1585 SLEEP +Pache-2023-207-167 167 to smell 2124 SMELL +Pache-2023-207-168 168 to spit 1440 SPIT +Pache-2023-207-169 169 to split 1437 SPLIT +Pache-2023-207-170 170 to squeeze 1414 SQUEEZE +Pache-2023-207-171 171 to stab 1434 STAB +Pache-2023-207-172 172 to stand 1442 STAND +Pache-2023-207-173 173 to suck 1421 SUCK +Pache-2023-207-174 174 to swell 1573 SWELL +Pache-2023-207-175 175 to swim 1439 SWIM +Pache-2023-207-176 176 to think 2271 THINK +Pache-2023-207-177 177 to throw 1456 THROW +Pache-2023-207-178 178 to tie 1917 TIE +Pache-2023-207-179 179 to turn (intransitive) 1588 TURN +Pache-2023-207-180 180 to vomit 1278 VOMIT +Pache-2023-207-181 181 to walk 1443 WALK +Pache-2023-207-182 182 to wash 1453 WASH +Pache-2023-207-183 183 to wipe 1454 WIPE +Pache-2023-207-184 184 tongue (organ) 1205 TONGUE +Pache-2023-207-185 185 tooth 1380 TOOTH +Pache-2023-207-186 186 tree 906 TREE +Pache-2023-207-187 187 two 1498 TWO +Pache-2023-207-188 188 warm 1232 WARM +Pache-2023-207-189 189 water 948 WATER +Pache-2023-207-190 190 we 1212 WE +Pache-2023-207-191 191 wet 1726 WET +Pache-2023-207-192 192 what 1236 WHAT +Pache-2023-207-193 193 when 1238 WHEN +Pache-2023-207-194 194 where 1237 WHERE +Pache-2023-207-195 195 white 1335 WHITE +Pache-2023-207-196 196 who 1235 WHO +Pache-2023-207-197 197 wide 1243 WIDE +Pache-2023-207-198 198 wife 1199 WIFE +Pache-2023-207-199 199 wind 960 WIND +Pache-2023-207-200 200 wing 1257 WING +Pache-2023-207-201 201 with 1340 WITH +Pache-2023-207-202 202 woman 962 WOMAN +Pache-2023-207-203 203 worm 1219 WORM +Pache-2023-207-204 204 year 1226 YEAR +Pache-2023-207-205 205 yellow 1424 YELLOW +Pache-2023-207-206 206 you (plural) 1213 YOU +Pache-2023-207-207 207 you (singular) 1215 THOU \ No newline at end of file diff --git a/concepticondata/conceptlists/Pache-2023-207.tsv-metadata.json b/concepticondata/conceptlists/Pache-2023-207.tsv-metadata.json new file mode 100644 index 00000000..16df4e9b --- /dev/null +++ b/concepticondata/conceptlists/Pache-2023-207.tsv-metadata.json @@ -0,0 +1,54 @@ +{ + "@context": [ + "http://www.w3.org/ns/csvw", + { + "@language": "en" + } + ], + "dialect": { + "encoding": "utf-8-sig", + "delimiter": "\t", + "skipBlankRows": true + }, + "tables": [ + { + "tableSchema": { + "columns": [ + { + "datatype": { + "base": "string", + "format": "[a-zA-Z]+\\-[0-9]{4}\\-[0-9]+[a-z]?\\-[0-9]+[a-z]?$" + }, + "name": "ID" + }, + { + "datatype": { + "base": "string", + "format": "[0-9\\.]+([a-z\\\u2013]+)?$" + }, + "name": "NUMBER" + }, + { + "datatype": { + "base": "integer", + "minimum": "1" + }, + "name": "CONCEPTICON_ID" + }, + { + "datatype": "string", + "name": "CONCEPTICON_GLOSS" + }, + { + "datatype": "string", + "name": "ENGLISH" + } + ], + "primaryKey": [ + "ID" + ] + }, + "url": "Pache-2023-207.tsv" + } + ] +} \ No newline at end of file diff --git a/concepticondata/references/references.bib b/concepticondata/references/references.bib index d064adb8..a7caad1f 100644 --- a/concepticondata/references/references.bib +++ b/concepticondata/references/references.bib @@ -4932,4 +4932,15 @@ @Article{Eilola2010 pages = {134--140}, year = {2010}, Doi = {https://doi.org/10.3758/BRM.42.1.134} +} + +@Article{Pache2023, + title = {Pech and the Basic Internal Classification of Chibchan}, + author = {Pache, Matthias}, + journal = {International Journal of American Linguistics}, + volume = {89}, + number = {1}, + pages = {81--103}, + year = {2023}, + Doi = {10.1086/722240} } \ No newline at end of file