Skip to content
Snippets Groups Projects
Commit d941910e authored by Francesco Sabatini's avatar Francesco Sabatini
Browse files

Added manual correction from BJA, UJ, HB and export species list to submit in TRY

parent b99e8124
No related branches found
No related tags found
No related merge requests found
......@@ -12,7 +12,7 @@ output:
number_sections: true
toc: true
toc_depth: 2
abstract: "This document describes the workflow (with contributions from Oliver Purschke, Jürgen Dengler and Florian Jansen) that was used to generate the taxonomic backbone that standardizes taxon names across the (i) global vegetation plot database sPlot version 3.0 and (ii) the global plant trait data base TRY version 5."
abstract: "This document describes the workflow (with contributions from Oliver Purschke, Jürgen Dengler and Florian Jansen) that was used to generate the taxonomic backbone that standardizes taxon names across the (i) global vegetation plot database sPlot version 3.0 and (ii) the global plant trait data base TRY version 5. "
urlcolor: blue
---
......@@ -26,8 +26,10 @@ urlcolor: blue
**Timestamp:** `r date()`
**Drafted:** Francesco Maria Sabatini
**Revised:**
**Version:** 1.0
**Revised:** Helge Bruelheide, Borja Jimenez-Alfaro
**Version:** 1.1
**Changes to Version 1.1** Additional manual cleaning of species names from BJA, UJ and HB.
***
......@@ -57,7 +59,8 @@ mushroom <- c("Mycena", "Boletus", "Russula","Calocybe","Collybia","Amanita","Am
"Sarcodom","Sarcoscyphus","Scleroderma","Stropharia","Tylopilus","Typhula", "Calyptella", "Chrysopsora", "Lacrymaria", "Dermoloma",
"Agaricus","Alnicola", "Amanitina", "Bovista", "Cheilymenia","Clavulinopsis", "Clitocybe", "Entoloma", "Geaster", "Inocybe",
"Laccaria", "Laetiporus", "Lepista", "Macrolepiota", "Macrolepis", "Marasmius", "Panaeolus", "Psathyrella", "Psilocybe",
"Rickenella", "Sarcoscypha", "Vascellum", "Ramaria")
"Rickenella", "Sarcoscypha", "Vascellum", "Ramaria",
"Amphoroblasia", "Amphoroblastia")
```
......@@ -207,7 +210,7 @@ spec.list.TRY.sPlot <- spec.list.TRY.sPlot %>%
```
A total of `r nrow(spec.list.TRY.sPlot %>% filter(OriginalNames != Species))` species names were modified. Although substantially improved, the species list has still quite a lot of inconsistencies.
The total list submitted to TNRS containes `r length(unique(spec.list.TRY.sPlot$Species))` species names.
The total list submitted to TNRS contains `r length(unique(spec.list.TRY.sPlot$Species))` species names.
# Match names against Taxonomic Name Resolution Service ([TNRS](http://tnrs.iplantcollaborative.org))
......@@ -324,7 +327,7 @@ tnrs.res <- tnrs.res0 %>%
slice(1)
```
After this first step, there are `r sum(tnrs.res$Name_matched=="No suitable matches found.")` recprds for which no match was found. Another `r sum(tnrs.res$Overall_score<0.9)` were unreliably matched (overall match score <0.9).
After this first step, there are `r sum(tnrs.res$Name_matched=="No suitable matches found.")` records for which no match was found. Another `r sum(tnrs.res$Overall_score<0.9)` were unreliably matched (overall match score <0.9).
### Family level {#ID}
......@@ -421,14 +424,22 @@ tnrs.submit.iter2 <- data.frame(old=tnrs.res.uncertain$Name_submitted) %>%
mutate(new=old) %>%
mutate(new=tolower(new)) %>%
mutate(new=firstup(new)) %>%
mutate(new=gsub(" [0-9]*$", "", new)) %>%
mutate(new=gsub("^Str ", "", new)) %>%
mutate(new=gsub("^Unknown ", "", new)) %>%
mutate(new=firstup(new)) %>%
mutate(new=gsub(" [0-9]*$", "", new)) %>% #delete digits at end of object
mutate(new=gsub("^\\d+|\\d+$", "", new)) %>% #delete digits at the beginning or end of a string
mutate(new=gsub(" sp.$", "", new)) %>%
mutate(new=gsub(" sp$", "", new)) %>%
mutate(new=gsub(" species$", "", new)) %>%
mutate(new=gsub(" *$", "", new)) %>%mutate(new=gsub('^Agropyrum', 'Agropyron', new)) %>%
mutate(new=gsub(" *$", "", new)) %>%
mutate(new=gsub(" #$", "", new)) %>%
mutate(new=gsub(" m$", "", new)) %>%
mutate(new=gsub("acea ", "aceae ", new)) %>%
mutate(new=gsub('^Agropyrum', 'Agropyron', new)) %>%
mutate(new=gsub('^Anno ', 'Annona ', new)) %>%
mutate(new=gsub('Adpdytes dimidiata', 'Apodytes dimidiata', new)) %>%
mutate(new=gsub('Adenostorna fasciculaturn', 'Adenostoma fasciculaturn', new)) %>%
mutate(new=gsub('Adenostorna fasciculaturn', 'Adenostoma fasciculatum', new)) %>%
mutate(new=gsub('Arctostapliylos glallca', 'Arctostaphylos glauca', new)) %>%
mutate(new=gsub('Bituminosa bituminosa', 'Bituminaria bituminosa', new)) %>%
mutate(new=gsub('Causurina equisitifolia', 'Causuarina equisetifolia', new)) %>%
......@@ -523,9 +534,372 @@ tnrs.submit.iter2 <- data.frame(old=tnrs.res.uncertain$Name_submitted) %>%
mutate(new=gsub('^Albizzia "', 'Albizia ', new)) %>%
mutate(new=gsub('^Ipomoena ', 'Ipomoea ', new)) %>%
mutate(new=gsub('^Ipomea ', 'Ipomoea ', new)) %>%
mutate(new=gsub('Ipomo wolco', 'Ipomoea wolcottiana', new))
mutate(new=gsub('Ipomo wolco', 'Ipomoea wolcottiana', new)) %>%
## additional manual cleaning from UJ, BJA, HB
mutate(new=gsub('Abacaba palm', 'Oenocarpus balickii', new)) %>%
mutate(new=gsub('Acerkuomeii', 'Acer kuomeii', new)) %>%
mutate(new=gsub('Adelphacme minima', '', new)) %>%
mutate(new=gsub('Alder$', 'Alnus', new)) %>%
mutate(new=gsub('Amapa$', 'Tabebuia', new)) %>%
mutate(new=gsub('Amapa amargoso', 'Parahancornia amapa', new)) %>%
mutate(new=gsub('Amapa doce$', 'Tabebuia', new)) %>%
mutate(new=gsub('Amapai$', 'Tabebuia', new)) %>%
mutate(new=gsub('Amapaí$', 'Tabebuia', new)) %>%
mutate(new=gsub('Amapa m1', 'Tabebuia', new)) %>%
mutate(new=gsub('Amaranth$', 'Amaranthus', new)) %>%
mutate(new=gsub('Amophora fruticosa', 'Amorpha fruticosa', new)) %>%
mutate(new=gsub('Anacardiace ', 'Anacardiaceae ', new)) %>%
mutate(new=gsub('Anagallisarvensis', 'Anagallis arvensis', new)) %>%
mutate(new=gsub('Anemonenarcissiflora var.', 'Anemone narcissiflora', new)) %>%
mutate(new=gsub('Anenome ', 'Anemone', new)) %>%
mutate(new=gsub('Anona ', 'Annona ', new)) %>%
mutate(new=gsub('Antylis ', 'Anthyllis', new)) %>%
mutate(new=gsub('Apocyncadea gelbblueh$', 'Apocynaceae', new)) %>%
mutate(new=gsub('Aracium', 'Crepis', new)) %>%
mutate(new=gsub('Ardis mexic', 'Ardisia mexicana subsp. siltepecana', new)) %>%
mutate(new=gsub('Ardis verap', 'Ardisia verapazensis', new)) %>%
mutate(new=gsub('Argenomne hummemannii', 'Argemone hunnemanni', new)) %>%
mutate(new=gsub('Artabotus', 'Artabotrys', new)) %>%
mutate(new=gsub('Artemisiaintegrifolia', 'Artemisia integrifolia', new)) %>%
mutate(new=gsub('Asclepiacea$', 'Asclepiadaceae', new)) %>%
mutate(new=gsub('Asclep. klimmer', 'Asclepiadaceae', new)) %>%
mutate(new=gsub('Astartoseris triquetra', 'Lactuca triquetra', new)) %>%
mutate(new=gsub('Asteracee ', 'Asteraceae ', new)) %>%
mutate(new=gsub('Avenula glauc$', 'Avenula', new)) %>%
mutate(new=gsub('Baikea plurijuga', 'Baikiaea plurijuga', new)) %>%
mutate(new=gsub('Binse rundbl', 'Juncaceae', new)) %>%
mutate(new=gsub('Blättrige fabaceae th', 'Fabaceae', new)) %>%
mutate(new=gsub('Bonel macro$', 'Bonellia macrocarpa subsp. macrocarpa', new)) %>%
mutate(new=gsub('Boraginacee samtig', 'Boraginaceae', new)) %>%
mutate(new=gsub('Bri¢fitos', 'Bryophyta', new)) %>%
mutate(new=gsub('Bryophyte$', 'Bryophyta', new)) %>%
mutate(new=gsub('Bryopsida', 'Bryophyta', new)) %>%
mutate(new=gsub('Carallia macrophylla', 'Carallia', new)) %>%
mutate(new=gsub('Carexectabilis', 'Carex spectabilis', new)) %>%
mutate(new=gsub('Carex fein', 'Carex', new)) %>%
mutate(new=gsub('Cerania vermicularis', 'Thamnolia vermicularis', new)) %>%
mutate(new=gsub('Chamelauci merredin', 'Chamelaucium', new)) %>%
mutate(new=gsub('Chamelau drummon', 'Chamelaucium', new)) %>%
mutate(new=gsub('Charophyta', 'Characeae', new)) %>%
mutate(new=gsub('Cheiridopsis-keimlinge', 'Cheiridopsis', new)) %>%
mutate(new=gsub('Chenopodiacee$', 'Chenopodiaceae', new)) %>%
mutate(new=gsub('Chiangioden mexicanum', 'Chiangiodendron mexicanum', new)) %>%
mutate(new=gsub('Chiranthode pentadactylon', 'Chiranthodendron pentadactylon', new)) %>%
mutate(new=gsub('Chrysobalan ', 'Chrysobalanus ', new)) %>%
mutate(new=gsub('Cladapodiella', 'Cladopodiella', new)) %>%
mutate(new=gsub('Cleidium ', 'Cleidion ', new)) %>%
mutate(new=gsub('Collema/leptogium lichenoides', 'Collemataceae', new)) %>%
mutate(new=gsub('Comarostaph discolor', 'Comarostaphylis discolor', new)) %>%
mutate(new=gsub('Combretdodendrum africana', 'Combretodendrum africanum', new)) %>%
mutate(new=gsub('Commelinacaea floscopa', 'Floscopa glomerata', new)) %>%
mutate(new=gsub('Coyncia setigera', 'Coincya setigera', new)) %>%
mutate(new=gsub('Crataeva', 'Crateva', new)) %>%
mutate(new=gsub('Craterosperma', 'Rubiaceae', new)) %>%
mutate(new=gsub('Crespicium', 'Burseraceae', new)) %>%
mutate(new=gsub('Critoniadel nubigenus', 'Critoniadelphus nubigenus', new)) %>%
mutate(new=gsub('Crotalaria/vigna?', 'Fabaceae', new)) %>%
mutate(new=gsub('Croto billb', 'Croton billbergianus subsp. pyramidalis', new)) %>%
mutate(new=gsub('Dana„ racemosa', 'Danae racemosa', new)) %>%
mutate(new=gsub('Deehasia', 'Dehaasia', new)) %>%
mutate(new=gsub('Dichapetala', 'Dichapetalum', new)) %>%
mutate(new=gsub('Distel bractea', 'Asteracaea', new)) %>%
mutate(new=gsub('Distelig asteraceae', 'Asteracaea', new)) %>%
mutate(new=gsub('Dodon visco', 'Dodonaea viscosa', new)) %>%
mutate(new=gsub('Doldenbluetler', 'Apiaceae', new)) %>%
mutate(new=gsub('Echinosurus capitatus', 'Poaceae', new)) %>%
mutate(new=gsub('Einähriges gras$', 'Poaceae', new)) %>%
mutate(new=gsub('Einähriges gras von gestern$', 'Poaceae', new)) %>%
mutate(new=gsub('Einblütiges rispengras', 'Poaceae', new)) %>%
mutate(new=gsub('Eiovaltrichtergrundblatt orchidee', 'Orchidaceae', new)) %>%
mutate(new=gsub('Elongata subsp.', 'Pohlia elongata', new)) %>%
mutate(new=gsub('Enriquebelt ', 'Enriquebeltrania ', new)) %>%
mutate(new=gsub('Entermorpha ', 'Enteromorpha ', new)) %>%
mutate(new=gsub('Erodiurn$', 'Erodium', new)) %>%
mutate(new=gsub('Euc. chloroclada x camaldulensis', 'Eucalyptus', new)) %>%
mutate(new=gsub('Euphorbiacée ipatouduluga gouduatché', 'Euphorbiaceae', new)) %>%
mutate(new=gsub('Fabacee kleeblatt stengel schwarzdrüsi', 'Fabaceae', new)) %>%
mutate(new=gsub('Fabaceenstrauch wie 132446 f', 'Fabaceae', new)) %>%
mutate(new=gsub('Fabaceenstr kleinbltrg', 'Fabaceae', new)) %>%
mutate(new=gsub('Fabacee wie lotus f', 'Fabaceae', new)) %>%
mutate(new=gsub('Farn', 'Pteridophyta', new)) %>%
mutate(new=gsub('Farn cystopteris', 'Cystopteris', new)) %>%
mutate(new=gsub('Fern', 'Pteridophyta', new)) %>%
mutate(new=replace(new, list=word(new, 1)=="Fingergras", values="Digitaria")) %>%
mutate(new=replace(new, list=word(new, 1)=="Fingerhirse", values="Digitaria")) %>%
mutate(new=gsub('Gelbe onagraceae', 'Onagraceae', new)) %>%
mutate(new=gsub('Gramine', 'Poaceae', new)) %>%
mutate(new=gsub('Graminea', 'Poaceae', new)) %>%
mutate(new=gsub('Graminia', 'Poaceae', new)) %>%
mutate(new=gsub('Grannenquecke', 'Poaceae', new)) %>%
mutate(new=replace(new,
list=word(new, 1)=="Gras",
values="Poaceae")) %>%
mutate(new=gsub('Gynostachi dicanthus', 'Gymnostachium diacanthus', new)) %>%
mutate(new=gsub('Hafer haarkranz', 'Poaceae', new)) %>%
mutate(new=gsub('Hapolosiphon', 'Hapalosiphon', new)) %>%
mutate(new=gsub('Heliocrysum', 'Helichrysum', new)) %>%
mutate(new=replace(new, list=word(new, 1)=="Hepaticae", values="Bryophyta")) %>%
mutate(new=gsub('Hepaticas', 'Bryophyta', new)) %>%
mutate(new=gsub('Hepatophyta', 'Bryophyta', new)) %>%
mutate(new=gsub('Hermerocalis', 'Hemerocallis', new)) %>%
mutate(new=replace(new,
list=word(new, 1)=="Hirse",
values="Poaceae")) %>%
mutate(new=gsub('Hirte trian', 'Hirtella triandra subsp. media', new)) %>%
mutate(new=replace(new, list=word(new, 1)=="Hohlzahn", values="Lamiaceae")) %>%
mutate(new=gsub('Hondurodend urceolatum', 'Hondurodendron urceolatum', new)) %>%
mutate(new=gsub('Hornklee gelb', 'Fabaceae', new)) %>%
mutate(new=replace(new,
list=word(new, 1)=="Horstgras",
values="Poaceae")) %>%
mutate(new=replace(new,
list=word(new, 1)=="Huehnerhirse",
values="Digitaria")) %>%
mutate(new=gsub('Hydrocoleus lyngbyaceus', 'Hydrocoleum lyngbyaceum', new)) %>%
mutate(new=gsub('Hyernima nipensis', 'Hieronyma nipensis', new)) %>%
mutate(new=gsub('Hyeronima', 'Hieronyma', new)) %>%
mutate(new=gsub('Hypocal angusti', 'Hypocalymma angustifolium', new)) %>%
mutate(new=gsub('Hypocalym nambung', 'Hypocalymma', new)) %>%
mutate(new=gsub('Hyprium', 'Hypericum', new)) %>%
mutate(new=gsub('Igelkolben', 'Sparganium', new)) %>%
mutate(new=gsub('Ilexã‚â paraguariensis', 'Ilex', new)) %>%
mutate(new=gsub('Ipomea', 'Ipomoea', new)) %>%
mutate(new=gsub('Ipomoena', 'Ipomoea', new)) %>%
mutate(new=gsub('Jm kürbis stark behaart', 'Cucurbitaceae', new)) %>%
mutate(new=gsub('Juncaginacee/triglochin', 'Triglochin', new)) %>%
mutate(new=gsub('Juncas', 'Juncus', new)) %>%
mutate(new=gsub('Keilblatt cyperus', 'Cyperus', new)) %>%
mutate(new=gsub('Khh 3010 polygalacee', 'Polygalaceae', new)) %>%
mutate(new=gsub(' Khh 3014 liliacee 3f„ch. kapsel schwarze samen', 'Liliaceae', new)) %>%
mutate(new=gsub('Khh 3024 brachiaria', 'Brachiaria', new)) %>%
mutate(new=gsub('Khh 3025 liliaceae gelbe blten breite bl„tter', 'Liliaceae', new)) %>%
mutate(new=gsub('Khh 3037 ficus', 'Ficus', new)) %>%
mutate(new=gsub('Khh 3054 ficus iteophylla miq.', 'Ficus', new)) %>%
mutate(new=gsub('Kl. borstgras', 'Poaceae', new)) %>%
mutate(new=gsub('Kleine malvaceae', 'Malvaceae', new)) %>%
mutate(new=replace(new,
list=word(new, 1)=="Kletter",
values="Asteraceae")) %>%
mutate(new=gsub('Klimmer asclepiadaceae', 'Asclepiadaceae', new)) %>%
mutate(new=gsub('Klimmer curcuvitaceae', 'Cucurbitaceae', new)) %>%
mutate(new=gsub('Kl. sauergras', 'Cyperaceae', new)) %>%
mutate(new=gsub('Knabenkraut gefleckt', 'Orchis', new)) %>%
mutate(new=gsub('Knubbelblüt. gras haarkranz vgl f', 'Poaceae', new)) %>%
mutate(new=replace(new,
list=word(new, 1)=="Koenigskerze",
values="Verbascum")) %>%
mutate(new=gsub('Kriechgras zynodon', 'Poaceae', new)) %>%
mutate(new=gsub('Kürbis', 'Cucurbitaceae', new)) %>%
mutate(new=gsub('Lamiaceen strauch', 'Lamiaceae', new)) %>%
mutate(new=gsub('Lamiacee orange', 'Lamiaceae', new)) %>%
mutate(new=gsub('Lamiales orobanchaceae + phrymaceae + plantaginaceae + scrophulariaceae', 'Orobanchaceae', new)) %>%
mutate(new=gsub('Lantanacamara wandelrösschen', 'Lantana camara', new)) %>%
mutate(new=gsub('Lasiopeta watheroo k. shepherd & c. wilkins ks', 'Lasiopetalum', new)) %>%
mutate(new=gsub('Leg-inderteminada', 'Fabaceae', new)) %>%
mutate(new=gsub('Legu 1fiedrig groá schlank', 'Fabaceae', new)) %>%
mutate(new=gsub('Legume$', 'Fabaceae', new)) %>%
mutate(new=gsub('Leguminosae spgm', 'Fabaceae', new)) %>%
mutate(new=gsub('Leguminosea', 'Fabaceae', new)) %>%
mutate(new=replace(new,
list=word(new, 1)=="Leguminose",
values="Fabaceae")) %>%
mutate(new=gsub('Leheelo grass', 'Poaceae', new)) %>%
mutate(new=gsub('Lepid carra', 'Lepiderema', new)) %>%
mutate(new=gsub('Lich caloplaca', 'Caloplaca', new)) %>%
mutate(new=gsub('Liliacee', 'Liliaceae', new)) %>%
mutate(new=replace(new,
list=word(new, 1)=="Lilie",
values="Liliaceae")) %>%
mutate(new=gsub('Liliengewächs', 'Liliaceae', new)) %>%
mutate(new=gsub('Lisea', 'Litsea', new)) %>%
mutate(new=gsub('Lisymachia', 'Lysimachia', new)) %>%
mutate(new=replace(new,
list=word(new, 1)=="Liverwort",
values="Bryophyta")) %>%
mutate(new=gsub('Livwort', 'Bryophyta', new)) %>%
mutate(new=gsub('Lonicerachrysantha', 'Lonicera chrysantha', new)) %>%
mutate(new=gsub('Lycoctamnus barbatus', 'Aconitum barbatum', new)) %>%
mutate(new=gsub('Lygopus', 'Lycopus', new)) %>%
mutate(new=gsub('Maitenus', 'Maytenus', new)) %>%
mutate(new=replace(new,
list=word(new, 1)=="Malpighiace",
values="Malpighiaceae")) %>%
mutate(new=gsub('Malpighiales chrysobalanaceae + humiriaceae', 'Malpighiaceae', new)) %>%
mutate(new=replace(new,
list=word(new, 1)=="Malve",
values="Malvaceae")) %>%
mutate(new=replace(new,
list=word(new, 1)=="Mammutgras",
values="Poaceae")) %>%
mutate(new=gsub('Mammutgrass', 'Poaceae', new)) %>%
mutate(new=gsub('Maqui guian', 'Maquira guianensis subsp. costaricana', new)) %>%
mutate(new=gsub('Marchantiophyta', 'Bryophyta', new)) %>%
mutate(new=gsub('Mariana aphylla', 'Maireana aphylla', new)) %>%
mutate(new=gsub('Mehrfingeriges ährengras', 'Poaceae', new)) %>%
mutate(new=replace(new,
list=word(new, 1)=="Melastomata",
values="Melastomataceae")) %>%
mutate(new=gsub('Mesembr minibl', 'Mesembryanthemum', new)) %>%
mutate(new=gsub('Mesostomma kotschyanum', 'Mesostemma kotschyana', new)) %>%
mutate(new=gsub('Microhepatics', 'Bryophyta', new)) %>%
mutate(new=gsub('Micromeria micrantha', 'Micromeria graeca subsp. micrantha', new)) %>%
mutate(new=gsub('Mimose minifiedrig f', 'Fabaceae', new)) %>%
mutate(new=gsub('Miniepilobium', 'Epilobium', new)) %>%
mutate(new=gsub('Minimargerite', 'Asteraceae', new)) %>%
mutate(new=gsub('Miniochna', 'Ochna', new)) %>%
mutate(new=gsub('Minischilf 132466 f', 'Poaceae', new)) %>%
mutate(new=gsub('Mistletoe', 'Viscum', new)) %>%
mutate(new=gsub('Mniaecia', 'Mniaceae', new)) %>%
mutate(new=gsub('Molemo', 'Turraea', new)) %>%
mutate(new=gsub('Molses', 'Bryophyta', new)) %>%
mutate(new=gsub('Momisa pigra', 'Mimosa pigra', new)) %>%
mutate(new=gsub('Monandrus squarrosus', 'Cyperus squarrosus', new)) %>%
mutate(new=gsub('Monchema debile', 'Monechma debile', new)) %>%
mutate(new=replace(new,
list=word(new, 1)=="Monochna",
values="Polygalaceae")) %>%
mutate(new=replace(new,
list=word(new, 1)=="Moos",
values="Bryophyta")) %>%
mutate(new=gsub('Moospolster grau-grün', 'Bryophyta', new)) %>%
mutate(new=gsub('Mortonioden ', 'Mortoniodendron ', new)) %>%
mutate(new=gsub('Mos onbekend', 'Bryophyta', new)) %>%
mutate(new=gsub('Mossen overige', 'Bryophyta', new)) %>%
mutate(new=gsub('Mougetia', 'Mougeotia', new)) %>%
mutate(new=replace(new,
list=word(new, 1)=="Musci",
values="Bryophyta")) %>%
mutate(new=gsub('Myciantes', 'Myrcianthes', new)) %>%
mutate(new=gsub('Myrciaã‚â pulchra', 'Myrcia pulchra', new)) %>%
mutate(new=gsub('Myrcianov.', 'Myrcia', new, fixed = T)) %>%
mutate(new=gsub('Myrsi coria', 'Myrsine coriacea', new)) %>%
mutate(new=gsub('Myrtaceenstrauch', 'Myrtaceae', new)) %>%
mutate(new=gsub('Nachtkerze fru dreispaltig', 'Onagracaee', new)) %>%
mutate(new=gsub('Neobartsia crenoloba', 'Bartsia crenoloba', new)) %>%
mutate(new=gsub('None$', 'Nonea', new)) %>%
mutate(new=gsub('Ocos adenophylla', 'Symplocos adenophylla', new)) %>%
mutate(new=gsub('Officinale subsp. group', 'Taraxacum officinale s.l.', new)) %>%
mutate(new=gsub('Orch$', 'Orchidaceae', new)) %>%
mutate(new=gsub('Orchid', 'Orchidaceae', new)) %>%
mutate(new=replace(new,
list=word(new, 1)=="Orchidee",
values="Orchidaceae")) %>%
mutate(new=replace(new,
list=word(new, 1) %in% c("Papilonacea", "Papilionacea"),
values="Fabaceae")) %>%
mutate(new=gsub('Pasania dodoniifolia', 'Lithocarpus dodonaeifolius', new)) %>%
mutate(new=gsub('Phoebengmoensis', 'Phoebe hungmoensis', new)) %>%
mutate(new=gsub('Picra antid$', 'Picramnia antidesma subsp. fessonia', new)) %>%
mutate(new=gsub('Pinopsida', 'Coniferae', new)) %>%
mutate(new=gsub('Pisonianov.', 'Pisonia', new, fixed=T)) %>%
mutate(new=gsub('Pithecellob ', 'Pithecellobium ', new)) %>%
mutate(new=gsub('Pithecocten', 'Pithecoctenium', new)) %>%
mutate(new=gsub('Pleradenoph longicuspis', 'Pleradenophora longicuspis', new)) %>%
mutate(new=gsub('Pleuranthod ', 'Pleuranthodendron ', new)) %>%
mutate(new=gsub('Poales', 'Poaceae', new)) %>%
mutate(new=replace(new,
list=word(new, 1) %in% c("Polygalacea", "Polygalacee"),
values="Polygalaceae")) %>%
mutate(new=replace(new,
list=word(new, 1) %in% c("Polygonaceae", "Polygonacee"),
values="Polygonaceae")) %>%
mutate(new=gsub('Polygonumlongisetum', 'Polygonum longisetum', new)) %>%
mutate(new=gsub('Posoq coria subsp. maxima', 'Posoqueria coriacea subsp. maxima', new)) %>%
mutate(new=gsub('Prosthecidi ', 'Prosthecidiscus ', new)) %>%
mutate(new=gsub('Pseudo bidens', '', new)) %>%
mutate(new=replace(new,
list=word(new, 1) %in%
c("Pseudobriza", "Pseudofingergras",
"Pseudogerste", "Puschelgras", "Quecke",
"Queckenblatt", "Queckengras",
"Roggen/hafer", "Ruchgras", "Silbergras",
"Suessgras"),
values="Poaceae")) %>%
mutate(new=gsub('Ptarmica', 'Achillea', new)) %>%
mutate(new=gsub('Pterost cauline leaves n. gibson & m.n. lyons', 'Pterostegia', new)) %>%
mutate(new=gsub('Quararibeaã‚â guianensis', 'Quararibea guianensis', new)) %>%
mutate(new=gsub('Rainfarn f', 'Asteraceae', new)) %>%
mutate(new=gsub('Ranke ipomoea', 'Ipomoea', new)) %>%
mutate(new=gsub('Ranke rubiaceae', 'Rubiaceae', new)) %>%
mutate(new=gsub('Rauwolfia', 'Rauvolfia', new)) %>%
mutate(new=gsub('Rheinfarn', 'Asteraceae', new)) %>%
mutate(new=gsub('Rhodostemon kunthiana', 'Rhodostemonodaphne kunthiana', new)) %>%
mutate(new=gsub('Riccardia/aneura', 'Bryophyta', new)) %>%
mutate(new=gsub('Rietgras steril 134051a', 'Poaceae', new)) %>%
mutate(new=gsub('Rosenbergio formosum', 'Rosenbergiodendron formosum', new)) %>%
mutate(new=gsub('Rotes puschelgras', 'Poaceae', new)) %>%
mutate(new=replace(new,
list=word(new, 1)=="Rubiacea",
values="Rubiaceae")) %>%
mutate(new=gsub('Rytidospe goomallin a.g. gunness et al. oakp 10/', 'Rytidosperma', new)) %>%
mutate(new=gsub('Salacia idoensis', 'Salacia', new)) %>%
mutate(new=gsub('Samphire', 'Amaranthaceae', new)) %>%
mutate(new=replace(new,
list=word(new, 1) %in%
c("Sauergras", "Schlanksegge", "Sedge",
"Segge", "Simse"),
values="Cyperaceae")) %>%
mutate(new=gsub('Scaev repen subsp. north sandp r.j. cranf & p.j. spenc', 'Scaevola repens', new)) %>%
mutate(new=replace(new,
list=word(new, 1)=="Schachtelhalm",
values="Equisetaceae")) %>%
mutate(new=replace(new,
list=word(new, 1)=="Schnittlauch",
values="Amaryllidaceae")) %>%
mutate(new=gsub('Schwertlilie trocken', 'Iridaceae', new)) %>%
mutate(new=replace(new,
list=word(new, 1) %in% c("Scropholacea", "Scrophulariacea", "Scroph."),
values="Scrophulariacea")) %>%
mutate(new=gsub('Sitzende onagraceae', 'Onagraceae', new)) %>%
mutate(new=gsub('Sonnenblume', 'Asteraceae', new)) %>%
mutate(new=gsub('Stachelgurke', 'Cucurbitaceae', new)) %>%
mutate(new=gsub('Stark behaarte malve', 'Malvaceae', new)) %>%
mutate(new=gsub('Staude asteraceae bl watteweich f', 'Asteraceae', new)) %>%
mutate(new=gsub('Staude crotalaria unterseite silber', 'Crotalaria', new)) %>%
mutate(new=gsub('Staude solanum', 'Solanaceae', new)) %>%
mutate(new=gsub('Staude tephrosia', 'Tephrosia', new)) %>%
mutate(new=gsub('Stipagrosist panicle gross', 'Stipagrostis', new)) %>%
mutate(new=gsub('Asteraceae u silber', 'Asteraceae', new)) %>%
mutate(new=gsub('Stratonostoc communeá', 'Stratonostoc commune', new)) %>%
mutate(new=gsub('Strauch asteraceae nadelblätt.', 'Asteraceae', new)) %>%
mutate(new=gsub('Strauch blatt wie salix reticulata astera', 'Asteraceae', new)) %>%
mutate(new=gsub('Strauch blatt wie salix reticulata astera 132534b', 'Asteraceae', new)) %>%
mutate(new=gsub('Strauch fabaceae gerieft schote', 'Fabaceae', new)) %>%
mutate(new=replace(new,
list=word(new, 1)=="Strauch" &
word(new,2)=="Rubiaceae",
values="Rubiaceae")) %>%
mutate(new=gsub('Fabaceae samtig bl lanzettlich', 'Fabaceae', new)) %>%
mutate(new=gsub('Ochna mini', 'Ochna', new)) %>%
mutate(new=gsub('Stryphnoden microstachyum', 'Stryphnodendron microstachyum', new)) %>%
mutate(new=gsub('Sumpfgladiole haarig', 'Gladiolus', new)) %>%
mutate(new=gsub('Sygnum ramphicarpa', 'Scrophulariaceae', new)) %>%
mutate(new=replace(new,
list=word(new, 1)=="Symplococar",
values="Symplococarpon")) %>%
mutate(new=gsub('Sysirinchium', 'Sisyrinchium', new)) %>%
mutate(new=gsub('Syzigium accuminatisima', 'Syzygium acuminatissimum', new)) %>%
mutate(new=gsub('Tabernaemon ', 'Tabernaemontana ', new)) %>%
mutate(new=gsub('Thalassodend', 'Thalassodendron', new)) %>%
mutate(new=gsub('Thinouia canescens', 'Thinouia', new)) %>%
mutate(new=gsub('Thistle', 'Asteraceae', new)) %>%
mutate(new=gsub('Trisetumicatum', 'Trisetum spicatum', new)) %>%
mutate(new=gsub('Undetermined sedge', 'Cyperaceae', new)) %>%
mutate(new=replace(new,
list=word(new, 1) %in%
c("Liverwort", "Liverworts", "Moss"),
values="Bryophyta")) %>%
mutate(new=gsub('Vismi bacci', 'Vismia baccifera subsp. ferruginea', new)) %>%
mutate(new=gsub('Weidenr”schen', 'Onagraceae', new)) %>%
mutate(new=gsub('Weißpelziger brauner Spross Asteracea', 'Asteraceae', new)) %>%
mutate(new=gsub('Wie stipagrostis', 'Poaceae', new)) %>%
mutate(new=gsub('Wincassia', 'Fabaceae', new)) %>%
mutate(new=gsub('xDactyloden st-quintini', 'Dactylodenia st-quintinii', new)) %>%
mutate(new=gsub('Zizyphus sp1 IUCN1', 'Zizyphus', new)) %>%
mutate(new=gsub('Zwiebel Lilaceae steril', 'Lilaceae', new)) %>%
mutate(new=gsub('Zwstr faurea', 'Faurea', new))
# delete remaining records of mushroom species
tnrs.submit.iter2 <- tnrs.submit.iter2 %>%
filter(!word(new,1) %in% mushroom)
......@@ -534,7 +908,7 @@ tnrs.submit.iter2 <- tnrs.submit.iter2 %>%
tnrs.submit.iter2 <- tnrs.submit.iter2 %>%
na.omit() %>%
group_by(old) %>%
mutate(family.lev=str_extract(word(new,1), pattern='([^\\s]+acea)')) %>%
mutate(family.lev=str_extract(word(new,1), pattern='([^\\s]+aceae)')) %>%
mutate(new=ifelse(is.na(family.lev), new, family.lev)) %>%
dplyr::select(-family.lev) %>%
ungroup()
......@@ -880,7 +1254,7 @@ Backbone <- spec.list.TRY.sPlot %>%
left_join(tnrs.res.certain %>%
bind_rows(tnrs.res.iter2.certain) %>%
bind_rows(tnrs.ncbi.certain) %>%
#reformat TPL output to tnrs output
#reformat TPL output to tnrs output
bind_rows(tpl.ncbi.certain %>%
rename(Name_submitted=Taxon,
Name_matched_url=ID,
......@@ -951,7 +1325,7 @@ summary(Backbone$Rank_correct)
Copy family info for taxa resolved at family level
```{r}
Backbone <- Backbone %>%
mutate(family.lev=str_extract(word(Name_correct,1), pattern='([^\\s]+acea)')) %>%
mutate(family.lev=str_extract(word(Name_correct,1), pattern='([^\\s]+aceae)')) %>%
mutate(Family_correct=ifelse(!is.na(Accepted_name_family),
Accepted_name_family,
family.lev)) %>%
......@@ -993,7 +1367,7 @@ sum(is.na(Backbone$Family_correct))
### Resolve genera with missing family info with `TNRS`
```{r}
```{r, eval=F}
Genera_submit <- Backbone %>%
filter(is.na(Family_correct)) %>%
......@@ -1105,6 +1479,14 @@ table(Backbone$is_vascular_species, exclude=NULL)
save(Backbone, file="../_output/Backbone3.0.RData")
```
## Export species list to request in TRY
```{r}
Backbone %>%
filter(grep(sPlot_TRY,pattern = "S"))
```
# Statistics
## Statistics for backbone combining names in `sPlot3.0` and `TRY5.0`
......@@ -1157,7 +1539,7 @@ kable((table(Backbone$Taxonomic_status, exclude=NULL)), caption = "Number of (st
**Total number of unique standardized taxon names and families:**
```{r, eval = T}
length(unique(Backbone$name_short_correct))-1 # minus 1 for NA
length(unique(Backbone$Name_short))-1 # minus 1 for NA
length(unique(Backbone$Family_correct))-1 # minus 1 for NA
```
......@@ -1190,14 +1572,12 @@ entries per resolved name. (Only first 20 shown") %>%
### Based on `unique` standardized names
Generate version of the backbone that only includes the unique resolved names in `name.short.correct`, and for the non-unique names, the first rows of duplicated name:
Generate version of the backbone that only includes the unique resolved names in `Name.short`, and for the non-unique names, the first rows of duplicated name:
```{r, eval = T}
Backbone.uni <- Backbone %>%
distinct(Name_short, .keep_all = T) %>%
filter(!is.na(Name_short))
nrow(Backbone.uni)
```
There are `r nrow(Backbone.uni)` unique taxon names the in the backbone.
......@@ -1212,10 +1592,10 @@ Backbone.uni.vasc <- Backbone.uni %>%
**Now, run the stats for unique resolved names (excluding non-vascular and non-matching taxa):**
```{r, eval = T}
nrow(Backbone.uni.vasc$Name_short)
length(Backbone.uni.vasc$Name_short)
```
There are `r nrow(Backbone.uni.vasc$name.short.correct)` unique (vascular plant) taxon names:
There are `r length(Backbone.uni.vasc$Name.short)` unique (vascular plant) taxon names:
```{r, eval = T, echo=F}
kable((table(Backbone.uni.vasc$sPlot_TRY)), caption = "Number of (standardized) vascular plant taxon names per unique to, and shared between TRY (S), sPlot (T) and the Alpine (A) dataset.") %>%
......@@ -1243,8 +1623,8 @@ kable((table(Backbone.uni.vasc$Status_correct, exclude=NULL)), caption = "Number
**Total number of unique standardized taxon names and families:**
```{r, eval = T}
length(unique(Backbone.uni.vasc$name_short))-1 # minus 1 for NA
length(unique(Backbone.uni.vasc$family_correct))-1
length(unique(Backbone.uni.vasc$Name_short))-1 # minus 1 for NA
length(unique(Backbone.uni.vasc$Family_correct))-1
```
## Stats for the corrected names in `sPlot` only:
......@@ -1284,7 +1664,7 @@ kable((table(Backbone.uni.sPlot$Status_correct, exclude=NULL)), caption = "Numbe
**Number of families in sPlot**:
```{r, eval = T}
nrow(unique(Backbone$Family.correct))
length(unique(Backbone$Family_correct))
```
**Done!**
......@@ -1325,4 +1705,4 @@ write_csv(toCheck_manual, path="../_derived/TPL/toCheck_Manual.csv")
user@local $ ssh user@idiv-gateway.ufz.de
\ No newline at end of file
user@local $ ssh user@idiv-gateway.ufz.de
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment