MEMO!! WHAT TO DO WITH LAYER WHEN IS CONSISTENTLY ZERO IN A PLOT? CHANGE TO NA? WHAT TO DO INSTEAD WHEN LAYER==0 IN A PLOT WHERE LAYER INFO IS OTHERWISE AVAILABLE? !!! ADD Explanation of fields!!!
Timestamp: Fri Mar 13 02:52:06 2020
Drafted: Francesco Maria Sabatini
Revised:
version: 1.0
This report documents the construction of the DT table for sPlot 3.0. It is based on dataset sPlot_3.0.2, received on 24/07/2019 from Stephan Hennekens.
knitr::opts_chunk$set(echo = TRUE)
library(tidyverse)
library(readr)
library(xlsx)
library(knitr)
library(kableExtra)
#save temporary files
write("TMPDIR = /data/sPlot/users/Francesco/_tmp", file=file.path(Sys.getenv('TMPDIR'), '.Renviron'))
write("R_USER = /data/sPlot/users/Francesco/_tmp", file=file.path(Sys.getenv('R_USER'), '.Renviron'))
#rasterOptions(tmpdir="/data/sPlot/users/Francesco/_tmp")
Search and replace unclosed quotation marks and escape them. Run in Linux terminal
# escape all double quotation marks. Run in Linux terminal
# sed 's/"/\\"/g' sPlot_3_0_2_species.csv > sPlot_3_0_2_species_test.csv
DT table is the species x plot matrix, in long format.
DT0 <- readr::read_delim("../sPlot_data_export/sPlot_3_0_2_species_test.csv",
delim="\t",
col_type = cols(
PlotObservationID = col_double(),
Taxonomy = col_character(),
`Taxon group` = col_character(),
`Taxon group ID` = col_double(),
`Turboveg2 concept` = col_character(),
`Matched concept` = col_character(),
Match = col_double(),
Layer = col_double(),
`Cover %` = col_double(),
`Cover code` = col_character(),
x_ = col_double()
)
)
nplots <- length(unique(DT0$PlotObservationID))
nspecies <- length(unique(DT0$`Matched concept`))
Species data include 43103312 species * plot records, across 1978589 plots. Before taxonomic resolution, there are 107676 species .
PlotObservationID | Taxonomy | Taxon group | Taxon group ID | Turboveg2 concept | Matched concept | Match | Layer | Cover % | Cover code | x_ |
---|---|---|---|---|---|---|---|---|---|---|
354857 | NO-Europe_lenoir | Vascular plant | 1 | Agrostis capillaris | Agrostis capillaris | 3 | 0 | 1 | x | NA |
354857 | NO-Europe_lenoir | Vascular plant | 1 | Ammophila arenaria | Ammophila arenaria | 1 | 0 | 1 | x | NA |
354857 | NO-Europe_lenoir | Moss | 3 | Bryophyta species | Bryophyta species | 1 | 0 | 1 | x | NA |
354857 | NO-Europe_lenoir | Vascular plant | 1 | Carex arenaria | Carex arenaria | 1 | 0 | 1 | x | NA |
354857 | NO-Europe_lenoir | Vascular plant | 1 | Festuca rubra subsp. arenaria | Festuca arenaria | 3 | 0 | 1 | x | NA |
354857 | NO-Europe_lenoir | Vascular plant | 1 | Galium verum | Galium verum | 3 | 0 | 1 | x | NA |
354857 | NO-Europe_lenoir | Vascular plant | 1 | Linaria vulgaris | Linaria vulgaris | 3 | 0 | 1 | x | NA |
354857 | NO-Europe_lenoir | Vascular plant | 1 | Poa pratensis subsp. pratensis | Poa pratensis subsp. pratensis | 3 | 0 | 1 | x | NA |
354857 | NO-Europe_lenoir | Vascular plant | 1 | Salix repens subsp. repens var. argentea | Salix repens subsp. repens | 2 | 0 | 1 | x | NA |
1462431 | CS-Czechia_slovakia_2015 | Vascular plant | 1 | Agrostis rupestris | Agrostis rupestris | 3 | 6 | 3 | 1 | NA |
1462431 | CS-Czechia_slovakia_2015 | Vascular plant | 1 | Avenella flexuosa | Avenella flexuosa | 3 | 6 | 3 | 1 | NA |
1462431 | CS-Czechia_slovakia_2015 | Vascular plant | 1 | Avenula versicolor | Helictochloa versicolor | 3 | 6 | 38 | 3 | NA |
1462431 | CS-Czechia_slovakia_2015 | Vascular plant | 1 | Calamagrostis villosa | Calamagrostis villosa | 3 | 6 | 2 |
|
NA |
1462431 | CS-Czechia_slovakia_2015 | Vascular plant | 1 | Campanula alpina | Campanula alpina | 3 | 6 | 3 | 1 | NA |
1462431 | CS-Czechia_slovakia_2015 | Vascular plant | 1 | Carex sempervirens | Carex sempervirens | 3 | 6 | 3 | 1 | NA |
1462431 | CS-Czechia_slovakia_2015 | Lichen | 4 | Cetraria islandica | Cetraria islandica | 1 | 9 | 13 | 2 | NA |
1462431 | CS-Czechia_slovakia_2015 | Vascular plant | 1 | Empetrum nigrum | Empetrum nigrum | 3 | 6 | 2 |
|
NA |
1462431 | CS-Czechia_slovakia_2015 | Vascular plant | 1 | Geum montanum | Geum montanum | 3 | 6 | 2 |
|
NA |
1462431 | CS-Czechia_slovakia_2015 | Vascular plant | 1 | Hieracium alpinum | Hieracium alpinum | 3 | 6 | 2 |
|
NA |
1462431 | CS-Czechia_slovakia_2015 | Vascular plant | 1 | Homogyne alpina | Homogyne alpina | 3 | 6 | 3 | 1 | NA |
1462431 | CS-Czechia_slovakia_2015 | Vascular plant | 1 | Hypochaeris uniflora | Hypochaeris uniflora | 3 | 6 | 13 | 2 | NA |
1462431 | CS-Czechia_slovakia_2015 | Vascular plant | 1 | Persicaria vivipara | Bistorta vivipara | 3 | 6 | 1 | r | NA |
1462431 | CS-Czechia_slovakia_2015 | Vascular plant | 1 | Potentilla aurea | Potentilla aurea | 3 | 6 | 2 |
|
NA |
1462431 | CS-Czechia_slovakia_2015 | Vascular plant | 1 | Pulsatilla alpina subsp. alba auct. sudet. & carpat. | Pulsatilla alpina subsp. alba | 3 | 6 | 1 | r | NA |
1462431 | CS-Czechia_slovakia_2015 | Vascular plant | 1 | Vaccinium gaultherioides | Vaccinium uliginosum | 3 | 6 | 3 | 1 | NA |
1462431 | CS-Czechia_slovakia_2015 | Vascular plant | 1 | Vaccinium myrtillus | Vaccinium myrtillus | 3 | 6 | 3 | 1 | NA |
1462431 | CS-Czechia_slovakia_2015 | Vascular plant | 1 | Vaccinium vitis-idaea | Vaccinium vitis-idaea | 3 | 6 | 2 |
|
NA |
1585163 | RU-Russia | Vascular plant | 1 | Betula pubescens | Betula pubescens | 3 | 1 | 38 | 3 | NA |
1585163 | RU-Russia | Vascular plant | 1 | Betula pubescens | Betula pubescens | 3 | 2 | 38 | 3 | NA |
1585163 | RU-Russia | Vascular plant | 1 | Calamagrostis arundinacea | Calamagrostis arundinacea | 3 | 6 | 3 | 1 | NA |
1585163 | RU-Russia | Vascular plant | 1 | Carex rhizina | Carex pediformis subsp. rhizodes | 3 | 6 | 3 | 1 | NA |
1585163 | RU-Russia | Vascular plant | 1 | Chelidonium majus | Chelidonium majus | 3 | 6 | 3 | 1 | NA |
1585163 | RU-Russia | Vascular plant | 1 | Convallaria majalis | Convallaria majalis | 3 | 6 | 3 | 1 | NA |
1585163 | RU-Russia | Vascular plant | 1 | Dryopteris carthusiana | Dryopteris carthusiana | 3 | 6 | 3 | 1 | NA |
1585163 | RU-Russia | Vascular plant | 1 | Galium mollugo | Galium mollugo | 3 | 6 | 3 | 1 | NA |
1585163 | RU-Russia | Vascular plant | 1 | Hypericum perforatum | Hypericum perforatum | 3 | 6 | 3 | 1 | NA |
1585163 | RU-Russia | Vascular plant | 1 | Melampyrum pratense | Melampyrum pratense | 3 | 6 | 3 | 1 | NA |
1585163 | RU-Russia | Vascular plant | 1 | Moehringia trinervia | Moehringia trinervia | 3 | 6 | 3 | 1 | NA |
1585163 | RU-Russia | Vascular plant | 1 | Oxalis acetosella | Oxalis acetosella | 3 | 6 | 3 | 1 | NA |
1585163 | RU-Russia | Vascular plant | 1 | Picea obovata | Picea obovata | 3 | 1 | 68 | 4 | NA |
1585163 | RU-Russia | Vascular plant | 1 | Picea obovata | Picea obovata | 3 | 2 | 68 | 4 | NA |
1585163 | RU-Russia | Vascular plant | 1 | Pinus sylvestris | Pinus sylvestris | 3 | 1 | 3 | 1 | NA |
1585163 | RU-Russia | Vascular plant | 1 | Pinus sylvestris | Pinus sylvestris | 3 | 6 | 3 | 1 | NA |
1585163 | RU-Russia | Vascular plant | 1 | Rubus idaeus | Rubus idaeus | 3 | 6 | 3 | 1 | NA |
1585163 | RU-Russia | Vascular plant | 1 | Rumex acetosella | Rumex acetosella | 3 | 6 | 3 | 1 | NA |
1585163 | RU-Russia | Vascular plant | 1 | Sambucus racemosa | Sambucus racemosa | 3 | 4 | 3 | 1 | NA |
1585163 | RU-Russia | Vascular plant | 1 | Solidago virgaurea | Solidago virgaurea | 3 | 6 | 3 | 1 | NA |
1585163 | RU-Russia | Vascular plant | 1 | Sorbus aucuparia | Sorbus aucuparia | 3 | 4 | 3 | 1 | NA |
1585163 | RU-Russia | Vascular plant | 1 | Stellaria media | Stellaria media | 3 | 6 | 3 | 1 | NA |
1585163 | RU-Russia | Vascular plant | 1 | Vaccinium myrtillus | Vaccinium myrtillus | 3 | 6 | 3 | 1 | NA |
1585163 | RU-Russia | Vascular plant | 1 | Veronica officinalis | Veronica officinalis | 3 | 6 | 3 | 1 | NA |
Import taxonomic backbone
load("../_output/Backbone3.0.RData")
Match to DT0, using Taxonomic concept
as matching key. This is the field that was used to build, and resolve, the Backbone.
DT1 <- DT0 %>%
left_join(Backbone %>%
dplyr::select(Name_sPlot_TRY, Name_short, `Taxon group`, Rank_correct) %>%
rename(`Matched concept`=Name_sPlot_TRY,
Taxongroup_BB=`Taxon group`),
by="Matched concept") %>%
# Simplify Rank_correct
mutate(Rank_correct=fct_collapse(Rank_correct,
lower=c("subspecies", "variety", "infraspecies", "race", "forma"))) %>%
mutate(Rank_correct=fct_explicit_na(Rank_correct, "No_match")) %>%
mutate(Name_short=replace(Name_short,
list=Name_short=="No suitable",
values=NA))
Select species entries that changed after taxonomic standardization, as a way to check the backbone.
name.check <- DT1 %>%
dplyr::select(`Turboveg2 concept`:`Matched concept`, Name_short) %>%
rename(Name_TNRS=Name_short) %>%
distinct() %>%
mutate(Matched_short=word(`Matched concept`, start = 1L, end=2L)) %>%
filter(is.na(Name_TNRS) | Matched_short != Name_TNRS) %>%
dplyr::select(-Matched_short) %>%
arrange(Name_TNRS)
Turboveg2 concept | Matched concept | Name_TNRS |
---|---|---|
Cyclotella comta | Cyclotella comta | Cyclotella |
Lespedeza species #2 | Lespedeza species #2 | Lespedeza |
Alopecurus x brachystylus | Alopecurus x brachystylus | Alopecurus × |
Verbascum leptocladum | Verbascum leptocladum | Verbascum glabratum |
Anisomeridium polypori | Anisomeridium nyssaegenum | Anisomeridium nyssigenum |
Klasea boetica subsp. lusitanica | Klasea boetica subsp. lusitanica | Klasea baetica |
Arenaria armeniaca | Arenaria armeniaca | Eremogone armeniaca |
Lauraceae species [FB 87] | Lauraceae species [FB 87] | Lauraceae |
Lobelia species #5 | Lobelia species #5 | Lobelia |
Petrocoptis pyrenaica subsp. pyrenaica | Petrocoptis pyrenaica subsp. pyrenaica | Silene glaucifolia |
Plinia species | Plinia species | Plinia |
Stereocaulon species | Stereocaulon species | Stereocaulon |
Gesneria acuminata | Gesneria acuminata | Gesneria humilis |
Tephrocactus articulatus | Tephrocactus articulatus | Opuntia articulata |
Tulostoma brumale | Tulostoma brumale | NA |
Fontinalis species | Fontinalis species | Fontinalis |
Michelia shiluensis | Michelia shiluensis | Magnolia shiluensis |
Calamus aff. | Calamus aff. | Calamus |
Lespedeza daurica | Lespedeza daurica | Lespedeza davurica |
Arrabidaea truncata | Arrabidaea truncata | Fridericia truncata |
Launaea acanthoclada | Launaea acanthoclada | Launaea lanifera |
Jagera pseudorhus var. pseudorhus | Jagera pseudorhus var. pseudorhus | Cupania pseudorhus |
Alliaria_petiolata species | Alliaria_petiolata species | Alliaria petiolata |
Hevea cf. guianensis | Hevea cf. guianensis | Hevea guianensis |
Riccia beirichiana | Riccia beirichiana | Riccia beyrichiana |
Thesium billardieri | Thesium billardieri | Thesium billardierei |
Hugonia sp. | Hugonia sp. | Hugonia |
Coreopsis falcata | Coreopsis falcata | Coreopsis gladiata |
Vismia species [CMG 3029] | Vismia species [CMG 3029] | Vismia |
Bufonia mauritanica | Bufonia mauritanica | Bufonia perennis |
Check the most common species names from DT after matching to backbone
name.check.freq <- DT1 %>%
dplyr::select(`Turboveg2 concept`:`Matched concept`, Name_short) %>%
rename(Name_TNRS=Name_short) %>%
group_by(`Turboveg2 concept`, `Matched concept`, Name_TNRS) %>%
summarize(n=n()) %>%
mutate(Matched_short=word(`Matched concept`, start = 1L, end=2L)) %>%
filter(is.na(Name_TNRS) | Matched_short != Name_TNRS) %>%
dplyr::select(-Matched_short) %>%
ungroup() %>%
arrange(desc(n))
Turboveg2 concept | Matched concept | Name_TNRS | n |
---|---|---|---|
Deschampsia flexuosa | Avenella flexuosa | Deschampsia flexuosa | 126515 |
Festuca pratensis | Schedonorus pratensis | Festuca pratensis | 84008 |
Elymus repens | Elytrigia repens | Elymus repens | 82891 |
Phalaris arundinacea | Phalaroides arundinacea | Phalaris arundinacea | 75296 |
Bryophyta species | Bryophyta species | NA | 74393 |
Poa annua | Ochlopoa annua | Poa annua | 67460 |
Potentilla anserina | Argentina anserina | Potentilla anserina | 63786 |
Taraxacum sect. Ruderalia | Taraxacum sect. Taraxacum | Taraxacum | 58429 |
Taraxacum species | Taraxacum species | Taraxacum | 57167 |
Cornus sanguinea | Cornus sanguinea | Cornus controversa | 52651 |
Elytrigia repens | Elytrigia repens | Elymus repens | 51670 |
Taraxacum officinale | Taraxacum sect. Taraxacum | Taraxacum | 50502 |
Weinmannia racemosa | Weinmannia racemosa | Leiospermum racemosum | 38269 |
Bromus erectus | Bromopsis erecta | Bromus erectus | 33765 |
Cladonia species | Cladonia species | Cladonia | 32464 |
Avenella flexuosa | Avenella flexuosa | Deschampsia flexuosa | 30787 |
Rubus sect. Rubus | Rubus sect. Rubus | Rubus | 28684 |
Festuca arundinacea | Schedonorus arundinaceus | Festuca arundinacea | 26124 |
Trientalis europaea | Trientalis europaea | Lysimachia europaea | 25940 |
Rubus fruticosus aggr. | Rubus fruticosus aggr. | Rubus vestitus | 23669 |
Glaux maritima | Glaux maritima | Lysimachia maritima | 23306 |
Taraxacum officinale aggr. | Taraxacum sect. Taraxacum | Taraxacum | 22837 |
Rubus species | Rubus species | Rubus | 22098 |
Festuca gigantea | Schedonorus giganteus | Festuca gigantea | 20917 |
Taraxacum sectie Ruderalia | Taraxacum sect. Taraxacum | Taraxacum | 20888 |
Lophozonia menziesii | Lophozonia menziesii | Lophozonia | 20249 |
Juncus gerardi | Juncus gerardi | Juncus gerardii | 19094 |
Sphagnum species | Sphagnum species | Sphagnum | 18293 |
Festuca rupicola | Festuca stricta subsp. sulcata | Festuca rupicola | 18010 |
Rosa species | Rosa species | Rosa | 16657 |
Podocarpus laetus | Podocarpus laetus | Podocarpus spinulosus | 16356 |
Bromus tectorum | Anisantha tectorum | Bromus tectorum | 16305 |
Carex species | Carex species | Carex | 15744 |
Ripogonum scandens | Ripogonum scandens | Rhipogonum | 14984 |
Rubus hirtus | Rubus hirtus aggr. | Rubus proiectus | 14191 |
Avenula pubescens | Avenula pubescens | Helictotrichon pubescens | 13490 |
Notogrammitis billardierei | Notogrammitis billardierei | NA | 13117 |
Crataegus species | Crataegus species | Crataegus | 13072 |
Helictotrichon pubescens | Avenula pubescens | Helictotrichon pubescens | 12941 |
Erophila verna | Draba verna | Erophila verna | 12646 |
taxon group
Taxon group
information is only available for 35708898 entries, but absent for 7394414. To improve the completeness of this field, we derive additional info from the Backbone
, and merge it with the data already present in DT
.
table(DT1$`Taxon group`, exclude=NULL)
##
## Alga Lichen Moss Mushroom Stonewort
## 9497 324080 2035007 513 12166
## Unknown Vascular plant
## 7394414 33327635
DT1 <- DT1 %>%
mutate(`Taxon group`=ifelse(`Taxon group`=="Unknown", NA, `Taxon group`)) %>%
mutate(Taxongroup_BB=ifelse(Taxongroup_BB=="Unknown", NA, Taxongroup_BB)) %>%
mutate(`Taxon group`=coalesce(`Taxon group`, Taxongroup_BB)) %>%
dplyr::select(-Taxongroup_BB)
table(DT1$`Taxon group`, exclude=NULL)
##
## Alga Lichen Moss Mushroom Stonewort
## 9991 366997 2090994 513 12166
## Vascular plant <NA>
## 40532073 90578
Those taxa for which a measuress of Basal Area exists can be safely assumed to belong to vascular plants
DT1 <- DT1 %>%
mutate(`Taxon group`=replace(`Taxon group`,
list=`Cover code`=="x_BA",
values="Vascular plant"))
Cross-complement Taxon group
information. This means that, whenever a taxon is marked to belong to one group, then assign the same taxon to that group throughout the DT
table.
DT1 <- DT1 %>%
left_join(DT1 %>%
filter(!is.na(Name_short)) %>%
filter(`Taxon group` != "Unknown") %>%
dplyr::select(Name_short, `Taxon group`) %>%
distinct(Name_short, .keep_all=T) %>%
rename(TaxonGroup_compl=`Taxon group`),
by="Name_short") %>%
mutate(`Taxon group`=coalesce(`Taxon group`, TaxonGroup_compl)) %>%
dplyr::select(-TaxonGroup_compl)
table(DT1$`Taxon group`, exclude=NULL)
##
## Alga Lichen Moss Mushroom Stonewort
## 9994 367586 2100627 513 12166
## Vascular plant <NA>
## 40533651 78775
Check species with conflicting Taxon group
information and fix manually.
#check for conflicts in attribution of genera to Taxon groups
DT1 %>%
filter(!is.na(Name_short)) %>%
filter(!is.na(`Taxon group`)) %>%
distinct(Name_short, `Taxon group`) %>%
mutate(Genus=word(Name_short,1)) %>%
dplyr::select(Genus, `Taxon group`) %>%
distinct() %>%
group_by(Genus) %>%
summarize(n=n()) %>%
filter(n>1) %>%
arrange(desc(n))
## # A tibble: 16 x 2
## Genus n
## <chr> <int>
## 1 Brachytheciastrum 2
## 2 Brachythecium 2
## 3 Chara 2
## 4 Characeae 2
## 5 Hepatica 2
## 6 Hypericum 2
## 7 Hypnum 2
## 8 Lamprothamnus 2
## 9 Leptorhaphis 2
## 10 Lychnothamnus 2
## 11 Nitella 2
## 12 Oxymitra 2
## 13 Pancovia 2
## 14 Peltaria 2
## 15 Tonina 2
## 16 Zygodon 2
Manually fix some known problems in Taxon group
attribution. Some lists of taxa (e.g., lichen.genera
, mushroom.genera
) were defined when building the Backbone
.
#Attach genus info
DT1 <- DT1 %>%
left_join(Backbone %>%
dplyr::select(Name_sPlot_TRY, Name_short) %>%
mutate(Genus=word(Name_short, 1, 1)) %>%
dplyr::select(-Name_short) %>%
rename(`Matched concept`=Name_sPlot_TRY),
by="Matched concept") %>%
mutate(`Taxon group`=fct_collapse(`Taxon group`,
Alga_Stonewort=c("Alga", "Stonewort")))
#manually fix some known problems
mosses.gen <- c("Hypnum", "Brachytheciastrum","Brachythecium","Hypnum",
"Zygodon", "Oxymitra", "Bryophyta", "Musci", '\\\"Moos\\\"')
vascular.gen <- c("Polystichum", "Hypericum", "Peltaria", "Pancovia", "Calythrix", "Ripogonum",
"Notogrammitis", "Fuscospora", "Lophozonia", "Rostellularia",
"Hesperostipa", "Microsorium", "Angiosperm","Dicotyledonae", "Spermatophy")
alga.gen <- c("Chara", "Characeae", "Tonina", "Nostoc", "Entermorpha", "Hydrocoleum" )
DT1 <- DT1 %>%
mutate(`Taxon group`=replace(`Taxon group`,
list=Genus %in% mosses.gen,
values="Moss")) %>%
mutate(`Taxon group`=replace(`Taxon group`,
list=Genus %in% vascular.gen,
values="Vascular plant")) %>%
mutate(`Taxon group`=replace(`Taxon group`,
list=Genus %in% alga.gen,
values="Alga_Stonewort")) %>%
mutate(`Taxon group`=replace(`Taxon group`,
list=Genus %in% c(lichen.genera, "Lichenes"),
values="Lichen")) %>%
mutate(`Taxon group`=replace(`Taxon group`,
list=Genus %in% mushroom,
values="Mushroom"))
table(DT1$`Taxon group`, exclude=NULL)
##
## Alga_Stonewort Lichen Moss Mushroom Vascular plant
## 23071 367587 2100767 513 40535429
## <NA>
## 75945
Delete all records of fungi, and use lists of genera to fix additional problems. While in the previous round the matching was done on the resolved Genus name, here the match is based on unresolved Genus names.
DT1 <- DT1 %>%
dplyr::select(-Genus) %>%
left_join(DT1 %>%
distinct(`Matched concept`) %>%
mutate(Genus=word(`Matched concept`, 1)),
by="Matched concept") %>%
mutate(`Taxon group`=replace(`Taxon group`,
list=Genus %in% mushroom,
values = "Mushroom")) %>%
mutate(`Taxon group`=replace(`Taxon group`,
list=Genus %in% lichen.genera,
values="Lichen")) %>%
mutate(`Taxon group`=replace(`Taxon group`,
list=Genus %in% mosses.gen,
values="Moss")) %>%
mutate(`Taxon group`=replace(`Taxon group`,
list=Genus %in% vascular.gen,
values="Vascular plant")) %>%
mutate(`Taxon group` = fct_explicit_na(`Taxon group`, "Unknown")) %>%
filter(`Taxon group`!="Mushroom") %>%
mutate(`Taxon group`=factor(`Taxon group`))
#dplyr::select(-Genus)
table(DT1$`Taxon group`, exclude=NULL)
##
## Alga_Stonewort Lichen Moss Vascular plant Unknown
## 23071 367933 2103429 40572721 35721
After cross-checking all sources of information, the number of taxa not having Taxon group
information decreased to 35721 entries
Species abundance information varies across datasets and plots. While for the large majority of plots abundance values are returned as percentage cover, there is a subset where abundance is returned with different scales. These are marked in the column Cover code
as follows: x_BA - Basal Area
x_IC - Individual count
x_SC - Stem count
x_IV - Relative Importance
x_RF - Relative Frequency
x - Presence absence
Still, it’s not really intuitive that in case Cover code
belongs to one of the classes above, then the actual abundance value is stored in the x_
column. This stems from the way this data is stored in TURBOVEG
.
To make the cover data more user friendly, I simplify the way cover is stored, so that there are only two columns:
Ab_scale
- to report the type of scale used
Abundance
- to coalesce the cover\abundance values previously in the columns Cover %
and x_
.
# Create Ab_scale field
DT1 <- DT1 %>%
mutate(Ab_scale = ifelse(`Cover code` %in%
c("x_BA", "x_IC", "x_SC", "x_IV", "x_RF") & !is.na(x_),
`Cover code`,
"CoverPerc"))
Fix some errors. There are some plots where all species have zeros in the field Cover %
. Some of them are marked as p\a (Cover code=="x"
), but other not. Consider all this plots as presence\absence and transform Cover %
to 1.
!! There are some other plots having layers with all zeros. This should be double-checked, but are not being transformed here !!
allzeroes <- DT1 %>%
group_by(PlotObservationID) %>%
summarize(allzero=all(`Cover %`==0) ) %>%
filter(allzero==T) %>%
pull(PlotObservationID)
DT1 <- DT1 %>%
mutate(`Cover %`=replace(`Cover %`,
list=(PlotObservationID %in% allzeroes),
values=1)) %>%
mutate(`Cover code`=replace(`Cover code`,
list=(PlotObservationID %in% allzeroes),
values="x"))
Consider all plot-layer combinations where Cover code=="x"
, and all the entries of the field Cover % == 1
as presence\absence data, and transform Ab_scale
to “pa”. This is done to avoid confusion with plots where Cover code=="x"
but “x” has to be intended as a class in the cover scale used. For p\a plots, replace the field Cover %
with NA, and assign the value 1 to the field x_
.
#plots with at least one entry in Cover code=="x"
sel <- DT1 %>%
filter(`Cover code`=="x") %>%
distinct(PlotObservationID) %>%
pull(PlotObservationID)
DT1 <- DT1 %>%
left_join(DT1 %>%
filter(PlotObservationID %in% sel) %>%
group_by(PlotObservationID, Layer) %>%
mutate(to.pa= all(`Cover %`==1 & `Cover code`=="x")) %>%
distinct(PlotObservationID, Layer, to.pa),
by=c("PlotObservationID", "Layer")) %>%
replace_na(list(to.pa=F)) %>%
mutate(Ab_scale=ifelse(to.pa==T, "pa", Ab_scale)) %>%
mutate(`Cover %`=ifelse(to.pa==T, NA, `Cover %`)) %>%
mutate(x_=ifelse(to.pa==T, 1, x_)) %>%
dplyr::select(-to.pa)
There are also some plots having different cover scales in the same layer. They are not many, and I will reduce their cover value to p\a.
Find these plots first:
mixed <- DT1 %>%
distinct(PlotObservationID, Ab_scale, Layer) %>%
group_by(PlotObservationID, Layer) %>%
summarize(n=n()) %>%
filter(n>1) %>%
pull(PlotObservationID) %>%
unique()
length(mixed)
## [1] 335
Transform these plots to p\a and correct field Ab_scale
. Note: the column Abundance
is only created here.
DT1 <- DT1 %>%
mutate(Ab_scale=replace(Ab_scale,
list=PlotObservationID %in% mixed,
values="mixed")) %>%
mutate(`Cover %`=replace(`Cover %`,
list=Ab_scale=="mixed",
values=NA)) %>%
mutate(x_=replace(x_, list=Ab_scale=="mixed", values=1)) %>%
mutate(Ab_scale=replace(Ab_scale, list=Ab_scale=="mixed", values="pa")) %>%
#Create additional field Abundance to avoid overwriting original data
mutate(Abundance =ifelse(Ab_scale %in% c("x_BA", "x_IC", "x_SC", "x_IV", "x_RF", "pa"),
x_, `Cover %`)) %>%
mutate(Abundance=replace(Abundance,
list=PlotObservationID %in% mixed,
values=1))
Double check and summarize Ab_scales
scale_check <- DT1 %>%
distinct(PlotObservationID, Layer, Ab_scale) %>%
group_by(PlotObservationID) %>%
summarise(Ab_scale_combined=ifelse(length(unique(Ab_scale))==1,
unique(Ab_scale),
"Multiple_scales"))
nrow(scale_check)== length(unique(DT1$PlotObservationID))
## [1] TRUE
table(scale_check$Ab_scale_combined)
##
## CoverPerc Multiple_scales pa x_BA x_IC
## 1691454 2084 271057 6293 2092
## x_IV x_RF x_SC
## 146 585 4878
Transform abundances to relative abundance. For consistency with the previous version of sPlot, this field is called Relative cover
.
Watch out - Even plots with p\a information are transformed to relative cover.
DT1 <- DT1 %>%
left_join(x=.,
y={.} %>%
group_by(PlotObservationID) %>%
summarize(tot.abundance=sum(Abundance)),
by=c("PlotObservationID")) %>%
mutate(Relative.cover=Abundance/tot.abundance)
# check: there should be no plot where the sum of all relative covers !=0
DT1 %>%
group_by(PlotObservationID) %>%
summarize(tot.cover=sum(Relative.cover),
num.layers=sum(unique(Layer))) %>%
filter(tot.cover != num.layers) %>%
nrow()
## [1] 1958816
DT2 <- DT1 %>%
dplyr::select(PlotObservationID, Name_short, `Turboveg2 concept`, Rank_correct, `Taxon group`, Layer:x_, Ab_scale, Abundance, Relative.cover ) %>%
rename(species_original=`Turboveg2 concept`,
species=Name_short,
taxon_group=`Taxon group`,
cover_perc=`Cover %`,
cover_code=`Cover code`)
The output of the DT table contains 43102875 records, over 1978589 plots. The total number of taxa is 116256 and 76912, before and after standardization, respectively. Information on the Taxon group
is available for 76548 standardized species.
PlotObservationID | species | species_original | Rank_correct | taxon_group | Layer | cover_perc | cover_code | x_ | Ab_scale | Abundance | Relative.cover |
---|---|---|---|---|---|---|---|---|---|---|---|
354857 | Agrostis capillaris | Agrostis capillaris | species | Vascular plant | 0 | NA | x | 1 | pa | 1 | 0.1111111 |
354857 | Ammophila arenaria | Ammophila arenaria | species | Vascular plant | 0 | NA | x | 1 | pa | 1 | 0.1111111 |
354857 | NA | Bryophyta species | higher | Moss | 0 | NA | x | 1 | pa | 1 | 0.1111111 |
354857 | Carex arenaria | Carex arenaria | species | Vascular plant | 0 | NA | x | 1 | pa | 1 | 0.1111111 |
354857 | Festuca vaginata | Festuca rubra subsp. arenaria | species | Vascular plant | 0 | NA | x | 1 | pa | 1 | 0.1111111 |
354857 | Galium verum | Galium verum | species | Vascular plant | 0 | NA | x | 1 | pa | 1 | 0.1111111 |
354857 | Linaria vulgaris | Linaria vulgaris | species | Vascular plant | 0 | NA | x | 1 | pa | 1 | 0.1111111 |
354857 | Poa pratensis | Poa pratensis subsp. pratensis | lower | Vascular plant | 0 | NA | x | 1 | pa | 1 | 0.1111111 |
354857 | Salix | Salix repens subsp. repens var. argentea | genus | Vascular plant | 0 | NA | x | 1 | pa | 1 | 0.1111111 |
1462431 | Agrostis rupestris | Agrostis rupestris | species | Vascular plant | 6 | 3 | 1 | NA | CoverPerc | 3 | 0.0303030 |
1462431 | Deschampsia flexuosa | Avenella flexuosa | species | Vascular plant | 6 | 3 | 1 | NA | CoverPerc | 3 | 0.0303030 |
1462431 | Helictochloa versicolor | Avenula versicolor | species | Vascular plant | 6 | 38 | 3 | NA | CoverPerc | 38 | 0.3838384 |
1462431 | Calamagrostis villosa | Calamagrostis villosa | species | Vascular plant | 6 | 2 |
|
NA | CoverPerc | 2 | 0.0202020 |
1462431 | Campanula alpina | Campanula alpina | species | Vascular plant | 6 | 3 | 1 | NA | CoverPerc | 3 | 0.0303030 |
1462431 | Carex sempervirens | Carex sempervirens | species | Vascular plant | 6 | 3 | 1 | NA | CoverPerc | 3 | 0.0303030 |
1462431 | Cetraria islandica | Cetraria islandica | species | Lichen | 9 | 13 | 2 | NA | CoverPerc | 13 | 0.1313131 |
1462431 | Empetrum nigrum | Empetrum nigrum | species | Vascular plant | 6 | 2 |
|
NA | CoverPerc | 2 | 0.0202020 |
1462431 | Geum montanum | Geum montanum | species | Vascular plant | 6 | 2 |
|
NA | CoverPerc | 2 | 0.0202020 |
1462431 | Hieracium alpinum | Hieracium alpinum | species | Vascular plant | 6 | 2 |
|
NA | CoverPerc | 2 | 0.0202020 |
1462431 | Homogyne alpina | Homogyne alpina | species | Vascular plant | 6 | 3 | 1 | NA | CoverPerc | 3 | 0.0303030 |
1462431 | Hypochaeris uniflora | Hypochaeris uniflora | species | Vascular plant | 6 | 13 | 2 | NA | CoverPerc | 13 | 0.1313131 |
1462431 | Persicaria vivipara | Persicaria vivipara | species | Vascular plant | 6 | 1 | r | NA | CoverPerc | 1 | 0.0101010 |
1462431 | Potentilla aurea | Potentilla aurea | species | Vascular plant | 6 | 2 |
|
NA | CoverPerc | 2 | 0.0202020 |
1462431 | Anemone scherfelii | Pulsatilla alpina subsp. alba auct. sudet. & carpat. | lower | Vascular plant | 6 | 1 | r | NA | CoverPerc | 1 | 0.0101010 |
1462431 | Vaccinium uliginosum | Vaccinium gaultherioides | species | Vascular plant | 6 | 3 | 1 | NA | CoverPerc | 3 | 0.0303030 |
1462431 | Vaccinium myrtillus | Vaccinium myrtillus | species | Vascular plant | 6 | 3 | 1 | NA | CoverPerc | 3 | 0.0303030 |
1462431 | Vaccinium vitis-idaea | Vaccinium vitis-idaea | species | Vascular plant | 6 | 2 |
|
NA | CoverPerc | 2 | 0.0202020 |
1585163 | Betula pubescens | Betula pubescens | species | Vascular plant | 1 | 38 | 3 | NA | CoverPerc | 38 | 0.1397059 |
1585163 | Betula pubescens | Betula pubescens | species | Vascular plant | 2 | 38 | 3 | NA | CoverPerc | 38 | 0.1397059 |
1585163 | Calamagrostis arundinacea | Calamagrostis arundinacea | species | Vascular plant | 6 | 3 | 1 | NA | CoverPerc | 3 | 0.0110294 |
1585163 | Carex rhizina | Carex rhizina | lower | Vascular plant | 6 | 3 | 1 | NA | CoverPerc | 3 | 0.0110294 |
1585163 | Chelidonium majus | Chelidonium majus | species | Vascular plant | 6 | 3 | 1 | NA | CoverPerc | 3 | 0.0110294 |
1585163 | Convallaria majalis | Convallaria majalis | species | Vascular plant | 6 | 3 | 1 | NA | CoverPerc | 3 | 0.0110294 |
1585163 | Dryopteris carthusiana | Dryopteris carthusiana | species | Vascular plant | 6 | 3 | 1 | NA | CoverPerc | 3 | 0.0110294 |
1585163 | Galium mollugo | Galium mollugo | species | Vascular plant | 6 | 3 | 1 | NA | CoverPerc | 3 | 0.0110294 |
1585163 | Hypericum perforatum | Hypericum perforatum | species | Vascular plant | 6 | 3 | 1 | NA | CoverPerc | 3 | 0.0110294 |
1585163 | Melampyrum pratense | Melampyrum pratense | species | Vascular plant | 6 | 3 | 1 | NA | CoverPerc | 3 | 0.0110294 |
1585163 | Moehringia trinervia | Moehringia trinervia | species | Vascular plant | 6 | 3 | 1 | NA | CoverPerc | 3 | 0.0110294 |
1585163 | Oxalis acetosella | Oxalis acetosella | species | Vascular plant | 6 | 3 | 1 | NA | CoverPerc | 3 | 0.0110294 |
1585163 | Picea obovata | Picea obovata | species | Vascular plant | 1 | 68 | 4 | NA | CoverPerc | 68 | 0.2500000 |
1585163 | Picea obovata | Picea obovata | species | Vascular plant | 2 | 68 | 4 | NA | CoverPerc | 68 | 0.2500000 |
1585163 | Pinus sylvestris | Pinus sylvestris | species | Vascular plant | 1 | 3 | 1 | NA | CoverPerc | 3 | 0.0110294 |
1585163 | Pinus sylvestris | Pinus sylvestris | species | Vascular plant | 6 | 3 | 1 | NA | CoverPerc | 3 | 0.0110294 |
1585163 | Rubus idaeus | Rubus idaeus | species | Vascular plant | 6 | 3 | 1 | NA | CoverPerc | 3 | 0.0110294 |
1585163 | Rumex acetosella | Rumex acetosella | species | Vascular plant | 6 | 3 | 1 | NA | CoverPerc | 3 | 0.0110294 |
1585163 | Sambucus racemosa | Sambucus racemosa | species | Vascular plant | 4 | 3 | 1 | NA | CoverPerc | 3 | 0.0110294 |
1585163 | Solidago virgaurea | Solidago virgaurea | species | Vascular plant | 6 | 3 | 1 | NA | CoverPerc | 3 | 0.0110294 |
1585163 | Sorbus aucuparia | Sorbus aucuparia | species | Vascular plant | 4 | 3 | 1 | NA | CoverPerc | 3 | 0.0110294 |
1585163 | Stellaria media | Stellaria media | species | Vascular plant | 6 | 3 | 1 | NA | CoverPerc | 3 | 0.0110294 |
1585163 | Vaccinium myrtillus | Vaccinium myrtillus | species | Vascular plant | 6 | 3 | 1 | NA | CoverPerc | 3 | 0.0110294 |
1585163 | Veronica officinalis | Veronica officinalis | species | Vascular plant | 6 | 3 | 1 | NA | CoverPerc | 3 | 0.0110294 |
save(DT2, file = "../_output/DT_sPlot3.0.RData")
## R version 3.6.3 (2020-02-29)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 16.04.6 LTS
##
## Matrix products: default
## BLAS: /usr/lib/openblas-base/libblas.so.3
## LAPACK: /usr/lib/libopenblasp-r0.2.18.so
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=en_US.UTF-8
## [9] LC_ADDRESS=en_US.UTF-8 LC_TELEPHONE=en_US.UTF-8
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=en_US.UTF-8
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] kableExtra_1.1.0 knitr_1.28 xlsx_0.6.3 forcats_0.5.0
## [5] stringr_1.4.0 dplyr_0.8.5 purrr_0.3.3 readr_1.3.1
## [9] tidyr_1.0.2 tibble_2.1.3 ggplot2_3.3.0 tidyverse_1.3.0
##
## loaded via a namespace (and not attached):
## [1] tidyselect_1.0.0 xfun_0.12 rJava_0.9-11 haven_2.2.0
## [5] lattice_0.20-40 colorspace_1.4-1 vctrs_0.2.3 generics_0.0.2
## [9] viridisLite_0.3.0 htmltools_0.4.0 yaml_2.2.1 utf8_1.1.4
## [13] rlang_0.4.4 pillar_1.4.2 glue_1.3.1 withr_2.1.2
## [17] DBI_1.1.0 dbplyr_1.4.2 modelr_0.1.6 readxl_1.3.1
## [21] lifecycle_0.2.0 munsell_0.5.0 gtable_0.3.0 cellranger_1.1.0
## [25] rvest_0.3.5 evaluate_0.14 xlsxjars_0.6.1 fansi_0.4.1
## [29] highr_0.8 broom_0.5.5 Rcpp_1.0.3 scales_1.1.0
## [33] backports_1.1.5 webshot_0.5.2 jsonlite_1.6.1 fs_1.3.2
## [37] hms_0.5.3 digest_0.6.23 stringi_1.4.6 grid_3.6.3
## [41] cli_2.0.2 tools_3.6.3 magrittr_1.5 crayon_1.3.4
## [45] pkgconfig_2.0.3 ellipsis_0.3.0 xml2_1.2.2 reprex_0.3.0
## [49] lubridate_1.7.4 assertthat_0.2.1 rmarkdown_2.1 httr_1.4.1
## [53] rstudioapi_0.11 R6_2.4.1 nlme_3.1-145 compiler_3.6.3