Timestamp: Thu Mar 25 09:23:03 2021
Drafted: Francesco Maria Sabatini
Revised: Helge Bruelheide
Version: 1.1
This report documents the construction of the DT table for sPlot 3.0. It is based on dataset sPlot_3.0.2, received on 24/07/2019 from Stephan Hennekens.
Caution: Layer information is not available for all species in each plot. In case of missing information Layer is set to zero.
Changes in version 1.1
1) Added explanation of fields
2) Fixed taxon_group
of Friesodielsia
3) Only export the fields Ab_scale
and Abundance
knitr::opts_chunk$set(echo = TRUE)
library(tidyverse)
library(readr)
library(xlsx)
library(knitr)
library(kableExtra)
#save temporary files
write("TMPDIR = /data/sPlot/users/Francesco/_tmp", file=file.path(Sys.getenv('TMPDIR'), '.Renviron'))
write("R_USER = /data/sPlot/users/Francesco/_tmp", file=file.path(Sys.getenv('R_USER'), '.Renviron'))
#rasterOptions(tmpdir="/data/sPlot/users/Francesco/_tmp")
Search and replace unclosed quotation marks and escape them. Run in Linux terminal
# escape all double quotation marks. Run in Linux terminal
# sed 's/"/\\"/g' sPlot_3_0_2_species.csv > sPlot_3_0_2_species_test.csv
DT table is the species x plot matrix, in long format.
DT0 <- readr::read_delim("../sPlot_data_export/sPlot_3_0_2_species_test.csv",
delim="\t",
col_type = cols(
PlotObservationID = col_double(),
Taxonomy = col_character(),
`Taxon group` = col_character(),
`Taxon group ID` = col_double(),
`Turboveg2 concept` = col_character(),
`Matched concept` = col_character(),
Match = col_double(),
Layer = col_double(),
`Cover %` = col_double(),
`Cover code` = col_character(),
x_ = col_double()
)
)
Match plots with those in header
load("../_output/header_sPlot3.0.RData")
DT0 <- DT0 %>%
filter(PlotObservationID %in% unique(header$PlotObservationID))
nplots <- length(unique(DT0$PlotObservationID))
nspecies <- length(unique(DT0$`Matched concept`))
# Plots in header but not in DT
empty.plots <- header %>%
filter(!PlotObservationID %in% unique(DT0$PlotObservationID)) %>%
pull(PlotObservationID)
The DT table includes 43093474 species * plot records, across 1977540 plots. Before taxonomic resolution, there are 107676 species. There are 97. These are plots where the only species reported in Turboveg 3 are not identified (and not in the taxonomic list). Should these be deleted from header
?
PlotObservationID | Taxonomy | Taxon group | Taxon group ID | Turboveg2 concept | Matched concept | Match | Layer | Cover % | Cover code | x_ |
---|---|---|---|---|---|---|---|---|---|---|
532404 | EU-Europe | Vascular plant | 1 | Amaranthus lividus | Amaranthus blitum | 3 | 6 | 13.0 | 2 | NA |
532404 | EU-Europe | Vascular plant | 1 | Amaranthus powellii | Amaranthus powellii | 3 | 6 | 3.0 | 1 | NA |
532404 | EU-Europe | Vascular plant | 1 | Amaranthus retroflexus | Amaranthus retroflexus | 3 | 6 | 13.0 | 2 | NA |
532404 | EU-Europe | Vascular plant | 1 | Brassica rapa | Brassica rapa | 3 | 6 | 2.0 |
|
NA |
532404 | EU-Europe | Vascular plant | 1 | Calystegia sepium | Calystegia sepium | 3 | 6 | 2.0 |
|
NA |
532404 | EU-Europe | Vascular plant | 1 | Capsella bursa-pastoris | Capsella bursa-pastoris | 3 | 6 | 2.0 |
|
NA |
532404 | EU-Europe | Vascular plant | 1 | Chamomilla recutita | Matricaria chamomilla | 3 | 6 | 2.0 |
|
NA |
532404 | EU-Europe | Vascular plant | 1 | Chenopodium album | Chenopodium album | 3 | 6 | 3.0 | 1 | NA |
532404 | EU-Europe | Vascular plant | 1 | Chenopodium ficifolium | Chenopodium ficifolium | 3 | 6 | 2.0 |
|
NA |
532404 | EU-Europe | Vascular plant | 1 | Chenopodium polyspermum | Lipandra polysperma | 3 | 6 | 3.0 | 1 | NA |
532404 | EU-Europe | Vascular plant | 1 | Cirsium arvense | Cirsium arvense | 3 | 6 | 2.0 |
|
NA |
532404 | EU-Europe | Vascular plant | 1 | Convolvulus arvensis | Convolvulus arvensis | 3 | 6 | 13.0 | 2 | NA |
532404 | EU-Europe | Vascular plant | 1 | Digitaria sanguinalis | Digitaria sanguinalis | 3 | 6 | 2.0 |
|
NA |
532404 | EU-Europe | Vascular plant | 1 | Echinochloa crus-galli | Echinochloa crus-galli | 3 | 6 | 3.0 | 1 | NA |
532404 | EU-Europe | Vascular plant | 1 | Galinsoga ciliata | Galinsoga quadriradiata | 3 | 6 | 3.0 | 1 | NA |
532404 | EU-Europe | Vascular plant | 1 | Galinsoga parviflora | Galinsoga parviflora | 3 | 6 | 38.0 | 3 | NA |
532404 | EU-Europe | Vascular plant | 1 | Geranium dissectum | Geranium dissectum | 3 | 6 | 2.0 |
|
NA |
532404 | EU-Europe | Vascular plant | 1 | Lamium purpureum | Lamium purpureum | 3 | 6 | 2.0 |
|
NA |
532404 | EU-Europe | Vascular plant | 1 | Lolium perenne | Lolium perenne | 3 | 6 | 2.0 |
|
NA |
532404 | EU-Europe | Vascular plant | 1 | Phacelia tanacetifolia | Phacelia tanacetifolia | 3 | 6 | 2.0 |
|
NA |
532404 | EU-Europe | Vascular plant | 1 | Polygonum lapathifolium | Persicaria lapathifolia | 3 | 6 | 2.0 |
|
NA |
532404 | EU-Europe | Vascular plant | 1 | Polygonum persicaria | Persicaria maculosa | 3 | 6 | 2.0 |
|
NA |
532404 | EU-Europe | Vascular plant | 1 | Setaria pumila | Setaria pumila | 3 | 6 | 2.0 |
|
NA |
532404 | EU-Europe | Vascular plant | 1 | Stachys arvensis | Stachys arvensis | 3 | 6 | 2.0 |
|
NA |
532404 | EU-Europe | Vascular plant | 1 | Stellaria media | Stellaria media | 3 | 6 | 3.0 | 1 | NA |
532404 | EU-Europe | Vascular plant | 1 | Taraxacum officinale | Taraxacum sect. Taraxacum | 3 | 6 | 2.0 |
|
NA |
532404 | EU-Europe | Vascular plant | 1 | Veronica persica | Veronica persica | 3 | 6 | 2.0 |
|
NA |
1648095 | RU-Russia | Vascular plant | 1 | Acer campestre | Acer campestre | 3 | 6 | 0.1 | .1 | NA |
1648095 | RU-Russia | Vascular plant | 1 | Acer platanoides | Acer platanoides | 3 | 6 | 0.1 | .1 | NA |
1648095 | RU-Russia | Vascular plant | 1 | Acer pseudoplatanus | Acer pseudoplatanus | 3 | 6 | 0.1 | .1 | NA |
1648095 | RU-Russia | Vascular plant | 1 | Aegonychon purpureocaeruleum | Aegonychon purpurocaeruleum | 3 | 6 | 0.1 | .1 | NA |
1648095 | RU-Russia | Vascular plant | 1 | Anemonoides nemorosa | Anemone nemorosa | 3 | 6 | 0.1 | .1 | NA |
1648095 | RU-Russia | Vascular plant | 1 | Anemonoides ranunculoides | Anemone ranunculoides | 3 | 6 | 10.0 | 10 | NA |
1648095 | RU-Russia | Vascular plant | 1 | Asarum europaeum | Asarum europaeum | 3 | 6 | 0.1 | .1 | NA |
1648095 | RU-Russia | Vascular plant | 1 | Brachypodium sylvaticum | Brachypodium sylvaticum | 3 | 6 | 0.1 | .1 | NA |
1648095 | RU-Russia | Moss | 3 | Brachythecium velutinum | Brachytheciastrum velutinum | 1 | 9 | 0.1 | .1 | NA |
1648095 | RU-Russia | Vascular plant | 1 | Campanula rapunculoides | Campanula rapunculoides | 3 | 6 | 0.1 | .1 | NA |
1648095 | RU-Russia | Vascular plant | 1 | Carex contigua | Carex spicata | 3 | 6 | 0.1 | .1 | NA |
1648095 | RU-Russia | Vascular plant | 1 | Cerasus avium | Prunus avium | 3 | 1 | 12.0 | 12 | NA |
1648095 | RU-Russia | Vascular plant | 1 | Cerasus avium | Prunus avium | 3 | 6 | 0.1 | .1 | NA |
1648095 | RU-Russia | Vascular plant | 1 | Cornus mas | Cornus mas | 3 | 4 | 13.0 | 13 | NA |
1648095 | RU-Russia | Vascular plant | 1 | Cornus mas | Cornus mas | 3 | 6 | 0.1 | .1 | NA |
1648095 | RU-Russia | Vascular plant | 1 | Corylus avellana | Corylus avellana | 3 | 4 | 13.0 | 13 | NA |
1648095 | RU-Russia | Vascular plant | 1 | Corylus avellana | Corylus avellana | 3 | 6 | 0.1 | .1 | NA |
1648095 | RU-Russia | Vascular plant | 1 | Crataegus curvisepala | Crataegus rhipidophylla | 3 | 4 | 0.1 | .1 | NA |
1648095 | RU-Russia | Vascular plant | 1 | Euonymus europaea | Euonymus europaeus | 3 | 6 | 0.1 | .1 | NA |
1648095 | RU-Russia | Vascular plant | 1 | Euonymus verrucosa | Euonymus verrucosus | 3 | 4 | 0.1 | .1 | NA |
1648095 | RU-Russia | Vascular plant | 1 | Euonymus verrucosa | Euonymus verrucosus | 3 | 6 | 0.1 | .1 | NA |
1648095 | RU-Russia | Vascular plant | 1 | Fraxinus excelsior | Fraxinus excelsior | 3 | 1 | 57.0 | 57 | NA |
1648095 | RU-Russia | Vascular plant | 1 | Fraxinus excelsior | Fraxinus excelsior | 3 | 4 | 0.1 | .1 | NA |
1648095 | RU-Russia | Vascular plant | 1 | Fraxinus excelsior | Fraxinus excelsior | 3 | 6 | 0.1 | .1 | NA |
1648095 | RU-Russia | Vascular plant | 1 | Gagea lutea | Gagea lutea | 3 | 6 | 0.1 | .1 | NA |
1648095 | RU-Russia | Vascular plant | 1 | Galeobdolon luteum | Lamium galeobdolon subsp. galeobdolon | 3 | 6 | 0.1 | .1 | NA |
1648095 | RU-Russia | Vascular plant | 1 | Lilium martagon | Lilium martagon | 3 | 6 | 0.1 | .1 | NA |
1648095 | RU-Russia | Vascular plant | 1 | Malus sylvestris | Malus sylvestris | 3 | 4 | 0.1 | .1 | NA |
1648095 | RU-Russia | Vascular plant | 1 | Polygonatum multiflorum | Polygonatum multiflorum | 3 | 6 | 0.1 | .1 | NA |
1648095 | RU-Russia | Vascular plant | 1 | Primula veris | Primula veris | 3 | 6 | 0.1 | .1 | NA |
1648095 | RU-Russia | Vascular plant | 1 | Pulmonaria obscura | Pulmonaria obscura | 3 | 6 | 0.1 | .1 | NA |
1648095 | RU-Russia | Vascular plant | 1 | Quercus robur | Quercus robur | 3 | 1 | 8.0 | 8 | NA |
1648095 | RU-Russia | Vascular plant | 1 | Quercus robur | Quercus robur | 3 | 6 | 0.1 | .1 | NA |
1648095 | RU-Russia | Vascular plant | 1 | Scilla bifolia | Scilla bifolia | 3 | 6 | 0.1 | .1 | NA |
1648095 | RU-Russia | Vascular plant | 1 | Viburnum lantana | Viburnum lantana | 3 | 4 | 0.1 | .1 | NA |
1648095 | RU-Russia | Vascular plant | 1 | Viburnum lantana | Viburnum lantana | 3 | 6 | 0.1 | .1 | NA |
1648095 | RU-Russia | Vascular plant | 1 | Viola hirta | Viola hirta | 3 | 6 | 0.1 | .1 | NA |
1648095 | RU-Russia | Vascular plant | 1 | Viola mirabilis | Viola mirabilis | 3 | 6 | 6.0 | 6 | NA |
1648095 | RU-Russia | Vascular plant | 1 | Viola odorata | Viola odorata | 3 | 6 | 20.0 | 20 | NA |
1839189 | NSW_Austalia | Unknown | 0 | Acacia aneura | Acacia aneura | 0 | 0 | 1.0 | x | NA |
Import taxonomic backbone
load("../_output/Backbone3.0.RData")
Match to DT0, using Taxonomic concept
as matching key. This is the field that was used to build, and resolve, the Backbone.
DT1 <- DT0 %>%
left_join(Backbone %>%
dplyr::select(Name_sPlot_TRY, Name_short, `Taxon group`, Rank_correct) %>%
rename(`Matched concept`=Name_sPlot_TRY,
Taxongroup_BB=`Taxon group`),
by="Matched concept") %>%
# Simplify Rank_correct
mutate(Rank_correct=fct_collapse(Rank_correct,
lower=c("subspecies", "variety", "infraspecies", "race", "forma"))) %>%
mutate(Rank_correct=fct_explicit_na(Rank_correct, "No_match")) %>%
mutate(Name_short=replace(Name_short,
list=Name_short=="No suitable",
values=NA))
Select species entries that changed after taxonomic standardization, as a way to check the backbone.
name.check <- DT1 %>%
dplyr::select(`Turboveg2 concept`:`Matched concept`, Name_short) %>%
rename(Name_TNRS=Name_short) %>%
distinct() %>%
mutate(Matched_short=word(`Matched concept`, start = 1L, end=2L)) %>%
filter(is.na(Name_TNRS) | Matched_short != Name_TNRS) %>%
dplyr::select(-Matched_short) %>%
arrange(Name_TNRS)
Turboveg2 concept | Matched concept | Name_TNRS |
---|---|---|
Lomatium species | Lomatium species | Lomatium |
Angelica_atropurpurea species | Angelica_atropurpurea species | Angelica atropurpurea |
Artemisia scoparlia | Artemisia scoparlia | Artemisia scoparia |
Alternanthera denticulata | Alternanthera denticulata | Alternanthera sessilis |
Eryngium x chevalieri | Eryngium x chevalieri | Eryngium × |
Anvillea radiata | Anvillea radiata | Anvillea garcinii |
Manilkara surinamensis | Manilkara surinamensis | Manilkara bidentata |
Acacia penninervis var. longiracemosa | Acacia penninervis var. longiracemosa | Racosperma penninerve |
Anthurium species #7 | Anthurium species #7 | Anthurium |
Plagiothecium succul | Plagiothecium succul | Plagiothecium |
Neslia species | Neslia species | Neslia |
Deplanchea species [M1] | Deplanchea species [M1] | Deplanchea |
Elsholtzia stauntoni | Elsholtzia stauntoni | Elsholtzia stauntonii |
Scrophulariaceae 2915 | Scrophulariaceae 2915 | Scrophulariaceae |
Melanthera aspera | Melanthera aspera | Melanthera nivea |
Lauraceae species [NPZ 5081] | Lauraceae species [NPZ 5081] | Lauraceae |
Betula pendula x pubesens | Betula x aurata | Betula aurata |
Algae (spp) | Algae (spp) | NA |
Celastrus_orbiculatus species | Celastrus_orbiculatus species | Celastrus orbiculatus |
Platysace sp. Eneabba (R. Hnatiuk 770001) | Platysace sp. Eneabba (R. Hnatiuk 770001) | Platysace |
[AT469 Indigofera] | [AT469 Indigofera] | Indigofera |
Scabiosa simplex | Lomelosia simplex | Scabiosa stellata |
Calamagrostis laguroides | Calamagrostis laguroides | Calamagrostis anthoxanthoides |
Guettarda species | Guettarda species | Guettarda |
Philodendron urbanianum | Philodendron urbanianum | Philodendron consanguineum |
Connarus species #1 | Connarus species #1 | Connarus |
Dactylorhiza baltica | Dactylorhiza majalis subsp. baltica | Dactylorhiza baltica |
Lonchocarpu michelianus | Lonchocarpu michelianus | Lonchocarpus michelianus |
Aglaia elliptifolia | Aglaia elliptifolia | Aglaia rimosa |
Eupatorium recurvans | Eupatorium recurvans | Eupatorium mohrii |
Check the most common species names from DT after matching to backbone
name.check.freq <- DT1 %>%
dplyr::select(`Turboveg2 concept`:`Matched concept`, Name_short) %>%
rename(Name_TNRS=Name_short) %>%
group_by(`Turboveg2 concept`, `Matched concept`, Name_TNRS) %>%
summarize(n=n()) %>%
mutate(Matched_short=word(`Matched concept`, start = 1L, end=2L)) %>%
filter(is.na(Name_TNRS) | Matched_short != Name_TNRS) %>%
dplyr::select(-Matched_short) %>%
ungroup() %>%
arrange(desc(n))
## `summarise()` has grouped output by 'Turboveg2 concept', 'Matched concept'. You can override using the `.groups` argument.
Turboveg2 concept | Matched concept | Name_TNRS | n |
---|---|---|---|
Deschampsia flexuosa | Avenella flexuosa | Deschampsia flexuosa | 126514 |
Festuca pratensis | Schedonorus pratensis | Festuca pratensis | 84008 |
Elymus repens | Elytrigia repens | Elymus repens | 82891 |
Phalaris arundinacea | Phalaroides arundinacea | Phalaris arundinacea | 75296 |
Bryophyta species | Bryophyta species | NA | 74393 |
Poa annua | Ochlopoa annua | Poa annua | 67460 |
Potentilla anserina | Argentina anserina | Potentilla anserina | 63786 |
Taraxacum sect. Ruderalia | Taraxacum sect. Taraxacum | Taraxacum | 58429 |
Taraxacum species | Taraxacum species | Taraxacum | 57167 |
Cornus sanguinea | Cornus sanguinea | Cornus controversa | 52651 |
Elytrigia repens | Elytrigia repens | Elymus repens | 51670 |
Taraxacum officinale | Taraxacum sect. Taraxacum | Taraxacum | 50502 |
Weinmannia racemosa | Weinmannia racemosa | Leiospermum racemosum | 38269 |
Bromus erectus | Bromopsis erecta | Bromus erectus | 33765 |
Cladonia species | Cladonia species | Cladonia | 32464 |
Avenella flexuosa | Avenella flexuosa | Deschampsia flexuosa | 30787 |
Rubus sect. Rubus | Rubus sect. Rubus | Rubus | 28684 |
Festuca arundinacea | Schedonorus arundinaceus | Festuca arundinacea | 26124 |
Trientalis europaea | Trientalis europaea | Lysimachia europaea | 25940 |
Rubus fruticosus aggr. | Rubus fruticosus aggr. | Rubus vestitus | 23669 |
Glaux maritima | Glaux maritima | Lysimachia maritima | 23305 |
Taraxacum officinale aggr. | Taraxacum sect. Taraxacum | Taraxacum | 22837 |
Rubus species | Rubus species | Rubus | 22098 |
Festuca gigantea | Schedonorus giganteus | Festuca gigantea | 20917 |
Taraxacum sectie Ruderalia | Taraxacum sect. Taraxacum | Taraxacum | 20888 |
Lophozonia menziesii | Lophozonia menziesii | Lophozonia | 20249 |
Juncus gerardi | Juncus gerardi | Juncus gerardii | 19094 |
Sphagnum species | Sphagnum species | Sphagnum | 18293 |
Festuca rupicola | Festuca stricta subsp. sulcata | Festuca rupicola | 18010 |
Rosa species | Rosa species | Rosa | 16657 |
Podocarpus laetus | Podocarpus laetus | Podocarpus spinulosus | 16356 |
Bromus tectorum | Anisantha tectorum | Bromus tectorum | 16302 |
Carex species | Carex species | Carex | 15744 |
Ripogonum scandens | Ripogonum scandens | Rhipogonum | 14984 |
Rubus hirtus | Rubus hirtus aggr. | Rubus proiectus | 14191 |
Avenula pubescens | Avenula pubescens | Helictotrichon pubescens | 13490 |
Notogrammitis billardierei | Notogrammitis billardierei | NA | 13117 |
Crataegus species | Crataegus species | Crataegus | 13072 |
Helictotrichon pubescens | Avenula pubescens | Helictotrichon pubescens | 12941 |
Erophila verna | Draba verna | Erophila verna | 12646 |
taxon group
Taxon group
information is only available for 35699079 entries, but absent for 7394395. To improve the completeness of this field, we derive additional info from the Backbone
, and merge it with the data already present in DT
.
table(DT1$`Taxon group`, exclude=NULL)
##
## Alga Lichen Moss Mushroom Stonewort
## 9497 324002 2034938 513 12166
## Unknown Vascular plant
## 7394395 33317963
DT1 <- DT1 %>%
mutate(`Taxon group`=ifelse(`Taxon group`=="Unknown", NA, `Taxon group`)) %>%
mutate(Taxongroup_BB=ifelse(Taxongroup_BB=="Unknown", NA, Taxongroup_BB)) %>%
mutate(`Taxon group`=coalesce(`Taxon group`, Taxongroup_BB)) %>%
dplyr::select(-Taxongroup_BB)
table(DT1$`Taxon group`, exclude=NULL)
##
## Alga Lichen Moss Mushroom Stonewort
## 9991 366919 2090925 513 12166
## Vascular plant <NA>
## 40522355 90605
Those taxa for which a measures of Basal Area exists can be safely assumed to belong to vascular plants
DT1 <- DT1 %>%
mutate(`Taxon group`=replace(`Taxon group`,
list=`Cover code`=="x_BA",
values="Vascular plant"))
Cross-complement Taxon group
information. This means that, whenever a taxon is marked to belong to one group, then assign the same taxon to that group throughout the DT
table.
DT1 <- DT1 %>%
left_join(DT1 %>%
filter(!is.na(Name_short)) %>%
filter(`Taxon group` != "Unknown") %>%
dplyr::select(Name_short, `Taxon group`) %>%
distinct(Name_short, .keep_all=T) %>%
rename(TaxonGroup_compl=`Taxon group`),
by="Name_short") %>%
mutate(`Taxon group`=coalesce(`Taxon group`, TaxonGroup_compl)) %>%
dplyr::select(-TaxonGroup_compl)
table(DT1$`Taxon group`, exclude=NULL)
##
## Alga Lichen Moss Mushroom Stonewort
## 9994 367508 2100558 513 12193
## Vascular plant <NA>
## 40523933 78775
Check species with conflicting Taxon group
information and fix manually.
#check for conflicts in attribution of genera to Taxon groups
DT1 %>%
filter(!is.na(Name_short)) %>%
filter(!is.na(`Taxon group`)) %>%
distinct(Name_short, `Taxon group`) %>%
mutate(Genus=word(Name_short,1)) %>%
dplyr::select(Genus, `Taxon group`) %>%
distinct() %>%
group_by(Genus) %>%
summarize(n=n()) %>%
filter(n>1) %>%
arrange(desc(n))
## # A tibble: 15 x 2
## Genus n
## <chr> <int>
## 1 Brachytheciastrum 2
## 2 Brachythecium 2
## 3 Chara 2
## 4 Characeae 2
## 5 Hepatica 2
## 6 Hypericum 2
## 7 Hypnum 2
## 8 Leptorhaphis 2
## 9 Lychnothamnus 2
## 10 Nitella 2
## 11 Oxymitra 2
## 12 Pancovia 2
## 13 Peltaria 2
## 14 Tonina 2
## 15 Zygodon 2
Manually fix some known problems in Taxon group
attribution. Some lists of taxa (e.g., lichen.genera
, mushroom.genera
) were defined when building the Backbone
.
#Attach genus info
DT1 <- DT1 %>%
left_join(Backbone %>%
dplyr::select(Name_sPlot_TRY, Name_short) %>%
mutate(Genus=word(Name_short, 1, 1)) %>%
dplyr::select(-Name_short) %>%
rename(`Matched concept`=Name_sPlot_TRY),
by="Matched concept") %>%
mutate(`Taxon group`=fct_collapse(`Taxon group`,
Alga_Stonewort=c("Alga", "Stonewort")))
#manually fix some known problems
mosses.gen <- c("Hypnum", "Brachytheciastrum","Brachythecium","Hypnum",
"Zygodon", "Oxymitra", "Bryophyta", "Musci", '\\\"Moos\\\"')
vascular.gen <- c("Polystichum", "Hypericum", "Peltaria", "Pancovia", "Calythrix", "Ripogonum",
"Notogrammitis", "Fuscospora", "Lophozonia", "Rostellularia",
"Hesperostipa", "Microsorium", "Angiosperm","Dicotyledonae", "Spermatophy",
"Oxymitra", "Friesodielsia")
alga.gen <- c("Chara", "Characeae", "Tonina", "Nostoc", "Entermorpha", "Hydrocoleum" )
DT1 <- DT1 %>%
mutate(`Taxon group`=replace(`Taxon group`,
list=Genus %in% mosses.gen,
values="Moss")) %>%
mutate(`Taxon group`=replace(`Taxon group`,
list=Genus %in% vascular.gen,
values="Vascular plant")) %>%
mutate(`Taxon group`=replace(`Taxon group`,
list=Genus %in% alga.gen,
values="Alga_Stonewort")) %>%
mutate(`Taxon group`=replace(`Taxon group`,
list=Genus %in% c(lichen.genera, "Lichenes"),
values="Lichen")) %>%
mutate(`Taxon group`=replace(`Taxon group`,
list=Genus %in% mushroom,
values="Mushroom"))
table(DT1$`Taxon group`, exclude=NULL)
##
## Alga_Stonewort Lichen Moss Mushroom Vascular plant
## 23098 367509 2100635 513 40525774
## <NA>
## 75945
Delete all records of fungi, and use lists of genera to fix additional problems. While in the previous round the matching was done on the resolved Genus name, here the match is based on unresolved Genus names.
DT1 <- DT1 %>%
dplyr::select(-Genus) %>%
left_join(DT1 %>%
distinct(`Matched concept`) %>%
mutate(Genus=word(`Matched concept`, 1)),
by="Matched concept") %>%
mutate(`Taxon group`=replace(`Taxon group`,
list=Genus %in% mushroom,
values = "Mushroom")) %>%
mutate(`Taxon group`=replace(`Taxon group`,
list=Genus %in% lichen.genera,
values="Lichen")) %>%
mutate(`Taxon group`=replace(`Taxon group`,
list=Genus %in% mosses.gen,
values="Moss")) %>%
mutate(`Taxon group`=replace(`Taxon group`,
list=Genus %in% vascular.gen,
values="Vascular plant")) %>%
mutate(`Taxon group` = fct_explicit_na(`Taxon group`, "Unknown")) %>%
filter(`Taxon group`!="Mushroom") %>%
mutate(`Taxon group`=factor(`Taxon group`))
#dplyr::select(-Genus)
table(DT1$`Taxon group`, exclude=NULL)
##
## Alga_Stonewort Lichen Moss Vascular plant Unknown
## 23098 367855 2103292 40563071 35721
After cross-checking all sources of information, the number of taxa not having Taxon group
information decreased to 35721 entries
Species abundance information varies across datasets and plots. While for the large majority of plots abundance values are returned as percentage cover, there is a subset where abundance is returned with different scales. These are marked in the column Cover code
as follows:
x_BA - Basal Area
x_IC - Individual count
x_SC - Stem count
x_IV - Relative Importance
x_RF - Relative Frequency
x - Presence absence
Still, it’s not really intuitive that in case Cover code
belongs to one of the classes above, then the actual abundance value is stored in the x_
column. This stems from the way this data is stored in TURBOVEG
.
To make the cover data more user friendly, I simplify the way cover it is stored, so that there are only two columns:
Ab_scale
- to report the type of scale used
Abundance
- to coalesce the cover\abundance values previously in the columns Cover %
and x_
.
# Create Ab_scale field
DT1 <- DT1 %>%
mutate(Ab_scale = ifelse(`Cover code` %in%
c("x_BA", "x_IC", "x_SC", "x_IV", "x_RF") & !is.na(x_),
`Cover code`,
"CoverPerc"))
Fix some errors. There are some plots where all species have zeros in the field Cover %
. Some of them are marked as p\a (Cover code=="x"
), but other not. Consider all this plots as presence\absence and transform Cover %
to 1.
allzeroes <- DT1 %>%
group_by(PlotObservationID) %>%
summarize(allzero=all(`Cover %`==0) ) %>%
filter(allzero==T) %>%
pull(PlotObservationID)
DT1 <- DT1 %>%
mutate(`Cover %`=replace(`Cover %`,
list=(PlotObservationID %in% allzeroes),
values=1)) %>%
mutate(`Cover code`=replace(`Cover code`,
list=(PlotObservationID %in% allzeroes),
values="x"))
Consider all plot-layer combinations where Cover code=="x"
, and all the entries of the field Cover % == 1
as presence\absence data, and transform Ab_scale
to “pa”. This is done to avoid confusion with plots where Cover code=="x"
but “x” has to be intended as a class in the cover scale used. For p\a plots, replace the field Cover %
with NA, and assign the value 1 to the field x_
.
#plots with at least one entry in Cover code=="x"
sel <- DT1 %>%
filter(`Cover code`=="x") %>%
distinct(PlotObservationID) %>%
pull(PlotObservationID)
DT1 <- DT1 %>%
left_join(DT1 %>%
filter(PlotObservationID %in% sel) %>%
group_by(PlotObservationID, Layer) %>%
mutate(to.pa= all(`Cover %`==1 & `Cover code`=="x")) %>%
distinct(PlotObservationID, Layer, to.pa),
by=c("PlotObservationID", "Layer")) %>%
replace_na(list(to.pa=F)) %>%
mutate(Ab_scale=ifelse(to.pa==T, "pa", Ab_scale)) %>%
mutate(`Cover %`=ifelse(to.pa==T, NA, `Cover %`)) %>%
mutate(x_=ifelse(to.pa==T, 1, x_)) %>%
dplyr::select(-to.pa)
There are also some plots having different cover scales in the same layer. They are not many, and I will reduce their cover value to p\a.
Find these plots first:
mixed <- DT1 %>%
distinct(PlotObservationID, Ab_scale, Layer) %>%
group_by(PlotObservationID, Layer) %>%
summarize(n=n()) %>%
filter(n>1) %>%
distinct(PlotObservationID) %>%
pull(PlotObservationID)
## `summarise()` has grouped output by 'PlotObservationID'. You can override using the `.groups` argument.
length(mixed)
## [1] 335
Transform these plots to p\a and correct field Ab_scale
. Note: the column Abundance
is only created here.
DT1 <- DT1 %>%
mutate(Ab_scale=replace(Ab_scale,
list=PlotObservationID %in% mixed,
values="mixed")) %>%
mutate(`Cover %`=replace(`Cover %`,
list=Ab_scale=="mixed",
values=NA)) %>%
mutate(x_=replace(x_, list=Ab_scale=="mixed", values=1)) %>%
mutate(Ab_scale=replace(Ab_scale, list=Ab_scale=="mixed", values="pa")) %>%
#Create additional field Abundance to avoid overwriting original data
mutate(Abundance =ifelse(Ab_scale %in% c("x_BA", "x_IC", "x_SC", "x_IV", "x_RF", "pa"),
x_, `Cover %`)) %>%
mutate(Abundance=replace(Abundance,
list=PlotObservationID %in% mixed,
values=1))
Double check and summarize Ab_scales
scale_check <- DT1 %>%
distinct(PlotObservationID, Layer, Ab_scale) %>%
group_by(PlotObservationID) %>%
summarise(Ab_scale_combined=ifelse(length(unique(Ab_scale))==1,
unique(Ab_scale),
"Multiple_scales"))
nrow(scale_check)== length(unique(DT1$PlotObservationID))
## [1] TRUE
table(scale_check$Ab_scale_combined)
##
## CoverPerc Multiple_scales pa x_BA x_IC
## 1690405 2084 271057 6293 2092
## x_IV x_RF x_SC
## 146 585 4878
Transform abundances to relative abundance. For consistency with the previous version of sPlot, this field is called Relative_cover
.
Watch out - Even plots with p\a information are transformed to relative cover.
DT1 <- DT1 %>%
left_join(x=.,
y={.} %>%
group_by(PlotObservationID) %>%
summarize(tot.abundance=sum(Abundance)),
by=c("PlotObservationID")) %>%
mutate(Relative.cover=Abundance/tot.abundance)
DT2 <- DT1 %>%
dplyr::select(PlotObservationID, Name_short, `Turboveg2 concept`, Rank_correct, `Taxon group`, Layer:x_, Ab_scale, Abundance, Relative.cover ) %>%
rename(Species_original=`Turboveg2 concept`,
Species=Name_short,
Taxon_group=`Taxon group`,
Cover_perc=`Cover %`,
Cover_code=`Cover code`,
Relative_cover=Relative.cover) %>%
## change in Version 1.1.
dplyr::select(-x_, -Cover_perc)
The output of the DT table contains 43093037 records, over 1977540 plots. The total number of taxa is 116256 and 0, before and after standardization, respectively. Information on the Taxon group
is available for 76548 standardized species.
PlotObservationID | Species | Species_original | Rank_correct | Taxon_group | Layer | Cover_code | Ab_scale | Abundance | Relative_cover |
---|---|---|---|---|---|---|---|---|---|
532404 | Amaranthus blitum | Amaranthus lividus | species | Vascular plant | 6 | 2 | CoverPerc | 13.0 | 0.1007752 |
532404 | Amaranthus powellii | Amaranthus powellii | species | Vascular plant | 6 | 1 | CoverPerc | 3.0 | 0.0232558 |
532404 | Amaranthus retroflexus | Amaranthus retroflexus | species | Vascular plant | 6 | 2 | CoverPerc | 13.0 | 0.1007752 |
532404 | Brassica rapa | Brassica rapa | species | Vascular plant | 6 |
|
CoverPerc | 2.0 | 0.0155039 |
532404 | Calystegia sepium | Calystegia sepium | species | Vascular plant | 6 |
|
CoverPerc | 2.0 | 0.0155039 |
532404 | Capsella bursa-pastoris | Capsella bursa-pastoris | species | Vascular plant | 6 |
|
CoverPerc | 2.0 | 0.0155039 |
532404 | Matricaria chamomilla | Chamomilla recutita | higher | Vascular plant | 6 |
|
CoverPerc | 2.0 | 0.0155039 |
532404 | Chenopodium album | Chenopodium album | species | Vascular plant | 6 | 1 | CoverPerc | 3.0 | 0.0232558 |
532404 | Chenopodium ficifolium | Chenopodium ficifolium | species | Vascular plant | 6 |
|
CoverPerc | 2.0 | 0.0155039 |
532404 | Lipandra polysperma | Chenopodium polyspermum | species | Vascular plant | 6 | 1 | CoverPerc | 3.0 | 0.0232558 |
532404 | Cirsium arvense | Cirsium arvense | species | Vascular plant | 6 |
|
CoverPerc | 2.0 | 0.0155039 |
532404 | Convolvulus arvensis | Convolvulus arvensis | species | Vascular plant | 6 | 2 | CoverPerc | 13.0 | 0.1007752 |
532404 | Digitaria sanguinalis | Digitaria sanguinalis | species | Vascular plant | 6 |
|
CoverPerc | 2.0 | 0.0155039 |
532404 | Echinochloa crus-galli | Echinochloa crus-galli | species | Vascular plant | 6 | 1 | CoverPerc | 3.0 | 0.0232558 |
532404 | Galinsoga quadriradiata | Galinsoga ciliata | species | Vascular plant | 6 | 1 | CoverPerc | 3.0 | 0.0232558 |
532404 | Galinsoga parviflora | Galinsoga parviflora | species | Vascular plant | 6 | 3 | CoverPerc | 38.0 | 0.2945736 |
532404 | Geranium dissectum | Geranium dissectum | species | Vascular plant | 6 |
|
CoverPerc | 2.0 | 0.0155039 |
532404 | Lamium purpureum | Lamium purpureum | species | Vascular plant | 6 |
|
CoverPerc | 2.0 | 0.0155039 |
532404 | Lolium perenne | Lolium perenne | species | Vascular plant | 6 |
|
CoverPerc | 2.0 | 0.0155039 |
532404 | Phacelia tanacetifolia | Phacelia tanacetifolia | species | Vascular plant | 6 |
|
CoverPerc | 2.0 | 0.0155039 |
532404 | Persicaria lapathifolia | Polygonum lapathifolium | species | Vascular plant | 6 |
|
CoverPerc | 2.0 | 0.0155039 |
532404 | Persicaria maculosa | Polygonum persicaria | species | Vascular plant | 6 |
|
CoverPerc | 2.0 | 0.0155039 |
532404 | Setaria pumila | Setaria pumila | species | Vascular plant | 6 |
|
CoverPerc | 2.0 | 0.0155039 |
532404 | Stachys arvensis | Stachys arvensis | species | Vascular plant | 6 |
|
CoverPerc | 2.0 | 0.0155039 |
532404 | Stellaria media | Stellaria media | species | Vascular plant | 6 | 1 | CoverPerc | 3.0 | 0.0232558 |
532404 | Taraxacum | Taraxacum officinale | genus | Vascular plant | 6 |
|
CoverPerc | 2.0 | 0.0155039 |
532404 | Veronica persica | Veronica persica | species | Vascular plant | 6 |
|
CoverPerc | 2.0 | 0.0155039 |
1648095 | Acer campestre | Acer campestre | species | Vascular plant | 6 | .1 | CoverPerc | 0.1 | 0.0007037 |
1648095 | Acer platanoides | Acer platanoides | species | Vascular plant | 6 | .1 | CoverPerc | 0.1 | 0.0007037 |
1648095 | Acer pseudoplatanus | Acer pseudoplatanus | species | Vascular plant | 6 | .1 | CoverPerc | 0.1 | 0.0007037 |
1648095 | Buglossoides purpurocaerulea | Aegonychon purpureocaeruleum | species | Vascular plant | 6 | .1 | CoverPerc | 0.1 | 0.0007037 |
1648095 | Anemone nemorosa | Anemonoides nemorosa | species | Vascular plant | 6 | .1 | CoverPerc | 0.1 | 0.0007037 |
1648095 | Anemone ranunculoides | Anemonoides ranunculoides | species | Vascular plant | 6 | 10 | CoverPerc | 10.0 | 0.0703730 |
1648095 | Asarum europaeum | Asarum europaeum | species | Vascular plant | 6 | .1 | CoverPerc | 0.1 | 0.0007037 |
1648095 | Brachypodium sylvaticum | Brachypodium sylvaticum | species | Vascular plant | 6 | .1 | CoverPerc | 0.1 | 0.0007037 |
1648095 | Brachytheciastrum velutinum | Brachythecium velutinum | species | Moss | 9 | .1 | CoverPerc | 0.1 | 0.0007037 |
1648095 | Campanula rapunculoides | Campanula rapunculoides | species | Vascular plant | 6 | .1 | CoverPerc | 0.1 | 0.0007037 |
1648095 | Carex spicata | Carex contigua | species | Vascular plant | 6 | .1 | CoverPerc | 0.1 | 0.0007037 |
1648095 | Prunus avium | Cerasus avium | species | Vascular plant | 1 | 12 | CoverPerc | 12.0 | 0.0844476 |
1648095 | Prunus avium | Cerasus avium | species | Vascular plant | 6 | .1 | CoverPerc | 0.1 | 0.0007037 |
1648095 | Cornus mas | Cornus mas | species | Vascular plant | 4 | 13 | CoverPerc | 13.0 | 0.0914849 |
1648095 | Cornus mas | Cornus mas | species | Vascular plant | 6 | .1 | CoverPerc | 0.1 | 0.0007037 |
1648095 | Corylus avellana | Corylus avellana | species | Vascular plant | 4 | 13 | CoverPerc | 13.0 | 0.0914849 |
1648095 | Corylus avellana | Corylus avellana | species | Vascular plant | 6 | .1 | CoverPerc | 0.1 | 0.0007037 |
1648095 | Crataegus rhipidophylla | Crataegus curvisepala | species | Vascular plant | 4 | .1 | CoverPerc | 0.1 | 0.0007037 |
1648095 | Euonymus europaeus | Euonymus europaea | species | Vascular plant | 6 | .1 | CoverPerc | 0.1 | 0.0007037 |
1648095 | Euonymus verrucosus | Euonymus verrucosa | species | Vascular plant | 4 | .1 | CoverPerc | 0.1 | 0.0007037 |
1648095 | Euonymus verrucosus | Euonymus verrucosa | species | Vascular plant | 6 | .1 | CoverPerc | 0.1 | 0.0007037 |
1648095 | Fraxinus excelsior | Fraxinus excelsior | species | Vascular plant | 1 | 57 | CoverPerc | 57.0 | 0.4011260 |
1648095 | Fraxinus excelsior | Fraxinus excelsior | species | Vascular plant | 4 | .1 | CoverPerc | 0.1 | 0.0007037 |
1648095 | Fraxinus excelsior | Fraxinus excelsior | species | Vascular plant | 6 | .1 | CoverPerc | 0.1 | 0.0007037 |
1648095 | Gagea lutea | Gagea lutea | species | Vascular plant | 6 | .1 | CoverPerc | 0.1 | 0.0007037 |
1648095 | Lamium galeobdolon | Galeobdolon luteum | lower | Vascular plant | 6 | .1 | CoverPerc | 0.1 | 0.0007037 |
1648095 | Lilium martagon | Lilium martagon | species | Vascular plant | 6 | .1 | CoverPerc | 0.1 | 0.0007037 |
1648095 | Malus sylvestris | Malus sylvestris | species | Vascular plant | 4 | .1 | CoverPerc | 0.1 | 0.0007037 |
1648095 | Polygonatum multiflorum | Polygonatum multiflorum | species | Vascular plant | 6 | .1 | CoverPerc | 0.1 | 0.0007037 |
1648095 | Primula veris | Primula veris | species | Vascular plant | 6 | .1 | CoverPerc | 0.1 | 0.0007037 |
1648095 | Pulmonaria obscura | Pulmonaria obscura | species | Vascular plant | 6 | .1 | CoverPerc | 0.1 | 0.0007037 |
1648095 | Quercus robur | Quercus robur | species | Vascular plant | 1 | 8 | CoverPerc | 8.0 | 0.0562984 |
1648095 | Quercus robur | Quercus robur | species | Vascular plant | 6 | .1 | CoverPerc | 0.1 | 0.0007037 |
1648095 | Scilla bifolia | Scilla bifolia | species | Vascular plant | 6 | .1 | CoverPerc | 0.1 | 0.0007037 |
1648095 | Viburnum lantana | Viburnum lantana | species | Vascular plant | 4 | .1 | CoverPerc | 0.1 | 0.0007037 |
1648095 | Viburnum lantana | Viburnum lantana | species | Vascular plant | 6 | .1 | CoverPerc | 0.1 | 0.0007037 |
1648095 | Viola hirta | Viola hirta | species | Vascular plant | 6 | .1 | CoverPerc | 0.1 | 0.0007037 |
1648095 | Viola mirabilis | Viola mirabilis | species | Vascular plant | 6 | 6 | CoverPerc | 6.0 | 0.0422238 |
1648095 | Viola odorata | Viola odorata | species | Vascular plant | 6 | 20 | CoverPerc | 20.0 | 0.1407460 |
1839189 | Acacia aneura | Acacia aneura | species | Vascular plant | 0 | x | pa | 1.0 | 1.0000000 |
PlotObservationID
- Plot ID, as in header
.Species
- Resolved species name, based on taxonomic backboneSpecies_original
- Original species name, as provided by data contributor.Rank_correct
- Taxonomic rank at which Species_original
was matched.Taxon_group
- Possible entries are: Alga_Stonewort, Lichen, Moss, Vascular plant, Unknown.Layer
- Vegetation layer, as specified in Turboveg: 0: No layer specified, 1: Upper tree layer, 2: Middle tree layer, 3: Lower tree layer, 4: Upper shrub layer, 5: Lower shrub layer, 6: Herb layer, 7: Juvenile, 8: Seedling, 9: Moss layer.Cover_code
- Cover\abundance value in original data, before transformation to percentage cover.Ab_scale
- Abundance scale in original data. Possible values are: CoverPerc: Cover Percentage, pa: Presence absence, x_BA: Basal Area, x_IC: Individual count, x_SC: Stem count, x_IV: Relative Importance, x_RF: Relative Frequency.Abundance
- Abundance value, in original value, or as transformed from original Cover code
to quantitative values.Relative_cover
- Abundance of each species after being normalized to 1 in each plot.save(DT2, file = "../_output/DT_sPlot3.0.RData")
## R version 3.6.3 (2020-02-29)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 16.04.7 LTS
##
## Matrix products: default
## BLAS: /usr/lib/openblas-base/libblas.so.3
## LAPACK: /usr/lib/libopenblasp-r0.2.18.so
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=en_US.UTF-8
## [9] LC_ADDRESS=en_US.UTF-8 LC_TELEPHONE=en_US.UTF-8
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=en_US.UTF-8
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] kableExtra_1.3.4 knitr_1.31 xlsx_0.6.5 forcats_0.5.1
## [5] stringr_1.4.0 dplyr_1.0.5 purrr_0.3.4 readr_1.4.0
## [9] tidyr_1.1.3 tibble_3.0.1 ggplot2_3.3.0 tidyverse_1.3.0
##
## loaded via a namespace (and not attached):
## [1] tidyselect_1.1.0 xfun_0.22 bslib_0.2.4 rJava_0.9-13
## [5] haven_2.3.1 colorspace_2.0-0 vctrs_0.3.6 generics_0.1.0
## [9] viridisLite_0.3.0 htmltools_0.5.1.1 yaml_2.2.1 utf8_1.2.1
## [13] rlang_0.4.10 jquerylib_0.1.3 pillar_1.4.3 glue_1.4.2
## [17] withr_2.4.1 DBI_1.1.1 gdtools_0.2.3 dbplyr_2.1.0
## [21] modelr_0.1.6 readxl_1.3.1 lifecycle_1.0.0 munsell_0.5.0
## [25] gtable_0.3.0 cellranger_1.1.0 rvest_1.0.0 evaluate_0.14
## [29] fansi_0.4.2 xlsxjars_0.6.1 highr_0.8 broom_0.7.0
## [33] Rcpp_1.0.5 scales_1.1.1 backports_1.2.1 webshot_0.5.2
## [37] jsonlite_1.7.2 systemfonts_1.0.1 fs_1.5.0 hms_1.0.0
## [41] digest_0.6.25 stringi_1.5.3 grid_3.6.3 cli_2.3.1
## [45] tools_3.6.3 magrittr_2.0.1 sass_0.3.1 crayon_1.4.1
## [49] pkgconfig_2.0.3 ellipsis_0.3.1 xml2_1.3.2 reprex_1.0.0
## [53] lubridate_1.7.10 svglite_1.2.3.2 assertthat_0.2.1 rmarkdown_2.7
## [57] httr_1.4.2 rstudioapi_0.13 R6_2.5.0 compiler_3.6.3