Timestamp: Tue Mar 30 14:35:20 2021
Drafted: Francesco Maria Sabatini
Revised: Helge Bruelheide
version: 1.2

This report documents 1) the construction of Community Weighted Means (CWMs) and Variance (CWVs); and 2) the classification of plots into forest\non-forest based on species growth forms. It complements species composition data from sPlot 3.0 and gap-filled plant functional traits from TRY 5.0, as received by Jens Kattge on Jan 21, 2020.

Changes in version 1.1 - Standardized Growth form names in sPlot_traits.
Changes in version 1.2 - Improved match of species to traits, accounted for non standardized species in CWM completeness.

library(tidyverse)
library(readr)
library(data.table)
library(knitr)
library(kableExtra)
library(stringr)
library(caret)
library(viridis)
#save temporary files
write("TMPDIR = /data/sPlot/users/Francesco/_tmp", file=file.path(Sys.getenv('TMPDIR'), '.Renviron'))
write("R_USER = /data/sPlot/users/Francesco/_tmp", file=file.path(Sys.getenv('R_USER'), '.Renviron'))

1 Data import, preparation and cleaning

#load("/data/sPlot/releases/sPlot3.0/DT_sPlot3.0.RData")
#load("/data/sPlot/releases/sPlot3.0/Backbone3.0.RData")
load("../_output/Backbone3.0.RData")
load("../_output/DT_sPlot3.0.RData")

Import TRY data

# Species, Genus, Family
try.species <- read_csv(
  "../_input/TRY5.0_v1.1/TRY_5_GapFilledData_2020/input_data/hierarchy.info.csv",
  locale = locale(encoding = "latin1")) 
# Original data without gap-filling. With species and trait labels
try.allinfo <- read_csv(
  "../_input/TRY5.0_v1.1/TRY_5_GapFilledData_2020/input_data/traits_x_georef_wide_table.csv", 
  locale = locale(encoding = "latin1"), 
                        col_types = paste0(c("dddccccc",rep("c", 84)), collapse=""))
# Individual-level gap-filled data - order as in try.allinfo
try.individuals0 <- read_csv(
  "../_input/TRY5.0_v1.1/TRY_5_GapFilledData_2020/gapfilled_data/mean_gap_filled_back_transformed.csv", 
  locale = locale(encoding = "latin1"))

There are 609355 individual observations from 52104 distinct (unresolved) species in 7960 distinct (unresolved) genera.

1.2 Attach resolved names from Backbone

try.species.names <- try.allinfo %>% 
  dplyr::select(Species, Genus, GrowthForm) %>% 
  left_join(Backbone %>% 
              dplyr::select(Name_sPlot_TRY, Name_short) %>% 
              rename(Species=Name_sPlot_TRY), 
            by="Species") %>% 
  dplyr::select(Species, Name_short, Genus, GrowthForm)

After attaching resolved names, TRY data contains information on 50612 species.
Check for how many of the species in sPlot, trait information is available in TRY.

sPlot.species <- DT2 %>% 
  distinct(Species) 

sPlot.in.TRY <- sPlot.species %>% 
  filter(Species %in% (try.species.names %>% 
                                  distinct(Name_short) %>% 
                                  pull(Name_short))) 

Out of the 76912 standardized species names in sPlot 3.0, 29519 (38.4%) also occur in TRY 5.0. This number does not account for matches at the genus level.

1.3 Create legend of trait names

trait.legend <- data.frame(full=try.allinfo %>% 
                             dplyr::select(starts_with("StdValue_")) %>% 
                             colnames() %>% 
                             gsub("StdValue_", "", .) %>% 
                             sort()) %>%
  mutate(full=as.character(full)) %>% 
  mutate(traitcode=parse_number(full)) %>% 
  arrange(traitcode) %>% 
  dplyr::select(traitcode, everything()) %>% 
  mutate(full=gsub(pattern = "^[0-9]+_", replacement="", full)) %>% 
  mutate(short=c("StemDens", "RootingDepth","LeafC.perdrymass", "LeafN","LeafP",
                 "StemDiam","SeedMass", "Seed.length","LeafThickness","LDMC",
                 "LeafNperArea","LeafDryMass.single","Leaf.delta.15N","SeedGerminationRate",
                 "Seed.num.rep.unit","LeafLength","LeafWidth","LeafCN.ratio","Leaffreshmass",
                 "Stem.cond.dens","Chromosome.n","Chromosome.cDNAcont", 
                 "Disp.unit.leng","StemConduitDiameter","Wood.vessel.length",
                 "WoodFiberLength","SpecificRootLength.fine","SpecificRootLength",
                 "PlantHeight.veg","PlantHeight.generative","LeafArea.leaf.noPet",
                 "LeafArea.leaflet.noPet","LeafArea.leaf.wPet","LeafArea.leaflet.wPet",
                 "LeafArea.leaf.undef","LeafArea.leaflet.undef","LeafArea.undef.undef",
                 "SLA.noPet", "SLA.wPet","SLA.undef", "LeafWaterCont")) %>% 
  ## Add SLA missing from allinfo file
  bind_rows(data.frame(traitcode=11, 
                       full="Leaf area per leaf dry mass (specific leaf area, SLA or 1/LMA)",
                       short="SLA")) %>% 
  bind_rows(data.frame(traitcode=18, 
                       full="Plant height (vegetative + generative)", 
                       short="PlantHeight")) %>%
  arrange(traitcode) %>% 
  #create a column to mark traits for which gap filled data is available.
  mutate(available=paste0("X", traitcode) %in% colnames(try.individuals0))
Legend of traits from TRY
traitcode full short available
4 Stem specific density (SSD) or wood density (stem dry mass per stem fresh volume)_g/cm3 StemDens TRUE
6 Root rooting depth_m RootingDepth TRUE
11 Leaf area per leaf dry mass (specific leaf area, SLA or 1/LMA) SLA TRUE
13 Leaf carbon (C) content per leaf dry mass_mg/g LeafC.perdrymass TRUE
14 Leaf nitrogen (N) content per leaf dry mass_mg/g LeafN TRUE
15 Leaf phosphorus (P) content per leaf dry mass_mg/g LeafP TRUE
18 Plant height (vegetative + generative) PlantHeight TRUE
21 Stem diameter_m StemDiam TRUE
26 Seed dry mass_mg SeedMass TRUE
27 Seed length_mm Seed.length TRUE
46 Leaf thickness_mm LeafThickness TRUE
47 Leaf dry mass per leaf fresh mass (leaf dry matter content, LDMC)_g g-1 LDMC TRUE
50 Leaf nitrogen (N) content per leaf area_g m-2 LeafNperArea TRUE
55 Leaf dry mass (single leaf)_mg LeafDryMass.single TRUE
78 Leaf nitrogen (N) isotope signature (delta 15N)_per mill Leaf.delta.15N TRUE
95 Seed germination rate (germination efficiency)_% SeedGerminationRate TRUE
138 Seed number per reproducton unit_number Seed.num.rep.unit TRUE
144 Leaf length_mm LeafLength TRUE
145 Leaf width_cm LeafWidth TRUE
146 Leaf carbon/nitrogen (C/N) ratio_g/cm3 LeafCN.ratio TRUE
163 Leaf fresh mass_g Leaffreshmass TRUE
169 Stem conduit density (vessels and tracheids)_mm-2 Stem.cond.dens TRUE
223 Species genotype: chromosome number_dimensionless Chromosome.n TRUE
224 Species genotype: chromosome cDNA content_pg Chromosome.cDNAcont TRUE
237 Dispersal unit length_mm Disp.unit.leng TRUE
281 Stem conduit diameter (vessels, tracheids)_micro m StemConduitDiameter TRUE
282 Wood vessel element length; stem conduit (vessel and tracheids) element length_micro m Wood.vessel.length TRUE
289 Wood fiber lengths_micro m WoodFiberLength TRUE
614 Fine root length per fine root dry mass (specific fine root length, SRL)_cm/g SpecificRootLength.fine FALSE
1080 Root length per root dry mass (specific root length, SRL)_cm/g SpecificRootLength TRUE
3106 Plant height vegetative_m PlantHeight.veg FALSE
3107 Plant height generative_m PlantHeight.generative FALSE
3108 Leaf area (in case of compound leaves: leaf, petiole excluded)_mm2 LeafArea.leaf.noPet FALSE
3109 Leaf area (in case of compound leaves: leaflet, petiole excluded)_mm2 LeafArea.leaflet.noPet FALSE
3110 Leaf area (in case of compound leaves: leaf, petiole included)_mm2 LeafArea.leaf.wPet FALSE
3111 Leaf area (in case of compound leaves: leaflet, petiole included)_mm2 LeafArea.leaflet.wPet FALSE
3112 Leaf area (in case of compound leaves: leaf, undefined if petiole in- or excluded)_mm2 LeafArea.leaf.undef TRUE
3113 Leaf area (in case of compound leaves: leaflet, undefined if petiole is in- or excluded)_mm2 LeafArea.leaflet.undef TRUE
3114 Leaf area (in case of compound leaves undefined if leaf or leaflet, undefined if petiole is in- or excluded)_mm2 LeafArea.undef.undef TRUE
3115 Leaf area per leaf dry mass (specific leaf area, SLA or 1/LMA): petiole excluded_mm2 mg-1 SLA.noPet FALSE
3116 Leaf area per leaf dry mass (specific leaf area, SLA or 1/LMA): petiole included_mm2 mg-1 SLA.wPet FALSE
3117 Leaf area per leaf dry mass (specific leaf area, SLA or 1/LMA): undefined if petiole is in- or excluded_mm2 mg-1 SLA.undef FALSE
3120 Leaf water content per leaf dry mass (not saturated)_g(W)/g(DM) LeafWaterCont TRUE

Use trait legend to change naming of try.individuals0 data.frame of traits

#create string to rename traits
col.to <- trait.legend %>% 
  filter(available==T) %>% 
  pull(short) 
col.from <- trait.legend %>% 
  filter(available==T) %>% 
  mutate(traitcode=paste0("X", traitcode))  %>% 
  pull(traitcode) 

try.individuals <- try.individuals0 %>% 
              rename_at(col.from, .funs=function(x) col.to)

1.3 Fix some known errors in the gap-filled matrix

Check traits at the individual level. There are some traits with unexpected negative entries:

try.species.names %>% 
    dplyr::select(Name_short) %>% 
    bind_cols(try.individuals %>% 
                  dplyr::select(-X1)) %>% 
  gather(variable, value, -Name_short) %>% 
  filter(value<0) %>% 
  group_by(variable) %>% 
  summarize(n=n())
## # A tibble: 5 x 2
##   variable                 n
##   <chr>                <int>
## 1 LDMC                   419
## 2 LeafC.perdrymass         9
## 3 Leaf.delta.15N      262283
## 4 SeedGerminationRate    120
## 5 StemDens               337

According to Jens Kattge, the entries for Leaf.delta.15N are legitimate, while in the other cases, it may be due to bad predictions. He suggested to delete these negative records.
Similarly, there are records with impossible values for height. Some species incorrectly predicted to have height >100 meters, and some herbs predicted to have a height >10 m.

try.individuals <- try.species.names %>% 
  dplyr::select(Name_short) %>% 
  bind_cols(try.individuals)

toexclude <- try.individuals %>% 
  gather(variable, value, -X1, -Name_short) %>% 
  filter(variable != "Leaf.delta.15N") %>% 
  filter(value<0) %>% 
  pull(X1)

toexclude2 <- try.individuals %>% 
  filter(PlantHeight>100  & (!Name_short %in% c("Pseudotsuga menziesii", "Sequoia sempervirens"))) %>% 
  pull(X1)

toexclude3 <- try.individuals %>% 
  filter(X1 %in% (try.allinfo %>% 
                     filter(GrowthForm=="herb") %>% 
                     pull(X1))) %>% 
  filter(PlantHeight>10) %>% 
  pull(X1)

try.individuals <- try.individuals %>% 
  filter(!X1 %in% c(toexclude, toexclude2, toexclude3)) %>% 
  dplyr::select(-X1)

This results in the exclusion of 874 individuals. In this way the total number of species included in TRY reduces to 50404

1.4 Calculate species and genus level trait means and sd

## Calculate species level trait means and sd. 
try.species.means <- try.individuals %>% 
  group_by(Name_short) %>% 
  #Add a field to indicate the number of observations per taxon
  left_join(x={.} %>% 
              summarize(n=n()), 
            y={.} %>% 
              summarize_at(.vars=vars(StemDens:LeafWaterCont ),
                           .funs=list(mean=~mean(.), sd=~sd(.))),
            by="Name_short") %>% 
  dplyr::select(Name_short, n, everything())

## Calculate genus level trait means and sd.
try.genus.means <- try.individuals %>% 
  mutate(Genus=word(Name_short, 1)) %>% 
  group_by(Genus) %>% 
  left_join(x={.} %>% 
              summarize(n=n()), 
            y={.} %>% 
              summarize_at(.vars=vars(StemDens:LeafWaterCont ),
                           .funs=list(mean=~mean(.), sd=~sd(.))),
            by="Genus") %>% 
  dplyr::select(Genus, n, everything())

The average number of observations per species and genus is 12.1 and 81.5, respectively. As many as 17443 species have only one observation (1250 at the genus level).

Example of trait means for 15 randomly selected species
Name_short n StemDens_mean RootingDepth_mean SLA_mean LeafC.perdrymass_mean LeafN_mean LeafP_mean PlantHeight_mean StemDiam_mean SeedMass_mean Seed.length_mean LeafThickness_mean LDMC_mean LeafNperArea_mean LeafDryMass.single_mean Leaf.delta.15N_mean SeedGerminationRate_mean Seed.num.rep.unit_mean LeafLength_mean LeafWidth_mean LeafCN.ratio_mean Leaffreshmass_mean Stem.cond.dens_mean Chromosome.n_mean Chromosome.cDNAcont_mean Disp.unit.leng_mean StemConduitDiameter_mean Wood.vessel.length_mean WoodFiberLength_mean SpecificRootLength_mean LeafArea.leaf.undef_mean LeafArea.leaflet.undef_mean LeafArea.undef.undef_mean LeafWaterCont_mean StemDens_sd RootingDepth_sd SLA_sd LeafC.perdrymass_sd LeafN_sd LeafP_sd PlantHeight_sd StemDiam_sd SeedMass_sd Seed.length_sd LeafThickness_sd LDMC_sd LeafNperArea_sd LeafDryMass.single_sd Leaf.delta.15N_sd SeedGerminationRate_sd Seed.num.rep.unit_sd LeafLength_sd LeafWidth_sd LeafCN.ratio_sd Leaffreshmass_sd Stem.cond.dens_sd Chromosome.n_sd Chromosome.cDNAcont_sd Disp.unit.leng_sd StemConduitDiameter_sd Wood.vessel.length_sd WoodFiberLength_sd SpecificRootLength_sd LeafArea.leaf.undef_sd LeafArea.leaflet.undef_sd LeafArea.undef.undef_sd LeafWaterCont_sd
Achillea santolinoides 3 0.345 0.571 14.476 441.195 21.681 1.796 0.245 0.007 0.198 1.288 0.298 0.248 1.573 30.786 -0.203 97.108 2264.368 70.512 0.670 24.783 0.128 75.311 27.722 7.375 1.871 24.884 336.054 649.116 7473.206 351.320 435.422 553.604 4.844 0.002 0.024 0.182 0.147 0.183 0.006 0.010 0.000 0.012 0.022 0.002 0.000 0.006 0.782 0.048 2.502 126.049 3.400 0.034 0.254 0.003 3.348 0.395 0.050 0.012 0.725 4.363 16.360 272.162 9.439 15.420 4.808 0.040
Bunchosia lindeniana 2 0.694 2.223 18.193 468.785 34.809 0.870 8.670 0.199 269.996 7.704 0.185 0.323 2.044 410.337 1.205 89.458 14.226 36.637 4.260 17.674 1.280 29.036 47.205 2.769 11.350 54.150 347.467 825.881 1096.051 4367.325 6110.530 6801.134 3.593 0.000 0.054 0.037 0.678 0.250 0.010 0.948 0.020 57.511 0.502 0.001 0.002 0.014 26.561 0.034 0.577 0.288 1.715 0.185 0.128 0.073 1.523 0.139 0.039 0.858 1.102 5.420 16.558 76.367 311.559 464.381 527.430 0.060
Tetradymia spinosa 2 0.789 1.179 9.757 460.611 23.001 1.508 1.072 0.042 3.865 4.085 0.269 0.358 2.519 6.098 1.500 83.179 187.277 19.054 0.489 28.366 0.018 114.938 28.184 4.687 4.001 28.978 372.017 999.803 1628.177 59.846 120.609 108.828 2.950 0.001 0.023 0.070 1.166 0.206 0.001 0.165 0.002 0.281 0.141 0.001 0.001 0.004 0.108 0.088 0.335 20.696 0.476 0.016 0.072 0.000 2.466 0.027 0.014 0.136 0.299 0.084 16.558 44.116 1.953 7.839 5.173 0.000
Jurinea mollis 2 0.520 1.432 9.365 443.393 17.680 1.304 0.549 0.023 3.317 4.771 0.222 0.359 1.886 42.381 0.836 91.074 15.745 38.633 0.622 28.401 0.121 112.475 29.003 3.755 6.835 19.676 234.661 571.631 3995.147 249.778 826.619 520.622 3.750 0.003 0.100 0.053 0.850 0.169 0.004 0.194 0.002 0.361 0.308 0.001 0.003 0.048 3.627 0.057 0.249 1.053 2.814 0.062 0.190 0.010 3.024 0.383 0.001 0.412 0.163 1.976 6.843 69.988 25.285 78.866 47.061 0.053
Buchnera cryptocephala 10 0.549 0.376 19.701 451.323 11.722 1.277 0.536 0.045 0.079 0.921 0.235 0.213 0.810 21.839 -1.763 94.178 12253.581 32.593 0.506 74.536 0.096 101.182 36.396 5.033 1.752 20.958 519.447 897.881 4422.321 338.256 386.743 169.734 4.695 0.009 0.055 0.651 2.218 0.153 0.030 0.414 0.014 0.022 0.127 0.003 0.010 0.020 8.148 0.211 1.417 4921.999 17.001 0.300 1.689 0.035 12.352 1.235 0.182 0.215 0.580 25.650 48.754 235.109 128.913 133.921 82.928 0.190
Burchellia bubalina 11 0.666 0.533 15.596 459.183 18.858 1.077 1.511 0.064 3.914 2.590 0.226 0.305 1.353 64.150 2.005 86.818 163.628 30.170 1.292 26.290 0.221 61.643 37.269 2.904 2.890 38.768 477.746 933.214 2650.308 796.886 847.684 1084.400 3.912 0.006 0.052 2.400 1.325 0.732 0.045 0.135 0.007 0.255 0.074 0.014 0.013 0.141 18.459 0.121 0.539 18.943 3.856 0.219 0.850 0.062 3.711 0.739 0.049 0.115 1.223 12.039 20.837 150.987 223.575 237.425 270.507 0.252
Pistacia weinmannifolia 30 0.882 1.450 7.309 477.690 13.704 1.709 2.174 0.043 45.512 5.339 0.240 0.449 1.987 271.247 0.327 86.233 85.891 59.229 1.811 38.550 0.548 31.732 27.430 2.815 4.049 54.609 540.676 1093.031 1107.147 2001.101 1139.384 2230.694 1.924 0.014 0.225 1.368 3.325 0.712 0.110 1.074 0.045 8.355 0.379 0.017 0.017 0.269 81.809 0.401 1.065 26.178 8.411 0.276 1.591 0.146 3.111 0.548 0.086 0.422 2.765 21.442 44.937 227.905 483.246 225.535 480.819 0.166
Kadsura oblongifolia 2 0.514 0.829 19.359 427.871 10.220 3.058 4.229 0.031 3.078 4.339 0.290 0.207 0.529 138.489 -1.361 93.898 673.276 77.789 2.348 42.878 0.679 140.228 77.437 14.088 4.200 22.119 1257.892 1032.570 1163.833 2206.074 4346.295 1353.863 6.456 0.003 0.019 0.323 0.434 0.006 0.050 0.084 0.001 0.043 0.023 0.004 0.001 0.008 0.719 0.117 0.091 46.478 4.494 0.191 0.240 0.005 2.338 0.460 0.169 0.024 0.136 8.038 7.849 52.111 45.657 21.170 53.337 0.019
Acacia botrydion 2 0.769 3.135 6.877 467.546 23.213 0.839 5.267 0.132 17.812 4.215 0.312 0.437 3.404 47.665 2.455 89.944 16.144 8.390 0.245 20.010 0.109 14.988 29.928 2.571 4.315 61.209 286.600 925.895 2174.853 239.212 119.598 479.059 1.734 0.001 0.115 0.105 0.155 0.295 0.004 0.308 0.005 1.529 0.071 0.002 0.001 0.007 1.635 0.117 0.118 0.025 0.322 0.011 0.338 0.004 0.306 0.461 0.011 0.100 1.031 2.036 2.448 34.534 8.339 5.019 26.458 0.009
Lepturus repens 3 0.455 0.676 17.770 445.026 18.194 1.294 0.579 0.013 1.288 2.335 0.172 0.327 1.094 29.854 0.617 90.294 114.353 18.764 0.929 27.757 0.100 68.560 33.178 5.145 2.728 36.103 475.040 829.876 6632.511 500.198 638.540 566.196 2.885 0.001 0.017 0.321 0.747 0.078 0.030 0.015 0.001 0.033 0.037 0.002 0.004 0.022 2.434 0.067 0.416 7.884 3.172 0.108 0.147 0.009 0.813 0.459 0.018 0.057 0.506 2.516 12.683 369.314 40.296 38.734 68.550 0.050
Klasea quinquefolia 4 0.451 0.805 23.893 441.999 22.953 1.610 0.476 0.007 6.790 4.377 0.200 0.231 1.045 284.440 0.766 90.129 462.454 111.338 5.114 25.875 1.305 62.669 25.974 4.013 6.276 35.693 502.740 939.699 6765.512 8081.462 3748.683 3728.697 5.332 0.004 0.022 0.315 0.776 0.204 0.013 0.045 0.000 0.343 0.121 0.001 0.003 0.007 39.894 0.067 0.256 56.002 5.831 0.382 0.168 0.191 0.824 0.386 0.065 0.148 1.635 14.818 16.642 188.434 1225.735 450.722 457.976 0.060
Pteralyxia laurifolia 3 0.556 0.422 15.908 468.858 18.477 1.420 3.803 0.096 13.252 5.239 0.235 0.262 1.261 197.885 2.688 88.691 109.707 83.420 4.314 31.790 0.780 22.988 35.189 2.193 5.272 46.777 627.163 1272.353 1211.859 2626.568 2454.744 3551.885 4.290 0.006 0.056 0.396 4.169 0.315 0.042 0.412 0.016 0.501 0.381 0.008 0.003 0.020 4.994 0.259 2.942 81.829 7.226 0.115 0.683 0.017 6.368 3.498 0.159 0.623 14.733 99.390 191.605 234.600 77.397 138.138 323.160 0.083
Valeriana nivalis 1 0.435 0.265 16.232 455.965 16.990 1.233 0.326 0.010 0.912 2.647 0.241 0.237 1.108 69.079 -2.584 79.407 184.889 121.880 2.533 26.488 0.296 68.176 27.144 2.668 3.431 66.639 579.466 1019.845 5030.976 1207.207 1083.840 1210.345 3.842 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
Santiria conferta 3 0.524 0.907 9.465 495.377 18.174 0.921 25.784 0.206 206.472 11.896 0.199 0.423 1.846 929.183 2.289 71.460 46.444 123.519 4.919 26.237 2.219 6.460 29.203 2.346 8.613 83.819 659.951 1515.346 557.324 8635.519 4563.218 8141.972 2.142 0.021 0.011 0.093 0.112 0.411 0.019 0.337 0.008 2.585 0.120 0.001 0.002 0.020 22.001 0.093 0.451 5.083 1.826 0.027 0.702 0.068 0.077 0.149 0.036 0.084 1.312 5.063 16.467 21.094 187.117 34.794 174.808 0.036
Dovyalis macrocalyx 8 0.493 1.716 10.552 418.626 23.642 1.484 6.851 0.137 17.462 3.596 0.224 0.400 2.246 144.809 -1.239 94.760 356.891 50.260 2.556 20.109 0.349 96.560 58.206 1.626 6.162 50.222 377.351 608.406 1608.738 1286.082 1308.393 1575.185 2.279 0.009 0.105 0.300 1.148 0.399 0.043 0.593 0.010 1.000 0.155 0.003 0.004 0.049 28.915 0.135 0.722 83.893 19.401 0.908 0.276 0.069 3.805 1.319 0.030 0.232 1.850 17.224 23.279 53.353 257.525 245.632 474.509 0.040

1.5 Match taxa based on species, if available, or Genus

Combined the trait means based on species and genera into a single object, and check how many of these taxa match to the (resolved) species names in DT2.

try.combined.means <- try.genus.means %>% 
  rename(Taxon_name=Genus) %>% 
  mutate(Rank_correct="genus") %>% 
  bind_rows(try.species.means %>% 
              rename(Taxon_name=Name_short) %>% 
              mutate(Rank_correct="species")) %>% 
  dplyr::select(Taxon_name, Rank_correct, everything()) %>% 
  filter(!is.na(Taxon_name)) ## added in Version1.2 to avoid matching NA species

total.matches <- DT2 %>%
  distinct(Species, Rank_correct) %>%
  mutate(Rank_correct=fct_recode(Rank_correct, 
  #many taxa reported as matched at higher rank level or lower, were nevertheless resolved at species level
                                   "species"="higher", 
                                   "species"="lower")) %>% #added in Version1.2
  left_join(try.combined.means %>%
              dplyr::rename(Species=Taxon_name), 
            by=c("Species", "Rank_correct")) %>% 
  filter(!is.na(SLA_mean)) %>% 
  nrow()

The total number of matched taxa (either at species, or genus level) is 36085.

1.6 Calculate summary statistics for species- and genus-level mean traits

mysummary <- try.combined.means %>% 
               group_by(Rank_correct) %>% 
               summarize_at(.vars=vars(StemDens_mean:LeafWaterCont_sd),
                            .funs=list(min=~min(., na.rm=T),
                                       q025=~quantile(., 0.25, na.rm=T), 
                                       q50=~quantile(., 0.50, na.rm=T), 
                                       q75=~quantile(., .75, na.rm=T), 
                                       max=~max(., na.rm=T), 
                                       mean=~mean(., na.rm=T), 
                                       sd=~sd(., na.rm=T))) %>% 
  gather(variable, value, -Rank_correct) %>% 
  separate(variable, sep="_", into=c("variable", "mean.sd", "stat")) %>% 
  mutate(stat=factor(stat, levels=c("min", "q025", "q50", "q75", "max", "mean", "sd"))) %>% 
  spread(key=stat, value=value) %>% 
  arrange(desc(Rank_correct)) %>% 
  mutate_at(.vars=vars(min:sd),
            .funs=~round(.,3))
## Warning: attributes are not identical across measure variables;
## they will be dropped
Summary statistics for each trait, when summarized across species or genera
Rank_correct variable mean.sd min q025 q50 q75 max mean sd
species Chromosome.cDNAcont mean 0.006 2.150 2.992 4.595 1.067308e+03 4.344000e+00 8.248000e+00
species Chromosome.cDNAcont sd 0.000 0.026 0.047 0.090 1.761220e+02 9.200000e-02 1.020000e+00
species Chromosome.n mean 0.062 22.492 28.266 35.684 7.671771e+03 3.109500e+01 6.146000e+01
species Chromosome.n sd 0.000 0.242 0.400 0.692 1.621340e+02 6.700000e-01 1.482000e+00
species Disp.unit.leng mean 0.003 2.275 3.864 6.894 1.336988e+04 6.336000e+00 7.991700e+01
species Disp.unit.leng sd 0.000 0.069 0.150 0.308 3.050200e+02 2.910000e-01 2.006000e+00
species LDMC mean 0.012 0.243 0.321 0.383 1.372000e+00 3.140000e-01 9.700000e-02
species LDMC sd 0.000 0.003 0.004 0.008 2.250000e-01 8.000000e-03 1.000000e-02
species LeafArea.leaflet.undef mean 0.000 341.086 1039.523 2649.636 1.740828e+13 3.454087e+08 7.754031e+10
species LeafArea.leaflet.undef sd 0.000 20.712 76.618 245.071 3.366525e+11 1.021534e+07 1.854335e+09
species LeafArea.leaf.undef mean 0.000 343.274 1100.739 3115.795 1.341026e+11 2.666981e+06 5.973231e+08
species LeafArea.leaf.undef sd 0.000 22.932 87.299 292.514 3.408568e+09 1.038697e+05 1.877493e+07
species LeafArea.undef.undef mean 0.000 345.679 1175.076 3505.350 1.524769e+11 3.041432e+06 6.791673e+08
species LeafArea.undef.undef sd 0.000 24.273 99.108 369.801 4.133230e+09 1.261710e+05 2.276651e+07
species LeafCN.ratio mean 0.213 20.712 25.571 32.671 1.599229e+03 2.958100e+01 2.170300e+01
species LeafCN.ratio sd 0.000 0.233 0.406 0.817 1.686110e+02 7.880000e-01 1.705000e+00
species LeafC.perdrymass mean 93.949 441.375 456.737 474.191 1.131692e+03 4.565940e+02 2.970600e+01
species LeafC.perdrymass sd 0.000 0.804 1.230 1.880 7.672000e+01 1.736000e+00 2.234000e+00
species Leaf.delta.15N mean -31.434 -0.612 0.753 2.226 4.910200e+01 8.140000e-01 2.464000e+00
species Leaf.delta.15N sd 0.000 0.074 0.111 0.162 3.888400e+01 1.470000e-01 2.690000e-01
species LeafDryMass.single mean 0.000 27.531 88.300 280.900 1.023338e+17 2.030311e+12 4.558173e+14
species LeafDryMass.single sd 0.000 1.980 7.661 27.345 3.076084e+15 9.332780e+10 1.694356e+13
species Leaffreshmass mean 0.000 0.099 0.306 0.835 2.906497e+13 5.766516e+08 1.294618e+11
species Leaffreshmass sd 0.000 0.006 0.023 0.077 8.032411e+11 2.437018e+07 4.424379e+09
species LeafLength mean 0.000 29.485 57.716 97.430 3.089342e+08 6.260805e+03 1.376064e+06
species LeafLength sd 0.000 1.513 3.977 9.911 3.388372e+05 2.144100e+01 1.870873e+03
species LeafN mean 0.343 16.117 19.914 24.175 1.310526e+03 2.062200e+01 1.042300e+01
species LeafN sd 0.000 0.159 0.288 0.618 5.074300e+01 5.850000e-01 8.900000e-01
species LeafNperArea mean 0.002 1.196 1.489 1.878 4.016178e+04 2.426000e+00 1.788870e+02
species LeafNperArea sd 0.000 0.017 0.031 0.073 3.231700e+01 7.200000e-02 2.180000e-01
species LeafP mean 0.000 0.996 1.359 1.803 6.392164e+04 3.184000e+00 2.962400e+02
species LeafP sd 0.000 0.017 0.032 0.058 9.827822e+03 3.530000e-01 5.413300e+01
species LeafThickness mean 0.005 0.188 0.229 0.288 1.197200e+02 2.710000e-01 7.110000e-01
species LeafThickness sd 0.000 0.002 0.003 0.008 3.864000e+00 8.000000e-03 3.500000e-02
species LeafWaterCont mean 0.096 2.525 3.307 4.734 4.260880e+02 3.940000e+00 2.942000e+00
species LeafWaterCont sd 0.000 0.038 0.073 0.149 2.759500e+01 1.410000e-01 2.630000e-01
species LeafWidth mean 0.000 0.691 1.902 4.051 3.477895e+04 6.499000e+00 2.474640e+02
species LeafWidth sd 0.000 0.042 0.131 0.376 3.071545e+05 9.740000e+00 1.691858e+03
species PlantHeight mean 0.000 0.403 1.356 7.556 9.476600e+01 5.124000e+00 7.274000e+00
species PlantHeight sd 0.000 0.040 0.133 0.508 2.685300e+01 5.430000e-01 1.101000e+00
species RootingDepth mean 0.000 0.410 0.845 1.617 2.519319e+09 5.003461e+04 1.122161e+07
species RootingDepth sd 0.000 0.019 0.046 0.107 8.971563e+07 2.724145e+03 4.941680e+05
species SeedGerminationRate mean 5.441 83.595 88.845 93.287 2.416300e+02 8.756700e+01 1.031100e+01
species SeedGerminationRate sd 0.000 0.345 0.523 0.843 1.864950e+02 9.180000e-01 1.903000e+00
species Seed.length mean 0.004 1.817 3.029 5.599 1.089329e+04 4.931000e+00 5.310100e+01
species Seed.length sd 0.000 0.056 0.121 0.242 1.220200e+01 2.140000e-01 3.530000e-01
species SeedMass mean 0.000 0.730 4.336 36.226 9.521438e+08 1.938312e+04 4.241520e+06
species SeedMass sd 0.000 0.054 0.356 2.814 4.003387e+07 2.019808e+03 2.618936e+05
species Seed.num.rep.unit mean 0.000 42.925 180.756 791.517 2.225078e+25 4.416845e+20 9.910992e+22
species Seed.num.rep.unit sd 0.000 5.625 28.345 158.015 2.160083e+19 1.105318e+15 1.443143e+17
species SLA mean 0.000 10.691 14.520 18.798 5.490787e+03 1.576800e+01 2.874000e+01
species SLA sd 0.000 0.168 0.331 0.845 1.012120e+02 9.090000e-01 1.679000e+00
species SpecificRootLength mean 0.000 1337.343 2584.068 5186.847 3.071495e+09 2.474592e+05 2.296648e+07
species SpecificRootLength sd 0.000 48.687 111.589 254.694 1.335737e+08 6.094850e+03 7.740078e+05
species Stem.cond.dens mean 0.000 21.114 48.151 91.794 1.988212e+08 7.925137e+03 1.206861e+06
species Stem.cond.dens sd 0.000 0.689 1.873 4.707 2.011904e+06 6.649700e+01 1.108256e+04
species StemConduitDiameter mean 0.001 25.102 35.442 50.841 8.065496e+07 1.667588e+03 3.592628e+05
species StemConduitDiameter sd 0.000 0.549 1.006 1.852 1.181921e+03 2.189000e+00 9.759000e+00
species StemDens mean 0.007 0.433 0.535 0.637 2.640000e+00 5.410000e-01 1.480000e-01
species StemDens sd 0.000 0.003 0.005 0.009 1.253000e+00 1.000000e-02 1.700000e-02
species StemDiam mean 0.000 0.014 0.056 0.147 1.398249e+09 2.774159e+04 6.228112e+06
species StemDiam sd 0.000 0.001 0.003 0.011 4.882243e+07 1.481287e+03 2.689217e+05
species WoodFiberLength mean 12.143 668.311 834.577 1030.163 1.068162e+07 1.129436e+03 4.764160e+04
species WoodFiberLength sd 0.000 9.803 17.565 31.011 1.166290e+04 2.803700e+01 8.074500e+01
species Wood.vessel.length mean 3.460 313.875 413.751 537.506 2.082033e+06 5.573640e+02 1.082520e+04
species Wood.vessel.length sd 0.000 4.775 9.037 16.889 1.457577e+03 1.659700e+01 3.366500e+01
genus Chromosome.cDNAcont mean 0.006 2.332 3.178 4.970 1.067308e+03 4.910000e+00 1.835000e+01
genus Chromosome.cDNAcont sd 0.000 0.050 0.093 0.180 1.761220e+02 2.190000e-01 2.283000e+00
genus Chromosome.n mean 0.062 22.564 28.269 35.907 7.671771e+03 3.408600e+01 1.361540e+02
genus Chromosome.n sd 0.000 0.400 0.753 1.465 1.621340e+02 1.484000e+00 3.361000e+00
genus Disp.unit.leng mean 0.003 2.350 3.834 6.214 1.336988e+04 9.898000e+00 2.071740e+02
genus Disp.unit.leng sd 0.000 0.135 0.297 0.649 3.050200e+02 6.890000e-01 4.613000e+00
genus LDMC mean 0.021 0.234 0.307 0.365 1.372000e+00 3.020000e-01 1.000000e-01
genus LDMC sd 0.000 0.005 0.010 0.020 1.260000e-01 1.500000e-02 1.500000e-02
genus LeafArea.leaflet.undef mean 0.000 451.329 1047.617 2687.686 1.740828e+13 2.332414e+09 2.014976e+11
genus LeafArea.leaflet.undef sd 0.000 44.128 147.028 601.658 3.366525e+11 5.419567e+07 4.270672e+09
genus LeafArea.leaf.undef mean 0.000 404.595 1116.704 3186.863 1.341026e+11 1.799387e+07 1.552214e+09
genus LeafArea.leaf.undef sd 0.000 46.186 170.842 687.942 3.408568e+09 5.496733e+05 4.324006e+07
genus LeafArea.undef.undef mean 0.000 411.122 1167.387 3671.852 1.524769e+11 2.051781e+07 1.764895e+09
genus LeafArea.undef.undef sd 0.000 47.783 194.714 833.039 4.133230e+09 6.675087e+05 5.243295e+07
genus LeafCN.ratio mean 0.213 21.213 26.129 32.266 1.599229e+03 3.011100e+01 3.461600e+01
genus LeafCN.ratio sd 0.001 0.435 0.922 2.060 1.686110e+02 2.103000e+00 6.527000e+00
genus LeafC.perdrymass mean 95.729 440.577 454.526 471.232 1.131692e+03 4.545300e+02 3.514400e+01
genus LeafC.perdrymass sd 0.003 1.329 2.310 4.375 5.565700e+01 3.910000e+00 4.699000e+00
genus Leaf.delta.15N mean -31.434 -0.360 0.963 2.262 4.885800e+01 1.011000e+00 2.858000e+00
genus Leaf.delta.15N sd 0.000 0.122 0.212 0.379 1.000600e+01 3.190000e-01 3.530000e-01
genus LeafDryMass.single mean 0.000 30.225 93.616 273.934 1.023338e+17 1.371031e+13 1.184495e+15
genus LeafDryMass.single sd 0.000 4.094 15.861 59.587 3.076084e+15 4.950248e+11 3.902228e+13
genus Leaffreshmass mean 0.000 0.122 0.319 0.857 2.906497e+13 3.894021e+09 3.364218e+11
genus Leaffreshmass sd 0.000 0.013 0.050 0.178 8.032411e+11 1.292631e+08 1.018967e+10
genus LeafLength mean 0.000 31.476 56.824 92.774 3.089342e+08 4.168391e+04 3.575858e+06
genus LeafLength sd 0.000 2.621 6.763 18.834 3.388372e+05 8.008100e+01 4.309738e+03
genus LeafN mean 0.343 16.840 20.136 23.942 1.310526e+03 2.133800e+01 2.132400e+01
genus LeafN sd 0.001 0.299 0.660 1.681 5.074300e+01 1.283000e+00 1.636000e+00
genus LeafNperArea mean 0.002 1.224 1.492 1.843 4.016178e+04 7.058000e+00 4.648540e+02
genus LeafNperArea sd 0.000 0.032 0.074 0.217 3.231700e+01 1.610000e-01 4.700000e-01
genus LeafP mean 0.000 1.079 1.357 1.759 6.392164e+04 1.289400e+01 7.697680e+02
genus LeafP sd 0.000 0.031 0.063 0.138 9.827822e+03 1.711000e+00 1.246720e+02
genus LeafThickness mean 0.005 0.196 0.234 0.290 1.197200e+02 3.170000e-01 1.809000e+00
genus LeafThickness sd 0.000 0.004 0.009 0.021 3.343000e+00 1.900000e-02 6.000000e-02
genus LeafWaterCont mean 0.096 2.687 3.531 4.929 4.260880e+02 4.253000e+00 5.594000e+00
genus LeafWaterCont sd 0.000 0.077 0.163 0.372 2.759500e+01 3.060000e-01 5.290000e-01
genus LeafWidth mean 0.000 0.806 1.791 3.846 3.477895e+04 2.301000e+01 6.183080e+02
genus LeafWidth sd 0.000 0.078 0.238 0.768 3.071545e+05 5.070900e+01 3.896580e+03
genus PlantHeight mean 0.000 0.417 1.192 5.770 8.203000e+01 4.524000e+00 6.885000e+00
genus PlantHeight sd 0.000 0.067 0.217 1.025 2.685300e+01 1.020000e+00 1.858000e+00
genus RootingDepth mean 0.000 0.450 0.813 1.475 2.519319e+09 3.378525e+05 2.916066e+07
genus RootingDepth sd 0.000 0.037 0.086 0.198 8.971563e+07 1.444958e+04 1.138105e+06
genus SeedGerminationRate mean 6.898 84.387 88.784 92.519 2.396310e+02 8.772900e+01 1.137500e+01
genus SeedGerminationRate sd 0.001 0.539 0.988 2.030 4.806300e+01 1.783000e+00 2.315000e+00
genus Seed.length mean 0.004 1.856 3.001 5.077 1.089329e+04 6.656000e+00 1.375330e+02
genus Seed.length sd 0.000 0.106 0.234 0.508 3.097500e+01 4.720000e-01 8.990000e-01
genus SeedMass mean 0.000 0.878 4.277 28.603 9.521438e+08 1.301220e+05 1.102206e+07
genus SeedMass sd 0.000 0.135 0.869 5.765 4.003387e+07 1.064328e+04 6.031196e+05
genus Seed.num.rep.unit mean 0.000 59.105 249.765 963.571 2.225078e+25 2.982078e+21 2.575488e+23
genus Seed.num.rep.unit sd 0.000 14.004 65.787 373.467 9.602388e+20 1.586212e+17 1.218443e+19
genus SLA mean 0.000 11.534 15.070 18.147 5.490787e+03 1.731800e+01 7.070200e+01
genus SLA sd 0.000 0.339 0.793 2.676 4.479200e+01 1.951000e+00 2.724000e+00
genus SpecificRootLength mean 0.000 1401.251 2644.246 5204.402 3.071495e+09 1.619226e+06 5.964681e+07
genus SpecificRootLength sd 0.000 98.607 224.904 532.281 4.301703e+07 1.568093e+04 7.055693e+05
genus Stem.cond.dens mean 0.000 23.207 50.054 90.839 1.912493e+08 2.743381e+04 2.218101e+06
genus Stem.cond.dens sd 0.000 1.464 3.723 9.576 1.070819e+07 2.062249e+03 1.382138e+05
genus StemConduitDiameter mean 0.001 25.008 34.971 49.393 8.065496e+07 1.093975e+04 9.335763e+05
genus StemConduitDiameter sd 0.000 1.016 2.027 4.363 5.503930e+03 6.321000e+00 7.618700e+01
genus StemDens mean 0.015 0.444 0.539 0.627 2.640000e+00 5.420000e-01 1.480000e-01
genus StemDens sd 0.000 0.006 0.011 0.023 3.250000e-01 2.000000e-02 2.500000e-02
genus StemDiam mean 0.000 0.015 0.054 0.135 1.398249e+09 1.873329e+05 1.618448e+07
genus StemDiam sd 0.000 0.002 0.007 0.021 4.882243e+07 7.856923e+03 6.193467e+05
genus WoodFiberLength mean 12.143 706.990 848.093 1034.332 1.068162e+07 2.486672e+03 1.237900e+05
genus WoodFiberLength sd 0.000 17.850 34.223 64.295 4.830371e+03 5.557400e+01 1.078040e+02
genus Wood.vessel.length mean 3.460 331.386 428.820 540.582 2.082033e+06 9.565620e+02 2.618525e+04
genus Wood.vessel.length sd 0.000 8.979 18.005 37.694 2.797090e+03 3.545600e+01 7.398600e+01

2 Calculate CWMs and CWVs for each plot

Merge vegetation layers, where necessary. Combine cover values across layers

#Ancillary function
# Combine cover accounting for layers
combine.cover <- function(x){
    while (length(x)>1){
      x[2] <- x[1]+(100-x[1])*x[2]/100
      x <- x[-1]
    }
  return(x)
}

DT2.comb <- DT2 %>% 
  group_by(PlotObservationID, Species,Species_original, Rank_correct) %>% 
  summarize(Relative_cover=combine.cover(Relative_cover)) %>%
  ungroup() %>% 
  # re-normalize to 100%
  left_join(x=., 
            y={.} %>% 
              group_by(PlotObservationID) %>% 
              summarize(Tot.cover=sum(Relative_cover)), 
            by="PlotObservationID") %>% 
  mutate(Relative_cover=Relative_cover/Tot.cover) %>% 
  dplyr::select(-Tot.cover)
## `summarise()` has grouped output by 'PlotObservationID', 'Species', 'Species_original'. You can override using the `.groups` argument.

Calculate CWMs and CWV, as well as plot coverage statistics (proportion of total cover for which trait info exist, and proportion of species for which we have trait info). To avoid misleading results, CWM is calculated ONLY for plots for which we have some abundance information. All plots where Ab_scale==“pa” in ANY of the layers are therefore excluded.

# Tag plots where at least one layer has only p\a information 
any_pa <- DT2 %>% 
  distinct(PlotObservationID, Ab_scale) %>% 
  group_by(PlotObservationID) %>% 
  summarize(any.pa=any(Ab_scale=="pa")) %>% 
  filter(any.pa==T) %>% 
  pull(PlotObservationID)
length(any_pa)
## [1] 272981
# Exclude plots above and merge species data table with traits
CWM0 <- DT2.comb %>%
  filter(!PlotObservationID %in% any_pa) %>% 
  mutate(Rank_correct=fct_recode(Rank_correct, 
                                 "species"="higher", 
                                 "species"="lower")) %>% #added in Version1.2
  left_join(try.combined.means %>%
              dplyr::rename(Species=Taxon_name) %>% 
              dplyr::select(Species, Rank_correct, ends_with("_mean")), 
            by=c("Species", "Rank_correct"))
# Calculate CWM for each trait in each plot
CWM1 <- CWM0 %>% 
  group_by(PlotObservationID) %>%
  summarize_at(.vars= vars(StemDens_mean:LeafWaterCont_mean),
               .funs = list(~weighted.mean(., Relative_cover, na.rm=T))) %>%
  dplyr::select(PlotObservationID, order(colnames(.))) %>%
  pivot_longer(-PlotObservationID, names_to="variable", values_to = "CWM")
# Calculate CWV
# Ancillary function
variance2.fun <- function(trait, abu){
  res <- as.double(NA)
  abu <- abu[!is.na(trait)]
  trait <- trait[!is.na(trait)]
  abu <- abu/sum(abu)
  if (length(trait)>1){
    # you need more than 1 observation to calculate variance
    # for calculation see 
    # http://r.789695.n4.nabble.com/Weighted-skewness-and-curtosis-td4709956.html
    m.trait <- weighted.mean(trait,abu)
    res <- sum(abu*(trait-m.trait)^2)
  }
  res
}

CWM3 <- CWM0 %>%
  group_by(PlotObservationID) %>%
  summarize_at(.vars= vars(StemDens_mean:LeafWaterCont_mean),
               .funs = list(~variance2.fun(., Relative_cover))) %>%
  dplyr::select(PlotObservationID, order(colnames(.))) %>%
  pivot_longer(-PlotObservationID, names_to="variable", values_to = "CWV")
# Calculate coverage for each trait in each plot
# changed in Version 1.2
CWM2 <- CWM0 %>%
  mutate(StemDens_mean=if_else(is.na(StemDens_mean),0,1) * Relative_cover) %>% 
  group_by(PlotObservationID) %>%
  summarize(trait.coverage=sum(StemDens_mean, na.rm=T))
## Calculate proportion of species having traits #changes in version 1.2
CWM4 <- CWM0 %>%
  group_by(PlotObservationID) %>%
  summarize(n.sp.with.trait=sum(!is.na(StemDens_mean))) 


# Join together
CWM <- CWM1 %>%
  left_join(CWM2, by=c("PlotObservationID")) %>%
  left_join(CWM3, by=c("PlotObservationID", "variable")) %>%
  left_join(CWM4, by=c("PlotObservationID")) %>%
  left_join(CWM0 %>% 
              group_by(PlotObservationID) %>%
              summarize(sp.richness=n()), by=c("PlotObservationID")) %>%
  mutate(prop.sp.with.trait=n.sp.with.trait/sp.richness) %>%
  dplyr::select(PlotObservationID, variable, sp.richness, prop.sp.with.trait, trait.coverage, CWM, CWV) %>% 
  arrange(PlotObservationID)

2.1 Explore CWM output

Community weighted means of 3 randomly selected plots
PlotObservationID variable sp.richness prop.sp.with.trait trait.coverage CWM CWV
129410 Chromosome.cDNAcont_mean 28 1.000 1.000 7.763 5.397200e+01
129410 Chromosome.n_mean 28 1.000 1.000 40.243 1.181020e+03
129410 Disp.unit.leng_mean 28 1.000 1.000 2.562 1.593000e+00
129410 LDMC_mean 28 1.000 1.000 0.264 3.000000e-03
129410 LeafArea.leaflet.undef_mean 28 1.000 1.000 1779.138 3.433267e+06
129410 LeafArea.leaf.undef_mean 28 1.000 1.000 1612.768 3.910708e+06
129410 LeafArea.undef.undef_mean 28 1.000 1.000 1676.513 1.331718e+06
129410 LeafCN.ratio_mean 28 1.000 1.000 20.515 2.779700e+01
129410 LeafC.perdrymass_mean 28 1.000 1.000 445.536 2.429060e+02
129410 Leaf.delta.15N_mean 28 1.000 1.000 -0.121 2.363000e+00
129410 LeafDryMass.single_mean 28 1.000 1.000 87.202 1.009834e+04
129410 Leaffreshmass_mean 28 1.000 1.000 0.332 1.300000e-01
129410 LeafLength_mean 28 1.000 1.000 107.930 3.743207e+03
129410 LeafN_mean 28 1.000 1.000 24.268 4.462100e+01
129410 LeafNperArea_mean 28 1.000 1.000 1.377 2.700000e-01
129410 LeafP_mean 28 1.000 1.000 1.818 2.680000e-01
129410 LeafThickness_mean 28 1.000 1.000 0.238 1.300000e-02
129410 LeafWaterCont_mean 28 1.000 1.000 4.409 3.595000e+00
129410 LeafWidth_mean 28 1.000 1.000 1.512 2.412000e+00
129410 PlantHeight_mean 28 1.000 1.000 0.593 1.160000e-01
129410 RootingDepth_mean 28 1.000 1.000 0.500 1.970000e-01
129410 SeedGerminationRate_mean 28 1.000 1.000 90.885 5.256100e+01
129410 Seed.length_mean 28 1.000 1.000 1.783 5.840000e-01
129410 SeedMass_mean 28 1.000 1.000 1.113 4.692000e+00
129410 Seed.num.rep.unit_mean 28 1.000 1.000 16850.286 1.411920e+09
129410 SLA_mean 28 1.000 1.000 20.986 4.807600e+01
129410 SpecificRootLength_mean 28 1.000 1.000 13842.987 1.320369e+08
129410 Stem.cond.dens_mean 28 1.000 1.000 66.194 2.167844e+03
129410 StemConduitDiameter_mean 28 1.000 1.000 57.672 2.623701e+03
129410 StemDens_mean 28 1.000 1.000 0.386 1.100000e-02
129410 StemDiam_mean 28 1.000 1.000 0.011 0.000000e+00
129410 WoodFiberLength_mean 28 1.000 1.000 861.796 6.491834e+04
129410 Wood.vessel.length_mean 28 1.000 1.000 458.175 4.760045e+04
1185844 Chromosome.cDNAcont_mean 3 0.667 0.208 1.402 2.320000e-01
1185844 Chromosome.n_mean 3 0.667 0.208 53.979 7.998800e+01
1185844 Disp.unit.leng_mean 3 0.667 0.208 3.054 7.100000e-02
1185844 LDMC_mean 3 0.667 0.208 0.166 0.000000e+00
1185844 LeafArea.leaflet.undef_mean 3 0.667 0.208 459.630 4.779943e+04
1185844 LeafArea.leaf.undef_mean 3 0.667 0.208 500.734 5.345584e+04
1185844 LeafArea.undef.undef_mean 3 0.667 0.208 318.528 2.013214e+04
1185844 LeafCN.ratio_mean 3 0.667 0.208 15.027 1.807000e+00
1185844 LeafC.perdrymass_mean 3 0.667 0.208 416.304 1.294080e+02
1185844 Leaf.delta.15N_mean 3 0.667 0.208 3.894 1.200000e-02
1185844 LeafDryMass.single_mean 3 0.667 0.208 9.458 1.829200e+01
1185844 Leaffreshmass_mean 3 0.667 0.208 0.064 1.000000e-03
1185844 LeafLength_mean 3 0.667 0.208 31.124 1.292560e+02
1185844 LeafN_mean 3 0.667 0.208 30.103 7.567000e+00
1185844 LeafNperArea_mean 3 0.667 0.208 0.585 9.000000e-03
1185844 LeafP_mean 3 0.667 0.208 3.591 7.730000e-01
1185844 LeafThickness_mean 3 0.667 0.208 0.213 0.000000e+00
1185844 LeafWaterCont_mean 3 0.667 0.208 7.524 3.100000e-02
1185844 LeafWidth_mean 3 0.667 0.208 1.438 2.260000e-01
1185844 PlantHeight_mean 3 0.667 0.208 1.884 9.200000e-02
1185844 RootingDepth_mean 3 0.667 0.208 0.166 2.000000e-03
1185844 SeedGerminationRate_mean 3 0.667 0.208 83.855 3.098000e+00
1185844 Seed.length_mean 3 0.667 0.208 3.231 8.900000e-02
1185844 SeedMass_mean 3 0.667 0.208 3.386 4.820000e-01
1185844 Seed.num.rep.unit_mean 3 0.667 0.208 1225.744 3.506249e+05
1185844 SLA_mean 3 0.667 0.208 52.283 1.906600e+01
1185844 SpecificRootLength_mean 3 0.667 0.208 2255.435 1.280565e+05
1185844 Stem.cond.dens_mean 3 0.667 0.208 72.012 6.224280e+02
1185844 StemConduitDiameter_mean 3 0.667 0.208 64.483 1.536110e+02
1185844 StemDens_mean 3 0.667 0.208 0.240 1.000000e-03
1185844 StemDiam_mean 3 0.667 0.208 0.038 0.000000e+00
1185844 WoodFiberLength_mean 3 0.667 0.208 980.389 1.074627e+04
1185844 Wood.vessel.length_mean 3 0.667 0.208 686.896 2.926415e+03
1471661 Chromosome.cDNAcont_mean 18 1.000 1.000 5.323 2.428300e+01
1471661 Chromosome.n_mean 18 1.000 1.000 28.888 8.353800e+01
1471661 Disp.unit.leng_mean 18 1.000 1.000 5.149 9.420000e+00
1471661 LDMC_mean 18 1.000 1.000 0.288 5.000000e-03
1471661 LeafArea.leaflet.undef_mean 18 1.000 1.000 4161.404 5.990357e+07
1471661 LeafArea.leaf.undef_mean 18 1.000 1.000 2050.572 5.934564e+06
1471661 LeafArea.undef.undef_mean 18 1.000 1.000 2987.544 2.431498e+07
1471661 LeafCN.ratio_mean 18 1.000 1.000 23.513 1.830100e+01
1471661 LeafC.perdrymass_mean 18 1.000 1.000 459.098 2.370090e+02
1471661 Leaf.delta.15N_mean 18 1.000 1.000 0.542 2.437000e+00
1471661 LeafDryMass.single_mean 18 1.000 1.000 111.060 2.467149e+04
1471661 Leaffreshmass_mean 18 1.000 1.000 0.457 4.660000e-01
1471661 LeafLength_mean 18 1.000 1.000 123.835 1.143678e+04
1471661 LeafN_mean 18 1.000 1.000 22.202 2.639700e+01
1471661 LeafNperArea_mean 18 1.000 1.000 1.127 1.450000e-01
1471661 LeafP_mean 18 1.000 1.000 1.897 5.900000e-01
1471661 LeafThickness_mean 18 1.000 1.000 0.186 5.000000e-03
1471661 LeafWaterCont_mean 18 1.000 1.000 4.405 3.171000e+00
1471661 LeafWidth_mean 18 1.000 1.000 1.442 1.650000e+00
1471661 PlantHeight_mean 18 1.000 1.000 0.574 2.704000e+00
1471661 RootingDepth_mean 18 1.000 1.000 0.412 4.800000e-02
1471661 SeedGerminationRate_mean 18 1.000 1.000 91.031 2.094400e+01
1471661 Seed.length_mean 18 1.000 1.000 3.428 3.224000e+00
1471661 SeedMass_mean 18 1.000 1.000 2.915 9.749000e+00
1471661 Seed.num.rep.unit_mean 18 1.000 1.000 10226.454 7.019958e+08
1471661 SLA_mean 18 1.000 1.000 22.881 4.507900e+01
1471661 SpecificRootLength_mean 18 1.000 1.000 11499.926 7.189515e+07
1471661 Stem.cond.dens_mean 18 1.000 1.000 77.633 2.777308e+03
1471661 StemConduitDiameter_mean 18 1.000 1.000 51.578 2.146180e+03
1471661 StemDens_mean 18 1.000 1.000 0.425 7.000000e-03
1471661 StemDiam_mean 18 1.000 1.000 0.015 1.000000e-03
1471661 WoodFiberLength_mean 18 1.000 1.000 891.655 5.324285e+04
1471661 Wood.vessel.length_mean 18 1.000 1.000 429.362 1.531078e+04

Scatterplot comparing coverage of traits values across plots, when based on relative cover and when based on proportion of species richness

ggplot(data=CWM %>% 
         #all variables have the same coverage. Showcase with LDMC
         filter(variable=="LDMC_mean"), aes(x=trait.coverage, y=prop.sp.with.trait, col=log(sp.richness))) + 
  geom_point(pch="+", alpha=1/3) + 
  geom_abline(intercept = 0, slope=1, col=2, lty=2, lwd=.7) + 
  xlim(c(0,1)) + 
  ylim(c(0,1)) + 
  scale_color_viridis() + 
  theme_bw() +
  xlab("Trait coverage (Relative  cover)") + 
  ylab("Trait coverage (Proportion of species)") + 
  coord_equal()

Calculate summary statistics for trait coverage in plots

CWM.coverage <- CWM %>% 
  filter(variable=="LDMC_mean") %>% 
  summarize_at(.vars=vars(trait.coverage, prop.sp.with.trait),
                .funs=list(num.0s=~sum(.==0),
                           min=~min(., na.rm=T),
                           q025=~quantile(., 0.25, na.rm=T), 
                           q50=~quantile(., 0.50, na.rm=T), 
                           q75=~quantile(., .75, na.rm=T), 
                           max=~max(., na.rm=T), 
                           mean=~mean(., na.rm=T), 
                           sd=~sd(., na.rm=T))) %>% 
  pivot_longer(cols = 1:ncol(.), names_to="variable", values_to="value") %>% 
  separate(variable, sep="_", into=c("metric", "stat")) %>% 
  mutate(stat=factor(stat, levels=c("num.0s", "min", "q025", "q50", "q75", "max", "mean", "sd"))) %>% 
  pivot_wider(names_from = "stat")
Summary of plot-level coverage of CWM and CWVs
metric num.0s min q025 q50 q75 max mean sd
trait.coverage 10422 0 0.882 0.987 1 1 0.890 0.201
prop.sp.with.trait 10347 0 0.840 0.962 1 1 0.889 0.169

Calculate summary statistics for CWMs and CWVs

CWM.summary <- CWM %>% 
  rename(myvar=variable) %>% 
  group_by(myvar) %>% 
  summarize_at(.vars=vars(CWM:CWV),
                .funs=list(min=~min(., na.rm=T),
                           q025=~quantile(., 0.25, na.rm=T), 
                           q50=~quantile(., 0.50, na.rm=T), 
                           q75=~quantile(., .75, na.rm=T), 
                           max=~max(., na.rm=T), 
                           mean=~mean(., na.rm=T), 
                           sd=~sd(., na.rm=T))) %>% 
  gather(key=variable, value=value, -myvar) %>% 
  separate(variable, sep="_", into=c("metric", "stat")) %>% 
  mutate(stat=factor(stat, levels=c("min", "q025", "q50", "q75", "max", "mean", "sd"))) %>% 
  spread(key=stat, value=value) %>% 
  arrange(metric, myvar)
Summary of CWMs and CWVs across all plots
myvar metric min q025 q50 q75 max mean sd
Chromosome.cDNAcont_mean CWM 0.084 3.426 5.049 7.289000e+00 9.008000e+01 6.378000e+00 5.073000e+00
Chromosome.n_mean CWM 0.114 27.511 31.725 3.861300e+01 4.166200e+03 3.470000e+01 1.366200e+01
Disp.unit.leng_mean CWM 0.019 2.681 3.594 6.353000e+00 3.592797e+03 1.399700e+01 9.156600e+01
LDMC_mean CWM 0.014 0.243 0.284 3.350000e-01 9.680000e-01 2.870000e-01 6.900000e-02
LeafArea.leaflet.undef_mean CWM 0.001 507.521 1044.090 2.181974e+03 2.896005e+08 3.139681e+03 4.069999e+05
LeafArea.leaf.undef_mean CWM 0.007 484.284 1043.765 2.254141e+03 1.714323e+08 3.688766e+05 3.708649e+06
LeafArea.undef.undef_mean CWM 0.001 547.665 1186.965 2.187040e+03 2.239247e+07 2.017476e+03 2.479140e+04
LeafCN.ratio_mean CWM 0.986 20.104 23.453 2.799800e+01 8.780210e+02 2.560000e+01 1.470700e+01
LeafC.perdrymass_mean CWM 96.454 440.933 450.840 4.664590e+02 9.991090e+02 4.534850e+02 2.587500e+01
Leaf.delta.15N_mean CWM -12.462 -1.054 -0.116 8.060000e-01 4.310700e+01 8.000000e-03 2.001000e+00
LeafDryMass.single_mean CWM 0.000 30.792 63.186 1.299550e+02 9.401541e+04 1.133420e+02 3.344090e+02
Leaffreshmass_mean CWM 0.000 0.124 0.248 4.810000e-01 1.521780e+02 4.510000e-01 9.280000e-01
LeafLength_mean CWM 0.013 48.629 74.999 1.020730e+02 5.680312e+04 8.575800e+01 1.868610e+02
LeafN_mean CWM 3.830 19.117 22.247 2.514800e+01 2.882780e+02 2.245500e+01 5.297000e+00
LeafNperArea_mean CWM 0.002 1.146 1.316 1.593000e+00 8.794600e+01 1.436000e+00 5.510000e-01
LeafP_mean CWM 0.014 1.529 1.835 2.200000e+00 6.392164e+04 1.184280e+02 1.316728e+03
LeafThickness_mean CWM 0.005 0.189 0.221 2.770000e-01 5.413100e+01 3.150000e-01 1.061000e+00
LeafWaterCont_mean CWM 0.387 3.458 4.409 5.432000e+00 4.260880e+02 5.521000e+00 9.322000e+00
LeafWidth_mean CWM 0.005 0.766 1.343 2.548000e+00 2.675318e+04 5.747000e+00 1.239480e+02
PlantHeight_mean CWM 0.005 0.337 0.611 4.597000e+00 6.994000e+01 3.096000e+00 4.644000e+00
RootingDepth_mean CWM 0.004 0.356 0.515 7.640000e-01 7.274138e+04 8.000000e-01 5.740100e+01
SeedGerminationRate_mean CWM 7.122 84.063 88.908 9.256200e+01 2.416300e+02 8.799500e+01 6.766000e+00
Seed.length_mean CWM 0.061 1.915 2.502 4.084000e+00 1.089329e+04 1.502100e+01 2.134020e+02
SeedMass_mean CWM 0.000 0.939 2.239 1.844600e+01 3.486900e+06 5.578200e+02 1.635557e+04
Seed.num.rep.unit_mean CWM 0.000 1095.844 3681.489 1.521527e+04 1.747089e+21 7.647885e+15 2.903085e+18
SLA_mean CWM 1.473 15.282 20.029 2.453200e+01 5.490787e+03 3.416900e+01 1.258870e+02
SpecificRootLength_mean CWM 0.000 4427.166 7264.227 1.200942e+04 3.071495e+09 4.753325e+06 6.289158e+07
Stem.cond.dens_mean CWM 0.013 72.735 98.152 1.395610e+02 4.821523e+06 2.464080e+02 8.781285e+03
StemConduitDiameter_mean CWM 0.001 33.125 41.642 5.261600e+01 1.664311e+07 9.315900e+01 2.038130e+04
StemDens_mean CWM 0.053 0.362 0.420 4.970000e-01 2.640000e+00 4.350000e-01 1.150000e-01
StemDiam_mean CWM 0.000 0.010 0.024 1.000000e-01 8.882800e+01 6.800000e-02 1.620000e-01
WoodFiberLength_mean CWM 116.018 723.295 812.050 9.290410e+02 2.204640e+06 1.336279e+03 1.052422e+04
Wood.vessel.length_mean CWM 31.679 368.476 430.992 5.207070e+02 4.298853e+05 4.657020e+02 5.607070e+02
Chromosome.cDNAcont_mean CWV 0.000 5.486 13.601 3.300600e+01 2.138479e+04 4.308800e+01 8.812800e+01
Chromosome.n_mean CWV 0.000 70.244 154.866 3.402730e+02 1.451823e+07 4.454820e+02 2.828649e+04
Disp.unit.leng_mean CWV 0.000 1.500 4.005 1.677100e+01 3.227013e+06 1.971977e+04 1.669878e+05
LDMC_mean CWV 0.000 0.002 0.004 7.000000e-03 1.570000e-01 5.000000e-03 4.000000e-03
LeafArea.leaflet.undef_mean CWV 0.000 305355.239 1480599.502 5.933651e+06 3.653300e+16 2.132119e+11 7.765425e+13
LeafArea.leaf.undef_mean CWV 0.000 286067.289 1277038.295 6.710382e+06 7.347254e+15 4.971601e+13 4.236662e+14
LeafArea.undef.undef_mean CWV 0.000 415108.997 1223847.297 4.832361e+06 1.809148e+15 1.432148e+10 3.137218e+12
LeafCN.ratio_mean CWV 0.000 15.173 35.581 8.165800e+01 6.171788e+05 4.141010e+02 4.250786e+03
LeafC.perdrymass_mean CWV 0.000 178.696 341.761 5.938160e+02 1.057395e+05 6.270460e+02 2.929516e+03
Leaf.delta.15N_mean CWV 0.000 1.711 2.987 4.825000e+00 6.184630e+02 6.739000e+00 2.490100e+01
LeafDryMass.single_mean CWV 0.000 1122.020 5685.286 2.727744e+04 7.538101e+09 3.353393e+05 3.177519e+07
Leaffreshmass_mean CWV 0.000 0.020 0.090 4.560000e-01 4.203363e+04 2.001000e+00 1.052830e+02
LeafLength_mean CWV 0.000 1015.478 2309.007 4.959933e+03 1.631258e+10 2.256041e+05 2.560740e+07
LeafN_mean CWV 0.000 13.433 24.019 3.836400e+01 2.717296e+05 3.433700e+01 3.758840e+02
LeafNperArea_mean CWV 0.000 0.066 0.125 2.490000e-01 1.121773e+04 8.360000e-01 2.818400e+01
LeafP_mean CWV 0.000 0.148 0.281 5.140000e-01 1.021494e+09 3.970267e+06 4.166663e+07
LeafThickness_mean CWV 0.000 0.002 0.005 1.300000e-02 3.417576e+03 2.596000e+00 3.067900e+01
LeafWaterCont_mean CWV 0.000 1.478 2.889 4.857000e+00 4.497611e+04 3.088160e+02 2.570454e+03
LeafWidth_mean CWV 0.000 0.502 1.794 4.775000e+00 3.023911e+08 1.185149e+05 3.112720e+06
PlantHeight_mean CWV 0.000 0.022 0.101 2.662500e+01 8.584270e+02 2.283900e+01 4.709200e+01
RootingDepth_mean CWV 0.000 0.043 0.102 2.490000e-01 1.862532e+10 1.262666e+04 1.446747e+07
SeedGerminationRate_mean CWV 0.000 27.071 51.189 9.250700e+01 7.015952e+03 8.111900e+01 1.343330e+02
Seed.length_mean CWV 0.000 0.562 1.342 4.901000e+00 2.966463e+07 6.835872e+04 1.188197e+06
SeedMass_mean CWV 0.000 0.746 9.161 1.458790e+03 2.901391e+13 2.068233e+09 6.451240e+10
Seed.num.rep.unit_mean CWV 0.000 3851346.556 69398278.752 1.292193e+09 5.087198e+42 2.717614e+37 9.617583e+39
SLA_mean CWV 0.000 19.789 36.780 7.476200e+01 7.533142e+06 5.343016e+04 4.346775e+05
SpecificRootLength_mean CWV 0.000 11016253.868 35073771.403 7.739769e+07 2.358521e+18 6.608033e+15 9.503399e+16
Stem.cond.dens_mean CWV 0.000 1898.440 4411.973 1.036987e+04 3.520818e+13 1.439290e+09 8.830036e+10
StemConduitDiameter_mean CWV 0.000 220.008 493.755 1.054666e+03 1.065354e+15 2.239858e+09 1.359263e+12
StemDens_mean CWV 0.000 0.005 0.010 1.500000e-02 1.447000e+00 1.600000e-02 5.400000e-02
StemDiam_mean CWV 0.000 0.000 0.001 9.000000e-03 7.518611e+04 2.730000e-01 9.320000e+01
WoodFiberLength_mean CWV 0.000 28041.463 50942.454 8.603931e+04 1.868337e+13 1.936939e+08 2.399085e+10
Wood.vessel.length_mean CWV 0.000 13361.834 26348.170 5.199048e+04 7.096939e+11 1.569152e+06 9.053488e+08

3 Classify plots in is.forest or is.non.forest based on species traits

sPlot has two independent systems for classifying plots to vegetation types. The first relies on the expert opinion of data contributors and classifies plots into broad habitat types. These broad habitat types are coded using 5, non-mutually exclusive dummy variables:
1) Forest
2) Grassland
3) Shrubland
4) Sparse vegetation
5) Wetland
A plot may belong to more than one formation, e.g. a Savannah is categorized as Forest + Grassland (FG). This system is, unfortunately, not consistently available across all plots, being the large majority of classified plots only available for Europe.
There is therefore the need to give at least some indication to the remaining unclassified plots. To achieve this, already from v2.1, sPlot started using a classification into forest and non-forest, based on the share of trees, and the layering of vegetation. Here, we derived the (mutually exclusive) is.forest and is.non.forest classification of plots.

3.1 Derive species level information on Growth Forms

We used different sources of information:
1) Data from the gap-filled trait matrix
2) Manual cleaning of the most common species for which growth trait info is not available
3) Data from TRY (public dataset only) on all species with growth form info (Trait ID = 42)
4) Cross-match with species assigned to tree layer in DT table.

Step 1: Attach growth form trait information to DT table. Growth form information derives from TRY

DT.gf <- DT2 %>% 
  filter(Taxon_group=="Vascular plant") %>% 
  #join with try names, using resolved species names as key
  left_join(try.species.names %>% 
              dplyr::select(Name_short, GrowthForm) %>% 
              rename(Species=Name_short) %>% 
              distinct(Species, .keep_all=T), 
            by="Species") %>% 
  left_join(try.species.means %>% 
              dplyr::select(Name_short, PlantHeight_mean) %>% 
              rename(Species=Name_short), 
            by="Species")
# number of records withouth Growth Form info
sum(is.na(DT.gf$GrowthForm))
## [1] 4997688

Step 2: Select most common species without growth-trait information to export and check manually

top.gf.nas <- DT.gf %>% 
  filter(is.na(GrowthForm)) %>% 
  group_by(Species) %>% 
  summarize(n=n()) %>% 
  arrange(desc(n))
write_csv(top.gf.nas %>% 
            filter(n>1000), 
  path="../_derived/Species_missingGF.csv")

The first 47569 species account for 56.57% of the missing records. Assign growth forms manually, reimport and coalesce into DT.gf

# Import manually classified species - this info is also reported in Appendix 1
gf.manual <- read_csv("../_derived/Species_missingGF_complete.csv")
## 
## ── Column specification ────────────────────────────────────────────────────────
## cols(
##   species = col_character(),
##   GrowthForm = col_character()
## )
DT.gf <- DT.gf %>% 
  left_join(gf.manual %>% 
              rename(GrowthForm.m=GrowthForm, Species=species),
            by="Species") %>% 
   mutate(GrowthForm=coalesce(GrowthForm, GrowthForm.m)) %>% 
   dplyr::select(-GrowthForm.m)

After manual completion, the number of records without growth form information decresead to 2331724.

Step 3: Import additional data on growth-form from TRY (Accessed 10 March 2020).
All public data on growth form downloaded. First take care of unmatched quotation marks in the txt file. Do this from command line.

# escape all unmatched quotation marks. Run in Linux terminal
#sed 's/"/\\"/g' 8854.txt > 8854_test.txt
#sed "s/'/\\'/g" 8854_test.txt > 8854_test2.txt

Information on growth form is not organized and has a myriad of levels. Extract and simplify to the set of few types used so far. In case a species is attributed to multiple growth forms use a majority vote.

all.gf0 <- read_delim("../_input/TRY5.0_v1.1/8854_test2.txt", delim="\t") 

all.gf <- all.gf0 %>% 
  filter(TraitID==42) %>% 
  distinct(AccSpeciesName, OrigValueStr) %>% 
  rename(GrowthForm0=OrigValueStr) %>% 
  mutate(GrowthForm0=tolower(GrowthForm0)) %>%
  filter(AccSpeciesName %in% sPlot.species$Species) %>% 
  mutate(GrowthForm_simplified= GrowthForm0) %>% 
  mutate(GrowthForm_simplified=replace(GrowthForm_simplified, 
                                       list=str_detect(GrowthForm0,
                                       "vine|climber|liana|carnivore|epiphyte|^succulent|lichen|parasite|
                                       hydrohalophyte|aquatic|cactous|parasitic|hydrophytes|carnivorous"), 
                                       "other")) %>%
  mutate(GrowthForm_simplified=replace(GrowthForm_simplified, 
                                       list=str_detect(GrowthForm0, 
                                                       "tree|conifer|^woody$|palmoid|mangrove|gymnosperm"), 
                                       "tree")) %>% 
  mutate(GrowthForm_simplified=replace(GrowthForm_simplified, 
                                       list=str_detect(GrowthForm0, "shrub|scrub|bamboo"), "shrub")) %>%
  mutate(GrowthForm_simplified=replace(GrowthForm_simplified, 
                                       list=str_detect(GrowthForm0,
                                                       "herb|sedge|graminoid|fern|forb|herbaceous|grass|chaemaephyte|geophyte|annual"),
                                       "herb")) %>%
  mutate(GrowthForm_simplified=ifelse(GrowthForm_simplified %in% c("other", "herb", "shrub", "tree"), 
                                      GrowthForm_simplified, NA)) %>% 
  filter(!is.na(GrowthForm_simplified)) 

#Some species have multiple attributions - use a majority vote. NA if ties
get.mode <- function(x){
  if(length(unique(x))==1){
    return(as.character(unique(x)))} else{
    tmp <- sort(table(x), decreasing=T)
    if(tmp[1]!=tmp[2]){return(names(tmp)[1])} else {
    return("Unknown")}
    }
  }

all.gf <- all.gf %>% 
  group_by(AccSpeciesName) %>% 
  summarize(GrowthForm_simplified=get.mode(GrowthForm_simplified)) %>% 
  filter(GrowthForm_simplified!="Unknown")

table(all.gf$GrowthForm_simplified, exclude=NULL)  
## 
##  herb other shrub  tree 
## 21467  3429  7406  9194
#coalesce this info into DT.gf
DT.gf <- DT.gf %>% 
  left_join(all.gf %>% 
              rename(Species=AccSpeciesName), 
            by="Species") %>% 
  mutate(GrowthForm=coalesce(GrowthForm, GrowthForm_simplified)) %>% 
  dplyr::select(-GrowthForm_simplified)

Step 4: Cross-match. Assign all species occurring in at least one relevé in the tree layer as tree. Conservatively, do this only when the record is at species level (exclude records at genus\family level)

other.trees <- DT.gf %>% 
  filter(Layer==1 & is.na(GrowthForm)) %>% 
  filter(Rank_correct=="species") %>% 
  distinct(Species, Layer, GrowthForm) %>% 
  pull(Species)

DT.gf <- DT.gf %>% 
  mutate(GrowthForm=replace(GrowthForm, 
                            list=Species %in% other.trees, 
                            values="tree"))

After cross-matching, the number of records without growth form information decresead to 1264620.

Average height per growth form
GrowthForm Height
herb 0.520
herb/shrub 1.982
herb 1.522
herb/shrub/tree 5.241
other 4.550
shrub 2.351
shrub/tree 5.094
shrub 4.644
tree 13.077
NA 2.507

Classify species as tree or tall shrubs vs. other. Make a compact table of species growth forms and create fields is.tree.or.tall.shrub and is.not.tree.and.small.
Define a species as is.tree.or.tall.shrub when it is either defined as tree, OR has a height >10
Define a species as is.not.tree.or.shrub.and.small when it has a height <10, as long as it’s not defined a tree. When height is not available, it is sufficient that the species is classified as “herb” or “other”.

GF <- DT.gf %>% 
  distinct(Species, GrowthForm, PlantHeight_mean) %>% 
  mutate(GrowthForm=fct_collapse(GrowthForm, 
                                 "herb/shrub"=c("herb\\shrub","herb/shrub"), 
                                 "shrub/tree"=c("shrub/tree", "shrub\\tree"))) %>% 
  ## define is.tree.or.tall
  mutate(is.tree.or.tall.shrub=NA) %>% 
  mutate(is.tree.or.tall.shrub=replace(is.tree.or.tall.shrub, 
                                       list=str_detect(GrowthForm, "tree"), 
                                       T)) %>% 
  mutate(is.tree.or.tall.shrub=replace(is.tree.or.tall.shrub, 
                                       list=PlantHeight_mean>=10, 
                                       T)) %>% 
  ## define is.not.tree.or.shrub.and.small 
  mutate(is.not.tree.or.shrub.and.small=NA) %>% 
  mutate(is.not.tree.or.shrub.and.small=replace(is.not.tree.or.shrub.and.small,
                                       list=PlantHeight_mean<10, 
                                       T)) %>% 
  mutate(is.not.tree.or.shrub.and.small=replace(is.not.tree.or.shrub.and.small,
                                       list=is.na(PlantHeight_mean) & str_detect(GrowthForm, "herb|other"), 
                                       T)) %>%   
  ## use each field in turn to define which of the records in the other is F
  mutate(is.not.tree.or.shrub.and.small=replace(is.not.tree.or.shrub.and.small,
                                       list= is.tree.or.tall.shrub==T,
                                       F)) %>% 
  mutate(is.tree.or.tall.shrub=replace(is.tree.or.tall.shrub,
                                       list= is.not.tree.or.shrub.and.small==T,
                                       F)) %>% 
  ## drop redundant field
  dplyr::select(-is.not.tree.or.shrub.and.small)
  

## cross-check classification  
table(GF$GrowthForm, GF$is.tree.or.tall.shrub, exclude=NULL)
##                  
##                   FALSE  TRUE  <NA>
##   herb            22430     2     0
##   herb/shrub         47     1     0
##   herb/shrub/tree     0     2     0
##   other            1646    42     0
##   shrub            5410    93  2323
##   shrub/tree          0   133     0
##   tree                0 13458     0
##   <NA>              818    50 26691
## Check for herb species classified as trees
GF %>% 
  filter(is.tree.or.tall.shrub & GrowthForm=="herb")
## # A tibble: 2 x 4
##   Species                   GrowthForm PlantHeight_mean is.tree.or.tall.shrub
##   <chr>                     <fct>                 <dbl> <lgl>                
## 1 Phyllostachys bambusoides herb                   16.6 TRUE                 
## 2 Bambusa vulgaris          herb                   14.2 TRUE

These are Bamboo species and their hiehgts seems reasonable.

3.2 Classify plots into forest\non-forest

Define a plot as forest if:
1) Has a total cover of the the tree layer >=25% (from header)
2) Has a total cover in Layer 1 >=25% (from DT)
3) Has a total cover of tree or tall shrub species >=25% (from DT + TRY)
4) Has data on Basal area summing to 10 m2/ha

The first three criteria are declined to define non forest as follows:
1) Info on total cover of the tree layer is available and <25%
2) Info on total cover in Layer 1 is available and <25%
3) The relative cover of non tree species is >75%

Criteria 2 and 3 only apply to plots having cover data in percentage.
Reimport header file

load("../_output/header_sPlot3.0.RData")

Criterium 1

plot.vegtype1 <- header %>% 
  dplyr::select(PlotObservationID, `Cover tree layer (%)`) %>% 
  rename(Cover_trees=`Cover tree layer (%)`) %>% 
  mutate(is.forest=Cover_trees>=25) 

table(plot.vegtype1 %>% dplyr::select(is.forest), exclude=NULL)
## 
##   FALSE    TRUE    <NA> 
##   26211  191833 1759593

Criterium 2

# Select only plots having cover data in percentage
mysel <- (DT.gf %>% 
            distinct(PlotObservationID, Ab_scale) %>% 
            group_by(PlotObservationID) %>% 
            summarize(AllCovPer=all(Ab_scale=="CoverPerc")) %>% 
            filter(AllCovPer==T) %>% 
            pull(PlotObservationID))
# Excludedd plots
nrow(header)-length(mysel)
## [1] 294880
plot.vegtype2 <- DT.gf %>% 
  filter(PlotObservationID %in% mysel ) %>% 
  filter(Layer %in% c(1,2,3)) %>% 
  # first sum the cover of all species in a layer
  group_by(PlotObservationID, Layer) %>% 
  summarize(Cover_perc=sum(Abundance)) %>% 
  # then combine cover across layers
  group_by(PlotObservationID) %>% 
  summarize(Cover_perc=combine.cover(Cover_perc)) %>% 
  mutate(is.forest=Cover_perc>=25) 
## `summarise()` has grouped output by 'PlotObservationID'. You can override using the `.groups` argument.
table(plot.vegtype1 %>% dplyr::select(is.forest), exclude=NULL)
## 
##   FALSE    TRUE    <NA> 
##   26211  191833 1759593

Criterium 3

plot.vegtype3 <- DT.gf %>% 
  #filter plots where all records are recorded as percentage cover
  filter(PlotObservationID %in% mysel ) %>% 
  # combine cover across layers
  group_by(PlotObservationID, Species) %>%
  summarize(cover_perc=combine.cover(Abundance)) %>%
  ungroup() %>% 
  # attach species Growth Form information
  left_join(GF, by="Species")%>% 
  group_by(PlotObservationID) %>% 
  summarize(cover_tree=sum(cover_perc*is.tree.or.tall.shrub, na.rm=T), 
            cover_non_tree=sum(cover_perc*(!is.tree.or.tall.shrub), na.rm=T), 
            cover_unknown=sum(cover_perc* is.na(is.tree.or.tall.shrub))) %>% 
  rowwise() %>% 
  ## classify plots based on cover of different growth forms
  mutate(tot.cover=sum(cover_tree, cover_non_tree, cover_unknown, na.rm=T)) %>% 
  mutate(is.forest=cover_tree>=25) %>% 
  mutate(is.non.forest=cover_tree<25 & (cover_non_tree/tot.cover)>.75)
## `summarise()` has grouped output by 'PlotObservationID'. You can override using the `.groups` argument.
table(plot.vegtype3 %>% dplyr::select(is.forest, is.non.forest), exclude=NULL)
##          is.non.forest
## is.forest   FALSE    TRUE    <NA>
##     FALSE   72087 1136069      10
##     TRUE   474591       0       0

Criterium 4

plot.vegtype4 <-  DT.gf %>% 
  filter(Ab_scale=="x_BA") %>% 
  group_by(PlotObservationID) %>% 
  summarize(tot.ba=sum(Abundance)) %>% 
  mutate(is.forest=tot.ba>10)

table(plot.vegtype4 %>% dplyr::select(is.forest), exclude=NULL)
## 
## FALSE  TRUE 
##  1358  5558

Combine classifications from the three criteria. Use majority vote to assign plots. In case of ties, a progressively lower priority is given from criterium 1 to criterium 4.

plot.vegtype <- header %>% 
  dplyr::select(PlotObservationID) %>% 
  left_join(plot.vegtype1 %>% 
              dplyr::select(PlotObservationID, is.forest), 
            by="PlotObservationID") %>% 
  left_join(plot.vegtype2 %>% 
              dplyr::select(PlotObservationID, is.forest), 
            by="PlotObservationID") %>% 
  left_join(plot.vegtype3 %>% 
              dplyr::select(PlotObservationID, is.forest, is.non.forest) %>% 
              rename(is.non.forest.x.x=is.non.forest), 
            by="PlotObservationID") %>% 
  left_join(plot.vegtype4 %>% 
              dplyr::select(PlotObservationID, is.forest), 
            by="PlotObservationID") %>% 
  ## assign vegtype based on majority vote. In case of ties use the order of criteria as ranking
  rowwise() %>% 
  mutate(mean.forest=mean(c(is.forest.x, is.forest.y, is.forest.x.x, is.forest.y.y), na.rm=T)) %>% 
  mutate(mean.forest2=coalesce(is.forest.x, is.forest.y, is.forest.x.x, is.forest.y.y)) %>% 
  mutate(is.forest=ifelse(mean.forest==0.5, mean.forest2, mean.forest>0.5)) %>%  
  # same for is.non.forest
  mutate(mean.non.forest=mean(c( (!is.forest.x), (!is.forest.y), is.non.forest.x.x, (!is.forest.y.y)), na.rm=T)) %>% 
  mutate(mean.non.forest2=coalesce( (!is.forest.x), (!is.forest.y), is.non.forest.x.x, (!is.forest.y.y))) %>% 
  mutate(is.non.forest=ifelse(mean.non.forest==0.5, mean.non.forest2, mean.non.forest>0.5)) %>% 
  # when both is.forest & is.non.forest are F transform to NA
  mutate(both.F=ifelse( ( (is.forest==F | is.na(is.forest)) & is.non.forest==F), T, F)) %>% 
  mutate(is.forest=replace(is.forest, list=both.F==T, values=NA)) %>% 
  mutate(is.non.forest=replace(is.non.forest, list=both.F==T, values=NA))

table(plot.vegtype %>% dplyr::select(is.forest, is.non.forest), exclude=NULL)
##          is.non.forest
## is.forest   FALSE    TRUE    <NA>
##     FALSE       0 1160476       7
##     TRUE   468259       0       0
##     <NA>        0       0  348895

3.3 Cross-check and validate

Cross check with sPlot’s 5-class (incomplete) native classification deriving from data contributors. Build a Confusion matrix.

cross.check <- header %>% 
  dplyr::select(PlotObservationID, Forest) %>% 
  left_join(plot.vegtype %>% 
              dplyr::select(PlotObservationID, is.forest, is.non.forest) %>% 
              rename(Forest=is.forest, Other=is.non.forest) %>% 
              gather(isfor_isnonfor, value, -PlotObservationID) %>% 
              filter(value==T) %>% 
              dplyr::select(-value), 
            by="PlotObservationID") %>% 
  mutate(Other=1*Forest!=1) %>% 
  gather(veg_type, value, -PlotObservationID, -isfor_isnonfor) %>% 
  filter(value==1) %>% 
  dplyr::select(-value)

#Build a confusion matrix to evaluate the comparison  
u <- union(cross.check$isfor_isnonfor, cross.check$veg_type)
t <- table( factor(cross.check$isfor_isnonfor, u), factor(cross.check$veg_type, u))
confm <- caret::confusionMatrix(t)
Confusion matrix between sPlot’s native classification of habitats (columns), and classification based on four criteria based on vegetation layers and growth forms (rows)
Forest Other
Forest 381411 25588
Other 28020 973463
Formulas of associated statistics are available on the help page of the caret package and associated references. The overall accuracy of the classification based on is.forest\is.non.forest, when tested against sPlot’s native habitat classification is 0.96, the Kappa statistics is 0.91.
Associated statistics of confusion matrix by class
x
Sensitivity 0.9315636
Specificity 0.9743877
Pos Pred Value 0.9371301
Neg Pred Value 0.9720215
Precision 0.9371301
Recall 0.9315636
F1 0.9343385
Prevalence 0.2906896
Detection Rate 0.2707958
Detection Prevalence 0.2889629
Balanced Accuracy 0.9529756
#check 
nrow(header.vegtype)==nrow(header)
## [1] TRUE

Through the process described above, we managed to classify 1628735, of which 468259 is forest and 1160476 is non-forest.
The total number of plots with attribution to forest\non-forest (either coming from sPlot’s native classification, or from the process above) is: 1726506.

4 Export and update other objects

sPlot.traits <- sPlot.species %>% 
  arrange(Species) %>% 
  left_join(GF %>% 
              dplyr::select(Species, GrowthForm, is.tree.or.tall.shrub), 
            by="Species") %>% 
  left_join(try.combined.means %>% 
              rename(Species=Taxon_name), by="Species") %>% 
  ## some entries are duplicated (both at species and Genus level)
  ## Keep only genus-level averages
  group_by(Species) %>% 
  arrange(desc(n)) %>% 
  slice(1) %>% 
  ungroup() %>% 
  dplyr::select(-Rank_correct)
  
save(try.combined.means, CWM, sPlot.traits, trait.legend, file="../_output/Traits_CWMs_sPlot3.RData")

header <- header %>% 
  left_join(plot.vegtype %>% 
              dplyr::select(PlotObservationID, is.forest, is.non.forest),
            by="PlotObservationID") %>% 
  dplyr::select(PlotObservationID:ESY, is.forest:is.non.forest, everything())

save(header, file="../_output/header_sPlot3.0.RData")

APPENDIX

Appendix 1 - Growth forms of most common species

As assigned manually.

cat(readLines("../_derived/Species_missingGF_complete.csv"), sep = '\n')
species,GrowthForm
Taraxacum,herb
Quercus robur,tree
Corylus avellana,tree
Frangula alnus,shrub
Festuca ovina,herb
Vaccinium vitis-idaea,shrub
NA,NA
Rubus,shrub
Capsella bursa-pastoris,herb
Salix cinerea,tree
Solanum dulcamara,herb
Tripolium pannonicum,herb
Impatiens noli-tangere,herb
Ononis spinosa,shrub
Centaurea nigra,herb
Rubus ulmifolius,shrub
Alisma plantago-aquatica,herb
Spirodela polyrhiza,herb
Salix,NA
Helictochloa pratensis,herb
Ruscus aculeatus,shrub
Lophozonia,tree
Stachys recta,herb
Crataegus laevigata,shrub
Festuca rupicola,herb
Metrosideros diffusa,herb
Rhamnus cathartica,shrub\tree
Helianthemum oelandicum,herb
Dicksonia squarrosa,herb
Rosa,shrub
Carex viridula,herb
Podocarpus spinulosus,shrub
Pinus mugo,tree
Orthilia secunda,herb
Cyathea smithii,tree
Erica arborea,shrub\tree
Hippocrepis emerus,herb
Phillyrea latifolia,tree
Triglochin palustris,herb
Metrosideros fulgens,other
Apera spica-venti,herb
Crataegus,shrub
Blechnum discolor,herb
Blechnum novae-zelandiae,herb
Tragopogon pratensis,herb
Bellidiastrum michelii,herb
Sedum album,herb
Raphanus raphanistrum,herb
Quercus coccifera,tree
Quercus mongolica,tree
Hydrocharis morsus-ranae,herb
Camellia japonica,shrub\tree
Arbutus unedo,shrub\tree
Dactylorhiza majalis,herb
Trachelospermum asiaticum,other
Myosotis laxa,herb
Valeriana crispa,herb
Hieracium lachenalii,herb
Festuca drymeja,herb
Asplenium flaccidum,herb
Rubus australis,other
Adenostyles alpina,herb
Viola,herb
Hymenophyllum demissum,herb
Hieracium,herb
Senecio nemorensis,herb
Lemna,herb
Microsorum pustulatum,herb
Epilobium ciliatum,herb
Paederia foetida,herb
Ledum palustre,shrub
Arctostaphylos uva-ursi,shrub
Poaceae,herb
Epilobium,herb
Alchemilla,herb
Genista sagittalis,shrub
Blechnum nipponicum,herb
Biscutella laevigata,herb
Galeopsis,herb
Ribes uva-crispa,shrub
Prunus mahaleb,shrub\tree
Asparagus officinalis,shrub
Disporum smilacinum,herb
Brunella vulgaris,herb
Veronica anagallis-aquatica,herb
Rhododendron kaempferi,shrub
Festuca,herb
Lipandra polysperma,herb
Sedum rupestre,herb
Helictochloa versicolor,herb
Hymenophyllum nephrophyllum,herb
Cephalotaxus harringtonia,shrub\tree
Helleborus odorus,herb
Hyacinthoides non-scripta,herb
Artemisia maritima,shrub
Helictochloa bromoides,herb
Salix euxina,tree
Viburnum furcatum,shrub
Hymenophyllum multifidum,herb
Asplenium bulbiferum,herb
Cotinus coggygria,shrub
Juniperus phoenicea,shrub\tree
Artemisia indica,herb
Pieris japonica,shrub\tree
Genista scorpius,shrub
Viburnum wrightii,shrub
Ampelopsis glandulosa,other
Potentilla pusilla,herb
Blechnum fluviatile,herb
Rubus palmatus,shrub
Artemisia santonicum,herb\shrub
Senecio leucanthemifolius,herb
Thymus,herb
Solidago canadensis,herb
Echinops ritro,herb
Seseli elatum,herb
Cymbidium goeringii,herb
Pleioblastus argenteostriatus,herb
Reynoutria japonica,herb
Rubus angloserpens,shrub
Noccaea,herb
Smilax glauca,other
Polystichum spinulosum,herb
Scirpus maritimus,herb
Luzuriaga parviflora,herb
Bryonia cretica,other
Kadsura japonica,other
Betula,tree
Carex goodenoughii,herb
Thymus longicaulis,herb
Thelypteris limbosperma,herb
Callitriche,herb
Salix pentandra,tree
Chenopodiastrum murale,herb
Quercus,tree
Parthenocissus tricuspidata,other
Aria alnifolia,tree
Callicarpa mollis,shrub
Amaranthus hybridus,herb
Leptospermum scoparium,shrub\tree
Corylus sieboldiana,shrub
Pittosporum tobira,shrub\tree
Torilis arvensis,herb
Zanthoxylum bungeanum,shrub\tree
Crepis vesicaria,herb
Dioscorea tokoro,herb
Leptopteris superba,herb
Cyanus montanus,herb
Prunus cerasifera,shrub\tree
Salix appendiculata,shrub
Lathyrus laxiflorus,herb
Galeopsis ladanum,herb
Ericameria nauseosa,shrub
Cyclamen hederifolium,herb
Hymenophyllum revolutum,herb
Dendropanax trifidus,shrub\tree
Lastreopsis hispida,herb
Pilosella hoppeana,herb
Vandasina retusa,other
Oxybasis rubra,herb
Dianthus hyssopifolius,herb
Clinopodium nepeta,herb
Cardamine glanduligera,herb
Chamaesyce peplis,herb
Pueraria montana,other
Alyssum turkestanicum,herb
Minuartia sedoides,herb
Cyanus triumfettii,herb
Cyclosorus pozoi,herb
Cyclamen repandum,herb
Astilbe thunbergii,herb
Anthyllis montana,herb
Mitchella undulata,herb
Krascheninnikovia ceratoides,shrub
Dioscorea japonica,other
Sibbaldianthe bifurca,herb
Tripterospermum trinervium,NA
Cerasus jamasakura,tree
Hierochloe repens,herb
Festuca gautieri,herb
Salicornia perennans,herb
Salix atrocinerea,tree
Agrostis,herb
Oxybasis glauca,herb
Saxifraga exarata,herb
Hymenophyllum flabellatum,herb
Salix viminalis,shrub
Sasa borealis,herb\shrub
Puccinellia festuciformis,herb
Symplocos sawafutagi,shrub
Athyrium yokoscense,herb
Rubus buergeri,shrub
Prunus leveilleana,tree
Pertya scandens,shrub
Dioscorea quaternata,other
Cyathea dealbata,shrub\tree
Calamagrostis stricta,herb
Soldanella carpatica,herb
Selinum pyrenaeum,herb
Laurus nobilis,shrub\tree
Ononis natrix,shrub
Farfugium japonicum,herb
Cornus sanguinea,shrub
Vaccinium microcarpum,shrub
Limonium meyeri,herb
Vaccinium japonicum,shrub
Scandix pecten-veneris,herb
Lemmaphyllum microphyllum,herb
Amaranthus blitum,herb
Chimaphila maculata,herb
Euphorbia nicaeensis,herb\shrub
Dodonaea viscosa,shrub\tree
Coprosma microcarpa,shrub
Lomandra multiflora,herb
Microlaena stipoides,herb
Microstegium vimineum,herb
Pteretis struthiopteris,herb
Rumex scutatus,herb
Podospermum canum,herb
Ampelodesmos mauritanicus,herb
Tmesipteris tannensis,herb
Allium carinatum,herb
Hymenophyllum dilatatum,herb
Lindsaea trichomanoides,herb
Pilosella bauhini,herb
Hymenophyllum sanguinolentum,herb
Elaeagnus pungens,shrub
Vitis vinifera,other
Mespilus germanica,shrub\tree
Odontarrhena,NA
Myosotis,herb
Teucrium pyrenaicum,herb
Centaurea thuillieri,herb
Vaccinium smallii,shrub
Hymenophyllum,herb
Carex kitaibeliana,herb
Pogostemon stellatus,herb
Vicia,herb
Quercus dalechampii,tree
Sedum roseum,herb
Stauntonia hexaphylla,other
Pulmonaria affinis,herb
Vaccinium bracteatum,shrub\tree
Lonicera gracilipes,shrub
Dryopteris setosa,herb
Herniaria hirsuta,herb
Aralia elata,shrub\tree
Eurybia divaricata,herb
Hydrangea scandens,shrub
Mentha,herb
Lindera benzoin,shrub
Juniperus virginiana,tree
Ainsliaea acerifolia,herb
x Ammocalamagrostis,NA
Galium,herb
Ligustrum tschonoskii,shrub
Blechnum chambersii,herb
Ulex parviflorus,shrub
Artemisia gmelinii,herb
Paliurus spina-christi,shrub
Luzula,herb
Piper kadsura,other
Polygonum maritimum,herb
Ulmus,tree
Actinidia arguta,other
Chenopodiastrum hybridum,herb
Stemona lucida,other
Rubia tatarica,herb
Vaccinium hirtum,shrub
Rhododendron maximum,shrub
Anisocampium niponicum,herb
Sticherus cunninghamii,herb
Smilax sieboldii,other
Potentilla humifusa,herb
Cyathea colensoi,herb\shrub
Endiandra virens,tree
Polygonum equisetiforme,herb
Dryopteris lacera,herb
Hylodesmum podocarpum,herb
Rumex,herb
Aphananthe aspera,tree
Geranium solanderi,herb
Pseudopanax linearis,shrub
Sedum alpestre,herb
Lepisorus thunbergianus,herb
Aria japonica,tree
Elytrigia repens,herb
Ainsliaea apiculata,herb
Senecio,NA
Schisandra repanda,other
Cardamine,herb
Carex dolichostachya,herb
Potentilla supina,herb
Schizocodon soldanelloides,herb
Rhaphiolepis indica,shrub
Scilla lilio-hyacinthus,herb
Clinopodium menthifolium,herb
Aster,NA
Sasa palmata,herb
Brucea javanica,shrub
Anemone scherfelii,herb
Arundinella hirta,herb
Thymus nervosus,herb
Laportea bulbifera,herb
No suitable,NA
Potentilla montana,herb
Leptopteris hymenophylloides,herb
Solidago,herb
Compositae,NA
Pimpinella tragium,herb
Soldanella hungarica,herb
Leptorumohra mutica,herb
Artemisia pontica,herb
Verbascum,herb
Carex lenta,herb
Fraxinus chinensis,tree
Centranthus ruber,herb
Sesbania sesban,tree
Phormium colensoi,herb
Asparagus aphyllus,herb\shrub
Nasturtium,herb
Carex conica,herb
Lauraceae,NA
Dumasia truncata,other
Pilosella floribunda,herb
Goodenia geniculata,herb
Medicago intertexta,herb
Prunus,shrub\tree
Austrostipa scabra,herb
Juncus,herb
Sempervivum arachnoideum,herb
Thymus striatus,herb
Jasione crispa,herb
Echinochloa crusgalli,herb
Lindera glauca,shrub
Laburnum anagyroides,shrub
Oxalis pes-caprae,herb
Dianella nigra,herb
Jacobaea subalpina,herb
Campanula serrata,herb
Piptatherum coerulescens,herb
Carex pisiformis,herb
Geum sylvaticum,herb
Minuartia recurva,herb
Globularia repens,herb
Fraxinus,tree
Eucalyptus phaenophylla,tree
Osmorhiza aristata,herb
Leguminosae,NA
Helictochloa marginata,herb
Polygonatum lasianthum,herb
Rosa dumalis,shrub
Hymenophyllum scabrum,herb
Puccinellia gigantea,herb
Heloniopsis orientalis,herb
Anthemis cretica,herb
Styrax officinalis,shrub
Hosta sieboldiana,herb
Earina mucronata,herb
Calamagrostis hakonensis,herb
Tragopogon podolicus,herb
Thymus pulcherrimus,herb
Adenophora triphylla,herb
Aster ovatus,herb
Crepis lampsanoides,herb
Panicum boscii,herb
Pluchea dioscoridis,shrub
Amelanchier laevis,tree
Silene pusilla,herb
Eupatorium makinoi,herb
Polyphlebium venosum,herb
Uncinia,herb
Rubia argyi,other
Plagiogyria matsumureana,herb
Dryopteris,herb
Symphytum cordatum,herb
Ononis striata,herb
Allium,herb
Ruscus hypoglossum,shrub
Parathelypteris japonica,herb
Cyrtomium fortunei,herb
Ranunculus taisanensis,herb
Desmodium brachypodum,herb
Carex blepharicarpa,herb
Viburnum phlebotrichum,shrub
Atractylodes ovata,NA
Cichorium pumilum,herb
Ranunculus,herb
Cyperus gracilis,herb
Carex stenostachys,herb
Diplopterygium glaucum,herb
Sesleria rigida,herb
Centaurea,herb
Opuntia,other
Galium octonarium,herb
Pseudowintera axillaris,shrub\tree
Tricyrtis affinis,herb
Asplenium platyneuron,herb
Clematis terniflora,other
Parsonsia heterophylla,other
Raukaua edgerleyi,tree
Dianthus giganteiformis,herb
Viola sieheana,herb
Hosta sieboldii,herb
Sasa nipponica,herb
Cirsium,herb
Arachniodes standishii,NA
Paspalidium geminatum,herb
Alhagi graecorum,shrub
Cuscuta campestris,other
Allium saxatile,herb
Trifolium,herb
Persicaria longiseta,NA
Jacobaea maritima,NA
Acer shirasawanum,tree
Athyrium vidalii,herb
Centaurea nemoralis,herb
Circaea ×,herb
Dactylorhiza,herb
Xanthorrhoea acaulis,other
Cynoglossum,herb
Boehmeria silvestrii,herb\shrub
Serratula coronata,herb
Salix phylicifolia,shrub
Genista depressa,NA
Populus,tree
Phlegmariurus,NA
Atropa bella-donna,herb
Bignonia capreolata,other
Amelanchier,shrub\tree
Launaea nudicaulis,herb
Photinia glabra,tree
Suaeda acuminata,herb
Gonocarpus teucrioides,herb\shrub
Pulsatilla grandis,herb
Sesleria comosa,herb
Patzkea spadicea,herb
Koeleria nitidula,herb
Orobanche crenata,other
Achillea asiatica,herb
Paris tetraphylla,herb
Edraianthus graminifolius,herb
Clematis apiifolia,other
Thelypteris acuminata,herb
Patzkea paniculata,herb
Dichondra,herb
Dryopteris pseudomas,herb
Festuca hystrix,herb
Blechnum minus,herb
Maianthemum japonicum,herb
Millettia japonica,NA
Pteris cretica,herb
Leucanthemum rotundifolium,herb
Pyrrosia eleagnifolia,other
Elionurus citreus,herb
Ochlopoa supina,NA
Crocus veluchensis,herb
Galium maritimum,herb
Crepis albida,herb
Solidago curtisii,herb
Coptis trifolia,herb
Syneilesis palmata,herb
Chenopodium bonus-henricus,herb
Potentilla,herb
Artemisia lerchiana,herb
Lathyrus pisiformis,herb
Euphorbia plumerioides,NA
Ophiopogon planiscapus,herb
Ranunculus aduncus,herb
Scabiosa triniifolia,herb
Viola kusanoana,herb
Rytidosperma linkii,herb
Festuca dalmatica,herb
Berchemia racemosa,shrub
Lespedeza maximowiczii,shrub
Wisteria brachybotrys,NA
Quercus infectoria,shrub\tree
Asarum caucasicum,herb
Centaurea aspera,herb
Lechenaultia filiformis,NA
Tragopogon porrifolius,herb
Athyrium asplenioides,herb
Silene sericea,herb
Scrophularia alpestris,herb
Rhododendron pentandrum,NA
Thymus comosus,herb
Sanicula chinensis,herb
Inula oculus-christi,herb
Lamium,herb
Arachniodes aristata,NA
Onosma simplicissima,NA
Ranunculus pseudomontanus,herb
Corylus cornuta,shrub
Arachniodes sporadosora,NA
Orostachys spinosa,other
Olearia lacunosa,shrub\tree
Carthamus mitissimus,herb
Stewartia pseudocamellia,tree
Eucalyptus indurata,tree
Prosopis glandulosa,shrub\tree
Aurinia saxatilis,herb
Dampiera purpurea,herb\shrub
Cirsium nipponicum,NA
Patrinia villosa,NA
Galium pseudoaristatum,herb
Rhinanthus,herb
Leionema elatius,shrub
Arrhenatherum longifolium,herb
Limonium bellidifolium,herb
Brachiaria whiteana,herb
Adiantum capillus-veneris,herb
Vittadinia cuneata,herb
Carex rhizina,herb
Tephrosia,NA
Leontopodium nivale,herb
Crocus caeruleus,herb
Cuscuta,other
Pyrrosia lingua,herb
Ficaria fascicularis,herb
Pilosella peleteriana,herb
Dinebra decipiens,herb
Psychotria asiatica,shrub
Vicia pyrenaica,herb
Galax urceolata,herb
Aristolochia serpentaria,herb
Sedum brevifolium,herb
Impatiens atrosanguinea,herb
Dapsilanthus ramosus,herb
Nephrodium sabaei,herb
Silene rubella,herb
Blechnum procerum,herb
Phyllanthera grayi,tree
Lycopodium alpinum,herb
Codonopsis lanceolata,other
Persicaria senegalensis,herb
Bolboschoenus glaucus,herb
Clematis japonica,NA
Asplenium incisum,herb
Chrysothamnus,NA
Kunzea ericoides,shrub\tree
Elatostema involucratum,herb
Liriope minor,herb
Campanula spatulata,herb
Orobanche,other
Laserpitium krapffii,herb
Picrothamnus,NA
Thymus roegneri,herb
Achillea coarctata,herb
Cephalaria uralensis,herb
Artemisia nitrosa,herb
Ozothamnus tesselatus,NA
Sedum urvillei,herb
Lamium garganicum,herb
Pyrola asarifolia,herb
Orites lancifolius,shrub
Polygonatum falcatum,herb
Cerastium,herb
Gaultheria procumbens,herb
Keraudrenia hookeriana,NA
Polystichum polyblepharum,herb
Lindera sericea,NA
Paesia scaberula,herb
Litsea japonica,shrub
Crepis fraasii,herb
Hypecoum imberbe,herb
Plantago monosperma,herb
Quercus rosacea,tree
Halesia tetraptera,tree
Polystichum retrosopaleaceum,herb
Leptorumohra miqueliana,herb
Boehmeria spicata,shrub
lachenalii subsp.,NA
Amaranthus graecizans,herb
Cephalomanes obscurum,herb
Sedum amplexicaule,herb
Alectryon oleifolius,tree
Galium bungei,herb
Tmesipteris,NA
Blechnum filiforme,herb
Hieracium transylvanicum,herb
Viola orbiculata,herb
Spiraea crenata,shrub
Molinia japonica,herb
Actinidia polygama,other
Bursaria spinosa,shrub\tree
Acacia aneura,tree
Heterachne,NA
Oenanthe javanica,herb
Lemna aequinoctialis,herb
Calythrix,shrub
Senecio aegyptius,NA
Petasites frigidus,herb
Dalbergia densa,other
Carex morrowii,herb
Viola vaginata,herb
Alpinia intermedia,NA
Enkianthus campanulatus,NA
Leucopogon,NA
Menziesia ferruginea,shrub
Spiraea media,shrub
Dryopteris pacifica,herb
Minuartia setacea,herb
Salvia officinalis,herb
Coprosma dumosa,shrub
Bidens,NA
Aristida vagans,herb
Phragmites japonicus,herb
Lysimachia japonica,NA
Knautia arvernensis,herb
Ononis cristata,NA
Lamyropsis cynaroides,NA
Puccinellia tenuissima,NA
Burchardia congesta,herb
Galium trifidum,herb
Armeria canescens,herb
Minuartia laricifolia,herb
Carex reinii,herb
Picea,tree
Senna,NA
Asarum sieboldii,herb
Atriplex,NA
Pseudoraphis,NA
Symphyotrichum lateriflorum,herb
Panicum effusum,herb
Microlepia marginata,NA
Prunus apetala,shrub\tree
Alyssum obovatum,herb
Bromus,herb
Rubus pannosus,shrub
Sedobassia sedoides,herb
Alyssum hirsutum,herb
Astelia,NA
Prosartes lanuginosa,herb
Jacobaea adonidifolia,herb
Helleborus purpurascens,herb
Ulmus davidiana,tree
Campanula sparsa,herb
Gleichenia,NA
Veratrum maackii,NA
Sorghum virgatum,herb
Rhododendron lagopus,shrub
Blechnum nigrum,herb
Leucopogon muticus,shrub
Biscutella auriculata,herb
Geranium collinum,herb
Centranthus calcitrapae,herb
Oxalis griffithii,herb
Festuca pseudodalmatica,herb
Galatella angustissima,herb
Prenanthes,herb
Gaultheria myrsinoides,shrub
Sarcobatus baileyi,shrub
Vitis heyneana,other
Dioscorea gracillima,NA
Launaea fragilis,herb
Sonchus bulbosus,herb
Leptospermum polygalifolium,shrub
Digitaria,herb
Lycopodium volubile,herb
Aralia cordata,herb
Carex concinnoides,herb
Avenula pubescens,herb
Pleurospermum uralense,herb
Taraxacum hamatum,herb
Ranunculus reflexus,herb
Euphorbia subcordata,herb
Ferulago sylvatica,herb
Carthamus carduncellus,herb
Psychotria serpens,other
Sonchus,NA

SessionInfo

sessionInfo()
## R version 3.6.3 (2020-02-29)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 16.04.7 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/openblas-base/libblas.so.3
## LAPACK: /usr/lib/libopenblasp-r0.2.18.so
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] viridis_0.5.1     viridisLite_0.3.0 caret_6.0-84      lattice_0.20-41  
##  [5] kableExtra_1.3.4  knitr_1.31        data.table_1.14.0 forcats_0.5.1    
##  [9] stringr_1.4.0     dplyr_1.0.5       purrr_0.3.4       readr_1.4.0      
## [13] tidyr_1.1.3       tibble_3.0.1      ggplot2_3.3.0     tidyverse_1.3.0  
## 
## loaded via a namespace (and not attached):
##  [1] nlme_3.1-152         fs_1.5.0             lubridate_1.7.10    
##  [4] webshot_0.5.2        httr_1.4.2           tools_3.6.3         
##  [7] backports_1.2.1      bslib_0.2.4          utf8_1.2.1          
## [10] R6_2.5.0             rpart_4.1-15         DBI_1.1.1           
## [13] colorspace_2.0-0     nnet_7.3-15          withr_2.4.1         
## [16] tidyselect_1.1.0     gridExtra_2.3        compiler_3.6.3      
## [19] cli_2.3.1            rvest_1.0.0          xml2_1.3.2          
## [22] labeling_0.4.2       sass_0.3.1           scales_1.1.1        
## [25] proxy_0.4-24         systemfonts_1.0.1    digest_0.6.25       
## [28] rmarkdown_2.7        svglite_1.2.3.2      pkgconfig_2.0.3     
## [31] htmltools_0.5.1.1    dbplyr_2.1.0         highr_0.8           
## [34] rlang_0.4.10         readxl_1.3.1         rstudioapi_0.13     
## [37] jquerylib_0.1.3      generics_0.1.0       farver_2.1.0        
## [40] jsonlite_1.7.2       ModelMetrics_1.2.2.2 magrittr_2.0.1      
## [43] Matrix_1.3-2         Rcpp_1.0.5           munsell_0.5.0       
## [46] fansi_0.4.2          gdtools_0.2.3        lifecycle_1.0.0     
## [49] stringi_1.5.3        yaml_2.2.1           MASS_7.3-53.1       
## [52] plyr_1.8.6           recipes_0.1.15       grid_3.6.3          
## [55] crayon_1.4.1         haven_2.3.1          splines_3.6.3       
## [58] hms_1.0.0            pillar_1.4.3         reshape2_1.4.4      
## [61] codetools_0.2-18     stats4_3.6.3         reprex_1.0.0        
## [64] glue_1.4.2           evaluate_0.14        modelr_0.1.6        
## [67] vctrs_0.3.6          foreach_1.5.1        cellranger_1.1.0    
## [70] gtable_0.3.0         assertthat_0.2.1     xfun_0.22           
## [73] gower_0.2.2          prodlim_2019.11.13   broom_0.7.0         
## [76] e1071_1.7-6          class_7.3-18         survival_3.2-10     
## [79] timeDate_3043.102    iterators_1.0.13     lava_1.6.9          
## [82] ellipsis_0.3.1       ipred_0.9-11