This reports documents the construction of Community Weighted Means (CWMs) and Variance (CWVs), complementing species composition data from sPlot 3.0 and Plant functional traits from TRY 5.0.
Functional Traits were received by [Jens Kattge](jkattge@bgc-jena.mpg.de) on Jan 21, 2020.
This reports documents 1) the construction of Community Weighted Means (CWMs) and Variance (CWVs); and 2) the classification of plots into forest\\non-forest based on species growth forms. It complements species composition data from sPlot 3.0 and gap-filled plant functional traits from TRY 5.0, as received by [Jens Kattge](jkattge@bgc-jena.mpg.de) on Jan 21, 2020.
The average number of observations per species and genus is `r round(mean(try.species.means$n),1)` and `r round(mean(try.genus.means$n),1)`, respectively. As many as `r sum(try.species.means$n==1)` species have only one observation (`r sum(try.genus.means$n==1)` at the genus level).
## Match taxa based on species, if available, or Genus
```{r, echo=F}
knitr::kable(try.species.means %>%
sample_n(15),
caption="Example of trait means for 15 randomly selected species", digits = 3) %>%
## 1.5 Match taxa based on species, if available, or Genus
Combined the trait means based on species and genera into a single object, and check how many of these taxa match to the (resolved) species names in `DT2`.
```{r, warning=F}
try.combined.means <- try.genus.means %>%
...
...
@@ -250,7 +260,7 @@ total.matches <- DT2 %>%
```
The total number of matched taxa (either at species, or genus level) is `r total.matches`.
### Calculate summary statistics for species- and genus-level mean traits
## 1.6 Calculate summary statistics for species- and genus-level mean traits
Calculate CWMs and CWV, as well as plot coverage statistics (proportion of total cover for which trait info exist, and proportion of species for which we have trait info). To avoid misleading results, CWM is calculated ONLY for plots for which we have some abundance information. All plots where `Ab_scale`=="pa" in **ANY** of the layers are therefore excluded.
## Classify plots in `is.forest` or `is.non.forest` based on species traits
# 3 Classify plots in `is.forest` or `is.non.forest` based on species traits
sPlot has two independent systems for classifying plots to vegetation types. The first, classifies plots into forest and non-forest, based on the share of trees, and the layering of vegetation. The second system classifies plots into broad habitat types and relies on the expert opinion of data contributors. This is, unfortunately, not consistently available across all plots, being the large majority of classified plots only available for Europe. These broad habitat types are coded using 5, non-mutually exclusive dummy variables:
1) Forest - F
2) Grassland - G
...
...
@@ -465,7 +508,7 @@ A plot may belong to more than one formation, e.g. a Savannah is categorized as
\newline\newline
Derive the `if.forest` and `is.non.forest` classification of plots.
### Derive species level information on Growth Forms.
## 3.1 Derive species level information on Growth Forms.
We used different sources of information:
1) Data from the gap-filled trait matrix
2) Manual cleaning of the most common species for which growth trait info is not available
...
...
@@ -518,13 +561,14 @@ DT.gf <- DT.gf %>%
```
After manual completion, the number of records without growth form information decresead to `r sum(is.na(DT.gf$GrowthForm))`.
\newline\newline
Step 3: Import additional data on growth-form from TRY (Accessed 10 March 2020). All public data on growth form downloaded. First take care of unmatched quotation marks in the txt file. Do this from command line.
Step 3: Import additional data on growth-form from TRY (Accessed 10 March 2020).
All public data on growth form downloaded. First take care of unmatched quotation marks in the txt file. Do this from command line.
```{bash, eval=F}
# escape all unmatched quotation marks. Run in Linux terminal
#sed 's/"/\\"/g' 8854.txt > 8854_test.csv
#sed "s/'/\\'/g" 8854.txt > 8854_test.csv
```
Information on growth form is not organized and has a myriad of levels. Extract and simplify to the set of few types used so far. In case a species is attributed to multiple growth forms use a majority vote.
Cross check with sPlot's 5-class (incomplete) native classification deriving from data contributors. Build a Confusion matrix.
```{r}
cross.check <- header %>%
...
...
@@ -805,7 +862,7 @@ Through the process described above, we managed to classify `r plot.vegtype %>%
\newline\newline
The total number of plots with attribution to forest\\non-forest (either coming from sPlot's native classification, or from the process above) is: `r header.vegtype %>% dplyr::select(-PlotObservationID) %>% filter(rowMeans(is.na(.)) < 1) %>% nrow()`.