Timestamp: Wed Mar 24 10:35:08 2021
Drafted: Francesco Maria Sabatini
Revised: Helge Bruelheide
Version: 1.2

This report documents the construction of the header file for sPlot 3.0. It is based on dataset sPlot_3.0.2, received on 24/07/2019 from Stephan Hennekens.

Changes in version 1.1
1) Excluded plots from Canada, as recommended by Custodian
2) Filled missing info from most of the ~2000 plots without country information from these datasets.
3) Corrected mismatched sBiomes and ecoregions
Changes in version 1.2
1) Reassigned coordinates to ~19.000 misplaced plots (mostly from SOPHY or in Hungary). Assigned country level centroids
2) Corrected mismatched CONTINENTS & Countries
3) Added graphs to check assignment to continents or countries

knitr::opts_chunk$set(echo = TRUE)
library(tidyverse)
library(purrr)
library(viridis)
library(readr)
library(xlsx)
library(knitr)
library(kableExtra)

## Spatial packages
library(rgdal)
library(sp)
library(rgeos)
library(raster)
library(rworldmap)
library(elevatr)
library(sf)
library(rnaturalearth)
library(dggridR)
library(shotGroups) #minCircle

#save temporary files
write("TMPDIR = /data/sPlot/users/Francesco/_tmp", file=file.path(Sys.getenv('TMPDIR'), '.Renviron'))
write("R_USER = /data/sPlot/users/Francesco/_tmp", file=file.path(Sys.getenv('R_USER'), '.Renviron'))
rasterOptions(tmpdir="/data/sPlot/users/Francesco/_tmp")

1 Import data

Import header data. Clean header data from quotation and double quotation marks from linux terminal.

# escape all double quotation marks. Run in Linux terminal
#sed 's/"/\\"/g' sPlot_3_0_2_header.csv > sPlot_3_0_2_header_test.csv

#more general alternative in case some " are already escaped
##first removing \s before all "s, and then adding \ before all ":
#sed 's/\([^\\]\)"/\1\\\"/g; s/"/\\"/g'

Import cleaned header data.

header0 <- readr::read_delim("../sPlot_data_export/sPlot_3_0_2_header_test.csv", 
                             locale = locale(encoding = 'UTF-8'),
                            delim="\t", col_types=cols(
  PlotObservationID = col_double(),
  PlotID = col_double(),
  `TV2 relevé number` = col_double(),
  Country = col_character(),
  `Cover abundance scale` = col_factor(),
  `Date of recording` = col_date(format="%d-%m-%Y"),
  `Relevé area (m²)` = col_double(),
  `Altitude (m)` = col_double(),
  `Aspect (°)` = col_double(),
  `Slope (°)` = col_double(),
  `Cover total (%)` = col_double(),
  `Cover tree layer (%)` = col_double(),
  `Cover shrub layer (%)` = col_double(),
  `Cover herb layer (%)` = col_double(),
  `Cover moss layer (%)` = col_double(),
  `Cover lichen layer (%)` = col_double(),
  `Cover algae layer (%)` = col_double(),
  `Cover litter layer (%)` = col_double(),
  `Cover open water (%)` = col_double(),
  `Cover bare rock (%)` = col_double(),
  `Height (highest) trees (m)` = col_double(),
  `Height lowest trees (m)` = col_double(),
  `Height (highest) shrubs (m)` = col_double(),
  `Height lowest shrubs (m)` = col_double(),
  `Aver. height (high) herbs (cm)` = col_double(),
  `Aver. height lowest herbs (cm)` = col_double(),
  `Maximum height herbs (cm)` = col_double(),
  `Maximum height cryptogams (mm)` = col_double(),
  `Mosses identified (y/n)` = col_factor(),
  `Lichens identified (y/n)` = col_factor(),
  COMMUNITY = col_character(),
  SUBSTRATE = col_character(),
  Locality = col_character(),
  ORIG_NUM = col_character(),
  ALLIAN_REV = col_character(),
  REV_AUTHOR = col_character(),
  Forest = col_logical(),
  Grassland = col_logical(),
  Wetland = col_logical(),
  `Sparse vegetation` = col_logical(),
  Shrubland = col_logical(),
  `Plants recorded` = col_factor(),
  `Herbs identified (y/n)` = col_factor(),
  Naturalness = col_factor(),
  EUNIS = col_factor(),
  Longitude = col_double(),
  Latitude = col_double(),
  `Location uncertainty (m)` = col_double(),
  Dataset = col_factor(),
  GUID = col_character()
)) %>% 
  rename(Sparse.vegetation=`Sparse vegetation`, 
         ESY=EUNIS) %>% 
  dplyr::select(-COMMUNITY, -ALLIAN_REV, -REV_AUTHOR, -SUBSTRATE) %>%   #too sparse information to be useful
  dplyr::select(-PlotID) #identical to PlotObservationID

The following column names occurred in the header of sPlot v2.1 and are currently missing from the header of v3.0
1. Syntaxon
2. Cover cryptogams (%)
3. Cover bare soil (%)
4. is.forest
5. is.non.forest
6. EVA
7. Biome
8. BiomeID
9. CONTINENT
10. POINT_X
11. POINT_Y
~~ Columns #1, #2, #3, #10, #11 will be dropped. The others will be derived below.

1.1 Exclude unreliable plots

Some canadian plots need to be removed, on indication of Laura Boisvert-Marsh from GIVD NA-CA-004. The plots (and corresponding PlotObservationID) are:

Fabot01 - 1707776
Fadum01, 02 & 03 - 1707779:1707781
Faers01 - 1707782
Pfe-f-08 - 1707849
Pfe-o-05- 1707854

header0 <- header0 %>% 
  filter(!PlotObservationID %in% c(1707776, 1707779:1707782, 1707849, 1707854)) %>% 
  filter(Dataset != "$Coastal_Borja") %>% 
  filter(Dataset != "$Coastal_Poland") 

1.2 Solve spatial problems

There are 2020 plots in the Nile dataset without spatial coordinates. Assign manually with wide (90km) location uncertainty.

header <- header0 %>% 
  mutate(Latitude=replace(Latitude, 
                          list=(is.na(Latitude) & Dataset=="Egypt Nile delta"), 
                          values=30.917351)) %>% 
  mutate(Longitude=replace(Longitude, 
                          list=(is.na(Longitude) & Dataset=="Egypt Nile delta"), 
                          values=31.138534)) %>% 
  mutate(`Location uncertainty (m)`=replace(`Location uncertainty (m)`, 
                          list=(is.na(`Location uncertainty (m)`) & Dataset=="Egypt Nile delta"), 
                          values=-90000))

There are two plots in the Romania Grassland Databse, ~4442 plots in the Japan database, and a few in the European Weed Vegetation Database whose lat\long are inverted. Correct.

toswap <- c(which(header$Dataset=="Japan" & header$Latitude>90), 
            which(header$Dataset=="Romania Grassland Database" & header$Longitude>40), 
            which(header$PlotObservationID==525283))
header[toswap, c("Latitude", "Longitude")] <- header[toswap, c("Longitude", "Latitude")]

There are 237546 plots without location uncertainty. As a first approximation, we assign the median of the respective dataset, as a negative value to indicate this is an estimation, rather than a measure.

header <- header %>% 
  left_join(header %>% 
              group_by(Dataset) %>% 
              summarize(loc.uncer.median=median(`Location uncertainty (m)`, na.rm=T)), 
            by="Dataset") %>% 
  mutate(`Location uncertainty (m)`=ifelse( is.na(`Location uncertainty (m)` & !is.na(Latitude)), 
                                            -abs(loc.uncer.median), 
                                            `Location uncertainty (m)`)) %>% 
  dplyr::select(-loc.uncer.median)

There are still 91943 plots with no estimation of location uncertainty.
Assign plot size to plots in the Patagonia dataset (input of Ana Cingolani)

header <- header %>% 
  mutate(`Relevé area (m²)`=ifelse( (Dataset=="Patagonia" & is.na(`Relevé area (m²)`)), 
                                    -900, `Relevé area (m²)`))

There are 518 plots from the dataset Germany_gvrd (EU-DE-014) having a location uncertainty equal to 2,147,483 km (!). These plots have a location reported. Replace with a more likely estimate (20 km)

header <- header %>% 
  mutate(`Location uncertainty (m)`=replace(`Location uncertainty (m)`, 
                                            list=`Location uncertainty (m)`==2147483647, 
                                            values=20000))

There are 103 plots in the “Balkan Vegetation Database” which are erroneously assigned to Bahrain, instead of Bosnia

header <- header %>% 
  mutate(Country=ifelse(Dataset=="Balkan Vegetation Database" & Country=="Bahrain", 
                        "Bosnia-Herzegovina", 
                        Country))

There is one plot in the Czech dataset which was sampled in Italy (near Civitella Alfedena) but has the centroid of CZ republic as coordinates

plot.alfedena <- header %>% 
  filter(Dataset=="Czechia_nvd" & Country=="Italy") %>% 
  pull(PlotObservationID)
# coordinates of Civitella Alfedena - 41.764975, 13.940494

header <- header %>% 
  mutate(Latitude=ifelse(PlotObservationID %in% plot.alfedena, 41.764975, Latitude)) %>% 
  mutate(Longitude=ifelse(PlotObservationID %in% plot.alfedena, 13.940494, Longitude)) %>% 
  mutate(`Location uncertainty (m)` =ifelse(PlotObservationID %in% plot.alfedena, 30000, `Location uncertainty (m)`)) 

Many plots in SOPHY were collected from neighbouring countries, but are located at the centroid of France. Similarly, there is a batch of 108 plots from the European Mire VDB, European Weed Vegetation Database, and WetVegEurope Database that although being located in Hungary, have spatial coordinates in Chad. To all these plots, I reassign the centroid of their respective country as coordinate, and the radius of the minimum circle enclosing the country as location uncertainty

#Centroid of France
#  Longitude Latitude
#   2.55196  46.5645

plot.to.correct <- header %>% 
  filter( (Dataset=="France_SOPHY" & Country!="France" & Longitude==2.55196 & Latitude==46.5645) | #SOPHY plots
            Country=="Hungary" & Latitude<20) ##Hungary plots


nrow(plot.to.correct)
## [1] 19049
plot.to.correct %>% 
  count(Country)
## # A tibble: 19 x 2
##    Country                      n
##    <chr>                    <int>
##  1 Albania                     17
##  2 Andorra, Principality of   110
##  3 Austria                     68
##  4 Belgium                   1716
##  5 Czech Republic               4
##  6 Germany                   6548
##  7 Hungary                    108
##  8 Italy                     1375
##  9 Liechtenstein                3
## 10 Luxembourg                 309
## 11 Monaco                       4
## 12 Netherlands                  3
## 13 Norway                     180
## 14 Portugal                    17
## 15 Slovenia                    10
## 16 Spain                     4541
## 17 Sweden                       3
## 18 Switzerland               3914
## 19 United Kingdom             119
countries.to.correct <- plot.to.correct %>% 
  distinct(Country) %>% 
  pull(Country)

plot.to.correct <- plot.to.correct %>% 
  pull(PlotObservationID)


## Import polygon of countries and subset 
countries.sel <- ne_countries(returnclass = "sf", scale=110) %>% 
  dplyr::select(geometry, name) %>% 
  mutate(name=ifelse(name=="Andorra", "Andorra, Principality of", name)) %>% 
  mutate(name=ifelse(name=="Czech Rep.", "Czech Republic", name)) %>% 
  filter(name %in% countries.to.correct)

# Some smaller countries cannot be resolved to the low res layer of countries above.
# Add them from another source
small.countries <- ne_countries(returnclass = "sf", scale=50) %>%  
  dplyr::select(geometry, name) %>% 
  filter(name %in% c("Andorra", "Monaco", "Liechtenstein")) %>% 
  mutate(name=ifelse(name=="Andorra", "Andorra, Principality of", name))
  

## For Norway I delete the svalbard islands to avoid centroid falling in the sea
norway <- countries.sel %>% 
  filter(name=="Norway") %>% 
  as_Spatial() %>% 
  spatialEco::explode() %>% 
  st_as_sf() %>% 
  slice(1)   # extract only norway mainland

# Bind all countries together, and replace Norway
countries.sel <- countries.sel %>% 
  filter(name != "Norway") %>% 
  bind_rows(norway) %>% 
  bind_rows(small.countries) 


# get centroids in lat long and radius (=location uncertainty) in meters (projection "eck4")
centroids <- list()
radius <- NULL
for(cc in seq_along(countries.to.correct)){
  cnt <- countries.sel %>% 
    filter(name ==countries.to.correct[cc])

  #get centroid
  centroids[[cc]] <- gCentroid(cnt %>% 
                                 as_Spatial(),byid = F)@coords
  
  ## transform polygons to points and get the minimum encolosing circle
  ## to determine ragius (in km)
  cnt.eck <- countries.sel %>% 
    st_transform(crs = "+proj=eck4") %>% 
    filter(name ==countries.to.correct[cc]) %>% 
    st_cast(to="MULTIPOINT") %>% 
    st_coordinates() %>% 
    as.data.frame() %>% 
    mutate_all(~(.)/1000) %>% #to km
    dplyr::select(point.x=X, point.y=Y)
  radius[cc] <- shotGroups::getMinCircle(cnt.eck)$rad
}

## build dataset of centroids and location uncertainty
cnt.centroids <- bind_rows(lapply(centroids, as.data.frame)) %>% 
  mutate(radius=radius*1000) %>% # transform raddius to meters
  mutate(Country=countries.to.correct)

Reassign coordinates to plots to correct, based on the centroid of their country

header <- header %>% 
  left_join(cnt.centroids, by="Country") %>% 
  mutate(Latitude=ifelse(PlotObservationID %in% plot.to.correct, y, Latitude)) %>% 
  mutate(Longitude=ifelse(PlotObservationID %in% plot.to.correct, x, Longitude)) %>% 
  mutate(`Location uncertainty (m)` =ifelse(PlotObservationID %in% plot.to.correct, radius, `Location uncertainty (m)`)) %>% 
  dplyr::select(-x,-y,-radius)

2 Formations

Fill out the columns Forest:Sparse.vegetation with NAs, where necessary. Create columns is.forest and is.non.forest using script developed for sPlot 2.1
I am not assigning plots to Faber-Langedon formation at this stage, as this is only possible for European plots having an ESY classification.

eunis.key <- read.xlsx("../_input/EUNIS_WFT.xlsx", sheetIndex = "Sheet1", endRow = 246) %>% 
  dplyr::select(EUNIS_code, NATURALNESS:SPARSE_VEG) %>% 
  mutate(EUNIS_code=as.character(EUNIS_code)) %>% 
  rename(ESY=EUNIS_code, 
         Naturalness=NATURALNESS, 
         Forest=FOREST,
         Shrubland=SCRUBLAND,
         Grassland=GRASSLAND,
         Wetland=WETLAND,
         Sparse.vegetation=SPARSE_VEG)#,

header <- header %>% # header.backup %>% 
  mutate(ESY=as.character(ESY)) %>% 
  #mutate(ESY=ifelse(ESY=="?", NA, ESY)) %>% 
  # Systematically assign some databases to forest
  mutate(Forest=ifelse(Dataset %in% 
                         c("Turkey Oak_Forest Database", "Turkey Forest Database", "Chile_forest", "Ethiopia"), 
                       T, Forest)) %>% 
  #fill up with F those rows where at least one column on formation is assigned
  rowwise() %>% 
  mutate(Any=any(Forest, Shrubland, Grassland, Wetland, Sparse.vegetation)) %>% 
  mutate(Forest=ifelse( (is.na(Forest) & Any), F, Forest))  %>%
  mutate(Shrubland=ifelse( (is.na(Shrubland) & Any), F, Shrubland))  %>% 
  mutate(Grassland=ifelse( (is.na(Grassland) & Any), F, Grassland))  %>% 
  mutate(Wetland=ifelse( (is.na(Wetland) & Any), F, Wetland))  %>% 
  mutate(Sparse.vegetation=ifelse( (is.na(Sparse.vegetation) & Any), F, Sparse.vegetation))  %>%
  ungroup() %>% 
  dplyr::select(-Any) %>% 
  mutate_at(vars(Forest:Shrubland), .funs=list(~.*1)) %>% 
  mutate(Naturalness=as.numeric(as.character(Naturalness))) %>% 
  ##join and coalesce with eunis.key
  left_join(eunis.key %>% 
              distinct(), by = "ESY") %>% 
    mutate(
        Forest = dplyr:::coalesce(Forest.x, Forest.y), 
        Shrubland = coalesce(Shrubland.x, Shrubland.y),
        Grassland = coalesce(Grassland.x, Grassland.y),
        Wetland = coalesce(Wetland.x, Wetland.y),
        Sparse.vegetation = coalesce(Sparse.vegetation.x, Sparse.vegetation.y),
        Naturalness = coalesce(Naturalness.x, Naturalness.y)
    ) %>% 
  dplyr::select(-ends_with(".x"), -ends_with(".y"))

3 Fix header and attach GIVD codes

Reduce number of factor levels for the column Plants recorded

header <- header %>% 
  mutate(`Plants recorded`=fct_recode(`Plants recorded`, 
                                      "All vascular plants"="complete vegetation",
                                      "All vascular plants"="Complete vegetation",
                                      "All vascular plants"="all vascular plants", 
                                      "All vascular plants"="complete", 
                                      "All vascular plants"="Complete vegetation (including non-terricolous tax",
                                      "All vascular plants"="Vascular plants",
                                      "All woody plants"="Woody plants",
                                      "All woody plants"="All woody species",
                                      "Woody plants >= 10 cm dbh"= "trees>=10cm dbh",
                                      "All trees & dominant understory"="All trees & dominant shrubs",
                                      "Woody plants >= 1 cm dbh" = "Plants >= 1 cm dbh"
                                      )) %>% 
  mutate(`Plants recorded`=factor(`Plants recorded`, exclude = "#N/A"))

Align consortium labels to those in sPlot’s consortium archive

databases <- read_csv("/data/sPlot/users/Francesco/_sPlot_Management/Consortium/Databases.out.csv")

header <- header %>% 
  mutate(Dataset=fct_recode(Dataset,
                            "BIOTA_South_Africa" = "BIOTA_South_Africa_3", 
                            "Kriti"="Cyprus_Bergmeier", 
                            "European Boreal Forest Database"="European Boreal Forest Database 1", 
                            "European Boreal Forest Database"="European Boreal Forest Database 2", 
                            "European Coastal Vegetation Database"= "European Coastal Vegetation Database-A", 
                            "Germany_vegetweb"="Germany_vegetweb2", 
                            "Germany_vegetweb"="Germany_vegetweb3",
                            "Ladakh"="Ladakh_2", 
                            "Netherlands"="Netherlands Military sites",
                            "NSW Australia" = "NSW Austalia",
                            )) %>% 
  left_join(databases %>% 
              dplyr::select(`GIVD ID`, label) %>% 
              rename(Dataset=label),
            by="Dataset")

4 Assign plots to spatial descriptors

Create spatial point dataframe for sPlot data to intersect with spatial layers

header.shp <- header %>%
  filter(!is.na(Longitude) | !is.na(Latitude))
header.shp <- SpatialPointsDataFrame(coords= header.shp %>% 
                                        dplyr::select(Longitude, Latitude),
                               proj4string = CRS("+init=epsg:4326"), 
                               data=data.frame(PlotObservationID= header.shp$PlotObservationID, 
                                               loc.uncert=header.shp$`Location uncertainty (m)`, 
                                               `GIVD ID`=header.shp$`GIVD ID`))
writeOGR(header.shp, dsn="../_derived/", layer="header.shp", driver="ESRI Shapefile", overwrite_layer = T)

Reimport shapefile

header.shp <- readOGR("../_derived/header.shp.shp")
## OGR data source with driver: ESRI Shapefile 
## Source: "/data/sPlot/users/Francesco/sPlot3/_derived/header.shp.shp", layer: "header.shp"
## with 1975186 features
## It has 3 fields
header.shp@data <- header.shp@data %>% 
  rename(PlotObservationID=PltObID, 
         loc.uncert=lc_ncrt,
         `GIVD ID`=GIVD_ID)
crs(header.shp) <- CRS("+init=epsg:4326")

4.1 Assign to Continents

Download and manipulate map of continents

sPDF <- rworldmap::getMap(resolution="coarse")
continent <- sPDF[,"continent"]
crs(continent) <- CRS("+init=epsg:4326")
continent@data[243,"continent"] <- "South America" ## Manually correct missing data
# create clipped version of continent to avoid going beyond 180 lON
coords <- data.frame(x=c(-180,180,180,-180),
                     y=c(-90,-90,90,90))
bboxc = Polygon(coords)
bboxc = SpatialPolygons(list(Polygons(list(bboxc), ID = "a")), proj4string=crs(continent))
continent_clipped <- gIntersection(continent[-137,], bboxc, byid=T) # polygon 137 gives problems... workaround
# convert to SpatialPolygonDataFrame
pid <- sapply(slot(continent_clipped, "polygons"), function(x) slot(x, "ID"))
# Create dataframe with correct rownames
p.df <- data.frame( ID=1:length(continent_clipped), row.names = pid)
# Coerce and re-assign
continent_clipped <- SpatialPolygonsDataFrame(continent_clipped, p.df)
continent_clipped@data <- continent@data[-137,, drop=F]
 

## same but high resolution (slower, but works better for plots along coastlines)
sPDF <- rworldmap::getMap(resolution="high")
continent.high <- sPDF[,"continent"]
crs(continent.high) <- CRS("+init=epsg:4326")
continent.high@data$continent <- fct_recode(continent.high@data$continent, "South America"="South America and the Caribbean")

Assign plots to continent

continent.out <- sp::over(x=header.shp, y=continent)
#overlay unassigned points to the high resolution layer of continent
toassign <- header.shp[which(is.na(continent.out$continent)),] #154782 remain to assign
crs(toassign) <- crs(continent)
continent.out2 <- sp::over(x=toassign, y=continent.high)
#merge first and second overlay 
continent.out$continent[is.na(continent.out$continent)] <- continent.out2$continent
toassign <- header.shp[which(is.na(continent.out$continent)),] #47610 remain to assign
crs(toassign) <- crs(continent)

There are 43105 plots remaining unassigned.

Match unassigned points to closest continent

#go parallel
ncores=12
library(parallel)
library(doParallel)
cl <- makeForkCluster(ncores, outfile="" )
registerDoParallel(cl)
  
clusterEvalQ(cl, {
  library(rgdal)
  library(raster)
  library(sp)
  library(elevatr)
  library(dplyr)
  })

nearestContinent <- foreach(i=1:length(toassign), .packages=c('raster'), .combine=rbind) %dopar% {  
#  print(i)
  ## create a subset of geoentities based on a 5° buffer radius around each target plot.
  tmp.buff <- gBuffer(toassign[i,], width=5) 
  tryCatch(
    tmp.mypredictor <- spatialEco::spatial.select(
      x = tmp.buff,
      y = continent_clipped,
      distance = 0.1,
      predicate = "intersect"
    ),
    error = function(e) {
      print(paste("Nothing close enough for plot", toassign@data$PlotObservationID[i]))
    }
  )
    # find nearest neighbour  
  nearest.tmp <- tryCatch(tmp.mypredictor@data[geosphere::dist2Line(toassign[i,],
                                                                    tmp.mypredictor)[,"ID"],],
                          error = function(e){
                            ee <- continent@data[1,, drop=F]
                            ee[1,] <- rep(NA, ncol(continent))
                            }
                          ) %>% 
    as.character()
  return(nearest.tmp)
  }

stopCluster(cl)
continent.out$continent[is.na(continent.out$continent)] <- nearestContinent[,1]
save(continent.out, file = "../_derived/continent.out.RData")

Reload, manipulate continent and attach to header

load("../_derived/continent.out.RData")
header <- header %>% 
  left_join(header.shp@data %>% 
              dplyr::select(PlotObservationID) %>% 
              bind_cols(continent.out),
            by="PlotObservationID") %>% 
  mutate(CONTINENT=factor(continent, 
                            levels=c("Africa", "Antarctica", "Australia", "Eurasia", "North America", "South America"),
                            labels=c("AF", "AN", "AU", "EU", "N-A", "S-A"))) %>% 
  
  dplyr::select(-continent)
Summarize
Number of plots per continent
CONTINENT num.plot
AF 26715
AN 19
AU 66811
EU 1775648
N-A 90170
S-A 15823
NA 2451

4.2 Assign to sBiomes

Extract unique sets of coordinates to speed up matching

unique.coord.shp <- SpatialPoints(coords = header.shp@coords %>% 
                                    as.data.frame() %>% 
                                    distinct(), 
                                  proj4string = header.shp@proj4string
                                  )

Extract sBiomes for each unique coordinate.

sBiomes <- "/data/sPlot/users/Francesco/Ancillary_Data/Biomes_sPlot/sBiomes.shp"

library(parallel)
library(doParallel)
cl <- makeForkCluster(12, outfile=paste0("../_derived/sBiomes/sBiomes_.log"))
registerDoParallel(cl)

nchunks <- 120
foreach(i=1:nchunks) %dopar% {
  source("A98_PredictorsExtract.R")
  PredExtr(unique.coord.shp, myfunction=NA,
                          output=paste0("../_derived/sBiomes/sBiomes_",i, ".csv"),
                          toextract=sBiomes, typp="shp", ncores=1, chunkn=nchunks, chunk.i=i)
}
stopCluster(cl)

Reimport and join to header

filelist <-list.files("../_derived/sBiomes/", pattern=".csv", full.names = T)
filelist.n <- as.numeric(str_extract(filelist, pattern="(\\d)+"))
filelist.order <- order(filelist.n)
filelist <- filelist[filelist.order]

sBiomes.out <- unique.coord.shp@coords %>% 
  as.data.frame() %>% 
  bind_cols(lapply(filelist, read_csv) %>% 
              bind_rows())

header <- header %>% 
  left_join(sBiomes.out %>% 
              dplyr::select(Latitude=coords.x2, 
                            Longitude=coords.x1, 
                            sBiome=Name,
                            sBiomeID=BiomeID), 
            by=c("Latitude", "Longitude"))

There are 2451 unassigned plots.

Summarize:
Number of plots per Biome
sBiome num.plot
Alpine 35601
Boreal zone 34698
Dry midlatitudes 69744
Dry tropics and subtropics 40410
Polar and subpolar zone 7902
Subtropics with winter rain 202135
Subtrop. with year-round rain 78472
Temperate midlatitudes 1479240
Tropics with summer rain 15332
Tropics with year-round rain 11652
NA 2451

4.3 Extract WWF Ecoregions

Extract ecoregion name and ID from Ecoregions of the World. Olson et al. 2001 (BioScience).
Computation was performed in EVE HPC cluster using function A98_PredictorsExtract.R. Divided in 99 chunks.

ecoreg.path <- "../../Ancillary_Data/Ecoregions_WWF/wwf_terr_ecos.shp"
#ecoreg <- readOGR("../../Ancillary_Data/Ecoregions_WWF", layer="wwf_terr_ecos")
#ecoreg@data <- ecoreg@data %>% 
#  dplyr::select(OBJECTID, ECO_NAME, REALM, BIOME, ECO_NUM, ECO_ID, eco_code)

library(parallel)
library(doParallel)
cl <- makeForkCluster(14, outfile="../_derived/wwf_ecoregions/wwf_ecoregions.log")
registerDoParallel(cl)

nchunks <- 98
foreach(i=1:nchunks) %dopar% {
  source("A98_PredictorsExtract.R")
 # options(echo = F, message=F)
  PredExtr(unique.coord.shp, myfunction=NA, 
           output=paste0("../_derived/wwf_ecoregions/wwf_ecoregions_", i, ".csv"),  
           toextract=ecoreg.path, typp="shp", ncores=1, chunkn=nchunks, chunk.i=i)
}
stopCluster(cl)

Reimport output and join to header

ecoreg.files <- list.files("../_derived/wwf_ecoregions/", pattern="wwf_ecoregions_[0-9]+.csv", full.names=T)
ecoreg.files <- ecoreg.files[order(as.numeric(str_extract(ecoreg.files, pattern="[0-9]+")))]
ecoreg.out <- do.call(rbind, lapply(ecoreg.files, function(x) {read_csv(x, 
  col_types=cols(
    .default = col_double(),
    ECO_NAME = col_character(),
    REALM = col_character(),
    G200_REGIO = col_character(),
    eco_code = col_character()))}))


header <- header %>% 
  left_join(ecoreg.out %>% 
              dplyr::select(Latitude, Longitude, ECO_NAME, ECO_ID) %>% 
              dplyr::rename(Ecoregion=ECO_NAME, EcoregionID=ECO_ID), 
            by=c("Latitude", "Longitude"))

There are 0 unassigned plots.

Summarize:
Number of plots in the 30 best represented Ecoregions
Ecoregion num.plot
Atlantic mixed forests 350084
Western European broadleaf forests 239724
Baltic mixed forests 191195
Central European mixed forests 122612
Pannonian mixed forests 95802
Alps conifer and mixed forests 78170
Carpathian montane forests 68018
Celtic broadleaf forests 64659
Sarmatic mixed forests 34920
Northeastern Spain and Southern France Mediterranean forests 34194
Taiheiyo evergreen forests 31629
English Lowlands beech forests 29728
Italian sclerophyllous and semi-deciduous forests 29611
Balkan mixed forests 28785
Cantabrian mixed forests 26603
Pyrenees conifer and mixed forests 24563
Tyrrhenian-Adriatic Sclerophyllous and mixed forests 22922
Iberian sclerophyllous and semi-deciduous forests 22575
East European forest steppe 16667
Great Basin shrub steppe 16131
Scandinavian and Russian taiga 15527
Eastern Australian temperate forests 15117
Dinaric Mountains mixed forests 14484
Pontic steppe 14085
Rodope montane mixed forests 13454
Caspian lowland desert 13443
Nihonkai montane deciduous forests 12414
Southeast Australia temperate savanna 12134
Colorado Plateau shrublands 10964
North Atlantic moist mixed forests 10591

4.4 Extract elevation

Extract elevation for each plot. Loops over tiles of 1 x 1°, projects to Mercator, and extracts elevation for plot coordinates, as well as 2.5, 50, and 97.5 quantiles for a buffer area having a radius equal to the location uncertainty of each plot (but only if location uncertainty < 50 km). DEM derives from package elevatr, which uses the Terrain Tiles on Amazon Web Services. Resolutions of DEM rasters vary by region. I set a zoom factor z=10, which corresponds to ~ 75-150 m. Sources are: SRTM, data.gov.at in Austria, NRCAN in Canada, SRTM, NED/3DEP 1/3 arcsec, data.gov.uk in United Kingdom, INEGI in Mexico, ArcticDEM in latitudes above 60°, LINZ in New Zealand, Kartverket in Norway, as described here.
Split data into tiles of 1 x 1 degrees, and create sp::SpatialPointsDataFrame files. Only for plots having a location uncertainty < 50 km, which corresponds to 1581658 plots.

header.tiles <- header %>%
  dplyr::select(PlotObservationID, Dataset, Longitude, Latitude, `Location uncertainty (m)`) %>%
  mutate(`Location uncertainty (m)`=abs(`Location uncertainty (m)`)) %>% 
  filter(`Location uncertainty (m)`<= 50000) %>%
  mutate_at(.vars=vars(Longitude, Latitude), 
            .funs=list(tile=~cut(., breaks = seq(-180,180, by=.2)))) %>%
  filter(!is.na(Longitude_tile) & !is.na(Latitude_tile) ) %>%
  mutate(tilenam=factor(paste(Longitude_tile, Latitude_tile)))

There are 1581658 plots out of 1977637 plots with Location uncertainty <= 50km (or absent). The total number of tiles is 30117.
Performed in EVE HPC cluster using function A97_ElevationExtract.R. Divided in 99 chunks.

cl <- makeForkCluster(14, outfile="")
registerDoParallel(cl)

clusterEvalQ(cl, {
  library(rgdal)
  library(raster)
  library(sp)
  library(elevatr)
  library(dplyr)})

# Divided in 99 chunks
elevation.out <- foreach(i=1:99, .combine=rbind) %dopar% {
  source("A97_ElevationExtract.R")
  ElevationExtract(header.shp, output, ncores=1, chunk.i=i)}
stopCluster(cl)

For those tiles that failed, extract elevation of remaining plots one by one

#create list of tiles for which dem could not be extracted
myfiles <- list.files("../_derived/elevatr/")
failed <- list.files("../_derived/elevatr/", pattern = "[A-Za-z]*_[0-9]+failed\\.RData$")
failed <- as.numeric(unlist(regmatches(failed, gregexpr("[[:digit:]]+", failed))))

#create SpatialPointsDataFrame
sp.tile0 <- SpatialPointsDataFrame(coords=header.tiles %>% 
                                    filter(tilenam %in% levels(header.tiles$tilenam)[failed]) %>%
                                    dplyr::select(Longitude, Latitude),
                                  data=header.tiles %>% 
                                    filter(tilenam %in% levels(header.tiles$tilenam)[failed]) %>%
                                    dplyr::select(-Longitude, -Latitude),
                                  proj4string = CRS("+init=epsg:4326"))
sp.tile0 <- spTransform(sp.tile0, CRSobj = CRS("+init=epsg:3857 +proj=merc +a=6378137 +b=6378137 +lat_ts=0.0
                                               +lon_0=0.0 +x_0=0.0 +y_0=0 +k=1.0 +units=m +nadgrids=@null
                                               +no_defs ")) #project to mercator
output.tile <- data.frame(NULL)

cl <- makeForkCluster(5, outfile="")
registerDoParallel(cl)

clusterEvalQ(cl, {
  library(rgdal)
  library(raster)
  library(sp)
  library(elevatr)
  library(dplyr)})

#Loop over all plots
elevation.failed <- foreach(i=1:nrow(sp.tile0), .packages=c('raster'), .combine=rbind) %dopar% { 
  sp.tile <- sp.tile0[i,]
  tryCatch(raster.tile <- get_elev_raster(sp.tile, z=10, 
                                          expand=max(sp.tile$`Location uncertainty (m)`)),
        error = function(e){
          print(paste("could not retrieve DEM for", sp.tile$PlotObservationID))}
          )
   if(!exists("raster.tile")) {
    output.tile <- data.frame(PlotObservationID==sp.tile$PlotObservationID, 
                              elevation=NA, 
                              Elevation_q2.5=NA, 
                              Elevation_median=NA,
                              Elevation_q97.5=NA,
                              DEM.res=NA)
    return(output.tile)
  } else {
  # clip dem tile with continent shape
  raster.tile <- mask(raster.tile, continent.high.merc)
  
  #extract and summarize elevation data
  elev.tile <- raster::extract(raster.tile, sp.tile, small=T)
  elev.tile.buffer <- raster::extract(raster.tile, sp.tile, 
                                      buffer=sp.tile$`Location uncertainty (m)`, small=T)
  elev.q95 <- t(round(mapply( quantile, 
                            x=elev.tile.buffer,
                            probs=rep(c(0.025, 0.5, 0.975), each=length(elev.tile)), na.rm=T)))
  output.tile <- data.frame(PlotObservationID=sp.tile$PlotObservationID, 
                            elevation=round(elev.tile), 
                            elev.q95, 
                            DEM.res=res(raster.tile)[1]) %>%
  rename(Elevation_q2.5=X2.5., Elevation_median=X50., Elevation_q97.5=X97.5.)
  return(output.tile)
  }
}
stopCluster(cl)
save(elevation.failed, file = "../_derived/elevatr/elevation_missing.RData")

Compose tiles into a single output, and export

myfiles <- list.files("../_derived/elevatr/", pattern = "elevation_tile_[0-9]+\\.RData$", full.names = T)

#create empty data.frame
elevation.out <- matrix(NA, nrow=nrow(header.tiles), ncol=6)
elevation.out <- as.data.frame(elevation.out)
colnames(elevation.out) <- c("PlotObservationID", "elevation", "Elevation_q2.5", "Elevation_median", "Elevation_q97.5","DEM.res")
elevation.out$PlotObservationID <- header.tiles$PlotObservationID

tmp <- NULL
for(i in 1:length(myfiles)){
  load(myfiles[i])
  #attach results to empty data.frame
  tmp <- bind_rows(tmp, output.tile)
  if(i %in% seq(5000, length(myfiles), by=5000)){
    mymatch <- base::match(x=tmp$PlotObservationID, table=elevation.out$PlotObservationID)
    mymatch <- mymatch[!is.na(mymatch)]
    elevation.out[mymatch,] <- tmp
    tmp <- NULL
    print(paste("Attached first", i, "files"))
  }
  if(i %in% seq(1,length(myfiles), by=250)){print(i)}
}

load(file = "../_derived/elevatr/elevation_missing.RData")

mymatch <- base::match(x=elevation.failed$PlotObservationID, table=elevation.out$PlotObservationID)
mymatch <- mymatch[!is.na(mymatch)]
elevation.out[mymatch,] <- elevation.failed

write_csv(elevation.out, path ="../_derived/elevatr/elevation.out.csv")

Reimport output, attach to header and check

elevation.out <- read_csv("../_derived/elevatr/elevation.out.csv")
Example of elevation output (10 randomly selected plots shown)
PlotObservationID elevation Elevation_q2.5 Elevation_median Elevation_q97.5 DEM.res
1120364 27 27 27 27 153
40172 2023 2023 2023 2023 153
223124 156 156 156 156 153
1408002 -25 -25 -25 -25 153
1197947 11 11 11 11 153
68524 107 107 107 107 153
287032 475 476 482 488 153
250392 289 289 289 289 153
1180919 17 17 17 17 153
1053589 0 -4 -2 0 153
summary(elevation.out %>% 
          dplyr::select(-PlotObservationID, -elevation))
##  Elevation_q2.5    Elevation_median Elevation_q97.5     DEM.res     
##  Min.   :-1018.0   Min.   :-474.0   Min.   :-474     Min.   : 76.4  
##  1st Qu.:   17.0   1st Qu.:  21.0   1st Qu.:  26     1st Qu.:153.0  
##  Median :  163.0   Median : 178.0   Median : 194     Median :153.0  
##  Mean   :  441.6   Mean   : 469.4   Mean   : 506     Mean   :153.1  
##  3rd Qu.:  593.0   3rd Qu.: 644.0   3rd Qu.: 698     3rd Qu.:153.0  
##  Max.   : 6155.0   Max.   :6155.0   Max.   :6155     Max.   :306.0  
##  NA's   :104818    NA's   :104818   NA's   :104818   NA's   :3082

There are 104818 plots without elevation info, corresponding to 5.3% of the number of matched plots. Please note that elevation was extracted only for plots with location uncertainty <50 km, i.e., 1581658 plots.
There are 51389 plots with elevation below sea level.
Join elevation data (only median)

header <- header %>% 
  left_join(elevation.out %>% 
              dplyr::select(PlotObservationID, Elevation_median) %>% 
              rename(elevation_dem=Elevation_median) %>% 
              distinct(PlotObservationID, .keep_all=T), 
            by="PlotObservationID")
Summary and check
Dataset with highest number of plots below sea level
GIVD ID Dataset num.plot
EU-NL-001 Netherlands 38530
EU-RU-002 Russia_volga 8768
EU-DK-002 Denmark 1197
EU-DE-001 Germany_vegmv 858
EU-FR-003 France_SOPHY 295
AS-JP-002 Japan 220
EU-RU-003 Russia_lysenko 159
EU-UA-006 Ukraine_onyshchenko 158
EU-DE-040 Schleswig-Holstein Db 146
EU-IE-001 Ireland_nvd 124
EU-00-016 Ammophiletea Database 118
EU-IT-001 VegItaly 106
EU-HR-001 Croatia_mix 95
EU-00-011 Basque Country Database 71
EU-GB-001 Britain_nvcd 67
EU-IT-011 Italy_uniroma 48
EU-00-004d Spain_sivim_macaronesia 44
EU-LT-001 Lithuania 38
EU-DE-014 Germany_gvrd 35
EU-DE-013 Germany_vegetweb 33
EU-DE-020 GrassVeg.DE 27
00-00-004 Euro-Asian tundra VDB 22
EU-00-022 European Mire VDB 21
NA-US-014 Aava 21
EU-00-028 European Weed Vegetation Database 20
EU-ES-001 Spain_sivim_wetlands 19
AS-CN-007 China_Northern_Mountains_2 15
AU-AU-002 Australia 12
EU-LV-001 Latvian Grassland VDB 11
EU-00-004f Spain_sivim_sclerophyllous_pinus 9
EU-DE-035 Germany Coastal VDB 8
EU-00-002 Nordic_Baltic EDGG 7
EU-00-018 Nordic Vegetation Database 7
EU-UA-001 Ukraine Grassland Database 7
EU-RO-008 Romania Grassland Database 6
EU-IT-010 Italy_HabItAlp 5
NA-US-006 USA_CVS 5
EU-00-004e Spain_sivim_sclerophyllous 4
EU-00-004 Spain_sivim 3
EU-00-027 European Boreal Forest Database 3
EU-AL-001 Albanian Vegetation Database 3
AF-ZA-003 Fynbos 2
AU-AU-003 NSW Australia 2
EU-BE-002 INBOVEG 2
AS-00-001 Korea 1
EU-00-004c Spain_sivim_grasslands 1
EU-00-019 Balkan Vegetation Database 1
EU-00-026 CircumMed+Euro Pine Forest database 1

Create Scatterplot between measured elevation in the field, and elevation derived from DEM

#join measured and derived elevation
mydata <- header %>% 
  dplyr::select(PlotObservationID, `Altitude (m)`, elevation_dem) %>%
  filter(!is.na(`Altitude (m)`) & !is.na(elevation_dem)) %>%
  rename(elevation_measured=`Altitude (m)`)

ggplot(data=mydata) + 
  geom_point(aes(x=elevation_measured, y=elevation_dem), alpha=1/10, cex=0.8) + 
  theme_bw() + 
  geom_abline(slope=0, intercept=0, col=2, lty=2) + 
  geom_abline(slope=1, intercept=1, col="Dark green")

4.5 Assign to countries

There is a minor number of plots (4064), not assigned to any countries. Fix that.

countries <- readOGR("../../Ancillary_Data/naturalearth/ne_110m_admin_0_countries.shp") 
## OGR data source with driver: ESRI Shapefile 
## Source: "/data/sPlot/users/Francesco/Ancillary_Data/naturalearth/ne_110m_admin_0_countries.shp", layer: "ne_110m_admin_0_countries"
## with 177 features
## It has 94 fields
## Integer64 fields read as strings:  POP_EST NE_ID
crs(countries) <- crs(header.shp)
tmp.sel <- header %>% 
  mutate(Country=ifelse(Country=="Former USSR", NA, Country)) %>% 
  filter(is.na(Country)) %>% 
  pull(PlotObservationID)

header.shp.nocountry <- header.shp[which(header.shp$PlotObservationID %in% tmp.sel),]
countries.out <- over(header.shp.nocountry, countries)
countries.out$PlotObservationID <- header.shp.nocountry@data$PlotObservationID


header <- header %>% 
  mutate(Country=ifelse(Country=="Former USSR", NA, Country)) %>% 
  left_join(countries.out %>% 
              dplyr::select(PlotObservationID, Country2=NAME), 
            by="PlotObservationID") %>% 
  mutate(Country=coalesce(Country, Country2)) %>% 
  dplyr::select(-Country2) %>% 
  mutate(Country=ifelse(Country=="Great Britain", "United Kingdom", Country)) %>% 
  mutate(Country=ifelse(Country=="Russia", "Russian Federation", Country)) 

Plots without country info are now only 1920.

5 Map of plots

Update header.shp

header.shp@data <- header.shp@data %>% 
  left_join(header %>% 
          dplyr::select(PlotObservationID, sBiome, CONTINENT, Country,
                        Ecoregion, GIVD.ID=`GIVD ID`), 
            by="PlotObservationID") 

header.sf <- header.shp %>% 
  st_as_sf() %>% 
  st_transform(crs = "+proj=eck4")

Basic Map of the world in Eckert projection

countries <- ne_countries(returnclass = "sf") %>% 
  st_transform(crs = "+proj=eck4") %>% 
  st_geometry()
graticules <- ne_download(type = "graticules_15", category = "physical",
                          returnclass = "sf") %>% 
  st_transform(crs = "+proj=eck4") %>% 
  st_geometry()
## OGR data source with driver: ESRI Shapefile 
## Source: "/data/sPlot/users/Francesco/_tmp/RtmpGE1i8r", layer: "ne_110m_graticules_15"
## with 35 features
## It has 5 fields
## Integer64 fields read as strings:  degrees scalerank
bb <- ne_download(type = "wgs84_bounding_box", category = "physical",
                  returnclass = "sf") %>% 
  st_transform(crs = "+proj=eck4") %>% 
  st_geometry()
## OGR data source with driver: ESRI Shapefile 
## Source: "/data/sPlot/users/Francesco/_tmp/RtmpGE1i8r", layer: "ne_110m_wgs84_bounding_box"
## with 1 features
## It has 2 fields
w3a <- ggplot() +
  geom_sf(data = bb, col = "grey20", fill = "white") +
  geom_sf(data = graticules, col = "grey20", lwd = 0.1) +
  geom_sf(data = countries, fill = "grey90", col = NA, lwd = 0.3) +
  coord_sf(crs = "+proj=eck4") +
  theme_minimal() +
  theme(axis.text = element_blank(), 
        legend.title=element_text(size=12), 
        legend.text=element_text(size=12),
        legend.background = element_rect(size=0.1, linetype="solid", colour = 1), 
        legend.key.height = unit(1.1, "cm"), 
        legend.key.width = unit(1.1, "cm")) +
  scale_fill_viridis()

Graph of plot density (hexagons)

header2 <- header %>% 
  filter(!is.na(Longitude) | !is.na(Latitude)) %>% 
  dplyr::select(PlotObservationID, Latitude, Longitude, `GIVD ID`) %>% 
  filter(!(abs(Longitude) >171 & abs(Latitude>70)))
dggs <- dgconstruct(spacing=300, metric=T, resround='down')
## Resolution: 6, Area (km^2): 69967.8493448681, Spacing (km): 261.246386348549, CLS (km): 298.479323187169
#Get the corresponding grid cells for each earthquake epicenter (lat-long pair)
header2$cell <- dgGEO_to_SEQNUM(dggs, header2$Longitude, header2$Latitude)$seqnum

#Calculate number of plots for each cell
header.out   <- header2 %>% 
  group_by(cell) %>% 
  summarise(value.out=log(n(), 10))

#Get the grid cell boundaries for cells 
grid   <- dgcellstogrid(dggs, header.out$cell, frame=F) %>%
  st_as_sf() %>% 
  mutate(cell = header.out$cell) %>% 
  mutate(value.out=header.out$value.out) %>% 
  st_transform("+proj=eck4") %>% 
  st_wrap_dateline(options = c("WRAPDATELINE=YES"))

## plotting
legpos <- c(0.160, .24)
(w3 <- w3a + 
       geom_sf(data=grid, aes(fill=value.out),lwd=0, alpha=0.9)    +
       geom_sf(data = countries, col = "grey10", fill=NA, lwd = 0.3) + 
       scale_fill_viridis(
         name="# plots", breaks=0:5, labels = c("1", "10", "100",
                                                "1,000", "10,000", "100,000"), option="viridis" ) + 
    #labs(fill="# plots") + 
    theme(legend.position = legpos +c(-0.06, 0.25))
)

ggsave(filename="../_pics/PlotDensityLog10_vir.png", width = 15, height = 7, units="in", dpi=300, plot=w3)

Graph of plot location by Dataset

(w4 <- w3a + 
         geom_sf(data=header.sf %>% 
                   mutate(GIVD.ID=fct_shuffle(GIVD.ID)), aes(col=factor(GIVD.ID)), pch=16, size=0.8, alpha=0.6) +
         geom_sf(data = countries, col = "grey10", fill=NA, lwd = 0.3) + 
         theme(legend.position = "none"))

ggsave(filename="../_pics/PlotDistrib_Dark2_shuffle1984.png", width = 15, height = 7, units="in", dpi=300, plot=w4) ## takes ~40' to render

Double check attribution to continents, Biomes and Ecoregions. Do it only on a subset of plots

tmp.sel <- header %>% 
  group_by(sBiome) %>% 
  sample_n(1000) %>% 
  pull(PlotObservationID)

#sBiomes
(w5 <- w3a + 
         geom_sf(data=header.sf %>% 
                   filter(PlotObservationID %in% tmp.sel), 
                 aes(col=factor(sBiome)), pch=16, size=0.8, alpha=0.6) +
         geom_sf(data = countries, col = "grey10", fill=NA, lwd = 0.3) + 
         scale_color_brewer(name="Biome", palette="Paired") + 
         guides(color  = guide_legend(override.aes = list(size = 5))))

#continent
tmp.sel <- header %>%
  filter(CONTINENT!="AN") %>% 
  group_by(CONTINENT) %>% 
  sample_n(1000) %>% 
  pull(PlotObservationID)
(w6 <- w3a + 
         geom_sf(data=header.sf %>% 
                   filter(PlotObservationID %in% tmp.sel), 
                 aes(col=factor(CONTINENT)), pch=16, size=0.8, alpha=0.6) +
         geom_sf(data = countries, col = "grey10", fill=NA, lwd = 0.3)+ 
         scale_color_brewer(name="Continent", palette="Paired") + 
         guides(color  = guide_legend(override.aes = list(size = 5))))

#Country
tmp.sel <- header %>%
  nest(-Country) %>% 
  left_join(header %>% 
              count(Country) %>% 
              mutate(samplesize=ifelse(n<50, n,50)), 
            by="Country") %>% 
  mutate(Sample = purrr::map2(data, samplesize, sample_n)) %>% 
  unnest(Sample) %>% 
  filter(Country %in% (header %>% 
                         distinct(Country) %>% 
                         sample_n(10) %>% 
                         pull(Country))) %>%   
  pull(PlotObservationID)
(w7 <- w3a + 
         geom_sf(data=header.sf %>% 
                   filter(PlotObservationID %in% tmp.sel), 
                 aes(col=factor(Country)), pch=16, size=0.8, alpha=0.6) +
         geom_sf(data = countries, col = "grey10", fill=NA, lwd = 0.3)+ 
         scale_color_brewer(name="Biome", palette="Paired") + 
         guides(color  = guide_legend(override.aes = list(size = 5))))

#Ecoregion - Only 10 random ecoregions tested
tmp.sel <- header %>%
  filter(Ecoregion %in% sample(unique(header$Ecoregion), 10)) %>% 
  pull(PlotObservationID)
(w7 <- w3a + 
         geom_sf(data=header.sf %>% 
                   filter(PlotObservationID %in% tmp.sel) %>% 
                   mutate(Ecoregion=factor(Ecoregion)), 
                 aes(col=factor(Ecoregion)), pch=16, size=0.8, alpha=0.6) +
         geom_sf(data = countries, col = "grey10", fill=NA, lwd = 0.3) + 
    theme(legend.position="bottom") + 
    guides(color  = guide_legend(override.aes = list(size = 5), nrow = 3)))

6 Fix output and export

#check 
nrow(header)==nrow(header0)
## [1] TRUE
header <- header %>% 
  dplyr::select(
    #Metadata
    PlotObservationID, Dataset, "GIVD ID", "TV2 relevé number", "ORIG_NUM", "GUID", 
    #Geographic info
    Longitude:"Location uncertainty (m)", Country, CONTINENT, sBiome, sBiomeID, Ecoregion, EcoregionID, Locality,
    #Methodological info
    "Relevé area (m²)", "Cover abundance scale", "Date of recording", "Plants recorded", 
    "Herbs identified (y/n)","Mosses identified (y/n)","Lichens identified (y/n)",
    #Topographical
    elevation_dem, "Altitude (m)", "Aspect (°)", "Slope (°)", 
    #Vegetation type
    Forest:Naturalness, ESY, 
    #Vegetation structure
                "Cover total (%)":"Maximum height cryptogams (mm)")
save(header, file = "../_output/header_sPlot3.0.RData")
Example of header (20 random rows shown)
PlotObservationID Dataset GIVD ID TV2 relevé number ORIG_NUM GUID Longitude Latitude Location uncertainty (m) Country CONTINENT sBiome sBiomeID Ecoregion EcoregionID Locality Relevé area (m²) Cover abundance scale Date of recording Plants recorded Herbs identified (y/n) Mosses identified (y/n) Lichens identified (y/n) elevation_dem Altitude (m) Aspect (°) Slope (°) Forest Shrubland Grassland Wetland Sparse.vegetation Naturalness ESY Cover total (%) Cover tree layer (%) Cover shrub layer (%) Cover herb layer (%) Cover moss layer (%) Cover lichen layer (%) Cover algae layer (%) Cover litter layer (%) Cover open water (%) Cover bare rock (%) Height (highest) trees (m) Height lowest trees (m) Height (highest) shrubs (m) Height lowest shrubs (m) Aver. height (high) herbs (cm) Aver. height lowest herbs (cm) Maximum height herbs (cm) Maximum height cryptogams (mm)
659114 France_SOPHY EU-FR-003 104091 NA 5BF03047-A890-4397-A806-B5891F18CC18 -3.617021 40.34866 568490.9 Spain EU Subtropics with winter rain 4 Iberian sclerophyllous and semi-deciduous forests 81209 ST LLORENC, VERS MATADEPERA N PRAT, E 50.0 Braun/Blanquet (old) 1971-01-01 NA NA NA NA NA 460 NA NA 1 0 0 0 0 1 G37 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
880984 Germany_vegmv EU-DE-001 21473 NA E0999444-8B4C-4ABB-B41C-767F8E7432A6 12.539978 54.42336 3880.0 Germany EU Temperate midlatitudes 6 Baltic mixed forests 80405 Prerow 1.7 Barkman, Doing & Segal 1992-01-01 NA NA N N 10 5 180 24 0 0 1 0 0 NA E 70 NA NA 20 NA NA NA 50 NA NA NA NA NA NA NA NA NA NA
1294160 Poland EU-PL-001 52377 NA 9582A7BD-0EEA-44D7-AE82-EBDD027A8FD4 16.740074 53.19266 NA Poland EU Temperate midlatitudes 6 Baltic mixed forests 80405 Kuznik 300.0 Braun/Blanquet (old) 1972-07-16 NA NA Y N NA NA 270 15 1 0 0 0 0 NA G NA NA 60 80 80 NA NA NA NA NA 35 NA NA NA NA NA NA NA
1457273 Slovakia_nvd EU-SK-001 629056 NA A30BC56C-CA27-4B08-86CE-C42BE2F0E785 20.200000 49.24861 -10.0 Slovak Republic EU Temperate midlatitudes 6 Carpathian montane forests 80504 Belianske Tatry, Havran, SV úbočie medzi dvoma svah.hrebeňmi z vrcholu SSV a SV smerom 15.0 Braun/Blanquet (new) 1987-07-16 NA NA Y Y NA 1950 23 30 0 1 0 0 0 NA Fb 100 NA NA 90 40 NA NA NA NA NA NA NA NA NA NA NA NA NA
1465142 Slovakia_nvd EU-SK-001 636930 NA 9FE6F1D9-5705-4212-8575-6862D84D667C 20.115556 49.18028 -10.0 Slovak Republic EU Alpine 10 Carpathian montane forests 80504 V. Tatry, Bielovodská dol., Ťažká dol. 500.0 Ordinale scale (1-9) 1994-07-08 NA NA Y Y 1534 1570 90 NA 1 0 0 0 0 1 G32 80 70 15 60 5 NA NA NA NA NA NA NA NA NA NA NA NA NA
480844 Denmark EU-DK-002 174451 NA A9FADE2D-837D-4373-8288-63A41753550A 9.903970 55.69568 15.0 Denmark EU Temperate midlatitudes 6 Baltic mixed forests 80405 NA 706.9 Presence/Absence 2005-01-01 NA NA NA NA NA 28 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
466896 Denmark EU-DK-002 148644 NA 4772E096-B377-45B5-A5A7-3C56B2252A44 8.767364 55.77278 15.0 Denmark EU Temperate midlatitudes 6 Atlantic mixed forests 80402 NA 78.5 Presence/Absence 2012-01-01 NA NA NA NA 26 27 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
1457594 Slovakia_nvd EU-SK-001 629377 NA 20ADF451-86B7-45A2-BF6C-21EC82906C75 22.520278 49.07472 -10.0 Slovak Republic EU Temperate midlatitudes 6 Carpathian montane forests 80504 Bukovské vrchy, Príkry, tesne pod vrcholom kopca 10.0 Braun/Blanquet (old) 1986-07-18 NA NA Y Y 933 951 90 15 0 0 1 0 0 NA E NA NA NA 95 1 NA NA NA NA NA NA NA NA NA NA NA NA NA
790431 Germany_gvrd EU-DE-014 46700 NA 0CFE7A82-071D-4772-B50E-C2CA1627FFF8 9.433367 51.29081 1000.0 Germany EU Temperate midlatitudes 6 Western European broadleaf forests 80445 NA 30.0 Braun/Blanquet (old) 1982-01-01 NA NA NA NA 213 207 NA NA 0 0 1 0 0 2 E34a NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
869794 Germany_vegmv EU-DE-001 7485 NA b78196e3-3c19-404346-bcbbb3-deb4230ca69f 12.789964 53.92342 3903.0 Germany EU Temperate midlatitudes 6 Baltic mixed forests 80405 Gnoien 100.0 Braun/Blanquet (old) 1964-01-01 NA NA Y NA 35 32 NA NA NA NA NA NA NA NA ? NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
219733 Czechia_nvd EU-CZ-001 103470 NA AD4515C9-FEAA-496B-8D91-5F6913567B47 14.357778 48.87222 -100.0 Czech Republic EU Temperate midlatitudes 6 Western European broadleaf forests 80445 Plešovice, řeka Vltava, říční km 265,0 16.0 Braun/Blanquet (old) 1995-07-13 NA NA Y N 448 -1 NA 0 0 0 0 1 0 1 C22b 60 NA NA 60 NA NA NA NA NA NA NA NA NA NA NA NA NA NA
934467 Hungary EU-HU-003 204713 NA BED9BAB0-85E2-4267-9029-396E508C34D4 20.331121 47.99155 1000.0 Hungary EU Temperate midlatitudes 6 Pannonian mixed forests 80431 Szarvask‹; Szarvask‹ V rhegy 4.0 Braun/Blanquet (old) NA NA NA NA NA 290 302 325 65 0 0 1 0 0 1 E11g NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
172823 Britain_nvcd EU-GB-001 22816 NA 3A2EE015-5086-4AF1-BD4D-6354ABBED625 -9.000000 53.40000 620000.0 United Kingdom EU Temperate midlatitudes 6 Celtic broadleaf forests 80409 NA 49.0 Domin 1961-08-21 NA NA NA NA NA NA NA NA 0 0 1 0 0 NA E NA NA NA 60 NA NA NA NA NA NA NA NA NA NA NA NA NA NA
710731 France_SOPHY EU-FR-003 156844 NA DAB16506-FCAF-490D-9479-6F9217EEDAD5 5.289229 43.50510 2.0 France EU Subtropics with winter rain 4 Northeastern Spain and Southern France Mediterranean forests 81215 VELAUX, ARBOIS, W NA Braun/Blanquet (old) 1980-01-01 NA NA NA NA 242 240 NA NA NA NA NA NA NA NA ? NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
153011 Britain_nvcd EU-GB-001 3004 NA 7877C36C-FA84-4C3C-A135-7A59D00B008F -2.731714 51.31522 100.0 United Kingdom EU Temperate midlatitudes 6 English Lowlands beech forests 80421 NA 16.0 Domin NA NA NA NA NA 247 300 40 10 0 1 0 1 0 2 F42 NA NA 100 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
406414 Denmark EU-DK-002 87726 NA 108B59C4-E7DC-413D-B66C-D0944F886CD2 8.419102 55.41979 15.0 Denmark EU Temperate midlatitudes 6 Atlantic mixed forests 80402 NA 78.5 Presence/Absence 2011-01-01 NA NA NA NA 4 4 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
244836 Czechia_nvd EU-CZ-001 183089 NA D33FC434-59B3-49A6-99EE-5587ABED6960 17.603889 49.05667 -100.0 Czech Republic EU Temperate midlatitudes 6 Pannonian mixed forests 80431 Drslavice; Terasy-Vinohradné, asi 950 m SSV od kostela 16.0 Braun/Blanquet (new) 2006-06-23 NA NA Y N 246 240 135 15 0 0 1 0 0 2 E12a 50 NA NA 50 NA NA NA NA NA NA NA NA NA NA 20 NA 50 NA
1766151 Japan AS-JP-002 36290 NA 302B075D-AE69-492B-B5E2-ED9E453497B0 144.104000 44.09957 -10.0 Japan EU Temperate midlatitudes 6 Hokkaido deciduous forests 80423 NA 9.0 Presence/Absence NA NA NA NA NA 30 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
849810 Germany_vegetweb EU-DE-013 497 NA ff0eb84a-7a49-41ff-ac43-dfd0b8b0e71c 8.360000 52.55000 -432849.0 Germany EU Temperate midlatitudes 6 Atlantic mixed forests 80402 Diepholz-Eschholtwiesen [4km südl.] 18.0 Braun/Blanquet (old) 1946-01-01 NA NA NA NA NA 0 NA NA 0 0 0 1 0 NA C1C2 100 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
989467 Italy_uniroma EU-IT-011 13878 NA A6C788D6-9ECC-4BCE-AB55-E15A2693347A 10.978761 43.18074 75.0 Italy EU Subtropics with winter rain 4 Italian sclerophyllous and semi-deciduous forests 81211 Pressi Brenna, Radicondoli 200.0 Braun/Blanquet (old) 1993-01-01 NA NA NA NA 791 800 45 30 1 0 0 0 0 NA G 95 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA

Supplementary Material

ANNEX 1 - Ancillary function - PredExtr

# define ancillary functions
robust.mean <- function(x){mean(x, na.rm=T)}
robust.mode <- function(x){if(any(x!=0)) {
  a <- x[which(x!=0)] #exclude zero (i.e. NAs)
  return(as.numeric(names(sort(table(a), decreasing=T))[1]))} else
    return(NA)}
robust.sd <- function(x){sd(x, na.rm=T)}

#main function
PredExtr <- function(x.shp, myfunction=NA, output=NA, 
                     toextract, typp=c("raster", "shp"), ncores, chunkn=1, chunk.i=NA){
  print("Load Packages")
  require(foreach)
  require(parallel)
  require(doParallel)
  require(raster)
  require(rgeos)
  require(rgdal)
  require(geosphere) 
  require(spatialEco)
  require(sf)

  
  if(ncores>1){
    print("go parallel")
    cl.tmp <- makeForkCluster(ncores, outfile="" )
    registerDoParallel(cl.tmp)
    `%myinfix%` <- `%dopar%`
  } else {`%myinfix%` <-  `%do%`}
  
  print(paste("Loading", typp, "data :", toextract))
  print(paste("output will be:", output))
  if(typp=="raster"){ mypredictor <- raster(toextract)} else
    mypredictor <- readOGR(toextract)
  if(is.character(x.shp)){
    header.shp <- readOGR(x.shp)
  } else {header.shp <- x.shp}
  print(length(header.shp))
  crs(mypredictor) <- crs(header.shp) #should be verified beforehand!

  ## Divide in chunks if requested
  if(chunkn>1 & !is.na(chunk.i)){
    print(paste("divide in", chunkn, " chunks and run on chunk n.", chunk.i))
    indices <- 1:length(header.shp)
    chunks <- split(indices, sort(indices%%chunkn))
    header.shp <- header.shp[chunks[[chunk.i]],]
    } 
  print(paste("length after chunking is ", length(header.shp)))
  if(!is.na(myfunction)){
    print("myfunction defined as")
    myfunction <- get(myfunction)
    print(myfunction)
  }
    

  
  if(typp=="raster"){
    print("start main foreach loop")
    out <- foreach(i=1:length(header.shp), .packages=c('raster'), .combine=rbind)  %myinfix% { 
    tmp <- raster::extract(mypredictor, header.shp[i,], 
                         buffer=min(10000,  max(250, header.shp@data$lc_ncrt[i])), fun=myfunction) 
  }
  if(!is.na(output)) {write.csv(out, file = output)} else{return(out)}
  } else {
    print("Match sequentially")
    out <- sp::over(x=header.shp, y=mypredictor) 
    i.toassign <- which(is.na(out[,1]))
    toassign <- header.shp[i.toassign,]
    print(paste(length(toassign), "plots not matched directly - seek for NN"))
    out <- header.shp@coords %>% 
      as.data.frame() %>% 
      rename(Longitude=1, Latitude=2) %>% 
      bind_cols(out)
  if(length(toassign)>0){
    # Crack if multipolygon
    if(any( (st_geometry_type(mypredictor %>% st_as_sf)) == "MULTIPOLYGON")) {
      ## explode multipolygon
      mypredictor.expl <- explode(mypredictor) %>% 
        as_Spatial()
      } else {
        mypredictor.expl <- mypredictor}
    print("start main foreach loop")
    nearest <- foreach(i=1:length(toassign), .packages=c('raster'), .combine=rbind) %myinfix% { 
      print(i)
      ## create a subset of geoentities based on a 5° buffer radius around each target plot.
      tmp.buff <- gBuffer(toassign[i,], width=5) 
      tryCatch(
        tmp.mypredictor <- spatialEco::spatial.select(
          x = tmp.buff,
          y = mypredictor.expl,
          distance = 0.1,
          predicate = "intersect"
        ),
        error = function(e) {
          print(paste("Nothing close enough for plot", i))
          return(mypredictor.expl)
        }
      )
      # find nearest neighbour  
      nearest.tmp <- tryCatch(tmp.mypredictor@data[geosphere::dist2Line(toassign[i,], tmp.mypredictor)[,"ID"],],
                              error = function(e){
                                print("Can't find NN!")
                                ee <- mypredictor@data[1,, drop=F]
                                ee[1,] <- rep(NA, ncol(mypredictor))
                                return(ee)
                                }
                              )
      nearest.tmp <- toassign@coords[i,,drop=F] %>% 
        as.data.frame() %>% 
        rename(Longitude=1, Latitude=2) %>% 
        bind_cols(nearest.tmp)
      return(nearest.tmp)
    }

    out[i.toassign,] <- nearest
   }
    if(!is.na(output)) {write.csv(out, file = output)} else{return(out)}
  }
  if(ncores>1) {stopCluster(cl.tmp)}
  }

ANNEX 2 - Ancillary function - ElevationExtract

ElevationExtract <- function(header, output, ncores, chunk.i){
  print("load packages")
  require(tidyverse)
  
  require(rgdal)
  require(sp)
  require(rgeos)
  require(raster)
  require(rworldmap)
  require(elevatr)
  
  require(parallel)
  require(doParallel)
  
  print("Import header and divide in tiles")
  header.shp <- readOGR(header)
  header.tiles <- header.shp@data %>% 
    bind_cols(as.data.frame(header.shp@coords)) %>% 
    rename(PlotObservationID=PltObID, Longitude=coords.x1, Latitude=coords.x2) %>% 
    mutate(lc_ncrt=abs(lc_ncrt)) %>% 
    filter(lc_ncrt <= 50000) %>%
    mutate_at(.vars=vars(Longitude, Latitude), 
              .funs=list(tile=~cut(., breaks = seq(-180,180, by=.2)))) %>%
    mutate(tilenam=factor(paste(Longitude_tile, Latitude_tile)))
    
  
  print("Get continent")
  sPDF <- rworldmap::getMap(resolution="high")
  continent.high <- sPDF[,"continent"]
  crs(continent.high) <- CRS("+init=epsg:4326")
  continent.high@data$continent <- fct_recode(continent.high@data$continent, "South America"="South America and the Caribbean")
  continent.high.merc <- spTransform(continent.high, CRS( "+init=epsg:3857 +proj=merc +a=6378137 +b=6378137 +lat_ts=0.0 +lon_0=0.0 +x_0=0.0 +y_0=0 +k=1.0 +units=m
+nadgrids=@null +no_defs"))

  print("go parallel")
  cl <- makeForkCluster(ncores, outfile="")
  registerDoParallel(ncores)
  
  clusterEvalQ(cl, {
    library(rgdal)
    library(raster)
    library(sp)
    library(elevatr)
    library(dplyr)
  })

  print("create list of tiles still to do")
  myfiles <- list.files(path = output, pattern = "[A-Za-z]*_[0-9]+\\.RData$")
  done <- NULL
  done <- as.numeric(unlist(regmatches(myfiles, gregexpr("[[:digit:]]+", myfiles))))
  todo <- 1:nlevels(header.tiles$tilenam)
  if(length(done)>0) {todo <- todo[-which(todo %in% done)]}
  todo <- sample(todo, replace=F)
  print(paste(length(todo), "tiles to do"))
  
  print("divide in chunks")
  #divide in chunks
  todo.chunks <- split(todo, sort(todo%%10))
  
  print(paste("start main foreach loop on chunk n=", chunk.i))
  print(paste(length(todo.chunks[[chunk.i]]), "to do"))
  foreach(i = todo.chunks[[chunk.i]]) %dopar% {  
    print(paste("doing", i))
    #create sp and project data
    if(nrow(header.tiles %>% 
            filter(tilenam %in% levels(header.tiles$tilenam)[i])) ==0 ) next()
    sp.tile <- SpatialPointsDataFrame(coords=header.tiles %>% 
                                        filter(tilenam %in% levels(header.tiles$tilenam)[i]) %>%
                                        dplyr::select(Longitude, Latitude),
                                      data=header.tiles %>% 
                                        filter(tilenam %in% levels(header.tiles$tilenam)[i]) %>%
                                        dplyr::select(-Longitude, -Latitude),
                                      proj4string = CRS("+init=epsg:4326"))
    sp.tile <- spTransform(sp.tile, CRSobj = CRS("+init=epsg:3857 +proj=merc +a=6378137 +b=6378137 +lat_ts=0.0
                                                 +lon_0=0.0 +x_0=0.0 +y_0=0 +k=1.0 +units=m +nadgrids=@null
                                                 +no_defs ")) #project to mercator
    
    #retrieve dem raster
    tryCatch(raster.tile <- get_elev_raster(sp.tile, z=9, expand=max(sp.tile$lc_ncrt), clip="bbox"),
             error = function(e){e}
    )
    if(!exists("raster.tile")) {
      raster.tile <- NA
      print(paste("tile", i, "doesn't work!, skip to next"))
      save(raster.tile, file = paste(output, "elevation_tile_", i, "failed.RData", sep=""))
      rm(raster.tile)
      } else {
    # clip dem tile with continent shape
    raster.tile <- mask(raster.tile, continent.high.merc)
    
    #extract and summarize elevation data
    elev.tile <- raster::extract(raster.tile, sp.tile, small=T)
    elev.tile.buffer <- raster::extract(raster.tile, sp.tile, 
                                        buffer=sp.tile@data$lc_ncrt, 
                                        small=T)
    tmp <- round(mapply( quantile, 
                         x=elev.tile.buffer,
                         #center=elev.tile,
                         probs=rep(c(0.025, 0.5, 0.975), each=length(elev.tile)), 
                         #lc_ncrt=sp.tile$lc_ncrt, 
                         na.rm=T))
    elev.q95 <- setNames(data.frame(matrix(tmp, ncol = 3, nrow = length(elev.tile.buffer))), 
                         c("Elevation_q2.5", "Elevation_median", "Elevation_q97.5"))
    output.tile <- data.frame(PlotObservationID=sp.tile$PlotObservationID, 
                              elevation=round(elev.tile), 
                              elev.q95, 
                              DEM.res=res(raster.tile)[1]) 
    
    #save output
    save(output.tile, file = paste(output, "elevation_tile_", i, ".RData", sep=""))
    print(paste(i, "done"))
  }}
  stopCluster(cl)
  }

ANNEX 3 - SessionInfo()

sessionInfo()
## R version 3.6.3 (2020-02-29)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 16.04.7 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/openblas-base/libblas.so.3
## LAPACK: /usr/lib/libopenblasp-r0.2.18.so
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8          LC_NUMERIC=C                 
##  [3] LC_TIME=en_US.UTF-8           LC_COLLATE=en_US.UTF-8       
##  [5] LC_MONETARY=en_US.UTF-8       LC_MESSAGES=en_US.UTF-8      
##  [7] LC_PAPER=en_US.UTF-8          LC_NAME=en_US.UTF-8          
##  [9] LC_ADDRESS=en_US.UTF-8        LC_TELEPHONE=en_US.UTF-8     
## [11] LC_MEASUREMENT=en_US.UTF-8    LC_IDENTIFICATION=en_US.UTF-8
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] shotGroups_0.8      dggridR_2.0.3       rnaturalearth_0.1.0
##  [4] sf_0.9-3            elevatr_0.2.0       rworldmap_1.3-6    
##  [7] raster_3.0-7        rgeos_0.5-5         rgdal_1.5-23       
## [10] sp_1.4-5            kableExtra_1.3.4    knitr_1.31         
## [13] xlsx_0.6.5          viridis_0.5.1       viridisLite_0.3.0  
## [16] forcats_0.5.1       stringr_1.4.0       dplyr_1.0.5        
## [19] purrr_0.3.4         readr_1.4.0         tidyr_1.1.3        
## [22] tibble_3.0.1        ggplot2_3.3.0       tidyverse_1.3.0    
## 
## loaded via a namespace (and not attached):
##  [1] TH.data_1.0-10          colorspace_2.0-0        ellipsis_0.3.1         
##  [4] class_7.3-18            modeltools_0.2-23       fs_1.5.0               
##  [7] rstudioapi_0.13         proxy_0.4-24            farver_2.1.0           
## [10] fansi_0.4.2             mvtnorm_1.1-1           lubridate_1.7.10       
## [13] coin_1.3-1              xml2_1.3.2              codetools_0.2-18       
## [16] splines_3.6.3           robustbase_0.93-5       libcoin_1.0-6          
## [19] spam_2.6-0              jsonlite_1.7.2          rJava_0.9-13           
## [22] broom_0.7.0             dbplyr_2.1.0            compiler_3.6.3         
## [25] httr_1.4.2              backports_1.2.1         assertthat_0.2.1       
## [28] Matrix_1.3-2            cli_2.3.1               htmltools_0.5.1.1      
## [31] tools_3.6.3             dotCall64_1.0-1         rnaturalearthdata_0.1.0
## [34] gtable_0.3.0            glue_1.4.2              maps_3.3.0             
## [37] Rcpp_1.0.5              cellranger_1.1.0        jquerylib_0.1.3        
## [40] vctrs_0.3.6             svglite_1.2.3.2         xfun_0.22              
## [43] xlsxjars_0.6.1          rvest_1.0.0             CompQuadForm_1.4.3     
## [46] lifecycle_1.0.0         spatialEco_1.3-1        DEoptimR_1.0-8         
## [49] MASS_7.3-53.1           zoo_1.8-9               scales_1.1.1           
## [52] hms_1.0.0               parallel_3.6.3          sandwich_3.0-0         
## [55] RColorBrewer_1.1-2      fields_11.6             yaml_2.2.1             
## [58] gridExtra_2.3           gdtools_0.2.3           sass_0.3.1             
## [61] stringi_1.5.3           highr_0.8               maptools_1.1-1         
## [64] e1071_1.7-5             boot_1.3-27             rlang_0.4.10           
## [67] pkgconfig_2.0.3         systemfonts_1.0.1       matrixStats_0.58.0     
## [70] evaluate_0.14           lattice_0.20-41         tidyselect_1.1.0       
## [73] magrittr_2.0.1          R6_2.5.0                generics_0.1.0         
## [76] multcomp_1.4-16         DBI_1.1.1               pillar_1.4.3           
## [79] haven_2.3.1             foreign_0.8-76          withr_2.4.1            
## [82] units_0.7-1             survival_3.2-10         modelr_0.1.6           
## [85] crayon_1.4.1            utf8_1.2.1              KernSmooth_2.23-18     
## [88] rworldxtra_1.01         rmarkdown_2.7           grid_3.6.3             
## [91] readxl_1.3.1            reprex_1.0.0            digest_0.6.25          
## [94] classInt_0.4-3          webshot_0.5.2           stats4_3.6.3           
## [97] munsell_0.5.0           bslib_0.2.4