Skip to content
Snippets Groups Projects
Commit 0297e6ab authored by ye87zine's avatar ye87zine
Browse files

changed bias metric yet again

parent 296d916c
No related branches found
No related tags found
No related merge requests found
...@@ -37,7 +37,7 @@ Key findings: ...@@ -37,7 +37,7 @@ Key findings:
- RF performed best, GBM slightly worse, GLM worst - RF performed best, GBM slightly worse, GLM worst
- More occurrence records and larger range sizes tended to improve model performance - More occurrence records and larger range sizes tended to improve model performance
- Higher range coverage correlated with better performance. - Higher range coverage correlated with better performance.
- Range coverage bias and functional group showed some impact but were less consistent <!-- TODO: check after rerun --> - Range coverage bias and functional group showed some impact but were less consistent
## Analysis ## Analysis
...@@ -263,10 +263,10 @@ bslib::card(plot, full_screen = T) ...@@ -263,10 +263,10 @@ bslib::card(plot, full_screen = T)
#### Range coverage bias #### Range coverage bias
Range coverage bias was calculated as the as the minimum of total grid cells and total occurrences divided by the number of occupied cells. Range coverage bias was calculated as 1 minus the ratio of the actual range coverage and the hypothetical range coverage if all observations were maximally spread out across the range.
$$ $$
RangeCoverageBias = \frac{min(N_{cells\_total}, N_{obs\_total})}{N_{cells\_occupied}} RangeCoverageBias = 1 - \frac{RangeCoverage}{min({N_{obs\_total}} / {N_{cells\_total}}, 1)}
$$ $$
Higher bias values indicate that occurrence records are spatially more clustered within the range of the species. Higher bias values indicate that occurrence records are spatially more clustered within the range of the species.
...@@ -282,7 +282,7 @@ df_occs_total = occs_final %>% ...@@ -282,7 +282,7 @@ df_occs_total = occs_final %>%
df_join = df_occs_total %>% df_join = df_occs_total %>%
dplyr::inner_join(df_cells_total, by = "species") %>% dplyr::inner_join(df_cells_total, by = "species") %>%
dplyr::inner_join(df_cells_occ, by = "species") %>% dplyr::inner_join(df_cells_occ, by = "species") %>%
dplyr::mutate(range_bias = pmin(cells_total, occs_total) / cells_occupied) dplyr::mutate(range_bias = 1-((cells_occupied / cells_total) / pmin(occs_total / cells_total, 1)))
df_plot = performance %>% df_plot = performance %>%
inner_join(df_join, by = "species") inner_join(df_join, by = "species")
...@@ -312,7 +312,7 @@ plot <- plot_ly( ...@@ -312,7 +312,7 @@ plot <- plot_ly(
plot <- plot %>% plot <- plot %>%
layout( layout(
title = "Model Performance vs. Range coverage bias", title = "Model Performance vs. Range coverage bias",
xaxis = list(title = "Range coverage bias", type = "log"), xaxis = list(title = "Range coverage bias"),
yaxis = list(title = "Value"), yaxis = list(title = "Value"),
legend = list(x = 1.1, y = 0.5), # Move legend to the right of the plot legend = list(x = 1.1, y = 0.5), # Move legend to the right of the plot
margin = list(r = 150), # Add right margin to accommodate legend margin = list(r = 150), # Add right margin to accommodate legend
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment