changed bias metric yet again

0297e6ab · ye87zine · 296d916c · 0297e6ab
Commit 0297e6ab authored 8 months ago by ye87zine
--- a/R/performance_analysis.qmd
+++ b/R/performance_analysis.qmd
@@ -37,7 +37,7 @@ Key findings:
 -   RF performed best, GBM slightly worse, GLM worst
 -   More occurrence records and larger range sizes tended to improve model performance
 -   Higher range coverage correlated with better performance.
-   Range coverage bias and functional group showed some impact but were less consistent <!-- TODO: check after rerun -->
+-   Range coverage bias and functional group showed some impact but were less consistent
 ## Analysis
@@ -263,10 +263,10 @@ bslib::card(plot, full_screen = T)
 #### Range coverage bias
-Range coverage bias was calculated as the as the minimum of total grid cells and total occurrences divided by the number of occupied cells.
+Range coverage bias was calculated as 1 minus the ratio of the actual range coverage and the hypothetical range coverage if all observations were maximally spread out across the range.
 $$
-RangeCoverageBias = \frac{min(N_{cells\_total}, N_{obs\_total})}{N_{cells\_occupied}}
+RangeCoverageBias = 1 - \frac{RangeCoverage}{min({N_{obs\_total}} / {N_{cells\_total}}, 1)}
 $$
 Higher bias values indicate that occurrence records are spatially more clustered within the range of the species.
@@ -282,7 +282,7 @@ df_occs_total = occs_final %>%
 df_join = df_occs_total %>% 
  dplyr::inner_join(df_cells_total, by = "species") %>% 
  dplyr::inner_join(df_cells_occ, by = "species") %>% 
-  dplyr::mutate(range_bias = pmin(cells_total, occs_total) / cells_occupied)
+  dplyr::mutate(range_bias = 1-((cells_occupied / cells_total) / pmin(occs_total / cells_total, 1)))
 df_plot = performance %>% 
  inner_join(df_join, by = "species")
@@ -312,7 +312,7 @@ plot <- plot_ly(
 plot <- plot %>%
  layout(
    title = "Model Performance vs. Range coverage bias",
-    xaxis = list(title = "Range coverage bias", type = "log"),
+    xaxis = list(title = "Range coverage bias"),
    yaxis = list(title = "Value"),
    legend = list(x = 1.1, y = 0.5),  # Move legend to the right of the plot
    margin = list(r = 150),  # Add right margin to accommodate legend