Skip to content
Snippets Groups Projects
Commit a35fd42f authored by samuelsmock's avatar samuelsmock
Browse files

added new vignette and migrated dataframe search function to util.r

parent 7f9e9a1c
No related branches found
No related tags found
No related merge requests found
......@@ -12,6 +12,32 @@ plot_multi_histogram <- function(df, feature, label_column, hist=FALSE) { #funct
}
## The function below is intended to find a dataframe called $coef in the output
## regardless of the output data structure, which can vary
find_dataframe <- function(list_object, dataframe_name) {
# Check if the current object is a list
if (is.list(list_object)) {
# Check if the dataframe_name exists in the names of the current list object
if (dataframe_name %in% names(list_object)) {
# Check if the object corresponding to dataframe_name is a data frame
if (is.data.frame(list_object[[dataframe_name]])) {
return(list_object[[dataframe_name]]) # Return the data frame if found
}
}
# Recursively search through each element of the current list object
for (element in list_object) {
result <- find_dataframe(element, dataframe_name)
if (!is.null(result)) {
return(result) # Return the data frame if found in any nested list
}
}
}
return(NULL) # Return NULL if dataframe_name not found
}
#' Title Downloads and extracts external data to be used in the simulation
#'
......@@ -67,3 +93,4 @@ make_md <- function(f=file){
)
}
No preview for this file type
......@@ -154,31 +154,6 @@ test_that("Simulation results are reasonable", {
result1 <- sim_all(nosim = nosim, resps = resps, destype = destype, designpath = designpath, u = ul, bcoeff = bcoeff)
## The function below is intended to find a dataframe called $coef in the output
## regardless of the output data structure, which can vary
find_dataframe <- function(list_object, dataframe_name) {
# Check if the current object is a list
if (is.list(list_object)) {
# Check if the dataframe_name exists in the names of the current list object
if (dataframe_name %in% names(list_object)) {
# Check if the object corresponding to dataframe_name is a data frame
if (is.data.frame(list_object[[dataframe_name]])) {
return(list_object[[dataframe_name]]) # Return the data frame if found
}
}
# Recursively search through each element of the current list object
for (element in list_object) {
result <- find_dataframe(element, dataframe_name)
if (!is.null(result)) {
return(result) # Return the data frame if found in any nested list
}
}
}
return(NULL) # Return NULL if dataframe_name not found
}
# Now access the coef data frame to compare to the input
coeffNestedOutput <- find_dataframe(result1, "coefs")
......
---
title: "SE_Agri-vignette"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{SE_Agri-vignette}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
<!-- There is an issue when creating these vignettes using usethis::use_vignette and
devtools::build_vignettes() where the compilied vignette HTML files are placed in /doc
rather than inst/doc
Best Practice is to follow these steps
1. Create vignette using usethis::use_vignette("my-vignette")
2. After making changes, then run devtools::build_vignettes()
3. Rebuild using devtools::install(build_vignettes = TRUE)
4. Check that it is in the vignette environment using browseVigettes()
If vignette does not appear in gitHub, it is possibly due to a file heirarchy problem where rendered files
appear in /doc instead of /inst/doc
To avoid this run:
tools::buildVignettes(dir = ".", tangle=TRUE)
dir.create("inst/doc")
file.copy(dir("vignettes", full.names=TRUE), "inst/doc", overwrite=TRUE)
More info here: https://community.rstudio.com/t/browsevignettes-mypackage-saying-no-vignettes-found/68656/7
-->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
```
# Introduction
This vignette demonstrates how to use simulate situations in which different utility functions apply to diffferent subsets of respondents. We will use a sample dataset and utility functions to generate simulated data and analyze the results. First off it is good practice remove all objects (variables, functions, etc.) from the current workspace and load all R files in the package directory into the current R session.
```{r setup}
library(simulateDCE)
library(rlang)
library(formula.tools)
```
# Inititalize Variables
sim_all is the highest level function in the package and will run simulations for all designs contained in the specified design folder. Accordingly, this is generally the function the user will want to call. To prepare for using this function, a hypothesized utility function with corresponding beta coefficients representing the weight of each term must be declared in R.
The manipulation variable allows the user to assign different terms based on the values of column values in the experimental design file
```{r initialize}
bcoeff <- list(
basc = 4.2, ## very high asc
bprof = 0.3,
bexp = 0.3,
bdomestic = 0.3,
bforeign = 0.3,
bdamage = 0.6,
bprice = 0.2)
manipulations = list(alt1.professional= expr(alt1.initiator==1),
alt2.professional= expr(alt2.initiator==1),
alt1.expert = expr(alt1.initiator==2),
alt2.expert = expr(alt2.initiator==2),
alt1.domestic = expr(alt1.funding==1),
alt2.domestic = expr(alt2.funding==1),
alt1.foreign = expr(alt1.funding==2),
alt2.foreign = expr(alt2.funding==2))
#place your utility functions here
ul<- list(u1=
list(
v1 =V.1 ~ bprof*alt1.professional+ bexp * alt1.expert + bdomestic * alt1.domestic + bforeign * alt1.foreign + bdamage*alt1.damage + bprice * alt1.compensation,
v2 =V.2 ~ bprof*alt2.professional + bexp * alt2.expert + bdomestic * alt2.domestic + bforeign * alt2.foreign + bdamage*alt2.damage + bprice * alt2.compensation,
v3 =V.3 ~ basc)
)
```
# Other parameters
Besides these arguments the user must also specify the number of respondents in the simulated survey and the number of times to run the model. The number of respondents (resps) should be selected based on experimental design parameters, while the number of simulations (nosim) should be large enough to glean statistically significant data. It is best to use a small number for this while learning to use the package and a large number (at least 500) once the other parameters have been settled.
The simulation can be ran using spdesign or NGENE design files which will be contained in the design path. The design path and design type, must also be specified as strings:
```{r other}
designpath<- system.file("extdata","se_AGRI" ,package = "simulateDCE")
## can also be specified using relative path eg. designpath<- "Projects/CSA/Designs/"
#notes <- "This design consists of different heuristics. One group did not attend the methan attribute, another group only decided based on the payment"
notes <- "No Heuristics"
resps =240 # number of respondents
nosim=2 # number of simulations to run (about 500 is minimum)
## design type must be either 'spdesign' or 'ngene'
destype <- "ngene"
```
# Randomness
As several part of the simulation rely on random values within experimentally defined bounds, the output of a given simulation call using sim_all will vary each time it is called unless the seed for R's random number generator is set like so:
```{r random}
set.seed(3393)
```
# Output
The sim_all function returns a multidimensional R list containing graphs, simulated observations and a dataframe containing sumaries of estimated beta coefficients. In general these will be printed to the console, but the entire results can also be assigned to an r list object.
```{r output}
seagri <- simulateDCE::sim_all(nosim = nosim, resps=resps, destype = destype,
designpath = designpath, u= ul, bcoeff = bcoeff)
```
# Accessing the Output in R
The beta cofficients for each simulation are contained in a dataframe called coeffs located within {result}->{design}->coefs. and a summary table, which displays statistics of these b coefficients across all simulations is contained in {results}->{design}->summary. You can access these by either searching for a known design name or by browsing design names contained in the output list.
You can also save the results to disk using saveRDS(csa,file = "tests/manual-tests/csa.RDS")
```{r accessOutput}
designs = names(seagri[sapply(seagri, is.list)])
print(designs)
simulationCoeff <- seagri[[1]]$coefs
coeffSummary <- seagri[[1]]$summary
print(simulationCoeff)
print(coeffSummary)
## saveRDS(csa,file = "tests/manual-tests/csa.RDS")
```
......@@ -49,9 +49,6 @@ library(formula.tools)
sim_all is the highest level function in the package and will run simulations for all designs contained in the specified design folder. Accordingly, this is generally the function the user will want to call. To prepare for using this function, a hypothesized utility function with corresponding beta coefficients representing the weight of each term must be declared in R like so:
```{r initialize}
decisiongroups=c(0,0.7,1)
# pass beta coefficients as a list
bcoeff <- list(
bpreis = -0.01,
......@@ -80,6 +77,14 @@ ul<-list( u1 =
)
```
# Decision Groups
If the utility function set passed to sim all has more than 1 utility function group as is the case above (u1, u2 etc.) an additional variable, 'decision groups' must be given to tell the function what portion of respondents should be given each utility function.
```{r decision}
decisiongroups=c(0,0.7,1)
```
# Other parameters
Besides these arguments the user must also specify the number of respondents in the simulated survey and the number of times to run the model. The number of respondents (resps) should be selected based on experimental design parameters, while the number of simulations (nosim) should be large enough to glean statistically significant data. It is best to use a small number for this while learning to use the package and a large number (at least 500) once the other parameters have been settled.
......
......@@ -108,19 +108,23 @@ csa <- simulateDCE::sim_all(nosim = nosim, resps=resps, destype = destype,
# Accessing the Output in R
The beta cofficients for each simulation are contained in a dataframe called coeffs located within {result}->olddesign->coefs. and a summary table, which displays
statistics of these b coefficients across all simulations is contained in ->olddesign->summary.
The beta cofficients for each simulation are contained in a dataframe called coeffs within within a nested list structure output. A summary table showing the beta coefficient statistics is also made within each experimental design.
You can also save the results to disk using saveRDS(csa,file = "tests/manual-tests/csa.RDS")
```{r accessOutput}
## nested results are hard coded, if the design changes this must aswell
simulationCoeff <- csa$olddesign$coefs
coeffSummary <- csa$olddesign$summary
topLevelResults = names(csa[sapply(csa, is.list)])
print(topLevelResults)
##saves and prints the key results of the first expreimental design
simulationCoeff <- csa[[1]]$coefs
coeffSummary <- csa[[1]]$summary
print(simulationCoeff)
print(coeffSummary)
## saveRDS(csa,file = "tests/manual-tests/csa.RDS")
```
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment