Prepare data in format required for estimation procedure described in "Explaining Recruitment to Violent Extremism: A Bayesian Case-Control Approach.
Usage
data.prep(
shape,
survey,
shape_large.area_id_name = NA,
shape_large.area_id_num = NA,
shape_small.area_id_name = NA,
shape_small.area_id_num = NA,
survey_small.area_id_num = NA,
survey_small.area_id_name = NA,
drop.incomplete.records = NA,
colnames_X = NA,
interactions_list = NA,
scale_X = NA,
colname_y = NA,
contamination = T,
pi = NA,
large_area_shape = F
)
Arguments
- shape
sf object: shapefile data.
- survey
data.table data.frame, case-control data including common geographic ID.
- shape_large.area_id_name
string, large area name identifiers in the shapefile.
- shape_large.area_id_num
integer, large area identifiers in the shapefile.
- shape_small.area_id_name
string, small area name identifiers in the shapefile.
- shape_small.area_id_num
integer, small area identifiers in the shapefile.
- survey_small.area_id_num
string, small area name identifiers in the survey.
- survey_small.area_id_name
integer, small area identifiers in the survey.
- drop.incomplete.records
logical, should the function return complete data? Defaults to
TRUE
.- colnames_X
character vector, covariates definining the design matrix X. Must be numeric.
- interactions_list
list, each element is a string of the form "a*b" where a and be are the names of two variables in colnames_X.
- scale_X
string, takes values "1sd" or "2sd."
- colname_y
string, variable name for the outcome variable. Must be numeric.
- contamination
logical, should this offset account for contamination? Defaults to
TRUE
.- pi
numeric, scalar defining the prevalence of the outcome in the population of interest.
- large_area_shape
logical, should the function return a large-area shapefile? Defaults to
TRUE
.