Package: simstudy 0.9.2.9000

simstudy: Simulation of Study Data

Simulates data sets in order to explore modeling techniques or better understand data generating processes. The user specifies a set of relationships between covariates, and generates data based on these specifications. The final data sets can represent data from randomized control trials, repeated measure (longitudinal) designs, and cluster randomized trials. Missingness can be generated using various mechanisms (MCAR, MAR, NMAR).

Authors:Keith Goldfeld [aut, cre], Jacob Wujciak-Jens [aut]

simstudy_0.9.2.9000.tar.gz
simstudy_0.9.2.9000.zip(r-4.7)simstudy_0.9.2.9000.zip(r-4.6)simstudy_0.9.2.9000.zip(r-4.5)
simstudy_0.9.2.9000.tgz(r-4.6-x86_64)simstudy_0.9.2.9000.tgz(r-4.6-arm64)simstudy_0.9.2.9000.tgz(r-4.5-x86_64)simstudy_0.9.2.9000.tgz(r-4.5-arm64)
simstudy_0.9.2.9000.tar.gz(r-4.7-arm64)simstudy_0.9.2.9000.tar.gz(r-4.7-x86_64)simstudy_0.9.2.9000.tar.gz(r-4.6-arm64)simstudy_0.9.2.9000.tar.gz(r-4.6-x86_64)
simstudy_0.9.2.9000.tgz(r-4.6-emscripten)
manual.pdf |manual.html
DESCRIPTION
card.svg |card.png
simstudy/json (API)

# Install 'simstudy' in R:
install.packages('simstudy', repos = c('https://kgoldfeld.r-universe.dev', 'https://cloud.r-project.org'))

Bug tracker:https://github.com/kgoldfeld/simstudy/issues

Pkgdown/docs site:https://kgoldfeld.github.io

Uses libs:
  • c++– GNU Standard C++ Library v3

On CRAN:

Conda:

data-generationdata-simulationsimulationstatistical-modelscpp

11.33 score 85 stars 2 packages 1.0k scripts 1.1k downloads 1 mentions 66 exports 16 dependencies

Last updated from:86b3002706. Checks:13 OK. Indexed: yes.

TargetResultTimeFilesSyslog
linux-devel-arm64OK222
linux-devel-x86_64OK231
source / vignettesOK414
linux-release-arm64OK220
linux-release-x86_64OK231
macos-release-arm64OK160
macos-release-x86_64OK298
macos-oldrel-arm64OK148
macos-oldrel-x86_64OK303
windows-develOK214
windows-releaseOK189
windows-oldrelOK157
wasm-releaseOK200

Exports:addColumnsaddCompRiskaddConditionaddCorDataaddCorFlexaddCorGenaddDataDensityaddMarkovaddMultiFacaddPeriodsaddSyntheticbetaGetShapesblockDecayMatblockExchangeMatcatProbsdefConditiondefDatadefDataAdddefMissdefReaddefReadAdddefReadConddefRepeatdefRepeatAdddefSurvdelColumnsgammaGetShapeRategenCatFormulagenClustergenCorDatagenCorFlexgenCorGengenCorMatgenCorOrdCatgenCrossedgenDatagenDataDensitygenDummygenFactorgenFormulagenMarkovgenMissgenMixFormulagenMultiFacgenNthEventgenObsgenOrdCatgenSplinegenSurvgenSyntheticgroupediccRElogisticCoefsmergeDatanegbinomGetSizeProbscenario_listsurvGetParamssurvParamPlottrimDatatrtAssigntrtObservetrtStepWedgeupdateDefupdateDefAddviewBasisviewSplines

Dependencies:backportsBHbigmemorybigmemory.sridata.tablefastglmFormulagluelatticeMatrixmvnfastpbvRcppRcppArmadilloRcppEigenuuid

Framework for repeated simulations
Introduction | The simulation framework | Specifying scenarios | Example: power analysis for a cluster-randomized trial

Last update: 2026-03-03
Started: 2025-10-05

Targeted logistic model coefficients
Prevalence | Finding the intercept | Risk ratios | Risk differences | AUC

Last update: 2025-12-15
Started: 2023-06-16

Simulating Study Data
Overview | Defining the Data | Generating the data | Assigning treatment/exposure | More details on data definitions | Formulas | Distributions | beta | binary | binomial | categorical | clusterSize | custom | exponential | gamma | mixture | negBinomial | nonrandom | normal | noZeroPoisson | poisson | trtAssign | uniform | uniformInt | Generating multiple variables with a single definition | Adding data to an existing data table | defDataAdd/defRepeatAdd/readDataAdd and addColumns | defCondition and addCondition

Last update: 2024-06-29
Started: 2016-02-05

Customized Distributions
Example 1 | Example 2

Last update: 2024-05-13
Started: 2024-05-13

Clustered Data
Setting cluster sizes

Last update: 2023-11-23
Started: 2020-09-25

Correlated Data
Correlated data: additional distributions

Last update: 2023-11-23
Started: 2020-09-25

Correlation Matrices
Simple correlation matrix generation | Specifying a structure | Cluster-specific correlation matrices | More elaborate example | Block matrices for temporal data | Cross-sectional data | Exchangeable | Decay | Closed cohort | Generating block matrices and simulating data | Cross-sectional data with exchangeable correlation | Cross-sectional data with correlation decay | Cohort data with exchangeable correlation | Cohort data with correlation decay | Varying correlation matrices by cluster

Last update: 2023-11-23
Started: 2023-02-16

Dynamic Data Definition
Updating existing definition tables | Double-dot external variable reference | Using non-scalar double-dot variable reference

Last update: 2023-11-23
Started: 2020-10-05

Longitudinal Data
Longitudinal data with varying observation and interval times

Last update: 2023-11-23
Started: 2020-09-25

Missing Data
Longitudinal data with missingness

Last update: 2023-11-23
Started: 2020-09-25

Ordinal Categorical Data
Comparing response distributions of different populations | The cumulative proportional odds model | Simulation | Non-proportional odds | Correlated multivariate ordinal data

Last update: 2023-11-23
Started: 2020-09-25

Spline Data

Last update: 2023-11-23
Started: 2020-09-25

Survival Data
Weibull distribution | Generating standard survival data with censoring | Competing risks | Introducing non-proportional hazards | Generating parameters for survival distribution

Last update: 2023-11-23
Started: 2020-09-25

Treatment and Exposure
Assigned treatment | Assigned treatment using trtAssign distribution in defData | Observed treatment | Stepped-wedge design

Last update: 2023-11-23
Started: 2020-09-25

Readme and manuals

Help Manual

Help pageTopics
Add columns to existing data setaddColumns
Generating single competing risk survival variableaddCompRisk
Add a single column to existing data set based on a conditionaddCondition
Add correlated data to existing data.tableaddCorData
Create multivariate (correlated) data - for general distributionsaddCorFlex
Create multivariate (correlated) data - for general distributionsaddCorGen
Add data from a density defined by a vector of integersaddDataDensity
Add Markov chainaddMarkov
Add multi-factorial dataaddMultiFac
Create longitudinal/panel dataaddPeriods
Add synthetic data to existing data setaddSynthetic
Convert beta mean and precision parameters to two shape parametersbetaGetShapes
Create a block correlation matrixblockDecayMat
Create a block correlation matrix with exchangeable structureblockExchangeMat
Add single row to definitions table of conditions that will be used to add data to an existing definitions tabledefCondition
Add single row to definitions tabledefData
Add single row to definitions table that will be used to add data to an existing data.tabledefDataAdd
Definitions for missing datadefMiss
Read external csv data set definitionsdefRead
Read external csv data set definitions for adding columnsdefReadAdd
Read external csv data set definitions for adding columnsdefReadCond
Add multiple (similar) rows to definitions tabledefRepeat
Add multiple (similar) rows to definitions table that will be used to add data to an existing data.tabledefRepeatAdd
Add single row to survival definitionsdefSurv
Delete columns from existing data setdelColumns
Distributions for Data Definitionsbeta binary binomial categorical clusterSize distributions exponential gamma mixture negBinomial nonrandom normal noZeroPoisson poisson uniform
Convert gamma mean and dispersion parameters to shape and rate parametersgammaGetShapeRate
Generate Categorical FormulagenCatFormula
Simulate clustered datagenCluster
Create correlated datagenCorData
Create multivariate (correlated) data - for general distributionsgenCorFlex
Create multivariate (correlated) data - for general distributionsgenCorGen
Create a correlation matrixgenCorMat
Generate crossed datagenCrossed
Calling function to simulate datagenData
Generate data from a density defined by a vector of integersgenDataDensity
Create dummy variables from a factor or integer variablegenDummy
Create factor variable from an existing (non-double) variablegenFactor
Generate a linear formulagenFormula
Generate Markov chaingenMarkov
Generate missing datagenMiss
Generate Mixture FormulagenMixFormula
Generate multi-factorial datagenMultiFac
Generate event data using longitudinal data, and restrict output to time until the nth event.genNthEvent
Create an observed data set that includes missing datagenObs
Generate ordinal categorical datagenOrdCat
Generate spline curvesgenSpline
Generate survival datagenSurv
Generate synthetic datagenSynthetic
Mark parameters as groupedgrouped
Generate variance for random effects that produce desired intra-class coefficients (ICCs) for clustered data.iccRE
Determine intercept, treatment/exposure and covariate coefficients that can be used for binary data generation with a logit link and a set of covariateslogisticCoefs
Merge two data.tables without modifying inputsmergeData
Convert negative binomial mean and dispersion parameters to size and prob parametersnegbinomGetSizeProb
Create list of parameter scenariosscenario_list
Deprecated functions in simstudysimstudy-deprecated
Get survival curve parameterssurvGetParams
Plot survival curvessurvParamPlot
Trim longitudinal data file once an event has occurredtrimData
Assign treatmenttrtAssign
Observed exposure or treatmenttrtObserve
Assign treatment for stepped-wedge designtrtStepWedge
Update definition tableupdateDef
Update definition tableupdateDefAdd
Plot basis spline functionsviewBasis
Plot spline curvesviewSplines