The Biostatistics Cornor: August 2013

Thursday, August 15, 2013

R function for Multiple Imputation of Longitudinal Binary Data Based on Propensity Score

Description

R function time1(x, m) and timex(x, inacol, nacol, m) return the imputed data based on propensity score and beta distribution (see details in reference paper). This multiple imputation is to impute the missing values according to time series and the later imputed values are dependent on the prior imputed data, which is different from R package "MICE" and SAS PROCE MI where the imputation is performed based on regression models.

Usage

time1(x, m)

timex(x, inacol, nacol, m)

Arguments

x              data frame of longtitudinal binary data to impute
m             number of replicates
inacol      a list of the column numbers/names before timex without NAs
nacol       the number of time before time x, for example total times is 5, to impute the time 3, nacol=2.

Examples

Copy the full R code from HERE or run the following scripts in R Gui:

Download the function code on Mac:

download.file("https://raw.github.com/devilszhang/blog/master/timex.r", destfile="./timex.r", method="curl")

Download the function code on Windows:

download.file("https://raw.github.com/devilszhang/blog/master/timex.r", destfile="./timex.r")

Source the code:

source("./timex.r")

Access the demo data from HERE or using:

require(RCurl)
x<-read.table(textConnection(getURL("https://raw.github.com/devilszhang/blog/master/longdata.txt")), header=T)

#Impute day0
x$t1imp<-time1(polio52, 5)#Impute day4

inacol<-c("t1imp")
nacol<-c("day4")
x$t2imp<-timex(x, inacol, nacol, 1)    #Impute day11

inacol<-c("t1imp", "t2imp")
nacol<-c("day11")
x$t3imp<-timex(x, inacol, nacol, 2) #Impute day18

inacol<-c("t1imp", "t2imp", "t3imp")
nacol<-c("day18")
x$t4imp<-timex(x, inacol, nacol, 3) #Impute day25

inacol<-c("t1imp", "t2imp", "t3imp", "t4imp")
nacol<-c("day25")
x$t5imp<-timex(x, inacol, nacol, 4)
#complete data
complx<-x[, c("t1imp", "t2imp", "t3imp", "t4imp", "t5imp")]
complx

References

Li X.M. et al. Statistics in Medicine, 2006, 25: 2107-2124.

Wednesday, August 14, 2013

R Function for Simulating the Sample Size, Type I error and Power in Adaptive Group Sequential Design

Description

The R function SAP() returns the planning sample size, simulated sample size, simulated standard error, simulated arm1/arm2 event rate and standard error, simulated type I error and statistical power for four methods (see references) with binary outcomes.

Usage

SAP(r, type1, type2, interim, tpi1, delta,p)
tra.size(type1, type2, pi1, pi2,delta) #return Chow's sample size
Gould.size(type1, type2, opi, delta) #return Gould Sample size
Shih.size(n,r, type1,type2,tpi1,tpi2,p,interim)
##return Shih's sample size, n is the sample size from either Gould or Chow.

Arguments

r number of replicates in simulation, such as 10000

type1 designed type1 error, such as 0.05

type2 designed type2 error, such as 0.2 (power=0.8)

interim the proportion of total information by which interim analysis is planned, such as 0.3

tpi1 designed arm 1 event rate, such as 0.3

delta minimum clinical effect size, such as 0.2

p the stratification factor for Shih's method, say 0.3 suggested by the paper.

Examples

To run the functions, you either to copy all the R codes from here or copy the following scripts to the R Gui window:

Download the function code on Mac:

download.file("https://raw.github.com/devilszhang/blog/master/SAP.r", destfile="./SAP.r", method="curl")

Download the function code on Windows:

download.file("https://raw.github.com/devilszhang/blog/master/SAP.r", destfile="./SAP.r")

Source the code:

source("./SAP.r")

Suppose you're planning a trial with the desired values in Arguments:

SAP(r=10000, type1=0.05, type2=0.2, interim=0.3, tpi1=0.3, delta=0.2, p=0.3)

References

Chow S.C., et al. Sample Size Calculations in Clinical Research, Marcel Dekker Inc.: New York, 2003.
Gould A. L., et al. Statistics in Medicine, 1992, 11: 55-66.
Shih W. J., et al. Statistics in Medicine, 1997, 16: 1913-1923. (Shih's method based on Gould)
Shih's method based on Chow.

Monday, August 12, 2013

Propensity Score Imputation for Binary Outcomes

Propensity score matching (PSM) / Bootstrap imputation for binary outcomes is not available either from R package "mice" or SAS PROC MI. Here is a R scripts for PSM method plus a pooling function for analyzing the results for m bootstraps.

An example using this demo dataset.
To do the imputation, you either need to copy all R code from here and here, Or copy the following scripts to download and source the functions.

Download the R functions on Mac:
download.file("https://raw.github.com/devilszhang/blog/master/boot.imp.r", destfile="./boot.imp.r", method="curl")

download.file("https://raw.github.com/devilszhang/blog/master/boot.pool.r", destfile="./boot.pool.r", method="curl")

Download the R functions on Windows:
download.file("https://raw.github.com/devilszhang/blog/master/boot.imp.r", destfile="./boot.imp.r")

download.file("https://raw.github.com/devilszhang/blog/master/boot.pool.r", destfile="./boot.pool.r")

Source the function:
source("./boot.imp.r")
source("./boot.pool.r")

#Read demo dataset from online:
     require(RCurl)
     demo<-read.table(textConnection(getURL("https://raw.github.com/devilszhang/blog/master/demo.txt")), header=T)

#STEP1:
#Calculating propensity score and grouping:
    fit1<-glm(miss~daysbirth+sex+lepidays+lepisevrt+dvomtdays, data=demo, family=binomial(link=logit), na.action="na.exclude")
    fitted<-fit1$fitted.values
    demo$napred<-predict(fit1, type="response", newdata=demo)
    demo<-demo[order(demo$napred), ]
    brks<-with(demo, quantile(napred, probs=c(0, 0.2, 0.4, 0.6, 0.8, 1)))
    demo<-within(demo, qgroup<-cut(napred, breaks=brks, labels=1:5, include.lowest=T))
    demo<-demo[order(demo$qgroup, demo$miss), ]
#STEP2
#do bootstrap imputation for m replicates
    set.seed(123456)
    m<-10
    boot<-boot.imp(demo, "qgroup", "rotabi", m)
    boot<-as.data.frame(boot)
    colnames(boot)<-paste("boot", 1:m, sep="")
    ####complete dataset
    cdemo<-cbind(demo, boot) ##complete dataset
#STEP3
#Pooling analysis
     fitall<-apply(boot, 2, function(x) summary(glm(x~daysbirth+sex+lepidays+lepisevrt+dvomtdays, data=demo, family=binomial(link=logit))))
    summary(boot.pool(fitall))#Lambda is the total variance due to missing values

#PLOTTING
    plot(density(na.omit(demo$rotabi)), col=4, lwd=4, ylim=c(0,10))
    nb<-paste("boot", 1:m, sep="")
    for (i in nb) {
       lines(density(cdemo[cdemo[,"miss"]==0, i]), col=2)
    }

The Biostatistics Cornor

Thursday, August 15, 2013

R function for Multiple Imputation of Longitudinal Binary Data Based on Propensity Score

Description

Usage

Arguments

Examples

References

Wednesday, August 14, 2013

R Function for Simulating the Sample Size, Type I error and Power in Adaptive Group Sequential Design

Description

Usage

Arguments

Examples

References

Monday, August 12, 2013

Propensity Score Imputation for Binary Outcomes

Blog Archive

Total Pageviews

Links