Title: | Calculate AUC-type measure when gold standard is continuous and the corresponding optimal linear combination of variables with respect to it. |
---|---|
Description: | The cgAUC can calculate the AUC-type measure of Obuchowski(2006) when gold standard is continuous, and find the optimal linear combination of variables with respect to this measure. |
Authors: | Yuan-chin I. Chang, Yu-chia Chang, and Ling-wan Chen |
Maintainer: | Yu-chia Chang <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.2.1 |
Built: | 2024-11-22 03:35:29 UTC |
Source: | https://github.com/cran/cgAUC |
In this package, the cgAUC is used to calculate the AUC-type measure raised in Obuchowski(2006) when gold standard is continuous.
Package: | cgAUC |
Type: | Package |
Version: | 1.2.1 |
Date: | 2014-08-24 |
License: | GPL (>=2) |
Yuan-chin I. Chang, Yu-chia Chang, and Ling-wan Chen
Maintainer: Yu-chia Chang <[email protected]>
Chang, YCI. Maximizing an ROC type measure via linear combination of markers when the gold reference is continuous. Statistics in Medicine 2012.
Obuchowski NA. An ROC-type measure of diagnostic accuracy when the gold standard is continuous-scale. Statistics in Medicine 2006; 25:481–493.
Obuchowski N. Estimating and comparing diagnostic tests accuracy when the gold standard is not binary. Statistics in Medicine 2005; 20:3261–3278.
Friedman JH, Popescu BE. Gradient directed regularization for linear regression and classification. Technical Report, Department of Statistics, Stanford University, 2004.
# n = 100; p = 5; # r.x = matrix(rnorm(n * p), , p) # raw data # r.z = r.x[ ,1] + rnorm(n) # gold standard # x = scale(r.x) # standardized of raw data # z = scale(r.z) # standardized of gold standard # h = n^(-1 / 2) # t1 = cgAUC(r.x, r.z, h, delta = 1, auto = FALSE, tau = 1, scale = 1) # the delta be constant # t1 # t2 = cgAUC(r.x, r.z, h, delta = 1, auto = TRUE, tau = 1, scale = 1) # the delta be variable # t2
# n = 100; p = 5; # r.x = matrix(rnorm(n * p), , p) # raw data # r.z = r.x[ ,1] + rnorm(n) # gold standard # x = scale(r.x) # standardized of raw data # z = scale(r.z) # standardized of gold standard # h = n^(-1 / 2) # t1 = cgAUC(r.x, r.z, h, delta = 1, auto = FALSE, tau = 1, scale = 1) # the delta be constant # t1 # t2 = cgAUC(r.x, r.z, h, delta = 1, auto = TRUE, tau = 1, scale = 1) # the delta be variable # t2
Continue function, when variable was continue.
c_cntin(y, z, l, h)
c_cntin(y, z, l, h)
y |
The potential variables. It is a matrix with column of values of a variables. It should be standardized in this application. |
z |
The gold standard variable. It should be standardized. |
l |
Linear combination. A vector. |
h |
The value of h falls into (n^(-1/2), n^(-1/5)). |
theta.sh.h.p |
The estimate of the theta of Chang(2012). |
var |
The variance of estimate of the theta of Chang(2012). |
Yu-chia Chang
Chang, YCI. Maximizing an ROC type measure via linear combination of markers when the gold reference is continuous. Statistics in Medicine 2012.
Obuchowski NA. An ROC-type measure of diagnostic accuracy when the gold standard is continuous-scale. Statistics in Medicine 2006; 25:481–493.
Obuchowski N. Estimating and comparing diagnostic tests accuracy when the gold standard is not binary. Statistics in Medicine 2005; 20:3261–3278.
Friedman JH, Popescu BE. Gradient directed regularization for linear regression and classification. Technical Report, Department of Statistics, Stanford University, 2004.
##---- Should be DIRECTLY executable !! ---- ##-- ==> Define data, use random, ##-- or do help(data=index) for the standard data sets. ## The function is currently defined as function(y, z, l, h) { .Call('cgAUC_c_cntin', PACKAGE = 'cgAUC', y, z, l, h) }
##---- Should be DIRECTLY executable !! ---- ##-- ==> Define data, use random, ##-- or do help(data=index) for the standard data sets. ## The function is currently defined as function(y, z, l, h) { .Call('cgAUC_c_cntin', PACKAGE = 'cgAUC', y, z, l, h) }
Compute the c_d_theta_sh_h_p.
c_d_theta_sh_h_p(y, z, l, h)
c_d_theta_sh_h_p(y, z, l, h)
y |
The potential variables. It is a matrix with column of values of a variables. It should be standardized in this application. |
z |
The gold standard variable. It should be standardized. |
l |
Linear combination. A vector. |
h |
The value of h falls into (n^(-1/2), n^(-1/5)). |
Compute the c_d_theta_sh_h_p Come from differential.
d.theta.sh.h.p |
Theta after differential. |
Yu-chia Chang
Chang, YCI. Maximizing an ROC type measure via linear combination of markers when the gold reference is continuous. Statistics in Medicine 2012.
Obuchowski NA. An ROC-type measure of diagnostic accuracy when the gold standard is continuous-scale. Statistics in Medicine 2006; 25:481–493.
Obuchowski N. Estimating and comparing diagnostic tests accuracy when the gold standard is not binary. Statistics in Medicine 2005; 20:3261–3278.
Friedman JH, Popescu BE. Gradient directed regularization for linear regression and classification. Technical Report, Department of Statistics, Stanford University, 2004.
##---- Should be DIRECTLY executable !! ---- ##-- ==> Define data, use random, ##-- or do help(data=index) for the standard data sets. ## The function is currently defined as function(y, z, l, h) { .Call('cgAUC_c_d_theta_sh_h_p', PACKAGE = 'cgAUC', y, z, l, h) }
##---- Should be DIRECTLY executable !! ---- ##-- ==> Define data, use random, ##-- or do help(data=index) for the standard data sets. ## The function is currently defined as function(y, z, l, h) { .Call('cgAUC_c_d_theta_sh_h_p', PACKAGE = 'cgAUC', y, z, l, h) }
discrete function, when variable is discrete.
c_dscrt(y, z, l)
c_dscrt(y, z, l)
y |
The potential variables. It is a matrix with column of values of a variables. It should be standardized in this application. |
z |
The gold standard variable. It should be standardized. |
l |
Linear combination. A vector. |
Discrete function, when variable is discrete.
theta.h.p |
The estimate of theta when variable is discrete. |
var |
The variance of estimate of theta. |
Yu-chia Chang
Chang, YCI. Maximizing an ROC type measure via linear combination of markers when the gold reference is continuous. Statistics in Medicine 2012.
Obuchowski NA. An ROC-type measure of diagnostic accuracy when the gold standard is continuous-scale. Statistics in Medicine 2006; 25:481–493.
Obuchowski N. Estimating and comparing diagnostic tests accuracy when the gold standard is not binary. Statistics in Medicine 2005; 20:3261–3278.
Friedman JH, Popescu BE. Gradient directed regularization for linear regression and classification. Technical Report, Department of Statistics, Stanford University, 2004.
##---- Should be DIRECTLY executable !! ---- ##-- ==> Define data, use random, ##-- or do help(data=index) for the standard data sets. ## The function is currently defined as function(y, z, l) { .Call('cgAUC_c_dscrt', PACKAGE = 'cgAUC', y, z, l) }
##---- Should be DIRECTLY executable !! ---- ##-- ==> Define data, use random, ##-- or do help(data=index) for the standard data sets. ## The function is currently defined as function(y, z, l) { .Call('cgAUC_c_dscrt', PACKAGE = 'cgAUC', y, z, l) }
Smooth function.
c_s_h(t, h)
c_s_h(t, h)
t |
A value, the difference between any two subjects. |
h |
The value of h falls into (n^(-1/2), n^(-1/5)). |
Smooth function.
s_h |
The value of smooth function. |
Yu-chia Chang
Chang, YCI. Maximizing an ROC type measure via linear combination of markers when the gold reference is continuous. Statistics in Medicine 2012.
Obuchowski NA. An ROC-type measure of diagnostic accuracy when the gold standard is continuous-scale. Statistics in Medicine 2006; 25:481–493.
Obuchowski N. Estimating and comparing diagnostic tests accuracy when the gold standard is not binary. Statistics in Medicine 2005; 20:3261–3278.
Friedman JH, Popescu BE. Gradient directed regularization for linear regression and classification. Technical Report, Department of Statistics, Stanford University, 2004.
##---- Should be DIRECTLY executable !! ---- ##-- ==> Define data, use random, ##-- or do help(data=index) for the standard data sets. ## The function is currently defined as function(t, h) { .Call('cgAUC_c_s_h', PACKAGE = 'cgAUC', t, h) }
##---- Should be DIRECTLY executable !! ---- ##-- ==> Define data, use random, ##-- or do help(data=index) for the standard data sets. ## The function is currently defined as function(t, h) { .Call('cgAUC_c_s_h', PACKAGE = 'cgAUC', t, h) }
The cgAUC can calculate the AUC-type measure of Obuchowski(2006) when gold standard is continuous, and find the optimal linear combination of variables with respect to this measure.
cgAUC(x, z, h, delta = 1, auto = FALSE, tau = 1, scale = 1)
cgAUC(x, z, h, delta = 1, auto = FALSE, tau = 1, scale = 1)
x |
The potential variables. It is a matrix with column of values of a variables. It should be standardized in this application. |
z |
The gold standard variable. It should be standardized. |
h |
The parameter controls the window width of smoothing function. |
delta |
The parameter be used in TGDM. The default value is one. |
auto |
Find the optimal delta in TGDN using cross-validation. If the auto is TRUE. The default is FALSE. |
tau |
The parameter used in TGDM. The default value is one. |
scale |
Scaling data when scale = 1, no scaling data when scale = 0. The default value is 1. |
In this package, we use the TGDM to find the optimal linear combination of variables in order to maximize the AUC-type measure. Before using this function, all of variables, including gold standard variable, should be standardized first. Below are parameters used in the algorithm:
Rev |
When Rev = 0 means l * 1; otherwise, l * -1. |
l |
The estimate of coefficients for the optimal linear combination of variables. |
theta.sh.h.p |
The estimate of the theta of Chang(2012) for the optimal linear combination of variables. |
theta.sh.h.p.var |
The estimate of variance for the theta of Chang(2012). |
cntin.ri |
The estimate of the theta of Chang(2012) for each single vaiable. |
theta.h.p |
The estimate of the theta of Obuchowski(2006) for the optimal linear combination of variables. |
theta.h.p.var |
The estimate of variance for the theta of Obuchowski(2006). |
dscrt.ri |
The estimate of the theta of Obuchowski(2006) for each single vaiable. |
delta |
The value of delta. |
Yu-chia Chang
Chang, YCI. Maximizing an ROC type measure via linear combination of markers when the gold reference is continuous. Statistics in Medicine 2012.
Obuchowski NA. An ROC-type measure of diagnostic accuracy when the gold standard is continuous-scale. Statistics in Medicine 2006; 25:481–493.
Obuchowski N. Estimating and comparing diagnostic tests accuracy when the gold standard is not binary. Statistics in Medicine 2005; 20:3261–3278.
Friedman JH, Popescu BE. Gradient directed regularization for linear regression and classification. Technical Report, Department of Statistics, Stanford University, 2004.
##---- Should be DIRECTLY executable !! ---- ##-- ==> Define data, use random, ##-- or do help(data=index) for the standard data sets. # n = 100; p = 5; # r.x = matrix(rnorm(n * p), , p) # raw data # r.z = r.x[ ,1] + rnorm(n) # gold standard # x = scale(r.x) # standardized of raw data # z = scale(r.z) # standardized of gold standard # h = n^(-1 / 2) # t1 = cgAUC(r.x, r.z, h, delta = 1, auto = FALSE, tau = 1, scale = 1) # the delta be constant # t1 # t2 = cgAUC(r.x, r.z, h, delta = 1, auto = TRUE, tau = 1, scale = 1) # the delta be variable # t2 ## The function is currently defined as function (x, z, h, delta = 1, auto = FALSE, tau = 1) { x = scale(x) z = scale(z) conv = FALSE n = dim(x)[1] p = dim(x)[2] cntin.ri = dscrt.ri = rep(0, p) id = diag(p) for (i in 1:p) { dscrt.ri[i] = dscrt(x, z, id[i, ])$theta.h.p cntin.ri[i] = cntin(x, z, id[i, ], h)$theta.sh.h.p } beta.i = ifelse(cntin.ri > 0.5, 1, -1) dscrt.ri = ifelse(dscrt.ri > 0.5, dscrt.ri, (1 - dscrt.ri)) cntin.ri = ifelse(cntin.ri > 0.5, cntin.ri, (1 - cntin.ri)) y = x * matrix(beta.i, n, p, byrow = TRUE) max.x = which(cntin.ri == max(cntin.ri)) theta.sh.h.p = 0 l = id[max.x, ] while (conv == FALSE) { d.l = d.theta.sh.h.p(y, z, l, h) max.d.l = max(d.l) ind.d.l = ifelse(d.l >= (tau * max.d.l), 1, 0) * d.l if (auto == TRUE) { delta = optimal.delta(y, z, l, h, ind.d.l) } l = l + delta * ind.d.l l = l/max(l) theta.temp = cntin(y, z, l, h)$theta.sh.h.p ifelse(abs(theta.temp - theta.sh.h.p) < 1e-04, conv <- TRUE, conv <- FALSE) theta.sh.h.p = theta.temp } optimal.dscrt = dscrt(y, z, l) theta.sh.h.p.var = cntin(y, z, l, h)$var l = l * beta.i return(list(l = l, theta.sh.h.p = theta.sh.h.p, theta.sh.h.p.var = theta.sh.h.p.var, cntin.ri = cntin.ri, theta.h.p = optimal.dscrt$theta.h.p, theta.h.p.var = optimal.dscrt$var, dscrt.ri = dscrt.ri, delta = delta)) } ## The function is currently defined as function (x, z, h, delta = 1, auto = FALSE, tau = 1) { x = scale(x) z = scale(z) conv = FALSE n = dim(x)[1] p = dim(x)[2] cntin.ri = dscrt.ri = rep(0, p) id = diag(p) for (i in 1:p) { dscrt.ri[i] = dscrt(x, z, id[i, ])$theta.h.p cntin.ri[i] = cntin(x, z, id[i, ], h)$theta.sh.h.p } beta.i = ifelse(cntin.ri > 0.5, 1, -1) dscrt.ri = ifelse(dscrt.ri > 0.5, dscrt.ri, (1 - dscrt.ri)) cntin.ri = ifelse(cntin.ri > 0.5, cntin.ri, (1 - cntin.ri)) y = x * matrix(beta.i, n, p, byrow = TRUE) max.x = which(cntin.ri == max(cntin.ri)) theta.sh.h.p = 0 l = id[max.x, ] while (conv == FALSE) { d.l = d.theta.sh.h.p(y, z, l, h) max.d.l = max(d.l) ind.d.l = ifelse(d.l >= (tau * max.d.l), 1, 0) * d.l if (auto == TRUE) { delta = optimal.delta(y, z, l, h, ind.d.l) } l = l + delta * ind.d.l l = l/max(l) theta.temp = cntin(y, z, l, h)$theta.sh.h.p ifelse(abs(theta.temp - theta.sh.h.p) < 1e-04, conv <- TRUE, conv <- FALSE) theta.sh.h.p = theta.temp } optimal.dscrt = dscrt(y, z, l) theta.sh.h.p.var = cntin(y, z, l, h)$var l = l * beta.i return(list(l = l, theta.sh.h.p = theta.sh.h.p, theta.sh.h.p.var = theta.sh.h.p.var, cntin.ri = cntin.ri, theta.h.p = optimal.dscrt$theta.h.p, theta.h.p.var = optimal.dscrt$var, dscrt.ri = dscrt.ri, delta = delta)) }
##---- Should be DIRECTLY executable !! ---- ##-- ==> Define data, use random, ##-- or do help(data=index) for the standard data sets. # n = 100; p = 5; # r.x = matrix(rnorm(n * p), , p) # raw data # r.z = r.x[ ,1] + rnorm(n) # gold standard # x = scale(r.x) # standardized of raw data # z = scale(r.z) # standardized of gold standard # h = n^(-1 / 2) # t1 = cgAUC(r.x, r.z, h, delta = 1, auto = FALSE, tau = 1, scale = 1) # the delta be constant # t1 # t2 = cgAUC(r.x, r.z, h, delta = 1, auto = TRUE, tau = 1, scale = 1) # the delta be variable # t2 ## The function is currently defined as function (x, z, h, delta = 1, auto = FALSE, tau = 1) { x = scale(x) z = scale(z) conv = FALSE n = dim(x)[1] p = dim(x)[2] cntin.ri = dscrt.ri = rep(0, p) id = diag(p) for (i in 1:p) { dscrt.ri[i] = dscrt(x, z, id[i, ])$theta.h.p cntin.ri[i] = cntin(x, z, id[i, ], h)$theta.sh.h.p } beta.i = ifelse(cntin.ri > 0.5, 1, -1) dscrt.ri = ifelse(dscrt.ri > 0.5, dscrt.ri, (1 - dscrt.ri)) cntin.ri = ifelse(cntin.ri > 0.5, cntin.ri, (1 - cntin.ri)) y = x * matrix(beta.i, n, p, byrow = TRUE) max.x = which(cntin.ri == max(cntin.ri)) theta.sh.h.p = 0 l = id[max.x, ] while (conv == FALSE) { d.l = d.theta.sh.h.p(y, z, l, h) max.d.l = max(d.l) ind.d.l = ifelse(d.l >= (tau * max.d.l), 1, 0) * d.l if (auto == TRUE) { delta = optimal.delta(y, z, l, h, ind.d.l) } l = l + delta * ind.d.l l = l/max(l) theta.temp = cntin(y, z, l, h)$theta.sh.h.p ifelse(abs(theta.temp - theta.sh.h.p) < 1e-04, conv <- TRUE, conv <- FALSE) theta.sh.h.p = theta.temp } optimal.dscrt = dscrt(y, z, l) theta.sh.h.p.var = cntin(y, z, l, h)$var l = l * beta.i return(list(l = l, theta.sh.h.p = theta.sh.h.p, theta.sh.h.p.var = theta.sh.h.p.var, cntin.ri = cntin.ri, theta.h.p = optimal.dscrt$theta.h.p, theta.h.p.var = optimal.dscrt$var, dscrt.ri = dscrt.ri, delta = delta)) } ## The function is currently defined as function (x, z, h, delta = 1, auto = FALSE, tau = 1) { x = scale(x) z = scale(z) conv = FALSE n = dim(x)[1] p = dim(x)[2] cntin.ri = dscrt.ri = rep(0, p) id = diag(p) for (i in 1:p) { dscrt.ri[i] = dscrt(x, z, id[i, ])$theta.h.p cntin.ri[i] = cntin(x, z, id[i, ], h)$theta.sh.h.p } beta.i = ifelse(cntin.ri > 0.5, 1, -1) dscrt.ri = ifelse(dscrt.ri > 0.5, dscrt.ri, (1 - dscrt.ri)) cntin.ri = ifelse(cntin.ri > 0.5, cntin.ri, (1 - cntin.ri)) y = x * matrix(beta.i, n, p, byrow = TRUE) max.x = which(cntin.ri == max(cntin.ri)) theta.sh.h.p = 0 l = id[max.x, ] while (conv == FALSE) { d.l = d.theta.sh.h.p(y, z, l, h) max.d.l = max(d.l) ind.d.l = ifelse(d.l >= (tau * max.d.l), 1, 0) * d.l if (auto == TRUE) { delta = optimal.delta(y, z, l, h, ind.d.l) } l = l + delta * ind.d.l l = l/max(l) theta.temp = cntin(y, z, l, h)$theta.sh.h.p ifelse(abs(theta.temp - theta.sh.h.p) < 1e-04, conv <- TRUE, conv <- FALSE) theta.sh.h.p = theta.temp } optimal.dscrt = dscrt(y, z, l) theta.sh.h.p.var = cntin(y, z, l, h)$var l = l * beta.i return(list(l = l, theta.sh.h.p = theta.sh.h.p, theta.sh.h.p.var = theta.sh.h.p.var, cntin.ri = cntin.ri, theta.h.p = optimal.dscrt$theta.h.p, theta.h.p.var = optimal.dscrt$var, dscrt.ri = dscrt.ri, delta = delta)) }
Find the optimal delta.
optimal.delta(y, z, l, h, ind.d.l)
optimal.delta(y, z, l, h, ind.d.l)
y |
The potential variables. It is a matrix with column of values of a variables. It should be standardized in this application. |
z |
The gold standard variable. It should be standardized. |
l |
Linear combination. A vector. |
h |
The value of h falls into (n^(-1/2), n^(-1/5)). |
ind.d.l |
Void |
delta.star |
Optimal delta. |
Yu-chia Chang
Chang, YCI. Maximizing an ROC type measure via linear combination of markers when the gold reference is continuous. Statistics in Medicine 2012.
Obuchowski NA. An ROC-type measure of diagnostic accuracy when the gold standard is continuous-scale. Statistics in Medicine 2006; 25:481–493.
Obuchowski N. Estimating and comparing diagnostic tests accuracy when the gold standard is not binary. Statistics in Medicine 2005; 20:3261–3278.
Friedman JH, Popescu BE. Gradient directed regularization for linear regression and classification. Technical Report, Department of Statistics, Stanford University, 2004.
##---- Should be DIRECTLY executable !! ---- ##-- ==> Define data, use random, ##-- or do help(data=index) for the standard data sets. ## The function is currently defined as function (y, z, l, h, ind.d.l) { l.i = matrix(rep(l, times = 50), nrow = 50, byrow = TRUE) delta = seq(0, 5, length = 50) m = delta %*% t(ind.d.l) l.i = l.i + m l.i.max = apply(l.i, 1, max) l.i = l.i/l.i.max theta = rep(0, 50) for (i in 2:50) { theta[i] = cntin(y, z, l.i[i, ], h)$theta.sh.h.p } delta.star = delta[which(theta == max(theta))] return(delta.star) }
##---- Should be DIRECTLY executable !! ---- ##-- ==> Define data, use random, ##-- or do help(data=index) for the standard data sets. ## The function is currently defined as function (y, z, l, h, ind.d.l) { l.i = matrix(rep(l, times = 50), nrow = 50, byrow = TRUE) delta = seq(0, 5, length = 50) m = delta %*% t(ind.d.l) l.i = l.i + m l.i.max = apply(l.i, 1, max) l.i = l.i/l.i.max theta = rep(0, 50) for (i in 2:50) { theta[i] = cntin(y, z, l.i[i, ], h)$theta.sh.h.p } delta.star = delta[which(theta == max(theta))] return(delta.star) }