-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path10-bivariate-models.Rmd
596 lines (405 loc) · 28.9 KB
/
10-bivariate-models.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
# Bivariate models {#bivariate-models}
```{r setup, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
library(RJafroc)
library(ggplot2)
library(mvtnorm)
library(gridExtra)
library(grid)
library(png)
library(plot3D)
library(plotly)
```
## How much finished 90% {#bivariate-models-how-much-finished}
## Introduction {#bivariate-models-introduction}
Until now the focus of this book has been on the single rating per case scenario where a reader interprets a set of cases in a single modality. This chapter describes two approaches to the problem of two readers interpreting a common set of cases. The first approach extends the binormal model to paired interpretations while the second extends the contaminated binormal model to paired interpretations. The corresponding software implementations are CORROC2 (for correlated ROC) and CORCBM (for correlated CBM), respectively (because of the paired interpretation the two ratings per case are correlated).
In my book [@chakraborty2017observer] Chapter 21 was devoted to the CORROC2 algorithm and usage of the software was described in some detail. At that time [@zhai2017bivariate] was in press. It describes an alternate algorithm that has several advantages over CORROC2. Therefore, I have reconsidered the focus of this chapter which now outlines the correlated binormal model fitting procedure but does not detail how to use CORROC2 software. Instead I now include the work presented in [@zhai2017bivariate] including interactive 3D visualization of the distributions which is not possible in print. The abbreviated description showing how to extend the binormal model to a correlated binormal model, described in the first section, will help the reader better understand how to extend the contaminated binormal model to the correlated contaminated binormal model, described in the second section.
## The bivariate binormal model {#bivariate-models-binormal}
Described next is the bivariate *normal* distribution followed by its extension to the bivariate *binormal* distribution.
### The bivariate normal distribution
Sampling from the bivariate normal distribution $N_2$ is as follows:
\begin{equation}
\overrightarrow{z} \sim N_2\left( \overrightarrow{\mu}, \Sigma \right)
(\#eq:multivariate-sampling-model)
\end{equation}
Here $\overrightarrow{z}$ is a length-$2$ vector, with components $(z_1,z_2)$, of the observed z-samples for each case, $\overrightarrow{\mu}$ is a length-$2$ vector containing the means of the bivariate normal distribution and $\Sigma$ is a $2 \times 2$ covariance matrix describing the variances and correlations of the observed z-samples.
The bivariate normal probability density function $\text{pdf}\left( \overrightarrow{z} ~~ \bigg \rvert ~~\overrightarrow{\mu}, ~~ \Sigma \right)$ is defined by:
\begin{equation}
\text{pdf}\left( z_1,z_2 ~~ \bigg \rvert ~~ \overrightarrow{\mu}, \Sigma \right)
= \frac{1}{2 \pi \sigma_1 \sigma_2 \sqrt{1-\rho^2}}\exp\left( -\frac{t}{2\left( 1-\rho^2 \right)} \right)
(\#eq:bivariate-models-binormal-density-function)
\end{equation}
where
\begin{equation}
t=\frac{\left( z_1-\mu_1 \right)^2}{\sigma_1^2}-\frac{2\rho\left( z_1-\mu_1 \right)\left( z_2-\mu_2 \right)}{\sigma_1 \sigma_2}
+\frac{\left( z_2-\mu_2 \right)^2}{\sigma_2^2}
(\#eq:bivariate-models-binormal-density-function2)
\end{equation}
For a bivariate distribution the covariance matrix is:
\begin{equation}
\Sigma=
\left( \begin{matrix}
\sigma_1^2 & \rho \sigma_1 \sigma_2 \\
\rho \sigma_1 \sigma_2 &
\sigma_2^2
\end{matrix}
\right)(\#eq:bivariate-models-binormal-covariance-matrix)
\end{equation}
In these equations $\mu_1$ and $\sigma_1^2$ is the mean and variance for modality 1, $\mu_2$ and $\sigma_2^2$ is the mean and variance for modality 2 and $\rho$ is the correlation between the two modalities.
The `R` function to evaluate Eqn. \@ref(eq:bivariate-models-binormal-density-function) is `dmvnorm()`, for “density of multivariate normal distribution”, available via package `mvtnorm` [@R-mvtnorm]. Its usage is illustrated next for the following parameter values:
\begin{equation}
\left.\begin{aligned}
\overrightarrow{
\mu}&=
\left( \begin{matrix}
\mu_1 \\
\mu_2
\end{matrix}
\right)
=\left( \begin{matrix}
1.5 \\
2.0
\end{matrix}
\right) \\
\sigma_1 &= 1.1 \\
\sigma_2 &= 1.5 \\
\rho &= 0.6 \\
\Sigma&=
\left(
\begin{matrix}
\sigma_1^2 & \rho \sigma_1 \sigma_2 \\
\rho \sigma_1 \sigma_2 & \sigma_2^2
\end{matrix}\right)
=
\left(
\begin{matrix}
1.1^2 & 0.6 \times 1.1 \times 1.5 \\
0.6 \times 1.1 \times 1.5 & 1.5^2
\end{matrix}\right)\end{aligned}\right\}
(\#eq:bivariate-models-binormal-parameters)
\end{equation}
In the following code the call to get the pdf occurs at line 10.
```{r, attr.source = ".numberLines"}
mu1 <- 1.5 # mean modality 1
mu2 <- 2.0 # mean modality 2
var1 <- 1.1^2 # variance modality 1
var2 <- 1.5^2 # variance modality 2
rho <- 0.6 # correlation between modalities 1 and 2
# construct covariance matrix Sigma
Sigma <- matrix(c(var1, rho*sqrt(var1*var2),
rho*sqrt(var1*var2), var2),2)
x <- c(0.1,0.2) # the x-vector at which to evaluate pdf
pdf <- dmvnorm(x, mean = c(mu1,mu2), sigma = Sigma)
cat("density at x1 = 0.1 and x2 = 0.2 = ", pdf, "\n")
```
The parameters describing the bivariate normal distribution are $\overrightarrow{\mu}$ and $\Sigma$. The total number of parameters is five: two means, two variances and a correlation coefficient.
### The bivariate binormal model {#bivariate-models-binormal-formulae}
The bivariate binormal model is an extension of the bivariate model just described to *two truth-states*: non-diseased and diseased, each modeled separately and independently as a bivariate normal distribution. This could potentially result in a doubling of the number of parameters but by appropriate shifting and scale transformations one can ensure that the two means of the bivariate distribution for non-diseased cases are zeroes and the corresponding variances are unity. This reduces the total number of parameters to six: $\mu_{12}$, $\mu_{22}$, $\sigma_{12}$, $\sigma_{22}$, $\rho_1$, $\rho_2$. Here $\mu_{12}$, $\mu_{22}$ are the means for diseased cases for modality 1 and modality 2, respectively, $\sigma_{12}^2$, $\sigma_{22}^2$ are the corresponding variances and $\rho_1$, $\rho_2$ are the correlations between the two modalities for non-diseased and diseased cases, respectively.
> Notation: henceforth when there are two subscripts the first subscript refers to the modality and the second to the truth state. When there is only one subscript it refers to the truth state.
The bivariate binormal parameters are:
\begin{equation}
\left.\begin{aligned}
\overrightarrow{
\mu_1}&=
\left( \begin{matrix}
0 \\
0
\end{matrix}
\right)\\
\overrightarrow{
\mu_2}&=
\left( \begin{matrix}
\mu_{12} \\
\mu_{22}
\end{matrix}
\right)\\
\Sigma_1&=
\left(
\begin{matrix}
1 & \rho_1 \\
\rho_1 & 1
\end{matrix}
\right) \\
\Sigma_2 &= \left(
\begin{matrix}
\sigma_{12}^2 & \rho_2 \sigma_{12} \sigma_{22} \\
\rho_2 \sigma_{12} \sigma_{22} & \sigma_{22}^2
\end{matrix}
\right)
\end{aligned}\right\}
(\#eq:bivariate-models-binormal-all-parameters)
\end{equation}
The decision variable is $z_{ik_tt}$ where the $i$ subscript corresponds to the two modalities and the $t$ subscript corresponds to the two truth states. The paired-ratings $\left( z_{1k_11},z_{2k_11} \right)$ and $\left( z_{1k_22},z_{2k_22} \right)$, corresponding to z-samples from non-diseased and diseased cases, respectively, are abbreviated to $\overrightarrow{z_{1k_tt}}$ (the vectorization always "condenses" the two modalities):
\begin{equation}
\overrightarrow{z_{k_tt}}=
\left( \begin{matrix}
z_{1k_tt} \\
z_{2k_tt}
\end{matrix} \right)
(\#eq:bivariate-models-binormal-notation)
\end{equation}
$\overrightarrow{z_{k_tt}}$ is sampled as follows:
\begin{equation}
\overrightarrow{z_{k_tt}} \sim N_2\left( \overrightarrow{\mu_t}, \Sigma_t \right)
(\#eq:bivariate-models-binormal-sampling)
\end{equation}
The parameters $\overrightarrow{\mu_t}, \Sigma_t$ are truth-state dependent as described in Eqn. \@ref(eq:bivariate-models-binormal-all-parameters).
To complete the model one needs to include threshold parameters and a decision rule. Recall that for an R-rating *single* modality ROC task with allowed ratings $r = 1, 2, ..., \text{R}$, one needs $\text{R}-1$ thresholds $\zeta_1,\zeta_2,...,\zeta_{\text{R}-1}$. Defining $\zeta_0 = -\infty$ and $\zeta_{\text{R}} = +\infty$ the decision rule is to label a case with rating $r$ if the realized z-sample satisfies $\zeta_{r-1} < z \le \zeta_r$.
In the two-modality task the decision variable and the thresholds are modality dependent: i.e., $z_{1k_tt}$ and $z_{2k_tt}$ and two sets of thresholds are needed: $\overrightarrow{\zeta_1} \equiv \zeta_{11},\zeta_{12},...,\zeta_{1 (\text{R}-1)}$ and $\overrightarrow{\zeta_2} \equiv\zeta_{21},\zeta_{22},...,\zeta_{2 (\text{S}-1)}$, where $R$ is the number of allowed ratings in modality 1 and $S$ is the number of allowed ratings in modality 2 (they can be different). As before $\zeta_{10} = -\infty$ and $\zeta_{20} = -\infty$ and $\zeta_{1\text{R}} = +\infty$ and $\zeta_{2\text{S}} = +\infty$. The decision rule is to rate a case in modality $1$ with rating $r$ if $\zeta_{1(r-1)} < z_{1k_tt} \le \zeta_{1r}$ and the same case in modality $2$ is rated $s$ if $\zeta_{2(s-1)} < z_{2k_tt} \le \zeta_{2s}$.
### Visualizing the bivariate binormal density functions {#bivariate-models-binormal-multivariate-density-visualization}
It is helpful to visualize the pdfs. One needs two axes to depict the two z-samples and a third to depict the pdf. This is done for non-diseased cases in Fig. \@ref(fig:bivariate-models-binormal-pdf-non-diseased), and for diseased cases in Fig. \@ref(fig:bivariate-models-binormal-pdf-diseased). The plots are interactive and the reader may wish to experiment by clicking and dragging the cursor on them. Package `plotly` [@R-plotly] provides the visualization technique.
The parameters are as follows:
\begin{equation}
\left.\begin{aligned}
\overrightarrow{
\mu_2}
&=\left( \begin{matrix}
1.5 \\
2.0
\end{matrix}
\right)
\\
\Sigma_1&=
\left(
\begin{matrix}
1 & 0.3 \\
0.3 & 1
\end{matrix}
\right) \\
\Sigma_2 &= \left(
\begin{matrix}
1.1^2 & 0.6 \times 1.1 \times 1.5 \\
0.6 \times 1.1 \times 1.5 & 1.5^2
\end{matrix}
\right)
\end{aligned}\right\}
(\#eq:bivariate-models-correlated-binormal-parameters)
\end{equation}
```{r, echo=FALSE}
mu1 <- 1.5 # diseased mean modality 1
mu2 <- 2.0 # diseased mean modality 2
var1 <- (1.1)^2 # diseased variance modality 1
var2 <- (1.5)^2 # diseased variance modality 2
rho1 <- 0.3 # non-diseased correlation
rho2 <- 0.6 # diseased correlation
Sigma1 <- matrix(c(1, rho1, rho1, 1),2)
Sigma2 <- matrix(c(var1, rho2*sqrt(var1*var2), rho2*sqrt(var1*var2), var2),2)
x <- seq(-4, 6, by = 0.25)
y <- x
M <- mesh(x, y)
X <- M$x
Y <- M$y
PDF1 <- dmvnorm(cbind(as.vector(X), as.vector(Y)), mean = c(0,0), sigma = Sigma1)
dim(PDF1) <- c(length(x), length(y))
PDF2 <- dmvnorm(cbind(as.vector(X), as.vector(Y)), mean = c(mu1, mu2), sigma = Sigma2)
dim(PDF2) <- c(length(x), length(y))
scene <- list(camera = list(eye = list(x = -1.25, y = -2, z = 1.25)))
p1 <- plot_ly(x = x, y = y, z = PDF1, type = "surface") %>% layout(scene=scene)
p2 <- plot_ly(x = x, y = y, z = PDF2, type = "surface") %>% layout(scene=scene)
```
```{r echo=FALSE, bivariate-models-binormal-pdf-non-diseased, fig.cap="Interactive plot of the bivariate binormal pdf function for non-diseased cases for $\\rho_1 = 0.3$. The peak is at (0,0) and each marginal distributions has unit variance. Modality 1 is plotted along x and modality 2 along y.", fig.show='hold'}
p1
```
```{r echo=FALSE, bivariate-models-binormal-pdf-diseased, fig.cap="Interactive plot of the bivariate binormal pdf function for diseased cases for $\\rho_2 = 0.6$. The peak is at (1.5, 2.0) and the marginal distributions for modality 1 has standard deviation 1.1 and that for modality 2 has standard deviation 1.5.", fig.show='hold'}
p2
```
### Estimating bivariate binormal model parameters {#bivariate-models-binormal-multivariate-density-estimation}
Chapter \@ref(binormal-model) described a method for estimating the parameters of the univariate binormal model. It involved maximizing the likelihood function, i.e., the probability of the observed data as a function of the model parameters. The values of the parameters at the maximum are the maximum-likelihood estimates (MLEs).
With a bivariate model one is dealing with six non-threshold parameters $\overrightarrow{\mu_t}, \Sigma_t$ plus threshold parameters for each modality. Again, the starting point is the likelihood function. The non-diseased case counts in bin $r$ of the first modality and bin $s$ of the second modality is denoted by $K_{rs1}$ (i.e., the binning indices occur before the truth index). The corresponding diseased counts are denoted $K_{rs2}$. Each case yields two integer ratings, $r$ and $s$.
For non-diseased cases, the probability $p_{rs1}$ of a z-sample in bin $r$ of the first modality and bin $s$ of the second modality is determined by the “volume” under the bivariate distribution $N_2(\overrightarrow {\mu_1}, \Sigma_1)$ between modality-1 thresholds $\zeta_{1(r-1)}$ and $\zeta_{1r}$, and between modality-2 thresholds $\zeta_{2(s-1)}$ and $\zeta_{2s}$. For diseased cases the corresponding probability $p_{rs2}$ is the volume under the bivariate distribution $N_2(\overrightarrow {\mu_2}, \Sigma_2)$ between the same thresholds. These probabilities can be calculated using the `pmvnorm()` function in package [@R-mvtnorm].
The probability of observing $K_{rs1}$ non-diseased cases and $K_{rs2}$ diseased cases in bin $r$ in the first modality and bin $s$ in the second modality is:
\begin{equation}
\prod_{t=1}^{2}\left ( p_{rst} \right )^{K_{rst}}
\end{equation}
The logarithm of the likelihood function is given by (neglecting a combinatorial factor that does not depend on the parameters):
\begin{equation}
LL\left ( \overrightarrow{\mu_2}, \Sigma_1, \Sigma_2, \overrightarrow{\zeta_1},\overrightarrow{\zeta_2} \right ) = \sum_{t=1}^{2}\sum_{s=1}^{S}\sum_{r=1}^{R} K_{rst} \log(p_{rst})
\end{equation}
The MLEs of the parameters are obtained by maximizing the LL function, [@metz1980statistical; @metz1984new], implemented in CORROC2. A unique feature is that the software measures ratings-correlations at the underlying z-sample level. Much as I have emphasized that ratings are not "hard" numbers the CORROC2 estimated correlations are valid because the algorithm models the ratings as continuous variables and estimates the correlation based on the bivariate binormal model.
### Comments on CORROC2
CORROC2 was developed as described in [@metz1984new]. Subsequent revisions to the
program were made by Jong-Her Shen and more recently by Benjamin Herman [@metz1989corroc2; @metz1998statistical]. There are at least 116 citations to this software. One reason is that at the time (mid 1980s) it was the only software that allowed significance testing of paired single-reader datasets. However, no advances have been made in the intervening 3 decades, which would allow fitting proper ROC curves to paired datasets. There has been a shift to empirical AUC based analysis which does not require parametric modeling. CORROC2 is currently a relatively under-utilized tool. One reason is that it is not designed to analyze multiple readers interpreting a common set of cases in two or more modalities i.e., MRMC datasets, which is of current interest. The main weakness of CORROC2 is its dependence on the underlying binormal model which, as we have seen, is susceptible to improper ROC curves and degeneracy problems.
## The bivariate contaminated binormal model {#bivariate-models-contaminated-binormal}
The bivariate contaminated binormal model (BCBM) is the bivariate extension of the univariate contaminated binormal model described in Chapter \@ref(proper-roc-models). Corresponding to the two truth states the BCBM is defined by two bivariate distributions as described next.
### Non-diseased cases
For non-diseased cases the sampling is as follows:
\begin{equation}
\overrightarrow{z_{k_11}} \sim N_2\left( \overrightarrow{\mu_1}, \Sigma_1 \right)
(\#eq:bivariate-models-binormal-sampling-teq1)
\end{equation}
where
\begin{equation}
\left.\begin{aligned}
\overrightarrow{
\mu_1}&=
\left( \begin{matrix}
0 \\
0
\end{matrix}
\right)\\
\Sigma_1&=
\left(
\begin{matrix}
1 & \rho_1 \\
\rho_1 & 1
\end{matrix}
\right) \\
\end{aligned}\right\}
(\#eq:bivariate-models-cbm-non-diseased)
\end{equation}
### Diseased cases
Recall, Eqn. \@ref(eq:proper-roc-models-cbm-pdfd), that in the (univariate) CBM, for diseased cases the pdf is a mixture of two scaled unit variance distributions, one centered at zero and the other at $\mu$ that integrate (i.e., via the scale factors) to $(1-\alpha)$ and $\alpha$, respectively.
The extension to bivariate sampling needs to account for the possibility that the scale factors can be different in the two modalities, modeled by $\alpha_1$ and $\alpha_2$, and the z-sample means for cases where the disease is visible can be different in the two modalities, modeled by $\mu_{12}$ and $\mu_{22}$ and finally one must account for four possibilities described next:
1. The disease is not visible in either modality. The appropriate distribution is a scaled bivariate $N_2$ distribution centered at $\overrightarrow{\mu_{00}} \equiv (0,0)$ with correlation $\rho_{00}$ whose pdf integrates to $(1-\alpha_1)(1-\alpha_2)$.
1. The disease is visible in modality 1 but not visible in modality 2. The appropriate distribution is a scaled bivariate $N_2$ distribution centered at $\overrightarrow{\mu_{10}} \equiv (\mu_{12},0)$ with correlation $\rho_{10}$ whose pdf integrates to $\alpha_1(1-\alpha_2)$.
1. The disease is not visible in modality 1 but is visible in modality 2. The appropriate distribution is a scaled bivariate $N_2$ distribution centered at $\overrightarrow{\mu_{01}} \equiv (0,\mu_{22})$ with correlation $\rho_{01}$ whose pdf integrates to $(1-\alpha_1)\alpha_2$.
1. The disease is visible in both modalities 1 and 2. The appropriate distribution is a scaled bivariate $N_2$ distribution centered at $\overrightarrow{\mu_{11}} \equiv (\mu_{12},\mu_{22})$ with correlation $\rho_{11}$ whose pdf integrates to $\alpha_1\alpha_2$.
The means and covariance matrices for the four distributions are defined by:
\begin{equation}
\left.\begin{aligned}
\begin{matrix}
\overrightarrow{
\mu_{00}}&=
\left( \begin{matrix}
0 \\
0
\end{matrix}
\right)
& \Sigma_{00}&=
\left(
\begin{matrix}
1 & \rho_{00} \\
\rho_{00} & 1
\end{matrix}
\right) \\
\\
\overrightarrow{
\mu_{10}}&=
\left( \begin{matrix}
\mu_{12} \\
0
\end{matrix}
\right)
& \Sigma_{10}&=
\left(
\begin{matrix}
1 & \rho_{10} \\
\rho_{10} & 1
\end{matrix}
\right) \\
\\
\overrightarrow{
\mu_{01}}&=
\left( \begin{matrix}
0 \\
\mu_{22}
\end{matrix}
\right)
& \Sigma_{01}&=
\left(
\begin{matrix}
1 & \rho_{01} \\
\rho_{01} & 1
\end{matrix}
\right) \\
\\
\overrightarrow{
\mu_{11}}&=
\left( \begin{matrix}
\mu_{12} \\
\mu_{22}
\end{matrix}
\right)
& \Sigma_{11}&=
\left(
\begin{matrix}
1 & \rho_{11} \\
\rho_{11} & 1
\end{matrix}
\right) \\
\end{matrix}
\end{aligned}\right\}
(\#eq:bivariate-models-cbm-diseased)
\end{equation}
Therefore, the pdf of the diseased case bivariate distribution is:
\begin{equation}
\left.\begin{aligned}
\text{pdf}\left (z_{1k_22},z_{2k_22} \right ) =
&(1-\alpha_1)(1-\alpha_2)~~\text{pdf}\left ( z_{1k_22},z_{2k_22} ~| ~\overrightarrow{\mu_{00}}, \Sigma_{00}\right )\\
+& \alpha_1(1-\alpha_2)~~\text{pdf}\left ( z_{1k_22},z_{2k_22} ~| ~ \overrightarrow{\mu_{10}}, \Sigma_{10}\right )\\
+& (1-\alpha_1)\alpha_2~~\text{pdf}\left ( z_{1k_22},z_{2k_22} ~| ~ \overrightarrow{\mu_{01}}, \Sigma_{01}\right )\\
+& \alpha_1\alpha_2~~\text{pdf}\left ( z_{1k_22},z_{2k_22} ~| ~\overrightarrow{\mu_{11}}, \Sigma_{11}\right )\\
\end{aligned}\right\}
(\#eq:bivariate-models-cbm-pdf-diseased)
\end{equation}
In this equation $\text{pdf}\left ( z_{1k_22},z_{2k_22} ~| ~\overrightarrow{\mu}, \Sigma \right )$ denotes the bivariate normal pdf function with the specified mean-vector and covariance matrix, as defined in Eqn. \@ref(eq:bivariate-models-binormal-density-function).
3D interactive visualization plots that follow are for the following parameters:
```{r}
mu_1 <- 3.5
mu_2 <- 3
rho1 <- 0.3
rho4 <- 0.8
alpha_1 <- 0.5
alpha_2 <- 0.6
```
```{r echo=FALSE}
mu_00 <- c(0, 0)
mu_01 <- c(0, mu_2)
mu_10 <- c(mu_1, 0)
mu_11 <- c(mu_1, mu_2)
rho2 <- (rho1 + rho4) / 2
rho3 <- rho2
sigma1 <- rbind(c(1, rho1), c(rho1, 1))
sigma2 <- rbind(c(1, rho2), c(rho2, 1))
sigma3 <- rbind(c(1, rho3), c(rho3, 1))
sigma4 <- rbind(c(1, rho4), c(rho4, 1))
x <- seq(-3, 6, by = 0.05)
y <- x
M <- mesh(x, y)
X <- M$x
Y <- M$y
pdf <- (1 - alpha_1) * (1 - alpha_2) * dmvnorm(cbind(as.vector(X), as.vector(Y)), sigma = sigma1) +
(1 - alpha_1) * alpha_2 * dmvnorm(cbind(as.vector(X), as.vector(Y)), mean = mu_01, sigma = sigma2) +
alpha_1 * (1 - alpha_2) * dmvnorm(cbind(as.vector(X), as.vector(Y)), mean = mu_10, sigma = sigma3) +
alpha_1 * alpha_2 * dmvnorm(cbind(as.vector(X), as.vector(Y)), mean = mu_11, sigma = sigma4)
dim(pdf) <- c(length(x), length(y))
scene <- list(camera = list(eye = list(x = -1.25, y = -2, z = 1.25)))
# https://plotly.com/python/3d-camera-controls/
# scene <- list(camera = list(eye = list(x = 0, y = 0, z = 1)))
p <- plot_ly(x = x, y = y, z = pdf, type = "surface") %>% layout(scene=scene)
#print(hide_colorbar(p))
```
```{r bivariate-models-cbm-diseased, fig.cap="Interactive surface plot showing the 4 distributions that are needed to describe diseased case sampling.", fig.show='hold', fig.align = "center", echo = F}
p
```
Fig. \@ref(fig:bivariate-models-cbm-diseased) this pdf function is shown as a surface plot and four peaks are visible.
[@zhai2017bivariate] prove that with the above scale factors the pdf and CDF (cumulative distribution function) of the marginal distributions for diseased cases in modalities 1 and 2 have the same structure as in the univariate CBM, i.e., either condition, viewed in isolation, has the same sampling behavior as the univariate CBM.
The above model has four correlations in contrast to the bivariate binormal model which has two correlations. For parsimony the following assumptions are made:
1. When the disease is invisible in both modalities the correlation is the same as that for non-diseased cases, i.e., $\rho_{00} \equiv \rho_1$.
1. When the disease is visible in both modalities the correlation is $\rho_{11} \equiv \rho_2$.
1. When the disease is visible in only one modality the correlation is the mean of the non-diseased cases correlation and the correlation where disease is visible in both conditions, i.e., $\rho_{10} = \rho_{01} = (\rho_1+\rho_2)/2$.
### Estimation
This follows the procedure described previously for the correlated binormal model (i.e., CORROC2). One introduces binning thresholds, defines the likelihood function and applies a procedure for maximizing it. Details are in [@zhai2017bivariate] which also shows applications to two real datasets and a simulation study.
### Application to Van Dyke dataset
Fig. \@ref(fig:bivariate-models-figures1); CORCBM fits for Van Dyke data for reader 4. CORROC2 failed to converge.
```{r bivariate-models-figures1, echo=FALSE,out.width="50%", fig.cap="VanDyke dataset, reader 4. Left plot: modality 1, right plot: modality 2. CORROC2 failed to converge and therefore only CORCBM plots are shown.", fig.show='hold'}
knitr::include_graphics(c("images/corcbm/plots/VanDykeM1R4.png","images/corcbm/plots/VanDykeM2R4.png"))
```
Fig. \@ref(fig:bivariate-models-figures2); CORROC2 (dashed line) and CORCBM (solid line) fits for Van Dyke data for reader 5.
```{r bivariate-models-figures2, echo=FALSE,out.width="50%", fig.cap="VanDyke dataset, reader 5. Left plot: modality 1, right plot: modality 2; both CORROC2 (dashed line) and CORCBM (solid line) plots are shown. Note that CORCBM has higher AUC than CORROC2.", fig.show='hold'}
knitr::include_graphics(c("images/corcbm/plots/VanDykeM1R5.png","images/corcbm/plots/VanDykeM2R5.png"))
```
### Simulation study
Fig. \@ref(fig:bivariate-models-figures3); simulation parameters were $\mu_1 = 1.5$, $\mu_2 = 3$, $\alpha_1 = 0.4$, $\alpha_2 = 0.7$, $\rho_1 = 0.3$, $\rho_2 = 0.8$.
```{r bivariate-models-figures3, echo=FALSE,out.width="50%", fig.cap="CORCBM fits to simulated data for two modalities labeled \"conditions\". Top-left: 50 non-diseased and 50 diseased cases; Top-right: 100 non-diseased and 100 diseased cases; Bottom-left: 1000 non-diseased and 1000 diseased cases; Bottom-right: 5000 non-diseased and 5000 diseased cases. For 5000-5000 cases the operating points are almost exactly fitted by CORCBM.", fig.show='hold'}
knitr::include_graphics(c("images/corcbm/plots/Simu-50-50.png","images/corcbm/plots/Simu-100-100.png", "images/corcbm/plots/Simu-1000-1000.png", "images/corcbm/plots/Simu-5000-5000.png"))
```
## Discussion {#bivariate-models-discussion}
Described is an approach to fitting paired ROC datasets using a bivariate extension of the contaminated binormal model. Software implementing the method, CORCBM, was applied to two datasets that have been widely used to test methodological advances in ROC analysis. For comparison CORROC2, which implements the bivariate binormal model, was also applied to the same datasets. CORCBM was able to fit the paired data for all 9 readers (5 from the Van Dyke dataset and 4 from the Franken dataset) while CORROC2 was unable to fit one reader in the Van Dyke dataset. Inability to fit a dataset, especially for a reader whose performance is close to perfect is, in my opinion, a serious shortcoming. [@vannier1994craniosynostosis] have reported a study where CORROC2 was unable to fit 14 out of 16 clinical datasets. All CORROC2 fits are improper implying a range of predicted performance that is below the chance diagonal for expert radiologists, another serious shortcoming.
In our opinion extending PROPROC, a relatively complex algorithm, whose software is not in the public domain, to paired data would be a challenge. Moreover, since it is based on the binormal model, it would still be unable to yield reasonable fits to degenerate datasets. The authors have confirmed that given a dataset with 100 non-diseased and 100 diseased cases yielding a single operating point at (0, TPF), where 0 < TPF < 1, PROPROC yields AUC = 1 regardless of how small the value of TPF is, which is unreasonable. Likewise, given a dataset yielding a single operating point at (FPF, 0), where 0 < FPF < 1, PROPROC still yields AUC = 1 regardless of how large the value of FPF is, which is also unreasonable. The CBM model is simpler and has no issues with fitting degenerate datasets. In my opinion, the CBM model parameters are easier to interpret than other proper ROC models, and as shown in this work, it can be extended to the bivariate situation.
Our interest in this problem arose from a methodology validation issue. Multiple-reader multiple case (MRMC) study designs are widely used to compare modalities and several analysis methods have been proposed. The Dorfman Berbaum Metz method or the substantially equivalent Obuchowski and Rockette method have been used in a few hundred publications (Prof. Kevin Berbaum, private communication, ca. 2010). With such wide usage the results of any of which could be a critical driver of medical imaging technology, one needs to be sure that the analysis method is valid. Validating a methodology requires an MRMC ratings data simulator. The simulator used so far to validate DBM is that proposed by Roe and Metz. This simulator has several shortcomings. It assumes that the ratings are distributed like an equal-variance binormal model, which is not true for most clinical datasets. CBM provides a natural explanation of the observed unequal variance: since there are two types of disease – visible and not visible –the mixing of the two gives arise to a diseased case pdf that appears wider than a non-diseased case pdf. The Roe-Metz simulator is out dated: the parameter values are based on datasets then available (i.e., prior to 1997). Medical imaging technology has changed substantially in the intervening decades, and modalities that did not exist then, like digital mammography, breast and chest tomosynthesis, are in wide usage in the US. Finally, the method used to arrive at the Roe and Metz parameter values is not clearly described. Needed is a realistic simulator that is calibrated by a clearly defined method to current datasets.
My approach to this problem originally used CORROC2 to determine the different types of ratings-level correlations present in an ROC-MRMC dataset. The correlations were used to design a simulator that incorporated the measured values. However, CORROC2 frequently failed to converge in many instances and other discrepancies were found when CORROC2 was applied to newer datasets (CORROC2 gave reasonable results for the Van Dyke and the Franken datasets, but failed to converge on almost all newer datasets). This is what led to this work.
An application of CORCBM to calibrating a simulator to a clinical data set is described in the next Chapter TBA 23.