Introduction
Related works
Method
Stage 1
Multi-headed self-attention (MHSA)
Stage 2
Internal data | |||||
---|---|---|---|---|---|
1.5 T | 3 T | ||||
Low risk | Medium risk | High risk | Low risk | Medium risk | High risk |
40 | 40 | 40 | 0 | 14 | 20 |
External data | |||||
1.5 T | 3 T | ||||
Low risk | Medium risk | High risk | Low risk | Medium risk | High risk |
0 | 0 | 0 | 40 | 26 | 20 |
1.5 T | |||||
---|---|---|---|---|---|
T2 | ADC | ||||
Slice thickness | Axial resolution | Sequence | Slice thickness | Axial resolution | Sequence |
3–4 mm | 0.325–0.625 mm | Turbo SE | 3.6–4 mm | 2–2.5 mm | Single-shot EP |
3 T | |||||
T2 | ADC | ||||
Slice thickness | Axial resolution | Sequence | Slice thickness | Axial resolution | Sequence |
3–3.6 mm | 0.325–0.5 mm | Turbo SE | 3.6–4 mm | 2–2.5 mm | Single-shot EP |
Dataset preparation
Pre-processing and augmentation
Dataset
Experimental setup
Model
Encoder | Decoder | |||||
---|---|---|---|---|---|---|
Layer | Convolutions | Downsample | Output (\(c\times x\times y\times z\)) | Convolutions | Upsample | Output (\(c\times x\times y\times z\)) |
Conv Block 1 | \(\begin{bmatrix} 3\times 3\times 1, 32 \\ 3\times 3\times 1, 32 \end{bmatrix}\) \(\times 2\) | 2D Max-Pooling | \(32\times 64\times 64\times 12\) | \(\begin{bmatrix} 3\times 3\times 3, 32 \\ 3\times 3\times 3, 256 \end{bmatrix}\) \(\times 2\) | Bi-linear | \(256\times 32\times 32\times 12\) |
Conv Block 2 | \(\begin{bmatrix} 3\times 3\times 1, 64 \\ 3\times 3\times 1, 64 \end{bmatrix}\) \(\times 2\) | 2D Max-Pooling | \(64\times 32\times 32\times 12\) | \(\begin{bmatrix} 3\times 3\times 1, 128 \\ 3\times 3\times 1, 128 \end{bmatrix}\) \(\times 2\) | Bi-linear | \(64\times 64\times 64\times 12\) |
Conv Block 3 | \(\begin{bmatrix} 3\times 3\times 1, 128 \\ 3\times 3\times 1, 128 \end{bmatrix}\) \(\times 2\) | 2D Max-Pooling | \(128\times 16\times 16\times 12\) | \(\begin{bmatrix} 3\times 3\times 1, 64 \\ 3\times 3\times 1, 64 \end{bmatrix}\) \(\times 2\) | Bi-linear | \(32\times 256\times 256\times 12\) |
Conv Block 4 | \(\begin{bmatrix} 3\times 3\times 3, 256 \\ 3\times 3\times 3, 256 \end{bmatrix}\) \(\times 2\) | None | \(128\times 16\times 16\times 12\) | \(\begin{bmatrix} 3\times 3\times 1, 32 \\ 3\times 3\times 1, 1 \end{bmatrix}\) \(\times 2\) | None | \(1\times 256\times 256\times 12\) |
Single-domain generalisation experiment
Perturbation experiments
Training and evaluation
Results
Single-domain generalisation
Discrete representations | |||||
---|---|---|---|---|---|
Accuracy | Specificity | Precision | Recall | AUC | |
Ours | \(\mathbf {0.731\pm 0.028}\) | \(\mathbf {0.720\pm 0.028}\) | \(\mathbf {0.727\pm 0.046}\) | \(\mathbf {0.736\pm 0.049}\) | \(\mathbf {0.739\pm 0.041}\) |
BigAug[SPAN] [6] | |||||
ResNet-50 | \(0.720\pm 0.073\) | \(0.701\pm 0.078\) | \(0.719\pm 0.055\) | \(0.724\pm 0.047\) | \(0.724\pm 0.053\) |
Hybrid vision transformer | \(0.726\pm 0.059\) | \(0.726\pm 0.033\) | \(0.720\pm 0.059\) | \(0.730\pm 0.047\) | \(0.731\pm 0.060\) |
3D Vision transformer | \(0.646\pm 0.083\) | \(0.663\pm 0.092\) | \(0.622\pm 0.082\) | \(0.648\pm 0.090\) | \(0.641\pm 0.087\) |
ProstAdv[SPAN] [23] | |||||
ResNet-50 | \(0.717\pm 0.066\) | \(0.708\pm 0.069\) | \(0.721\pm 0.068\) | \(0.729\pm 0.070\) | \(0.730\pm 0.057\) |
Hybrid vision transformer | \(0.722\pm 0.049\) | \(0.711\pm 0.045\) | \(0.726\pm 0.061\) | \(0.729\pm 0.047\) | \(0.729\pm 0.062\) |
Vision transformer | \(0.620\pm 0.064\) | \(0.631\pm 0.087\) | \(0.618\pm 0.077\) | \(0.633\pm 0.083\) | \(0.625\pm 0.082\) |
Jigen[SPAN] [24] | |||||
ResNet-50 | \(0.690\pm 0.052\) | \(0.678\pm 0.073\) | \(0.691\pm 0.088\) | \(0.696\pm 0.070\) | \(0.699\pm 0.062\) |
Hybrid vision transformer | \(0.701\pm 0.079\) | \(0.683\pm 0.099\) | \(0.702\pm 0.089\) | \(0.695\pm 0.082\) | \(0.704\pm 0.079\) |
Vision transformer | \(0.600\pm 0.103\) | \(0.595\pm 0.117\) | \(0.606\pm 0.092\) | \(0.610\pm 0.096\) | \(0.608\pm 0.094\) |
Perturbation experiments
Baseline | Gauss | Poisson | S &P | Blur | Motion | |
---|---|---|---|---|---|---|
Low-risk group | ||||||
ResNet-50 | \(0.718\pm 0.074\) | \(0.665\pm 0.101\) | \(0.689\pm 0.092\) | \(0.680\pm 0.076\) | \(0.698\pm 0.086\) | \(0.664\pm 0.113\) |
Hybrid vision transformer | \(0.750\pm 0.046\) | \(0.708\pm 0.090\) | \(0.708\pm 0.083\) | \(0.698\pm 0.090\) | \(0.714\pm 0.084\) | \(0.685\pm 0.097\) |
Vision transformer | \(0.666\pm 0.086\) | \(0.616\pm 0.068\) | \(0.632\pm 0.064\) | \(0.616\pm 0.090\) | \(0.630\pm 0.074\) | \(0.624\pm 0.074\) |
Ours | \(0.777\pm 0.068\) | \(\mathbf {0.764\pm 0.061}\) | \(\mathbf {0.758\pm 0.083}\) | \(\mathbf {0.754\pm 0.052}\) | \(\mathbf {0.770\pm 0.071}\) | \(\mathbf {0.740\pm 0.084}\) |
Medium risk group | ||||||
ResNet-50 | \(0.732\pm 0.084\) | \(0.680\pm 0.114\) | \(0.694\pm 0.113\) | \(0.696\pm 0.083\) | \(0.699\pm 0.090\) | \(0.681\pm 0.113\) |
Hybrid vision transformer | \(0.773\pm 0.054\) | \(0.724\pm 0.090\) | \(0.706\pm 0.114\) | \(0.710\pm 0.084\) | \(0.727\pm 0.098\) | \(0.695\pm 0.104\) |
Vision transformer | \(0.670\pm 0.083\) | \(0.620\pm 0.075\) | \(0.636\pm 0.054\) | \(0.621\pm 0.078\) | \(0.641\pm 0.065\) | \(0.609\pm 0.079\) |
Ours | \(0.781\pm 0.075\) | \(\mathbf {0.770\pm 0.061}\) | \(\mathbf {0.761\pm 0.084}\) | \(\mathbf {0.759\pm 0.054}\) | \(\mathbf {0.768\pm 0.074}\) | \(\mathbf {0.742\pm 0.085}\) |
High risk group | ||||||
ResNet-50 | \(0.740\pm 0.090\) | \(0.694\pm 0.118\) | \(0.704\pm 0.114\) | \(0.708\pm 0.080\) | \(0.710\pm 0.093\) | \(0.688\pm 0.124\) |
Hybrid vision transformer | \(0.781\pm 0.068\) | \(0.729\pm 0.094\) | \(0.721\pm 0.118\) | \(0.728\pm 0.093\) | \(0.741\pm 0.134\) | \(0.694\pm 0.116\) |
Vision transformer | \(0.671\pm 0.093\) | \(0.636\pm 0.071\) | \(0.633\pm 0.051\) | \(0.628\pm 0.081\) | \(0.646\pm 0.064\) | \(0.609\pm 0.083\) |
Ours | \(0.785\pm 0.071\) | \(\mathbf {0.770\pm 0.069}\) | \(\mathbf {0.762\pm 0.081}\) | \(\mathbf {0.759\pm 0.054}\) | \(\mathbf {0.775\pm 0.077}\) | \(\mathbf {0.747\pm 0.084}\) |
Low risk | Medium risk | High risk | |
---|---|---|---|
ResNet-50 | 51.0 | 50.0 | 57.1 |
Hybrid vision transformer | 41.8 | 34.6 | 38.3 |
Vision transformer | 46.7 | 47.1 | 55.1 |