Introduction
To date, cancer remains one of the major causes of deaths worldwide, as in 2020, approximately 20 million new cases were identified. The routine diagnosis of cancer is performed by invasive biopsy sampling which is considered inaccurate [
1], given that cancers are heterogeneous; thus, small biopsy samples cannot accurately characterize the whole stage of the given lesion [
2]. In addition, biopsies are painful, increase risk of infection, and in general, may reduce the quality of life of patients [
3]. Positron emission tomography (PET)/computer tomography (CT) and recently PET/magnetic resonance imaging (MRI) hybrid imaging techniques have been playing a crucial role in in vivo cancer detection and characterization settings [
4‐
6]. Radiomics is the process of extracting numerical features from medical images in order to characterize diseases in vivo [
7]. Recent advancements in the field of PET radiomics combined with machine learning approaches have remonstrated promising results in predicting clinical end-points [
2,
4,
8‐
10]. Nevertheless, radiomic models are challenged by factors related to metabolic variations across patients in PET, as well as variations in imaging, delineation, and radiomic feature extraction parameters [
7]. While the Imaging Biomarker Standardization Initiative (IBSI) [
11] has helped to standardize the process of performing feature extraction from medical images, the recently reported joint EANM/SNMMI guideline for radiomics in nuclear medicine lays out the foundations of quantitative radiomics on a wide spectrum of analysis aspects to characterize diseases in vivo [
7]. Still, various challenges remain on the level of small training datasets, combined with complex and difficult-to interpret prediction models that do not support the process of clinical adoption. Consistently, to date, the wide-scale clinical adoption of AI-driven approaches relying on PET in cancer patients is yet to be witnessed [
12].
Quantum computing is an emerging field with the promise to revolutionize computationally complex problems such as modeling and simulation, optimization, and artificial intelligence (AI) [
13]. While classical computers operate with bits that can either have values 0 or 1, quantum computing operates with qubits that can represent both 0 and 1 values with a probabilistic outcome, by encoding complex information [
13,
14]. Relying on quantum phenomena such as interference, superposition, and entanglement, the so-called quantum circuits can model complex real-life computational problems with simple qubit gate calculations [
15]. While the public perception of quantum advantage is associated to superior computing speed, quantum advantage has many different forms. Specifically, one may encode an N-dimensional vector to log
2N number of qubits, which results in speedup as well as in a much simpler quantum algorithmic complexity compared to its classic computing counterpart [
13,
16]. This simplified search space naturally aids the training process of quantum machine learning (QML) approaches compared to their classic computing counterparts [
13,
14]. Consistently, various QML studies have demonstrated the feasibility to both estimate [
14] and to achieve [
13,
17,
18] a higher predictive performance when relying on quantum ML approaches compared to classic ML [
19]. Recently, it has also been demonstrated that QML requires less training data than classic ML does to build high-performing predictive models [
20]. To date, various quantum algorithms (a.k.a. quantum circuits) have been proposed on existing, so-called noisy intermediate scale quantum computers (NISQ) [
15]. Nevertheless, most problem fields cannot efficiently utilize NISQs due to their low qubit count and high noise levels [
15]. Ongoing activities in this regard focus on proposing and implementing error mitigation techniques that can counter-balance quantum gate as well as measurement errors [
21‐
24]. In general, the majority of quantum computing research focuses on extending the number of qubits and minimizing noise in future quantum hardware and tend to underestimate the importance of existing NISQs as they are challenging to scale [
15]. In contrast, the mentioned advantageous properties of quantum computing render it an interesting candidate to further advance PET radiomic research.
In light of the challenges radiomics and machine learning is facing in the field of cancer research, we hypothesize that by relying on existing NISQs combined with novel error mitigation techniques, quantum advantage can be achieved in clinically relevant cancer cohorts. Therefore, this study had the following objectives: (a) to compare classic and quantum ML predictive performances relying on cross-validation techniques when predicting clinical endpoints in various cancer patients; (b) to investigate whether the magnitude of quantum advantage in light of QML predictive performance can be accurately estimated in the cancer datasets prior to engaging with NISQs; and (c) to investigate the feasibility of utilizing a real quantum computer combined with novel error mitigation techniques for QML prediction in the collected cancer datasets.
Discussion
In this study, we proposed a comprehensive approach to optimize quantum circuits combined with error mitigation. These approaches made quantum machine learning (QML) in clinically relevant PET radiomic datasets feasible in both simulator and real quantum hardware. In addition, we compared results derived using QML with their classic ML (CML) counterparts, while ensuring a fair comparison by following the guidelines in [
43].
Our findings confirm that quantum advantage can be efficiently estimated without engaging with quantum computing by relying on the previously proposed geometric difference (GD
Q) score as defined in [
14]. According to the above, we found that in case GD
Q > 1.0, QML can outperform CML already in simulator environments with up to + 4% balanced accuracy (BACC) and with a narrowed confidence interval (CI), implying improved robustness of QML. Furthermore, our quantum circuit optimization and error mitigation approaches resulted in feasible QML circuit evaluations in real quantum hardware when relying on simple circuits and minimum amount of circuit measurements.
On average, the Geometric Difference (GD
Q) scores through the dataset variants were higher than 1.0 with 16-features and 4-qubits, implying a high likelihood of achieving QML advantage. When GD
Q > 1.0 and QML failed to overperform CML, the underlying dataset had SRT 0.9, which is logical, as such a high SRT increases the number of redundant radiomic features, thus, the chances of overfit [
7,
46,
47]. At the same time, all inferior QML approaches that had GD
Q ≤ 1.0 were built with 8-features and 3-qubits. Correlating GD
Q scores with QML-CML relative test BACCs demonstrated that in case GD
Q > 1.0, relatively high GD
Q scores (e.g., > 5.0) do not necessarily yield higher-performing QML approaches compared to CML. According to the above, proposing feature ranking approaches for QML solely building on the maximization of GD
Q is not advised, as feature redundancy and ML-specific behavior also have to be accounted for. Our study also found that with the utilized kernels, in case classical GD
Q < 1.0, CML can only achieve the same result as QML with a GD
Q ~ 1.0 on the expense of approximately 10-times more computational costs. In general, quantum encoding does not create additional or added information from classical data, but the encoding step itself may transform classical data into quantum states where the data is better separable. This phenomena can be estimated by the GD
Q score. Overall, we wish to emphasize that GD
Q is the property of the data and not the QML or CML algorithm; hence, a weak correlation of increasing GD
Q vs. QML BACC (
p = 0.136) was demonstrated in our experiments.
When comparing the overall test performance of QML and CML methods relying on the cross-validation scheme, we identified a clear trend towards a higher BACC and in comparison with CML while increasing robustness in CI ranges as well.
Test predictive performance comparison of QML algorithms revealed that quantum kernel methods (3-qubits BACC: 62–78%, 4-qubits BACC: 63–83%) were outperforming qNN (3-qubits BACC: 63–73%, 4-qubits BACC: 63–76%), which is in accordance to prior findings, demonstrating that quantum kernel-based training models can solve supervised classification tasks better or equally than qNN learning models for small data samples [
17]. The above measurements were performed in simulator environments that are executed on classical hardware and software. This implies that quantum advantage in clinically relevant radiomic datasets may be achieved without the need to use real quantum hardware. Nevertheless, this is only true due to the low feature and qubit counts as well as the relatively small data size which is a generic property of many cancer cohorts [
7]. Indeed, this property of QML advantage has been demonstrated in other studies [
20]. A low feature count also supports the explainability of ML models and the process of biomarker identification in general [
7,
47].
While the above quantum advantage in simulator environments is encouraging, it is important to understand that real, noiseless quantum hardware may provide a higher fidelity than any simulator. Over time, NISQs will become less and less noisy. Real quantum hardware has properties such as interference, superposition, and entanglement that cannot be simulated on a classical hardware with the same fidelity. This implies that future noiseless or error-corrected quantum hardware has the potential to further advance the predictive performance of QML approaches, when exploiting quantum phenomena. This, however, will have to be evaluated and confirmed as part of future research, once noiseless or error-corrected quantum hardware is available. Here, our utilized quantum circuit optimization approach [
48] combined with learning-based error mitigation (EM) techniques [
49] yielded comparable test performance in the quantum simplified SVM (qsSVM) approach to the noiseless simulator results in both 3 and 4 qubit configurations (8 and 16 features, respectively). In contrast, the quantum kernel Gaussian process (qGP) QML approach underperformed even when relying on EM (BACC: 69% in IonQ with EM, 73% on simulator). While the quantum distance classifier (qDC) yielded identical results with qsSVM with EM and in simulator (BACC: 73%), qDC did slightly underperform on IonQ without EM (BACC: 64%). The reasons of this are manifold: The qsSVM runs only once and with a so-called Swap-test circuit. In comparison, the qGP runs three-times with Swap-test, while the qDC runs once, but with the so-called Hadamard-test, requiring a more complex circuit [
13,
20]. It is crucial to understand what tests can be combined with what QML algorithms when utilizing real quantum hardware. As such, swap test can result in information loss by the means of the sign of the train-test inner products that are required for the prediction [
50]. Nevertheless, Swap-test can be combined with qsSVM and qGP because they operate with positive semi-definite kernel matrices [
30]. In contrast, the qDC approach requires the sign of the inner products to be preserved, which can be achieved with Hadamard-tests that are in return, more complex. The relevance of quantum circuit complexity and well as the number of its measurements manifests in the noisy nature of existing quantum hardware, also referred to as noisy intermediate scale quantum systems (NISQ). This noise can degrade the output result of quantum circuits, affecting QML predictive performance. Therefore, when relying on NISQs, these effects need to be mitigated. As such, non-mitigated results in our study represent a wide-range of test BACCs (59–68%), while we successfully achieved the highest test BACC of 73% being identical with noiseless simulator results with the qsSVM and qDC algorithms relying on our mitigation techniques in the IonQ Aria device. In this regard, our research demonstrates that even when utilizing circuit optimization and error mitigation in combination with radiomic data, the understanding of what QML algorithms and what circuit tests shall be and can be utilized together is imperative.
Our study has multiple implications in the field of in vivo disease characterization, particularly when focusing on radiomic studies. First, relying on GD
Q estimations [
14], future research can estimate whether it makes sense to approach the given radiomics task in the quantum computing domain (test if GD
Q > 1.0). Second, when a quantum advantage is anticipated, widely available simulator environments combined with appropriate kernel methods can yield superior predictive performances with QML compared to CML, which is especially emphasized with small data. Third, when engaging with real quantum hardware is an option, the high-fidelity of NISQs relying on appropriate classic-to-quantum data encoding results in simpler, robust, and potentially interpretable models due to encoding N radiomic features with log2
(N) number of qubits. This can also support endeavors to combine shallow and deep radiomic as well as non-imaging features together [
7] to potentially yield high-performing QML holistic models. In this case, our circuit optimization technique combined with error mitigation can support researchers to minimize noise in existing NISQs and potentially, to even achieve higher predictive performance compared to simulators in case high-fidelity NISQ devices become widely available.
This study had limitations. First, it only operated with single center cohorts; however, it was utilizing cross-validation to estimate the performance and to compare a wide-range of classic and quantum ML approaches and with different SRTs as well as feature counts. Second, due to restricted access to the IonQ device, the effects of circuit optimization and error mitigation could only be demonstrated with selected QML approaches and for one train-test setting, thus, in a hold-out validation scenario. Nevertheless, the essence of our findings in the context of utilizing NISQs in a feasible way was not related to this limitation. Last, while additional CML approaches could have been involved in our study, those either do not yet have a QML variant, of their QML variant — such as quantum random forests — is not designed to be executable on existing NISQs [
51].
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.