Research Papers

PUBMED Cancer: pancreatic cancer Method: machine learning

Towards transparent and interpretable screening: multi-biofluid FTIR spectroscopy with LLM-Augmented explainability for pancreatic cancer detection.

Zheng Tang, Olivia Irvine, Edward Duckworth, Chiara Maria Costanzo, K Lillis, Jiahao Ren, P M Anupama Bandaranayake, Bilal Al-Sarireh, Matthew Mortimer, Venkateswarlu Kanamarlapudi, Victoria Higginbotham, S H Chandrashekhara, Benjamin Mora, Debdulal Roy
Published 2026-07-01 00:00

This study addresses the challenge of early detection of pancreatic cancer by utilizing Fourier Transform Infrared (FTIR) spectroscopy combined with machine learning techniques. The research evaluates multiple datasets from urine and blood biofluids, achieving a maximum balanced accuracy of 96.9% with a combination of transmission-mode urine and filtered blood. Additionally, the study emphasizes the importance of transparency and interpretability in AI systems, proposing a language-model-assisted framework for explainability in diagnostic processes.

Read abstract

Early detection of pancreatic cancer remains a critical challenge in oncology, with current diagnostic methods often failing to identify the disease until advanced stages. However, diagnostic accuracy alone may be insufficient for clinical adoption as regulatory frameworks and clinical workflows increasingly demand transparent, interpretable AI systems. This study investigates Fourier Transform Infrared (FTIR) spectroscopy combined with machine learning for non-invasive pancreatic cancer detection using urine and blood biofluids, augmented by a language-model-assisted transparency framework to bridge spectral feature attributions and biochemical interpretation. Five datasets were evaluated: urine ATR-FTIR (61.7% balanced accuracy), urine transmission FTIR (74.8%), filtered blood (<10 kDa; 89.8%), and two matched urine-blood fusion datasets. Transmission-mode urine combined with filtered blood achieved the highest performance (96.9% balanced accuracy), exceeding either biofluid alone. To support transparency, we developed an LLM-augmented explainability pipeline incorporating Monte Carlo Tree Search (MCTS) for structured hypothesis exploration, a curated retrieval-augmented knowledge base (RAG), and reliability-gated explanations that acknowledge disagreement between feature attribution methods. Explainability methods showed substantial disagreement (mean Spearman ρ = 0.23-0.28), motivating a tiered strategy: wavenumber-level interpretation when methods agree (ρ ≥ 0.3, with knowledge base verification) and zone-level interpretation otherwise. These results highlight both the potential and current limitations of transparent spectroscopic diagnostics.

PUBMED Cancer: non-small cell lung cancer Method: AI-enhanced multi-omics

Lung cancer as a global health challenge: Multidimensional biomarker research and therapeutic advances.

Dezhong Jin, Liangwang Zhong, Lai Chen
Published 2026-07-01 00:00

The paper discusses the challenges and advancements in lung cancer diagnostics, particularly focusing on the integration of multidimensional biomarkers and artificial intelligence. It highlights the limitations of traditional serum biomarkers and emphasizes the potential of liquid biopsies combined with AI-enhanced multi-omics to improve early detection and treatment monitoring. The authors advocate for the clinical adoption of validated AI-integrated platforms to enhance precision oncology for lung cancer.

Read abstract

Lung cancer, the leading cause of global cancer-related mortality, is categorized into small-cell and non-small-cell subtypes. The heterogeneous non-small-cell lung cancer group is further subcategorized primarily into adenocarcinoma, squamous cell carcinoma, and large cell carcinoma, each underpinned by distinct molecular alterations. Although traditional serum biomarkers aid in subtype differentiation and treatment monitoring, their utility is limited by challenges such as poor specificity due to inflammatory confounders and the difficulty of dynamically tracking therapeutic resistance. Recent advances have identified emergent subtype-specific biomarkers that reflect metabolic reprogramming, epigenetic dysregulation, stemness signatures, and interactions within the immune microenvironment. By integrating analytes such as ctDNA, exosomal RNAs, and urinary DNA with multi-analyte panels and advanced imaging, liquid biopsies offer a promising avenue to enhance early detection accuracy, prognostication, and dynamic therapy monitoring. Nevertheless, the clinical adoption is hindered by several challenges, including incomplete validation, the need for technical standardization, intratumoral heterogeneity, and inter-ethnic variability. The convergence of artificial intelligence (AI)-enhanced multi-omics with biomarker-guided therapeutics represents a transformative strategy with the potential to overcome resistance, mitigate ethnic disparities, and ultimately transform lung cancer into a chronic, manageable disease. Therefore, prioritizing clinically validated AI-integrated platforms is pivotal to achieve precision oncology.

PUBMED Cancer: ovarian cancer Method: large language model

Automated O-RADS Risk Stratification Using a Large Language Model Analysis of Narrative Ultrasound Reports.

Yanhui Guo, Jingjing Gong, Ruquan Jiang, Asmi Agarwal, Ruchika Goel, Richard Selingreund, Yujie Liu, Min Ren
Published 2026-07-01 00:00

This study presents an automated method for Ovarian-Adnexal Reporting and Data System (O-RADS) scoring using a large language model (LLM) to analyze ultrasound reports. A two-stage pipeline was developed, where the Lingshu LLM extracted features from narrative descriptions, which were then used to train various machine learning algorithms for predicting O-RADS scores. The method achieved a high accuracy of 0.803, indicating its potential to improve clinical workflow and reduce diagnostic variability in ovarian cancer risk assessment.

Read abstract

The Ovarian-Adnexal Reporting and Data System (O-RADS) is essential for standardizing the risk stratification of ovarian lesions detected on ultrasound. However, manual assignment of O-RADS scores is time-consuming and can vary between observers. This study investigates an automated method for O-RADS scoring using a large language model (LLM) to analyze narrative ultrasound reports. A two-stage pipeline was developed for automated O-RADS classification. Initially, the Lingshu LLM, specialized in medical language, extracted and embedded features from free-text descriptions of ovarian lesions. It identified key diagnostic features mentioned by sonologists. Subsequently, these features were used to train and evaluate several machine learning algorithms, including logistic regression (LR), support vector machines and random forests, to predict O-RADS scores (1-5). The proposed method was evaluated on a dataset of 513 cases using fivefold cross-validation. The pipeline using Lingshu model embeddings with LR achieved the highest accuracy of 0.803 [95% CI: 0.753, 0.853], a weighted-average F1-score of 0.819 [95% CI: 0.777, 0.861] and a macro-averaged AUROC of 0.948 [95% CI: 0.937, 0.959]. This outperformed the MedGemma model's pipeline, which had an accuracy of 0.760 [95% CI: 0.700, 0.820], F1-score of 0.787 [95% CI: 0.739, 0.835] and AUROC of 0.941 [95% CI: 0.911, 0.971]. This study introduces a novel approach to automate O-RADS scoring using LLMs for feature extraction and traditional machine learning for classification. The results indicate that this method can accurately stratify ovarian cancer risk, potentially improving clinical workflow efficiency and reducing diagnostic variability. This approach may support radiologists in making more consistent and timely assessments.

PUBMED Cancer: breast cancer Method: unknown

Lathyrane diterpenoids with anti-renal fibrotic activity from the aboveground parts of Euphorbia wallichii.

Ren-Fen Ma, Kai-Yue Liu, Yin-Bo Pan, Hua Zhang
Published 2026-07-01 00:00

This study reports the isolation of eighteen lathyrane diterpenoids from Euphorbia wallichii, including nine novel compounds. The structures were characterized using MS and NMR data, and their biological activities were evaluated. While no significant inhibition was observed against various cancer cell lines, some compounds demonstrated a notable suppressive effect on fibrotic biomarkers in human kidney cells, particularly through the inhibition of the Wnt/β-catenin signaling pathway.

Read abstract

Eighteen lathyrane diterpenoids, comprising nine unreported (1-9) and nine known analogues (10-18), were isolated from the aboveground parts of Euphorbia wallichii. The structures of these compounds were established by detailed interpretation of MS and NMR data, with the absolute configurations assigned via comparison of experimental and calculated ECD spectra. Biological evaluation of selected compounds revealed no significant inhibition against NO production in RAW264.7 macrophages and no cytotoxicity against MDA-MB-231, A549, MCF-7, and HeLa cancer cell lines. However, several of them exerted a significant suppressing effect on TGF-β1-induced upregulation of fibrotic biomarkers in human kidney tubular HK-2 cells, and the anti-renal fibrotic potential of compound 8 may be associated with its inhibition on the Wnt/β-catenin signaling pathway.

PUBMED Cancer: osteosarcoma Method: support vector machine

Habitat-based MRI heterogeneity radiomics for predicting neoadjuvant chemotherapy response in osteosarcoma.

Shuo Wang, Qingsong Wang, Xing Wan, Xianghong Meng, Man Sun, Jinglai Sun, Xuyao Yu, Guangpu Wang, Lei Zhu, Hui Yu
Published 2026-07-01 00:00

This study aimed to develop a habitat-based MRI heterogeneity radiomics (H-radiomics) approach to predict the response of osteosarcoma patients to neoadjuvant chemotherapy (NAC). By analyzing MRI scans and employing feature selection techniques, the researchers identified key features that improved prediction accuracy. The combined model of H-radiomics and conventional radiomics outperformed individual models, achieving an area under the curve (AUC) of 0.91 in predicting treatment response.

Read abstract

Osteosarcoma is a highly heterogeneous malignant tumor with varied responses to neoadjuvant chemotherapy (NAC). This study developed heterogeneity radiomics (H-radiomics) based on habitat imaging to predict the treatment response of osteosarcoma patients after NAC. This study retrospectively included MRI scans (T1-weighted and T2-weighted) of osteosarcoma patients who underwent NAC and surgery at two centers between April 2015 and September 2024. Conventional radiomics (C-radiomics) features and habitat imaging-based H-radiomics features were extracted, with 2236 features obtained for each. Unsupervised reproducibility feature correlation analysis and the least absolute shrinkage and selection operator (LASSO) were used for feature selection, which resulted in 5 features being selected. Support vector machines (SVM) served as the classifier. C-radiomics, H-radiomics, and combined models were developed, and their performance was evaluated using the area under the receiver operating characteristic curve (AUC). The training set included 57 patients (mean age, 17 years ± 11 [SD], 29 men) from Center 1, while the external validation set included 48 patients (mean age, 18 years ± 14 [SD], 28 men) from Center 2. In the external test set, the H-radiomics model achieved an AUC of 0.86, outperforming the C-radiomics model, which had an AUC of 0.79. The combined model demonstrated the best performance, with an AUC of 0.91. Additionally, the combined model achieved an accuracy of 85%, sensitivity of 88%, and specificity of 83%. The combined model of H-radiomics and C-radiomics from multiparametric MRI demonstrates good performance in predicting the treatment response after NAC in osteosarcoma patients.

PUBMED Cancer: hepatocellular carcinoma Method: large language models

Large Language Models for the Differentiation of Benign and Malignant Liver Nodules based on Multimodal Prompts in Liver US Cases.

Sijia Lin, Yu Li, Rushuang Mao, Xuebin Zou, Yixin Hu, Hongsheng Ye, Xiaojun Wu, Liang Yang, Jichong He, Shilin Lu, Lingling Li, Jianhua Zhou
Published 2026-07-01 00:00

This study evaluates the performance of large language models (LLMs) in differentiating benign and malignant liver nodules using multimodal prompts in liver ultrasound cases. The research involved 400 liver ultrasound cases, with a focus on identifying the optimal input for LLMs. Results indicated that the combination of ultrasound images and medical history provided the highest diagnostic accuracy, with LLMs achieving performance comparable to senior radiologists.

Read abstract

Large language models (LLMs) that can process both images and text are increasingly being used in radiology. This study aimed to evaluate the performance of LLMs including GPT-4 Omni (GPT-4o), Claude-3.5-Sonnet (Claude), and Gemini 1.5 Pro (Gemini) in differentiating benign and malignant nodules in liver US cases and compare it with that of human readers. Four hundred liver US cases with pathologically confirmed liver nodules visible on B-mode US from January 2020 to November 2024 were randomly selected in this retrospective study. They were divided into a development set (n = 100) and a test set (n = 300). Five prompt groups for LLMs including US image [I-only], image description [D-only], image and description [I+D], image and liver US e-textbook [I+T], and image and medical history [I+H] were evaluated to identify the optimal input in development set. In test set, accuracy of LLMs in differentiating benign and malignant liver nodules was compared with that of human readers using McNemar's test. In development set, the prompt group I+H for all LLMs exhibited the highest diagnostic accuracy in differentiating benign and malignant liver nodules, being considering as the optimal input (taking GPT-4o as an example, with I-only, 57.0% [as reference]; D-only, 62.0%, p = 0.55; I+D, 62.0%, p = 0.54; I+T, 62.0%, p = 0.36; I+H, 77.0%, p = 0.01). In test set, LLMs with I+H outperformed junior group and showed similar accuracy to senior group (Junior, 70.0% [as reference1]; Senior, 78.3% [as reference2]; GPT-4o, 83.3%, P1 < .001, P2 = .10; Claude, 77.0%, p1 = 0.04, p2 = 0.72; Gemini, 75.3%, p1 = 0.14, p2 = 0.36). Large language models with US image and medical history inputs achieved accuracy comparable to senior radiologists and superior to junior radiologists in differentiating benign and malignant liver nodules.

PUBMED Cancer: B-cell lymphoma Method: unknown

Discovery of 1H-pyrazolo[3,4-d]pyrimidin-4-ylamine derivatives as potent PI3Kδ/BTK dual-target inhibitors for the treatment of B-cell lymphoma.

Zunyuan Wang, Yingqiao Ye, Youkun Kang, Hongmei Zheng, Xinyue Chang, Xiangwei Xu, Chixiao Zhang, Wenhai Huang
Published 2026-07-01 00:00

This study investigates the development of 1H-pyrazolo[3,4-d]pyrimidin-4-ylamine derivatives as dual-target inhibitors of PI3Kδ and BTK for the treatment of B-cell lymphoma. The researchers synthesized 30 compounds, identifying compound 27 as particularly effective, demonstrating high inhibitory activity against both targets and favorable pharmacokinetic properties. The findings suggest that this compound could serve as a promising lead for further therapeutic development.

Read abstract

B-cell lymphoma (BCL) is a hematological system malignant tumor with a relatively high incidence, and PI3Kδ and BTK play an important role in the development of BCL. In the preliminary investigation, we found that when the PI3K inhibitor and the BTK inhibitor were used in combination, the therapeutic effect was greater than that of single-drug administration at both cell and animal levels. Therefore, dual-target inhibitors of PI3Kδ and BTK were expected to potentially achieve improved therapeutic window for BCL. Here, we designed and synthesized 30 compounds, among which compound 27 showed high inhibitory activity against both targets at the kinase level (IC50-PI3Kδ = 9.0 nM, IC50-BTK = 17.3 nM). Furthermore, at the cellular level, the inhibitory activity of 27 against JeKo-1 and H9 cells (IC50-JeKo-1 = 1.6 μM, IC50-H9 = 5.8 μM) was comparable to or exceeded that of the positive drug alone and in combination. Western blot analysis confirmed that compound 27 potently suppressed phosphorylation of BTK, PI3Kδ and their downstream effectors. In addition, compound 27 showed reduced cytotoxicity in H9c2 cardiomyocytes (LD50 = 247.3 μM) compared to the positive. Preliminary pharmacokinetic studies in rats revealed favorable plasma exposure profiles. These preliminary results collectively identified compound 27 as a promising lead candidate for further development against BCL.

PUBMED Cancer: prostate cancer Method: machine learning

Radiomics Applicability Domain Analysis Classification Framework (RADAN-CF): A method for evaluating prediction reliability in radiomics.

Pablo Rodríguez-Belenguer, Manuel Marfil-Trujillo, Aikaterini Vraka, Manolis Tsiknakis, Nikolaos Papanikolaou, Daniele Regge, Kostas Marias, Leonor Cerdá-Alberich, Luis Martí-Bonmatí, ProCAncer-I Consortium
Published 2026-07-01 00:00

The paper presents the Radiomics Applicability Domain Analysis - Classification Framework (RADAN-CF), aimed at evaluating the reliability of predictions in radiomics classification. It addresses the limitations of existing uncertainty estimation methods, particularly under distributional shifts, by integrating reliability criteria related to data representativeness and model behavior. The framework was validated on multiple radiomics datasets, demonstrating significant associations between prediction errors and reliability categories, thus enhancing the transparency of model deployment in clinical settings.

Read abstract

Radiomics-based machine learning models hold promise for clinical decision support, yet their deployment may be limited by the lack of transparent, prediction-level reliability assessment, especially under distributional shift. Existing uncertainty estimation methods mainly operate in probability space and may fail to identify unreliable predictions when test samples differ structurally or functionally from the training data. To address this gap, we propose the Radiomics Applicability Domain ANalysis - Classification Framework (RADANCF), a diagnostic approach for assessing the reliability of individual predictions in radiomics classification. RADANCF integrates six binary reliability criteria spanning two domains: data representativeness (A-C), describing the relationship between test samples and the training data manifold, and model behavior (D-F), capturing local inconsistencies in predictive responses. Criteria violations are aggregated into ordered reliability categories summarized using a qualitative traffic-light scheme. The framework was evaluated on six public radiomics datasets using five machine learning classifiers, resulting in 900 model configurations trained under a dissimilarity-based stratified partitioning strategy designed to challenge model generalization. Analyses included prediction-level error modeling, multiway ANOVA, correlation analysis between criteria, and assessment of frequently violated criterion combinations. External validation was performed on an independent cohort of 2689 prostate cancer patients from the ProCAncer-I project. Prediction error was significantly associated with RADANCF category, although the relationship was not strictly monotonic, with intermediate categories showing the largest error contributions. RADANCF criteria were largely complementary, as shown by low pairwise Spearman correlations (only 7.5% of cases with correlations higher than 0.5; p < 0.001). Multiway ANOVA confirmed RADANCF category as a significant factor after controlling for dataset and model effects (p < 10⁻¹²). Specific combinations of broken criteria-particularly A, B, C, and E-were significantly overrepresented among higher-error predictions (Wilcoxon test, p < 0.001). In external validation, correct predictions appeared across all traffic-light categories, confirming the diagnostic and risk-oriented nature of RADANCF. RADANCF provides a transparent, per-prediction diagnostic framework for assessing reliability in radiomics classification under distributional shift. By jointly accounting for data representativeness and model behavior, it complements traditional performance and uncertainty metrics and supports more cautious model deployment in radiomics-based models.

PUBMED Cancer: non-small cell lung cancer Method: random forest

An intelligent fusion model for Ki-67 prediction in non-small cell lung cancer: A cloud-based prediction system integrating radiomics.

Zhenyu Cao, Xiaoling Xu, Guoqun Mao, Feng Cui, Zhongfeng Niu, Zongyu Xie, Hengfeng Shi, Cheng Yan, Jian Wang
Published 2026-07-01 00:00

This study presents a cloud-based prediction system for Ki-67 expression in non-small cell lung cancer (NSCLC) using a multimodal random forest model. The model integrates radiomic features extracted from CT images and deep learning scores generated by ResNet101. The combined approach achieved high AUCs and F1-scores across training, testing, and validation sets, demonstrating its potential for accurate preoperative prediction and personalized therapeutic strategies.

Read abstract

The expression level of Ki-67 affects the prognosis of NSCLC patients. Accurate preoperative prediction of Ki-67 expression in non-small cell lung cancer (NSCLC) is crucial for prognostic stratification. This multicenter retrospective study enrolled 876 NSCLC patients (January 2015-December 2024) from four institutions, randomly divided into training (n = 525), testing (n = 175), and external validation (n = 176) sets. Radiomic features were extracted from intratumoral and peritumoral (0-12 mm) regions on CT images to construct intra-, peri-, and combined (intra + peri) radiomic scores (Rad-score). Deep learning scores (DL-score) were generated using ResNet101 for whole-lung and tumor-specific analyses. A random forest model integrating Rad-scores, DL-scores, and clinical parameters (lobulation, emphysema, etc.) was developed and validated across all datasets. The combined model (intra + peri Rad-score, intra-tumor DL-score, and clinical features) achieved AUCs of 0.98 (95% CI: 0.97-0.99), 0.92 (0.88-0.96), and 0.92 (0.87-0.96) in training, testing, and external validation sets, with corresponding F1-scores of 0.90, 0.75, and 0.70. SHAP interpretation identified intra-tumor DL-score as the most significant predictor (feature contribution: 46.8%). The multimodal random forest model enables noninvasive and accurate Ki-67 prediction in NSCLC, demonstrating superior generalizability and interpretability to guide personalized therapeutic strategies. Integrating deep learning with intratumoral and peritumoral radiomics enhances the preoperative prediction of Ki-67 expression in patients with non-small cell lung cancer.

PUBMED Cancer: general cancer Method: multimodal learning

Artificial intelligence in clinical oncology: Multimodal integration and translational development.

Ruichong Lin, Zhenhui Zhao, Zhonghai Liu, Jin Kang, Kang Zhang, Xiaoying Huang, Yunfang Yu
Published 2026-07-01 00:00

This paper reviews the integration of artificial intelligence (AI) in clinical oncology, focusing on the challenges and advancements in multimodal data utilization. It discusses how AI is enhancing personalized cancer management through improved representation learning and decision support systems. The review highlights the potential of AI in refining risk stratification and therapeutic recommendations while addressing ongoing challenges such as generalizability and ethical concerns.

Read abstract

Artificial intelligence (AI) is rapidly reshaping clinical oncology, as cancer care increasingly relies on integrating heterogeneous data streams spanning radiology, digital pathology, genomics, and longitudinal electronic health records. However, the sheer complexity and fragmentation of these multimodal inputs remain a major bottleneck for achieving truly personalized cancer management. Recent advances in AI, including foundation models, synthetic data generation, large language models, and agents, are enabling more robust representation learning, cross-modal reasoning, and clinically actionable decision support beyond what traditional single-modality systems can provide. AI-powered platforms are now accelerating molecular subtyping, refining risk stratification, and supporting individualized therapeutic recommendations by jointly modeling imaging, tissue architecture, and molecular landscapes. Moreover, emerging virtual cell and mechanistic foundation frameworks introduce a new computational paradigm for simulating cellular responses and drug-tumor interactions, offering predictive insights for treatment design and drug discovery. Despite these breakthroughs, critical challenges persist, including limited generalizability across patient populations and centers, insufficient prospective validation, regulatory uncertainty, scalability constraints, and ethical concerns surrounding fairness, transparency, and privacy. In this review, we synthesize the latest progress in multimodal oncology AI through a translational lens, emphasizing methodological trade-offs, validation readiness, and responsible deployment frameworks. We highlight how AI is moving from performance-driven benchmarking toward clinically trustworthy precision cancer care, with transformative implications for early detection, diagnosis, therapy optimization, drug development, and clinical trial design.

Find the papers that actually matter