Log in to save searches and build a personal reading queue.
Find the papers that actually matter
Search by concept, cancer type, source, or modeling approach. Every result is presented in a cleaner, review-friendly layout with summaries and direct access to the abstract.
Towards transparent and interpretable screening: multi-biofluid FTIR spectroscopy with LLM-Augmented explainability for pancreatic cancer detection.
Read abstract
Early detection of pancreatic cancer remains a critical challenge in oncology, with current diagnostic methods often failing to identify the disease until advanced stages. However, diagnostic accuracy alone may be insufficient for clinical adoption as regulatory frameworks and clinical workflows increasingly demand transparent, interpretable AI systems. This study investigates Fourier Transform Infrared (FTIR) spectroscopy combined with machine learning for non-invasive pancreatic cancer detection using urine and blood biofluids, augmented by a language-model-assisted transparency framework to bridge spectral feature attributions and biochemical interpretation. Five datasets were evaluated: urine ATR-FTIR (61.7% balanced accuracy), urine transmission FTIR (74.8%), filtered blood (<10 kDa; 89.8%), and two matched urine-blood fusion datasets. Transmission-mode urine combined with filtered blood achieved the highest performance (96.9% balanced accuracy), exceeding either biofluid alone. To support transparency, we developed an LLM-augmented explainability pipeline incorporating Monte Carlo Tree Search (MCTS) for structured hypothesis exploration, a curated retrieval-augmented knowledge base (RAG), and reliability-gated explanations that acknowledge disagreement between feature attribution methods. Explainability methods showed substantial disagreement (mean Spearman ρ = 0.23-0.28), motivating a tiered strategy: wavenumber-level interpretation when methods agree (ρ ≥ 0.3, with knowledge base verification) and zone-level interpretation otherwise. These results highlight both the potential and current limitations of transparent spectroscopic diagnostics.
Lung cancer as a global health challenge: Multidimensional biomarker research and therapeutic advances.
Read abstract
Lung cancer, the leading cause of global cancer-related mortality, is categorized into small-cell and non-small-cell subtypes. The heterogeneous non-small-cell lung cancer group is further subcategorized primarily into adenocarcinoma, squamous cell carcinoma, and large cell carcinoma, each underpinned by distinct molecular alterations. Although traditional serum biomarkers aid in subtype differentiation and treatment monitoring, their utility is limited by challenges such as poor specificity due to inflammatory confounders and the difficulty of dynamically tracking therapeutic resistance. Recent advances have identified emergent subtype-specific biomarkers that reflect metabolic reprogramming, epigenetic dysregulation, stemness signatures, and interactions within the immune microenvironment. By integrating analytes such as ctDNA, exosomal RNAs, and urinary DNA with multi-analyte panels and advanced imaging, liquid biopsies offer a promising avenue to enhance early detection accuracy, prognostication, and dynamic therapy monitoring. Nevertheless, the clinical adoption is hindered by several challenges, including incomplete validation, the need for technical standardization, intratumoral heterogeneity, and inter-ethnic variability. The convergence of artificial intelligence (AI)-enhanced multi-omics with biomarker-guided therapeutics represents a transformative strategy with the potential to overcome resistance, mitigate ethnic disparities, and ultimately transform lung cancer into a chronic, manageable disease. Therefore, prioritizing clinically validated AI-integrated platforms is pivotal to achieve precision oncology.
Automated O-RADS Risk Stratification Using a Large Language Model Analysis of Narrative Ultrasound Reports.
Read abstract
The Ovarian-Adnexal Reporting and Data System (O-RADS) is essential for standardizing the risk stratification of ovarian lesions detected on ultrasound. However, manual assignment of O-RADS scores is time-consuming and can vary between observers. This study investigates an automated method for O-RADS scoring using a large language model (LLM) to analyze narrative ultrasound reports. A two-stage pipeline was developed for automated O-RADS classification. Initially, the Lingshu LLM, specialized in medical language, extracted and embedded features from free-text descriptions of ovarian lesions. It identified key diagnostic features mentioned by sonologists. Subsequently, these features were used to train and evaluate several machine learning algorithms, including logistic regression (LR), support vector machines and random forests, to predict O-RADS scores (1-5). The proposed method was evaluated on a dataset of 513 cases using fivefold cross-validation. The pipeline using Lingshu model embeddings with LR achieved the highest accuracy of 0.803 [95% CI: 0.753, 0.853], a weighted-average F1-score of 0.819 [95% CI: 0.777, 0.861] and a macro-averaged AUROC of 0.948 [95% CI: 0.937, 0.959]. This outperformed the MedGemma model's pipeline, which had an accuracy of 0.760 [95% CI: 0.700, 0.820], F1-score of 0.787 [95% CI: 0.739, 0.835] and AUROC of 0.941 [95% CI: 0.911, 0.971]. This study introduces a novel approach to automate O-RADS scoring using LLMs for feature extraction and traditional machine learning for classification. The results indicate that this method can accurately stratify ovarian cancer risk, potentially improving clinical workflow efficiency and reducing diagnostic variability. This approach may support radiologists in making more consistent and timely assessments.
Lathyrane diterpenoids with anti-renal fibrotic activity from the aboveground parts of Euphorbia wallichii.
Read abstract
Eighteen lathyrane diterpenoids, comprising nine unreported (1-9) and nine known analogues (10-18), were isolated from the aboveground parts of Euphorbia wallichii. The structures of these compounds were established by detailed interpretation of MS and NMR data, with the absolute configurations assigned via comparison of experimental and calculated ECD spectra. Biological evaluation of selected compounds revealed no significant inhibition against NO production in RAW264.7 macrophages and no cytotoxicity against MDA-MB-231, A549, MCF-7, and HeLa cancer cell lines. However, several of them exerted a significant suppressing effect on TGF-β1-induced upregulation of fibrotic biomarkers in human kidney tubular HK-2 cells, and the anti-renal fibrotic potential of compound 8 may be associated with its inhibition on the Wnt/β-catenin signaling pathway.
Habitat-based MRI heterogeneity radiomics for predicting neoadjuvant chemotherapy response in osteosarcoma.
Read abstract
Osteosarcoma is a highly heterogeneous malignant tumor with varied responses to neoadjuvant chemotherapy (NAC). This study developed heterogeneity radiomics (H-radiomics) based on habitat imaging to predict the treatment response of osteosarcoma patients after NAC. This study retrospectively included MRI scans (T1-weighted and T2-weighted) of osteosarcoma patients who underwent NAC and surgery at two centers between April 2015 and September 2024. Conventional radiomics (C-radiomics) features and habitat imaging-based H-radiomics features were extracted, with 2236 features obtained for each. Unsupervised reproducibility feature correlation analysis and the least absolute shrinkage and selection operator (LASSO) were used for feature selection, which resulted in 5 features being selected. Support vector machines (SVM) served as the classifier. C-radiomics, H-radiomics, and combined models were developed, and their performance was evaluated using the area under the receiver operating characteristic curve (AUC). The training set included 57 patients (mean age, 17 years ± 11 [SD], 29 men) from Center 1, while the external validation set included 48 patients (mean age, 18 years ± 14 [SD], 28 men) from Center 2. In the external test set, the H-radiomics model achieved an AUC of 0.86, outperforming the C-radiomics model, which had an AUC of 0.79. The combined model demonstrated the best performance, with an AUC of 0.91. Additionally, the combined model achieved an accuracy of 85%, sensitivity of 88%, and specificity of 83%. The combined model of H-radiomics and C-radiomics from multiparametric MRI demonstrates good performance in predicting the treatment response after NAC in osteosarcoma patients.
Large Language Models for the Differentiation of Benign and Malignant Liver Nodules based on Multimodal Prompts in Liver US Cases.
Read abstract
Large language models (LLMs) that can process both images and text are increasingly being used in radiology. This study aimed to evaluate the performance of LLMs including GPT-4 Omni (GPT-4o), Claude-3.5-Sonnet (Claude), and Gemini 1.5 Pro (Gemini) in differentiating benign and malignant nodules in liver US cases and compare it with that of human readers. Four hundred liver US cases with pathologically confirmed liver nodules visible on B-mode US from January 2020 to November 2024 were randomly selected in this retrospective study. They were divided into a development set (n = 100) and a test set (n = 300). Five prompt groups for LLMs including US image [I-only], image description [D-only], image and description [I+D], image and liver US e-textbook [I+T], and image and medical history [I+H] were evaluated to identify the optimal input in development set. In test set, accuracy of LLMs in differentiating benign and malignant liver nodules was compared with that of human readers using McNemar's test. In development set, the prompt group I+H for all LLMs exhibited the highest diagnostic accuracy in differentiating benign and malignant liver nodules, being considering as the optimal input (taking GPT-4o as an example, with I-only, 57.0% [as reference]; D-only, 62.0%, p = 0.55; I+D, 62.0%, p = 0.54; I+T, 62.0%, p = 0.36; I+H, 77.0%, p = 0.01). In test set, LLMs with I+H outperformed junior group and showed similar accuracy to senior group (Junior, 70.0% [as reference1]; Senior, 78.3% [as reference2]; GPT-4o, 83.3%, P1 < .001, P2 = .10; Claude, 77.0%, p1 = 0.04, p2 = 0.72; Gemini, 75.3%, p1 = 0.14, p2 = 0.36). Large language models with US image and medical history inputs achieved accuracy comparable to senior radiologists and superior to junior radiologists in differentiating benign and malignant liver nodules.
Discovery of 1H-pyrazolo[3,4-d]pyrimidin-4-ylamine derivatives as potent PI3Kδ/BTK dual-target inhibitors for the treatment of B-cell lymphoma.
Read abstract
B-cell lymphoma (BCL) is a hematological system malignant tumor with a relatively high incidence, and PI3Kδ and BTK play an important role in the development of BCL. In the preliminary investigation, we found that when the PI3K inhibitor and the BTK inhibitor were used in combination, the therapeutic effect was greater than that of single-drug administration at both cell and animal levels. Therefore, dual-target inhibitors of PI3Kδ and BTK were expected to potentially achieve improved therapeutic window for BCL. Here, we designed and synthesized 30 compounds, among which compound 27 showed high inhibitory activity against both targets at the kinase level (IC50-PI3Kδ = 9.0 nM, IC50-BTK = 17.3 nM). Furthermore, at the cellular level, the inhibitory activity of 27 against JeKo-1 and H9 cells (IC50-JeKo-1 = 1.6 μM, IC50-H9 = 5.8 μM) was comparable to or exceeded that of the positive drug alone and in combination. Western blot analysis confirmed that compound 27 potently suppressed phosphorylation of BTK, PI3Kδ and their downstream effectors. In addition, compound 27 showed reduced cytotoxicity in H9c2 cardiomyocytes (LD50 = 247.3 μM) compared to the positive. Preliminary pharmacokinetic studies in rats revealed favorable plasma exposure profiles. These preliminary results collectively identified compound 27 as a promising lead candidate for further development against BCL.
Radiomics Applicability Domain Analysis Classification Framework (RADAN-CF): A method for evaluating prediction reliability in radiomics.
Read abstract
Radiomics-based machine learning models hold promise for clinical decision support, yet their deployment may be limited by the lack of transparent, prediction-level reliability assessment, especially under distributional shift. Existing uncertainty estimation methods mainly operate in probability space and may fail to identify unreliable predictions when test samples differ structurally or functionally from the training data. To address this gap, we propose the Radiomics Applicability Domain ANalysis - Classification Framework (RADANCF), a diagnostic approach for assessing the reliability of individual predictions in radiomics classification. RADANCF integrates six binary reliability criteria spanning two domains: data representativeness (A-C), describing the relationship between test samples and the training data manifold, and model behavior (D-F), capturing local inconsistencies in predictive responses. Criteria violations are aggregated into ordered reliability categories summarized using a qualitative traffic-light scheme. The framework was evaluated on six public radiomics datasets using five machine learning classifiers, resulting in 900 model configurations trained under a dissimilarity-based stratified partitioning strategy designed to challenge model generalization. Analyses included prediction-level error modeling, multiway ANOVA, correlation analysis between criteria, and assessment of frequently violated criterion combinations. External validation was performed on an independent cohort of 2689 prostate cancer patients from the ProCAncer-I project. Prediction error was significantly associated with RADANCF category, although the relationship was not strictly monotonic, with intermediate categories showing the largest error contributions. RADANCF criteria were largely complementary, as shown by low pairwise Spearman correlations (only 7.5% of cases with correlations higher than 0.5; p < 0.001). Multiway ANOVA confirmed RADANCF category as a significant factor after controlling for dataset and model effects (p < 10⁻¹²). Specific combinations of broken criteria-particularly A, B, C, and E-were significantly overrepresented among higher-error predictions (Wilcoxon test, p < 0.001). In external validation, correct predictions appeared across all traffic-light categories, confirming the diagnostic and risk-oriented nature of RADANCF. RADANCF provides a transparent, per-prediction diagnostic framework for assessing reliability in radiomics classification under distributional shift. By jointly accounting for data representativeness and model behavior, it complements traditional performance and uncertainty metrics and supports more cautious model deployment in radiomics-based models.
An intelligent fusion model for Ki-67 prediction in non-small cell lung cancer: A cloud-based prediction system integrating radiomics.
Read abstract
The expression level of Ki-67 affects the prognosis of NSCLC patients. Accurate preoperative prediction of Ki-67 expression in non-small cell lung cancer (NSCLC) is crucial for prognostic stratification. This multicenter retrospective study enrolled 876 NSCLC patients (January 2015-December 2024) from four institutions, randomly divided into training (n = 525), testing (n = 175), and external validation (n = 176) sets. Radiomic features were extracted from intratumoral and peritumoral (0-12 mm) regions on CT images to construct intra-, peri-, and combined (intra + peri) radiomic scores (Rad-score). Deep learning scores (DL-score) were generated using ResNet101 for whole-lung and tumor-specific analyses. A random forest model integrating Rad-scores, DL-scores, and clinical parameters (lobulation, emphysema, etc.) was developed and validated across all datasets. The combined model (intra + peri Rad-score, intra-tumor DL-score, and clinical features) achieved AUCs of 0.98 (95% CI: 0.97-0.99), 0.92 (0.88-0.96), and 0.92 (0.87-0.96) in training, testing, and external validation sets, with corresponding F1-scores of 0.90, 0.75, and 0.70. SHAP interpretation identified intra-tumor DL-score as the most significant predictor (feature contribution: 46.8%). The multimodal random forest model enables noninvasive and accurate Ki-67 prediction in NSCLC, demonstrating superior generalizability and interpretability to guide personalized therapeutic strategies. Integrating deep learning with intratumoral and peritumoral radiomics enhances the preoperative prediction of Ki-67 expression in patients with non-small cell lung cancer.
Artificial intelligence in clinical oncology: Multimodal integration and translational development.
Read abstract
Artificial intelligence (AI) is rapidly reshaping clinical oncology, as cancer care increasingly relies on integrating heterogeneous data streams spanning radiology, digital pathology, genomics, and longitudinal electronic health records. However, the sheer complexity and fragmentation of these multimodal inputs remain a major bottleneck for achieving truly personalized cancer management. Recent advances in AI, including foundation models, synthetic data generation, large language models, and agents, are enabling more robust representation learning, cross-modal reasoning, and clinically actionable decision support beyond what traditional single-modality systems can provide. AI-powered platforms are now accelerating molecular subtyping, refining risk stratification, and supporting individualized therapeutic recommendations by jointly modeling imaging, tissue architecture, and molecular landscapes. Moreover, emerging virtual cell and mechanistic foundation frameworks introduce a new computational paradigm for simulating cellular responses and drug-tumor interactions, offering predictive insights for treatment design and drug discovery. Despite these breakthroughs, critical challenges persist, including limited generalizability across patient populations and centers, insufficient prospective validation, regulatory uncertainty, scalability constraints, and ethical concerns surrounding fairness, transparency, and privacy. In this review, we synthesize the latest progress in multimodal oncology AI through a translational lens, emphasizing methodological trade-offs, validation readiness, and responsible deployment frameworks. We highlight how AI is moving from performance-driven benchmarking toward clinically trustworthy precision cancer care, with transformative implications for early detection, diagnosis, therapy optimization, drug development, and clinical trial design.