Log in to save searches and build a personal reading queue.
Find the papers that actually matter
Search by concept, cancer type, source, or modeling approach. Every result is presented in a cleaner, review-friendly layout with summaries and direct access to the abstract.
Region-guided decoupled fusion network for ultrasound-based classification of thyroid nodules with and without Hashimoto's thyroiditis.
Read abstract
Differentiating benign from malignant thyroid nodules is particularly challenging in patients with Hashimoto's thyroiditis (HT), where inflammatory changes can mimic cancer. We developed a region-guided decoupled fusion network (DFNet) that explicitly models intra- and peri-nodular transitions in both HT and non-HT nodules. By improving classification balance and interpretability, DFNet may help reduce unnecessary biopsies while preserving reliable detection of malignancy. In this multicenter retrospective study, 8667 patients (13,680 ultrasound images) from nine institutions were included. Nodules were confirmed histopathologically after surgery. Regions of interest (ROIs) representing intra- and peri-nodular areas were manually segmented, expanded/shrunk in fixed pixel increments, and normalized. A total of 1578 radiomic features were extracted from each ROI. DFNet employed a Swin Transformer backbone to obtain regional features, orthogonal constraint-based decomposition to separate common and region-specific representations, and HT-specific fusion before classification. Interpretability was achieved via Shapley Additive Explanations (SHAP) and correlation of deep features with radiomic descriptors. Performance was compared with 10 state-of-the-art architectures using accuracy (ACC), Matthews correlation coefficient (MCC), and area under the receiver operating characteristic curve (AUC). Statistical significance was assessed using the DeLong test and t tests with Bonferroni correction. DFNet achieved the best results in validation (ACC 91.9%, MCC 76.4%, AUC 91.4%) and testing cohorts (ACC 93.6%, MCC 83.0%, AUC 92.4%), significantly outperforming alternatives (p<0.05). Peri-nodular features improved MCC by up to 12.9%, decoupled fusion by 6.1-9.0%, and HT-specific adaptation by 2.9-5.4%. SHAP highlighted biomarkers (e.g., GLDM-LDHGLE, LBP-2D-FO-TE, OFK) with HT-dependent patterns. DFNet improves thyroid nodule classification by modeling intra- to peri-nodular transitions and linking deep features with radiomic biomarkers, enabling more accurate and interpretable predictions that may help reduce unnecessary fine-needle aspiration biopsies.
Breast area affects the performance of a commercial artificial intelligence algorithm assessment of negative digital breast tomosynthesis exams.
Read abstract
To understand whether cancer-neutral image attributes (breast area and number of slices) impact an AI algorithm assessment of negative digital breast tomosynthesis (DBT) screening exams. This retrospective cohort study included women from a single institution whose screening mammogram was interpreted as negative between 2016 and 2019. All patients had at least 2 years follow-up without evidence of malignancy. Primary outcome measures were AI-calculated assessment of present and future likelihood of malignancy, quantified as a case and risk score. A multivariable linear regression model evaluated the relationship between patient demographics (age, race/ethnicity), image size (breast area, number of slices), and AI algorithm outputs (breast density, case score, risk score). There were 4842 female patients included in the study (mean age 55.0 ± 10.6 years). For case score, there was a positive association with breast area (p < 0.0001), as well as older age, breast density (scattered vs fatty), and race (White vs Asian and Black vs White, all p < 0.05). For risk score, there was also a positive association with breast area (p < 0.001), as well as older age, breast density (scattered vs fatty, heterogeneously dense vs scattered, extremely dense vs heterogeneously dense), and race (White vs Asian, all p < 0.05). Number of DBT slices was not significantly associated with either case or risk scores. Known breast cancer risk factors and one neutral characteristic (breast area), significantly impacted an AI algorithm's assessment of present and future likelihood of malignancy.
Habitat-based MRI heterogeneity radiomics for predicting neoadjuvant chemotherapy response in osteosarcoma.
Read abstract
Osteosarcoma is a highly heterogeneous malignant tumor with varied responses to neoadjuvant chemotherapy (NAC). This study developed heterogeneity radiomics (H-radiomics) based on habitat imaging to predict the treatment response of osteosarcoma patients after NAC. This study retrospectively included MRI scans (T1-weighted and T2-weighted) of osteosarcoma patients who underwent NAC and surgery at two centers between April 2015 and September 2024. Conventional radiomics (C-radiomics) features and habitat imaging-based H-radiomics features were extracted, with 2236 features obtained for each. Unsupervised reproducibility feature correlation analysis and the least absolute shrinkage and selection operator (LASSO) were used for feature selection, which resulted in 5 features being selected. Support vector machines (SVM) served as the classifier. C-radiomics, H-radiomics, and combined models were developed, and their performance was evaluated using the area under the receiver operating characteristic curve (AUC). The training set included 57 patients (mean age, 17 years ± 11 [SD], 29 men) from Center 1, while the external validation set included 48 patients (mean age, 18 years ± 14 [SD], 28 men) from Center 2. In the external test set, the H-radiomics model achieved an AUC of 0.86, outperforming the C-radiomics model, which had an AUC of 0.79. The combined model demonstrated the best performance, with an AUC of 0.91. Additionally, the combined model achieved an accuracy of 85%, sensitivity of 88%, and specificity of 83%. The combined model of H-radiomics and C-radiomics from multiparametric MRI demonstrates good performance in predicting the treatment response after NAC in osteosarcoma patients.
Large Language Models for the Differentiation of Benign and Malignant Liver Nodules based on Multimodal Prompts in Liver US Cases.
Read abstract
Large language models (LLMs) that can process both images and text are increasingly being used in radiology. This study aimed to evaluate the performance of LLMs including GPT-4 Omni (GPT-4o), Claude-3.5-Sonnet (Claude), and Gemini 1.5 Pro (Gemini) in differentiating benign and malignant nodules in liver US cases and compare it with that of human readers. Four hundred liver US cases with pathologically confirmed liver nodules visible on B-mode US from January 2020 to November 2024 were randomly selected in this retrospective study. They were divided into a development set (n = 100) and a test set (n = 300). Five prompt groups for LLMs including US image [I-only], image description [D-only], image and description [I+D], image and liver US e-textbook [I+T], and image and medical history [I+H] were evaluated to identify the optimal input in development set. In test set, accuracy of LLMs in differentiating benign and malignant liver nodules was compared with that of human readers using McNemar's test. In development set, the prompt group I+H for all LLMs exhibited the highest diagnostic accuracy in differentiating benign and malignant liver nodules, being considering as the optimal input (taking GPT-4o as an example, with I-only, 57.0% [as reference]; D-only, 62.0%, p = 0.55; I+D, 62.0%, p = 0.54; I+T, 62.0%, p = 0.36; I+H, 77.0%, p = 0.01). In test set, LLMs with I+H outperformed junior group and showed similar accuracy to senior group (Junior, 70.0% [as reference1]; Senior, 78.3% [as reference2]; GPT-4o, 83.3%, P1 < .001, P2 = .10; Claude, 77.0%, p1 = 0.04, p2 = 0.72; Gemini, 75.3%, p1 = 0.14, p2 = 0.36). Large language models with US image and medical history inputs achieved accuracy comparable to senior radiologists and superior to junior radiologists in differentiating benign and malignant liver nodules.
Design, synthesis, and Lead optimization of novel Quinazoline-based FLT3 inhibitors with potent anti-acute myelogenous leukemia activity.
Read abstract
FLT3 mutations, including internal tandem duplications (ITD) and tyrosine kinase domain (TKD) variants, are key drivers of acute myeloid leukemia (AML) and represent attractive therapeutic targets. Guided by a scaffold-hopping strategy based on G-749 (denfivontinib), a series of quinazoline-based derivatives was designed and synthesized to explore structure-activity relationships (SAR). Among them, compound W4 showed the most promising profile, exhibiting potent antiproliferative activity against MV4-11 and MOLM-13 cells and strong inhibition of FLT3-ITD (IC50 = 16.0 nM) and FLT3-D835Y (IC50 = 20.4 nM), while displaying negligible activity toward c-KIT kinase (IC50 > 100 μM). Mechanism studies indicated that W4 induced G0/G1 cell cycle arrest and apoptosis, accompanied by a reduction in intracellular reactive oxygen species levels and a loss of mitochondrial membrane potential. Collectively, these results identified W4 as a potent FLT3 inhibitor and provided valuable SAR insights for further scaffold optimization.
Design, synthesis, and anti-tumor evaluation of indolin-2-one derivatives based on 3D-QSAR.
Read abstract
Tropomyosin receptor kinase (TRK) plays a critical role in tumorigenesis, and its aberrant activation is strongly implicated in cancer progression and metastasis. In this study, a series of novel indoline-2-one derivatives were designed and synthesized using a 3D-QSAR-guided approach targeting TRK. The antitumor potential of these compounds was evaluated through a panel of in vitro bioassays. Several derivatives were found to reduce TRK phosphorylation levels in a cellular context and exhibited pronounced anti-proliferative and anti-migratory effects. Among them, compound IIIc demonstrated potent activity against HepG2 hepatocellular carcinoma cells, with an IC₅₀ value of 2.06 μM. Furthermore, ELISA-based phosphorylation assays revealed that treatment with compound IIIc resulted in decrease in TRK phosphorylation, yielding an IC₅₀ of 0.19 μM, highlighting its therapeutic promise. Collectively, this study provides experimental evidence and a structural basis for the development of lead compounds capable of modulating TRK signaling in cancer therapy.
Diagnostic Accuracy of Ultrasound Radiomics for Cervical Lymph-Node Metastasis in Papillary Thyroid Carcinoma: Evidence Predominantly From Chinese Cohorts.
Read abstract
Pre-operative identification of cervical lymph-node metastasis (LNM) guides surgical extent in papillary thyroid carcinoma (PTC) but remains imperfect with conventional ultrasound. To quantify the diagnostic accuracy of ultrasound-based radiomics for predicting cervical LNM in PTC and to evaluate methodological quality and standardization across published studies." Six databases were searched to 8 March 2025 (PROSPERO CRD420252XXXX). Two reviewers independently screened records, extracted data, and assessed quality with QUADAS-2 and the Radiomics Quality Score. Pooled sensitivity, specificity, diagnostic odds ratio, and hierarchical summary area under the ROC curve (AUC) were calculated using bivariate random-effects models. Subgroup/meta-regression explored nodal compartment, modelling pipeline, and validation strategy; publication bias was examined with Deeks' funnel plot and trim-and-fill. Sixty studies (10,852 patients; 4,716 with LNM) met inclusion. Radiomics-only models achieved pooled sensitivity 0.72 (95% CI: 0.66-0.78), specificity 0.81 (0.74-0.86) and AUC = 0.83 (0.79-0.86). Adding clinical variables raised sensitivity to 0.79 and AUC to 0.88, but the gain was not significant (ΔAUC = 0.05; p = 0.33). Machine-learning pipelines outperformed deep learning (AUC = 0.86 vs. 0.79), and accuracy was highest for lateral nodes (AUC = 0.94). External-validation cohorts showed lower performance (AUC = 0.79). Heterogeneity was high (I² > 80%) yet estimates were robust after bias adjustment. Most included studies originated from China, which may limit generalizability to other populations. Ultrasound radiomics provides good non-invasive accuracy for cervical nodal staging in PTC-especially for lateral compartments-though its advantage over routine clinical variables is modest. Multi-center prospective studies using standardized acquisition and reporting are needed before routine clinical adoption.
Actinic keratosis staging in multimodal image data.
Read abstract
Actinic Keratosis (AK) is a common skin condition, usually appearing on sun-exposed areas, whose progression is associated with characteristic dermatoscopic and structural changes. Early detection of AK is crucial, as cancer progression may occur in changed skin. This study aimed to develop a multimodal, machine-learning-based framework combining dermatoscopic and high-frequency ultrasound (HFUS) data to automatically stage AK and identify early lesions. A dataset containing 222 pairs of dermatoscopic and HFUS images was clinically evaluated using the 3-point Zalaudek scale. Dermatoscopic images underwent ROI selection, hair removal, and extensive feature extraction (color, erythema, pigmentation, vessels, scales, pixel intensities, GLCM/LBP texture). HFUS images were divided into entry echo, sub-epidermal low-echoic band (SLEB), and dermis using a deep neural network, and then features describing the morphology and structure of the skin for each layer were extracted. A pre-trained EfficientNet network was used for feature extraction. Logistic Regression, k-Nearest Neighbors, Random Forests, Support Vector Machines and Multilayer Perceptrons with Sequential Feature Selection using 5-fold patient-wise cross-validation were used for feature-based classification. Additionally, multimodal TwinCNN was evaluated, with various pre-trained models as feature extractors. Combining dermatoscopic and HFUS features consistently outperformed single-modality models. Depending on the defined task, the models achieved over 80% accuracy (healthy, AK1-AK3), 78% (AK1-AK3), and almost 90% in the case of early AK detection vs. healthy and advanced AK on multimodal features. The TwinCNN model performed worse than classical machine-learning approaches, likely due to the limited size of the dataset and class imbalance. A multimodal framework integrating dermatoscopic and HFUS imaging enables accurate AK classification, surpassing single-modality approaches. Future work should expand multicenter datasets, improve automation of pre-processing steps, and explore enhanced neural multimodal fusion architectures.
An intelligent fusion model for Ki-67 prediction in non-small cell lung cancer: A cloud-based prediction system integrating radiomics.
Read abstract
The expression level of Ki-67 affects the prognosis of NSCLC patients. Accurate preoperative prediction of Ki-67 expression in non-small cell lung cancer (NSCLC) is crucial for prognostic stratification. This multicenter retrospective study enrolled 876 NSCLC patients (January 2015-December 2024) from four institutions, randomly divided into training (n = 525), testing (n = 175), and external validation (n = 176) sets. Radiomic features were extracted from intratumoral and peritumoral (0-12 mm) regions on CT images to construct intra-, peri-, and combined (intra + peri) radiomic scores (Rad-score). Deep learning scores (DL-score) were generated using ResNet101 for whole-lung and tumor-specific analyses. A random forest model integrating Rad-scores, DL-scores, and clinical parameters (lobulation, emphysema, etc.) was developed and validated across all datasets. The combined model (intra + peri Rad-score, intra-tumor DL-score, and clinical features) achieved AUCs of 0.98 (95% CI: 0.97-0.99), 0.92 (0.88-0.96), and 0.92 (0.87-0.96) in training, testing, and external validation sets, with corresponding F1-scores of 0.90, 0.75, and 0.70. SHAP interpretation identified intra-tumor DL-score as the most significant predictor (feature contribution: 46.8%). The multimodal random forest model enables noninvasive and accurate Ki-67 prediction in NSCLC, demonstrating superior generalizability and interpretability to guide personalized therapeutic strategies. Integrating deep learning with intratumoral and peritumoral radiomics enhances the preoperative prediction of Ki-67 expression in patients with non-small cell lung cancer.
Towards transparent and interpretable screening: multi-biofluid FTIR spectroscopy with LLM-Augmented explainability for pancreatic cancer detection.
Read abstract
Early detection of pancreatic cancer remains a critical challenge in oncology, with current diagnostic methods often failing to identify the disease until advanced stages. However, diagnostic accuracy alone may be insufficient for clinical adoption as regulatory frameworks and clinical workflows increasingly demand transparent, interpretable AI systems. This study investigates Fourier Transform Infrared (FTIR) spectroscopy combined with machine learning for non-invasive pancreatic cancer detection using urine and blood biofluids, augmented by a language-model-assisted transparency framework to bridge spectral feature attributions and biochemical interpretation. Five datasets were evaluated: urine ATR-FTIR (61.7% balanced accuracy), urine transmission FTIR (74.8%), filtered blood (<10 kDa; 89.8%), and two matched urine-blood fusion datasets. Transmission-mode urine combined with filtered blood achieved the highest performance (96.9% balanced accuracy), exceeding either biofluid alone. To support transparency, we developed an LLM-augmented explainability pipeline incorporating Monte Carlo Tree Search (MCTS) for structured hypothesis exploration, a curated retrieval-augmented knowledge base (RAG), and reliability-gated explanations that acknowledge disagreement between feature attribution methods. Explainability methods showed substantial disagreement (mean Spearman ρ = 0.23-0.28), motivating a tiered strategy: wavenumber-level interpretation when methods agree (ρ ≥ 0.3, with knowledge base verification) and zone-level interpretation otherwise. These results highlight both the potential and current limitations of transparent spectroscopic diagnostics.