Research Papers

PUBMED Cancer: thyroid cancer Method: Swin Transformer

Region-guided decoupled fusion network for ultrasound-based classification of thyroid nodules with and without Hashimoto's thyroiditis.

Jing Wen, Qijian Chen, Lijuan Luo, Hongqing Ma, Caihong Wang, Yuyu Hua, Dan Qin, Jinge Zhou, Ying Yang, Tingting Shen, Limei Liu, Juang Wen, Lihui Wang, Shi Zhou, Zhu Zeng
Published 2026-07-01 00:00

This study presents a region-guided decoupled fusion network (DFNet) designed to classify thyroid nodules in patients with and without Hashimoto's thyroiditis. The method enhances classification balance and interpretability, aiming to reduce unnecessary biopsies while maintaining reliable malignancy detection. DFNet demonstrated superior performance compared to ten state-of-the-art architectures, achieving high accuracy and area under the curve metrics in both validation and testing cohorts.

Read abstract

Differentiating benign from malignant thyroid nodules is particularly challenging in patients with Hashimoto's thyroiditis (HT), where inflammatory changes can mimic cancer. We developed a region-guided decoupled fusion network (DFNet) that explicitly models intra- and peri-nodular transitions in both HT and non-HT nodules. By improving classification balance and interpretability, DFNet may help reduce unnecessary biopsies while preserving reliable detection of malignancy. In this multicenter retrospective study, 8667 patients (13,680 ultrasound images) from nine institutions were included. Nodules were confirmed histopathologically after surgery. Regions of interest (ROIs) representing intra- and peri-nodular areas were manually segmented, expanded/shrunk in fixed pixel increments, and normalized. A total of 1578 radiomic features were extracted from each ROI. DFNet employed a Swin Transformer backbone to obtain regional features, orthogonal constraint-based decomposition to separate common and region-specific representations, and HT-specific fusion before classification. Interpretability was achieved via Shapley Additive Explanations (SHAP) and correlation of deep features with radiomic descriptors. Performance was compared with 10 state-of-the-art architectures using accuracy (ACC), Matthews correlation coefficient (MCC), and area under the receiver operating characteristic curve (AUC). Statistical significance was assessed using the DeLong test and t tests with Bonferroni correction. DFNet achieved the best results in validation (ACC 91.9%, MCC 76.4%, AUC 91.4%) and testing cohorts (ACC 93.6%, MCC 83.0%, AUC 92.4%), significantly outperforming alternatives (p<0.05). Peri-nodular features improved MCC by up to 12.9%, decoupled fusion by 6.1-9.0%, and HT-specific adaptation by 2.9-5.4%. SHAP highlighted biomarkers (e.g., GLDM-LDHGLE, LBP-2D-FO-TE, OFK) with HT-dependent patterns. DFNet improves thyroid nodule classification by modeling intra- to peri-nodular transitions and linking deep features with radiomic biomarkers, enabling more accurate and interpretable predictions that may help reduce unnecessary fine-needle aspiration biopsies.

PUBMED Cancer: breast cancer Method: multivariable linear regression

Breast area affects the performance of a commercial artificial intelligence algorithm assessment of negative digital breast tomosynthesis exams.

Emily C Barre, Yinhao Ren, Derek L Nguyen, Joseph Y Lo, Lars J Grimm
Published 2026-07-01 00:00

This study investigates the influence of breast area and the number of slices on the performance of an AI algorithm in assessing negative digital breast tomosynthesis (DBT) exams. A retrospective cohort of 4842 women was analyzed to evaluate the relationship between demographic factors, image attributes, and AI-generated risk scores. The findings indicate that breast area positively correlates with both case and risk scores, highlighting its significance in AI assessments of malignancy likelihood.

Read abstract

To understand whether cancer-neutral image attributes (breast area and number of slices) impact an AI algorithm assessment of negative digital breast tomosynthesis (DBT) screening exams. This retrospective cohort study included women from a single institution whose screening mammogram was interpreted as negative between 2016 and 2019. All patients had at least 2 years follow-up without evidence of malignancy. Primary outcome measures were AI-calculated assessment of present and future likelihood of malignancy, quantified as a case and risk score. A multivariable linear regression model evaluated the relationship between patient demographics (age, race/ethnicity), image size (breast area, number of slices), and AI algorithm outputs (breast density, case score, risk score). There were 4842 female patients included in the study (mean age 55.0 ± 10.6 years). For case score, there was a positive association with breast area (p < 0.0001), as well as older age, breast density (scattered vs fatty), and race (White vs Asian and Black vs White, all p < 0.05). For risk score, there was also a positive association with breast area (p < 0.001), as well as older age, breast density (scattered vs fatty, heterogeneously dense vs scattered, extremely dense vs heterogeneously dense), and race (White vs Asian, all p < 0.05). Number of DBT slices was not significantly associated with either case or risk scores. Known breast cancer risk factors and one neutral characteristic (breast area), significantly impacted an AI algorithm's assessment of present and future likelihood of malignancy.

PUBMED Cancer: osteosarcoma Method: support vector machine

Habitat-based MRI heterogeneity radiomics for predicting neoadjuvant chemotherapy response in osteosarcoma.

Shuo Wang, Qingsong Wang, Xing Wan, Xianghong Meng, Man Sun, Jinglai Sun, Xuyao Yu, Guangpu Wang, Lei Zhu, Hui Yu
Published 2026-07-01 00:00

This study aimed to develop a habitat-based MRI heterogeneity radiomics (H-radiomics) approach to predict the response of osteosarcoma patients to neoadjuvant chemotherapy (NAC). By analyzing MRI scans and employing feature selection techniques, the researchers identified key features that improved prediction accuracy. The combined model of H-radiomics and conventional radiomics outperformed individual models, achieving an area under the curve (AUC) of 0.91 in predicting treatment response.

Read abstract

Osteosarcoma is a highly heterogeneous malignant tumor with varied responses to neoadjuvant chemotherapy (NAC). This study developed heterogeneity radiomics (H-radiomics) based on habitat imaging to predict the treatment response of osteosarcoma patients after NAC. This study retrospectively included MRI scans (T1-weighted and T2-weighted) of osteosarcoma patients who underwent NAC and surgery at two centers between April 2015 and September 2024. Conventional radiomics (C-radiomics) features and habitat imaging-based H-radiomics features were extracted, with 2236 features obtained for each. Unsupervised reproducibility feature correlation analysis and the least absolute shrinkage and selection operator (LASSO) were used for feature selection, which resulted in 5 features being selected. Support vector machines (SVM) served as the classifier. C-radiomics, H-radiomics, and combined models were developed, and their performance was evaluated using the area under the receiver operating characteristic curve (AUC). The training set included 57 patients (mean age, 17 years ± 11 [SD], 29 men) from Center 1, while the external validation set included 48 patients (mean age, 18 years ± 14 [SD], 28 men) from Center 2. In the external test set, the H-radiomics model achieved an AUC of 0.86, outperforming the C-radiomics model, which had an AUC of 0.79. The combined model demonstrated the best performance, with an AUC of 0.91. Additionally, the combined model achieved an accuracy of 85%, sensitivity of 88%, and specificity of 83%. The combined model of H-radiomics and C-radiomics from multiparametric MRI demonstrates good performance in predicting the treatment response after NAC in osteosarcoma patients.

PUBMED Cancer: hepatocellular carcinoma Method: large language models

Large Language Models for the Differentiation of Benign and Malignant Liver Nodules based on Multimodal Prompts in Liver US Cases.

Sijia Lin, Yu Li, Rushuang Mao, Xuebin Zou, Yixin Hu, Hongsheng Ye, Xiaojun Wu, Liang Yang, Jichong He, Shilin Lu, Lingling Li, Jianhua Zhou
Published 2026-07-01 00:00

This study evaluates the performance of large language models (LLMs) in differentiating benign and malignant liver nodules using multimodal prompts in liver ultrasound cases. The research involved 400 liver ultrasound cases, with a focus on identifying the optimal input for LLMs. Results indicated that the combination of ultrasound images and medical history provided the highest diagnostic accuracy, with LLMs achieving performance comparable to senior radiologists.

Read abstract

Large language models (LLMs) that can process both images and text are increasingly being used in radiology. This study aimed to evaluate the performance of LLMs including GPT-4 Omni (GPT-4o), Claude-3.5-Sonnet (Claude), and Gemini 1.5 Pro (Gemini) in differentiating benign and malignant nodules in liver US cases and compare it with that of human readers. Four hundred liver US cases with pathologically confirmed liver nodules visible on B-mode US from January 2020 to November 2024 were randomly selected in this retrospective study. They were divided into a development set (n = 100) and a test set (n = 300). Five prompt groups for LLMs including US image [I-only], image description [D-only], image and description [I+D], image and liver US e-textbook [I+T], and image and medical history [I+H] were evaluated to identify the optimal input in development set. In test set, accuracy of LLMs in differentiating benign and malignant liver nodules was compared with that of human readers using McNemar's test. In development set, the prompt group I+H for all LLMs exhibited the highest diagnostic accuracy in differentiating benign and malignant liver nodules, being considering as the optimal input (taking GPT-4o as an example, with I-only, 57.0% [as reference]; D-only, 62.0%, p = 0.55; I+D, 62.0%, p = 0.54; I+T, 62.0%, p = 0.36; I+H, 77.0%, p = 0.01). In test set, LLMs with I+H outperformed junior group and showed similar accuracy to senior group (Junior, 70.0% [as reference1]; Senior, 78.3% [as reference2]; GPT-4o, 83.3%, P1 < .001, P2 = .10; Claude, 77.0%, p1 = 0.04, p2 = 0.72; Gemini, 75.3%, p1 = 0.14, p2 = 0.36). Large language models with US image and medical history inputs achieved accuracy comparable to senior radiologists and superior to junior radiologists in differentiating benign and malignant liver nodules.

PUBMED Cancer: acute myeloid leukemia Method: unknown

Design, synthesis, and Lead optimization of novel Quinazoline-based FLT3 inhibitors with potent anti-acute myelogenous leukemia activity.

Wei Liu, Shuaibo Du, Miaomiao Wang, Shuhan Sun, Lei Wang, Jin Liu, Shengzheng Wang
Published 2026-07-01 00:00

This study focuses on the design and synthesis of novel quinazoline-based derivatives aimed at inhibiting FLT3 mutations, which are significant drivers of acute myeloid leukemia (AML). The compound W4 demonstrated strong antiproliferative activity against specific AML cell lines and effectively inhibited FLT3 mutations. The research provides insights into structure-activity relationships that can guide further optimization of these inhibitors.

Read abstract

FLT3 mutations, including internal tandem duplications (ITD) and tyrosine kinase domain (TKD) variants, are key drivers of acute myeloid leukemia (AML) and represent attractive therapeutic targets. Guided by a scaffold-hopping strategy based on G-749 (denfivontinib), a series of quinazoline-based derivatives was designed and synthesized to explore structure-activity relationships (SAR). Among them, compound W4 showed the most promising profile, exhibiting potent antiproliferative activity against MV4-11 and MOLM-13 cells and strong inhibition of FLT3-ITD (IC50 = 16.0 nM) and FLT3-D835Y (IC50 = 20.4 nM), while displaying negligible activity toward c-KIT kinase (IC50 > 100 μM). Mechanism studies indicated that W4 induced G0/G1 cell cycle arrest and apoptosis, accompanied by a reduction in intracellular reactive oxygen species levels and a loss of mitochondrial membrane potential. Collectively, these results identified W4 as a potent FLT3 inhibitor and provided valuable SAR insights for further scaffold optimization.

PUBMED Cancer: hepatocellular carcinoma Method: 3D-QSAR

Design, synthesis, and anti-tumor evaluation of indolin-2-one derivatives based on 3D-QSAR.

Xingdan Wang, Liying Zhang, Ziqi He, Xinyue Li, Jing Bai, Qidi Zhong
Published 2026-07-01 00:00

This study focuses on the design and synthesis of novel indoline-2-one derivatives aimed at targeting the tropomyosin receptor kinase (TRK) involved in cancer progression. Using a 3D-QSAR-guided approach, the antitumor potential of these compounds was evaluated through various in vitro bioassays. Notably, compound IIIc exhibited significant anti-proliferative effects against HepG2 hepatocellular carcinoma cells, demonstrating its potential as a therapeutic agent.

Read abstract

Tropomyosin receptor kinase (TRK) plays a critical role in tumorigenesis, and its aberrant activation is strongly implicated in cancer progression and metastasis. In this study, a series of novel indoline-2-one derivatives were designed and synthesized using a 3D-QSAR-guided approach targeting TRK. The antitumor potential of these compounds was evaluated through a panel of in vitro bioassays. Several derivatives were found to reduce TRK phosphorylation levels in a cellular context and exhibited pronounced anti-proliferative and anti-migratory effects. Among them, compound IIIc demonstrated potent activity against HepG2 hepatocellular carcinoma cells, with an IC₅₀ value of 2.06 μM. Furthermore, ELISA-based phosphorylation assays revealed that treatment with compound IIIc resulted in decrease in TRK phosphorylation, yielding an IC₅₀ of 0.19 μM, highlighting its therapeutic promise. Collectively, this study provides experimental evidence and a structural basis for the development of lead compounds capable of modulating TRK signaling in cancer therapy.

PUBMED Cancer: papillary thyroid carcinoma Method: radiomics

Diagnostic Accuracy of Ultrasound Radiomics for Cervical Lymph-Node Metastasis in Papillary Thyroid Carcinoma: Evidence Predominantly From Chinese Cohorts.

Sara S Nabavizadeh, Farima Safari, Seyed Reza Abdipour Mehrian, Aref Ghanaatpisheh, Alireza Abbaspour, Mohammad Matin Karbalaee Alinazari, Mahtab Setayesh, Ali Nabavizadeh
Published 2026-07-01 00:00

This study evaluates the diagnostic accuracy of ultrasound-based radiomics for predicting cervical lymph-node metastasis (LNM) in patients with papillary thyroid carcinoma (PTC). The analysis included 60 studies with a total of 10,852 patients, revealing that radiomics-only models achieved a pooled sensitivity of 72% and specificity of 81%. The findings suggest that while ultrasound radiomics offers good accuracy for cervical nodal staging, its advantage over conventional clinical variables is limited. The study highlights the need for standardized methodologies in future research.

Read abstract

Pre-operative identification of cervical lymph-node metastasis (LNM) guides surgical extent in papillary thyroid carcinoma (PTC) but remains imperfect with conventional ultrasound. To quantify the diagnostic accuracy of ultrasound-based radiomics for predicting cervical LNM in PTC and to evaluate methodological quality and standardization across published studies." Six databases were searched to 8 March 2025 (PROSPERO CRD420252XXXX). Two reviewers independently screened records, extracted data, and assessed quality with QUADAS-2 and the Radiomics Quality Score. Pooled sensitivity, specificity, diagnostic odds ratio, and hierarchical summary area under the ROC curve (AUC) were calculated using bivariate random-effects models. Subgroup/meta-regression explored nodal compartment, modelling pipeline, and validation strategy; publication bias was examined with Deeks' funnel plot and trim-and-fill. Sixty studies (10,852 patients; 4,716 with LNM) met inclusion. Radiomics-only models achieved pooled sensitivity 0.72 (95% CI: 0.66-0.78), specificity 0.81 (0.74-0.86) and AUC = 0.83 (0.79-0.86). Adding clinical variables raised sensitivity to 0.79 and AUC to 0.88, but the gain was not significant (ΔAUC = 0.05; p = 0.33). Machine-learning pipelines outperformed deep learning (AUC = 0.86 vs. 0.79), and accuracy was highest for lateral nodes (AUC = 0.94). External-validation cohorts showed lower performance (AUC = 0.79). Heterogeneity was high (I² > 80%) yet estimates were robust after bias adjustment. Most included studies originated from China, which may limit generalizability to other populations. Ultrasound radiomics provides good non-invasive accuracy for cervical nodal staging in PTC-especially for lateral compartments-though its advantage over routine clinical variables is modest. Multi-center prospective studies using standardized acquisition and reporting are needed before routine clinical adoption.

PUBMED Cancer: actinic keratosis Method: multimodal learning

Actinic keratosis staging in multimodal image data.

Anna Slian, Katarzyna Korecka, Adriana Polańska, Joanna Czajkowska
Published 2026-07-01 00:00

This study developed a multimodal machine-learning framework to stage Actinic Keratosis (AK) using dermatoscopic and high-frequency ultrasound (HFUS) data. The framework achieved over 80% accuracy in classifying different stages of AK and nearly 90% accuracy in detecting early lesions. The results indicate that combining features from both imaging modalities outperforms single-modality models.

Read abstract

Actinic Keratosis (AK) is a common skin condition, usually appearing on sun-exposed areas, whose progression is associated with characteristic dermatoscopic and structural changes. Early detection of AK is crucial, as cancer progression may occur in changed skin. This study aimed to develop a multimodal, machine-learning-based framework combining dermatoscopic and high-frequency ultrasound (HFUS) data to automatically stage AK and identify early lesions. A dataset containing 222 pairs of dermatoscopic and HFUS images was clinically evaluated using the 3-point Zalaudek scale. Dermatoscopic images underwent ROI selection, hair removal, and extensive feature extraction (color, erythema, pigmentation, vessels, scales, pixel intensities, GLCM/LBP texture). HFUS images were divided into entry echo, sub-epidermal low-echoic band (SLEB), and dermis using a deep neural network, and then features describing the morphology and structure of the skin for each layer were extracted. A pre-trained EfficientNet network was used for feature extraction. Logistic Regression, k-Nearest Neighbors, Random Forests, Support Vector Machines and Multilayer Perceptrons with Sequential Feature Selection using 5-fold patient-wise cross-validation were used for feature-based classification. Additionally, multimodal TwinCNN was evaluated, with various pre-trained models as feature extractors. Combining dermatoscopic and HFUS features consistently outperformed single-modality models. Depending on the defined task, the models achieved over 80% accuracy (healthy, AK1-AK3), 78% (AK1-AK3), and almost 90% in the case of early AK detection vs. healthy and advanced AK on multimodal features. The TwinCNN model performed worse than classical machine-learning approaches, likely due to the limited size of the dataset and class imbalance. A multimodal framework integrating dermatoscopic and HFUS imaging enables accurate AK classification, surpassing single-modality approaches. Future work should expand multicenter datasets, improve automation of pre-processing steps, and explore enhanced neural multimodal fusion architectures.

PUBMED Cancer: non-small cell lung cancer Method: random forest

An intelligent fusion model for Ki-67 prediction in non-small cell lung cancer: A cloud-based prediction system integrating radiomics.

Zhenyu Cao, Xiaoling Xu, Guoqun Mao, Feng Cui, Zhongfeng Niu, Zongyu Xie, Hengfeng Shi, Cheng Yan, Jian Wang
Published 2026-07-01 00:00

This study presents a cloud-based prediction system for Ki-67 expression in non-small cell lung cancer (NSCLC) using a multimodal random forest model. The model integrates radiomic features extracted from CT images and deep learning scores generated by ResNet101. The combined approach achieved high AUCs and F1-scores across training, testing, and validation sets, demonstrating its potential for accurate preoperative prediction and personalized therapeutic strategies.

Read abstract

The expression level of Ki-67 affects the prognosis of NSCLC patients. Accurate preoperative prediction of Ki-67 expression in non-small cell lung cancer (NSCLC) is crucial for prognostic stratification. This multicenter retrospective study enrolled 876 NSCLC patients (January 2015-December 2024) from four institutions, randomly divided into training (n = 525), testing (n = 175), and external validation (n = 176) sets. Radiomic features were extracted from intratumoral and peritumoral (0-12 mm) regions on CT images to construct intra-, peri-, and combined (intra + peri) radiomic scores (Rad-score). Deep learning scores (DL-score) were generated using ResNet101 for whole-lung and tumor-specific analyses. A random forest model integrating Rad-scores, DL-scores, and clinical parameters (lobulation, emphysema, etc.) was developed and validated across all datasets. The combined model (intra + peri Rad-score, intra-tumor DL-score, and clinical features) achieved AUCs of 0.98 (95% CI: 0.97-0.99), 0.92 (0.88-0.96), and 0.92 (0.87-0.96) in training, testing, and external validation sets, with corresponding F1-scores of 0.90, 0.75, and 0.70. SHAP interpretation identified intra-tumor DL-score as the most significant predictor (feature contribution: 46.8%). The multimodal random forest model enables noninvasive and accurate Ki-67 prediction in NSCLC, demonstrating superior generalizability and interpretability to guide personalized therapeutic strategies. Integrating deep learning with intratumoral and peritumoral radiomics enhances the preoperative prediction of Ki-67 expression in patients with non-small cell lung cancer.

PUBMED Cancer: pancreatic cancer Method: machine learning

Towards transparent and interpretable screening: multi-biofluid FTIR spectroscopy with LLM-Augmented explainability for pancreatic cancer detection.

Zheng Tang, Olivia Irvine, Edward Duckworth, Chiara Maria Costanzo, K Lillis, Jiahao Ren, P M Anupama Bandaranayake, Bilal Al-Sarireh, Matthew Mortimer, Venkateswarlu Kanamarlapudi, Victoria Higginbotham, S H Chandrashekhara, Benjamin Mora, Debdulal Roy
Published 2026-07-01 00:00

This study addresses the challenge of early detection of pancreatic cancer by utilizing Fourier Transform Infrared (FTIR) spectroscopy combined with machine learning techniques. The research evaluates multiple datasets from urine and blood biofluids, achieving a maximum balanced accuracy of 96.9% with a combination of transmission-mode urine and filtered blood. Additionally, the study emphasizes the importance of transparency and interpretability in AI systems, proposing a language-model-assisted framework for explainability in diagnostic processes.

Read abstract

Early detection of pancreatic cancer remains a critical challenge in oncology, with current diagnostic methods often failing to identify the disease until advanced stages. However, diagnostic accuracy alone may be insufficient for clinical adoption as regulatory frameworks and clinical workflows increasingly demand transparent, interpretable AI systems. This study investigates Fourier Transform Infrared (FTIR) spectroscopy combined with machine learning for non-invasive pancreatic cancer detection using urine and blood biofluids, augmented by a language-model-assisted transparency framework to bridge spectral feature attributions and biochemical interpretation. Five datasets were evaluated: urine ATR-FTIR (61.7% balanced accuracy), urine transmission FTIR (74.8%), filtered blood (<10 kDa; 89.8%), and two matched urine-blood fusion datasets. Transmission-mode urine combined with filtered blood achieved the highest performance (96.9% balanced accuracy), exceeding either biofluid alone. To support transparency, we developed an LLM-augmented explainability pipeline incorporating Monte Carlo Tree Search (MCTS) for structured hypothesis exploration, a curated retrieval-augmented knowledge base (RAG), and reliability-gated explanations that acknowledge disagreement between feature attribution methods. Explainability methods showed substantial disagreement (mean Spearman ρ = 0.23-0.28), motivating a tiered strategy: wavenumber-level interpretation when methods agree (ρ ≥ 0.3, with knowledge base verification) and zone-level interpretation otherwise. These results highlight both the potential and current limitations of transparent spectroscopic diagnostics.

Find the papers that actually matter