Research Papers

PUBMED Cancer: gastric cancer Method: unknown

Blocking CEMIP2-mediated low-molecular-weight hyaluronic acid -TGFβ signaling inhibits chemotherapy-associated lymphatic metastasis in gastric cancer.

Huanjie Chen, Qinbo Cai, Yanlei Chen, Jiahuang Huang, Peng Shi, Rongman Xie, Shaoxiong Yi, Haobin Hou, Hongfa Wei, Yulong He, Huafeng Fu, Xinming Song, Dongjie Yang
Published 2026-07-01 00:00

This study investigates the role of CEMIP2-mediated degradation of hyaluronic acid in chemotherapy-associated lymphatic metastasis in gastric cancer. The authors demonstrate that low molecular weight hyaluronic acid promotes metastasis through CD44-ATF3 signaling, leading to the upregulation of TGFβ receptor TGFBR1. They developed bioengineered exosome mimics for targeted delivery of CEMIP2 siRNA, which effectively inhibited metastasis in vivo. The findings suggest that targeting CEMIP2 could be a promising therapeutic strategy in gastric cancer treatment.

Read abstract

Chemotherapy-associated metastasis is a major cause of failure of cancer treatment, especially neoadjuvant chemotherapy. Extracellular matrix (ECM) remodeling aways accompany with chemotherapy, but its role in chemotherapy-associated metastasis is still unclear. Here, we reveal hyaluronidase-driven degradation of hyaluronic acid (HA) as a key mechanism underlying chemotherapy-associated lymphatic metastasis in gastric cancer. We found that chemotherapy-associated lymphatic metastasis of gastric cancer occurred during neoadjuvant chemotherapy in both patients and nude mice. The proportion of HA increased significantly in ECM during chemotherapy. We also found that cell migration inducing hyaluronidase 2 (CEMIP2) is the most highly expressed hyaluronidase to degrade HA into its effective type, low molecular weight HA (LMWHA), and promoted chemotherapy-associated lymphatic metastasis of gastric cancer. Mechanistically, CEMIP2-generated LMWHA activates CD44-ATF3 signaling to transcriptionally upregulate TGFβ receptor TGFBR1, driving metastasis. CEMIP2 is highly expressed in gastric epithelium naturally. To specifically target CEMIP2 and inhibit chemotherapy-associated lymphatic metastasis of gastric cancer, we developed bioengineered RGD-conjugated exosomes mimics (EMs) for targeted delivery of CEMIP2 siRNA. This strategy potently suppressed chemotherapy-associated lymphatic metastasis in vivo. Crucially, our results position CEMIP2 as a therapeutic target to inhibit chemotherapy-associated metastasis of gastric cancer.

PUBMED Cancer: breast cancer Method: machine learning

Key predictive factors of breast cancer based on race using machine learning models.

Shuning Yin, Gaurav Nanda, Raji Sundararajan
Published 2026-07-01 00:00

This research investigates key factors influencing breast cancer risk, focusing on racial differences using machine learning and explainable AI. The study utilized data from the Breast Cancer Surveillance Consortium, applying various models to identify predictive factors. The analysis revealed that history of biopsy and age group were the strongest predictors across all racial groups, highlighting significant disparities in breast cancer risk among different demographics.

Read abstract

In this research, key factors influencing breast cancer risk, a major global issue, are investigated, using machine learning (ML) and explainable AI, for racial differences. We used Breast Cancer Surveillance Consortium (BCSC) data, originally comprising 1.5 million unique combination records, from 6.7 million mammograms, collected between 2005 and 2017. Naïve Bayes, Logistic Regression, and Extreme Gradient Boosting models were applied to identify these key predictors. Variable importance and SHapley Additive exPlanations values were used to interpret models and identify most predictive factors. Analyses were stratified by six racial groups. History of biopsy (50.04%) and age group (25.85%) were the strongest predictors across all models and races. Menopausal status, breast density, and age at first childbirth were also important. White women had the highest overall incidences, particularly those over 65 (9.02 overall; 18.13 at age 65 + per 100,000), while Black women had higher rates in younger age groups (7.1 per 100,000 at age 18-29). Native American women showed higher rates in certain older age groups, whereas Asian/Pacific Islander and Other/Mixed groups had generally lower rates. ML and explainable AI applied to BCSC data identified key predictors and highlighted racial disparities among most predictive factors for breast cancer risk.

PUBMED Cancer: non-small cell lung cancer Method: unknown

3-Deoxy-4-sulfonamido-butein derivatives promote cell cycle arrest and apoptosis by inhibiting EGFR/JAK2/STAT3 signaling in A549 lung cancer cells.

Jung Hwan Choi, Hyun-Ha Hwang, Jinwon Hong, Jeong Uk Kim, Geon Wan Roh, Yewon Cho, Joonseok Byun, Jeong-Hui Je, Hyeong-Chan Lee, Ji-Sung Yoo, Sangjun Lee, JinSeok Ha, Tae-Hyoun Kim, Seong-Gyu Ko, Jae Yeol Lee
Published 2026-07-01 00:00

This study investigates the synthesis and evaluation of novel 3-deoxy-4-sulfonamido-butein derivatives aimed at enhancing anticancer efficacy against non-small cell lung cancer (NSCLC). The compounds demonstrated significant antiproliferative activity in A549, H1299, and H1975 cell lines, inducing G2/M cell cycle arrest and apoptosis through inhibition of the EGFR/JAK2/STAT3 signaling pathway. Notably, the lead candidates showed effectiveness even in gefitinib-resistant NSCLC models, suggesting their potential as dual-targeting agents for treatment-resistant lung cancer.

Read abstract

Butein, a naturally occurring chalcone, exhibits diverse pharmacological potential but suffers from limited anticancer efficacy. To address this limitation and improve drug-likeness, we designed and synthesized a novel series of 3-deoxy-4-sulfonamido-butein derivatives. The synthesized compounds were evaluated for their antiproliferative activities against human non-small cell lung cancer (NSCLC) cells, including A549, H1299, and H1975 lines. Among them, compounds 8o (MRC-B-016) and 8p (MRC-B-018), characterized by electron-donating dialkylamino substituents, demonstrated profoundly enhanced growth inhibition compared to the parent compound without significant cytotoxicity in WI-38 normal lung cells. Mechanistic investigations revealed that 8o (MRC-B-016) and 8p (MRC-B-018) effectively induced G2/M cell cycle arrest and apoptotic cell death. Furthermore, these lead candidates operated through a convergent dual-targeting mechanism by directly inhibiting epidermal growth factor receptor (EGFR) tyrosine kinase activity while concurrently suppressing the downstream JAK2/STAT3 signaling axis. Molecular docking simulations corroborated these biological outcomes, displaying highly favorable binding interactions within both the EGFR kinase and STAT3 SH2 domains. Notably, these derivatives successfully suppressed oncogenic activation and colony formation even in gefitinib-resistant NSCLC models. Collectively, this study identifies sulfonamide-modified butein derivatives, particularly 8o (MRC-B-016) and 8p (MRC-B-018), as promising dual-targeting agents. Future research will focus on expanding the structure-activity relationship and conducting comprehensive in vivo evaluations to advance these chemotypes toward clinical translation for treatment-resistant lung cancer.

PUBMED Cancer: thyroid cancer Method: Swin Transformer

Region-guided decoupled fusion network for ultrasound-based classification of thyroid nodules with and without Hashimoto's thyroiditis.

Jing Wen, Qijian Chen, Lijuan Luo, Hongqing Ma, Caihong Wang, Yuyu Hua, Dan Qin, Jinge Zhou, Ying Yang, Tingting Shen, Limei Liu, Juang Wen, Lihui Wang, Shi Zhou, Zhu Zeng
Published 2026-07-01 00:00

This study presents a region-guided decoupled fusion network (DFNet) designed to classify thyroid nodules in patients with and without Hashimoto's thyroiditis. The method enhances classification balance and interpretability, aiming to reduce unnecessary biopsies while maintaining reliable malignancy detection. DFNet demonstrated superior performance compared to ten state-of-the-art architectures, achieving high accuracy and area under the curve metrics in both validation and testing cohorts.

Read abstract

Differentiating benign from malignant thyroid nodules is particularly challenging in patients with Hashimoto's thyroiditis (HT), where inflammatory changes can mimic cancer. We developed a region-guided decoupled fusion network (DFNet) that explicitly models intra- and peri-nodular transitions in both HT and non-HT nodules. By improving classification balance and interpretability, DFNet may help reduce unnecessary biopsies while preserving reliable detection of malignancy. In this multicenter retrospective study, 8667 patients (13,680 ultrasound images) from nine institutions were included. Nodules were confirmed histopathologically after surgery. Regions of interest (ROIs) representing intra- and peri-nodular areas were manually segmented, expanded/shrunk in fixed pixel increments, and normalized. A total of 1578 radiomic features were extracted from each ROI. DFNet employed a Swin Transformer backbone to obtain regional features, orthogonal constraint-based decomposition to separate common and region-specific representations, and HT-specific fusion before classification. Interpretability was achieved via Shapley Additive Explanations (SHAP) and correlation of deep features with radiomic descriptors. Performance was compared with 10 state-of-the-art architectures using accuracy (ACC), Matthews correlation coefficient (MCC), and area under the receiver operating characteristic curve (AUC). Statistical significance was assessed using the DeLong test and t tests with Bonferroni correction. DFNet achieved the best results in validation (ACC 91.9%, MCC 76.4%, AUC 91.4%) and testing cohorts (ACC 93.6%, MCC 83.0%, AUC 92.4%), significantly outperforming alternatives (p<0.05). Peri-nodular features improved MCC by up to 12.9%, decoupled fusion by 6.1-9.0%, and HT-specific adaptation by 2.9-5.4%. SHAP highlighted biomarkers (e.g., GLDM-LDHGLE, LBP-2D-FO-TE, OFK) with HT-dependent patterns. DFNet improves thyroid nodule classification by modeling intra- to peri-nodular transitions and linking deep features with radiomic biomarkers, enabling more accurate and interpretable predictions that may help reduce unnecessary fine-needle aspiration biopsies.

PUBMED Cancer: colon cancer Method: machine learning

Integrating network toxicology, machine learning, and experimental evidence reveals candidate targets and pathways in PCDD/F-related colon cancer.

Hanxiao Shen, Wei Zhu, Ding Wang, Yue Mou, Yuxin Huang, Yueying Yang, Zhen Liu, Qing Liu
Published 2026-07-01 00:00

This study investigates the role of polychlorinated dibenzo-p-dioxins and dibenzofurans (PCDD/Fs) in promoting colon cancer through a multidisciplinary approach that includes network toxicology and machine learning. The research identifies MMP7 as a core target, with its expression linked to immune cell infiltration in colon cancer tissues. Experimental evidence supports the findings, showing that exposure to TCDF increases Mmp7 expression and proinflammatory cytokines in murine colonic tissues.

Read abstract

Previous studies have suggested that exposure to carcinogenic polychlorinated dibenzo-p-dioxins and polychlorinated dibenzofurans (PCDD/Fs) pollutants may increase the risk of colon cancer, their underlying molecular mechanisms remain unclear. In this study, we employed a multidisciplinary approach integrating network toxicology, machine learning, molecular docking, molecular dynamics (MD) simulations and in vivo experiments to investigate how PCDD/Fs may promote colon carcinogenesis. Machine learning algorithms converged on MMP7 as a core target, MMP7 expression was upregulated in colon cancer tissues and was associated with immune cell infiltration. Molecular docking and MD simulations further suggested stable interactions between the five representative PCDD/F congeners and the target proteins (MMP7, SRC, and HSP90AA1), supporting their potential involvement in disease progression. Consistent with these in silico findings, exposure of mice to 24 μg/kg TCDF significantly increased the expression of Mmp7 and Hsp90aa1 in murine colonic tissues, increased the levels of proinflammatory cytokines Ifn-γ, Il-1β, and Il-6, and downregulated the expression of Mucin 2 (MUC2). Connectivity Map analysis based on the PCDD/F-related gene signature identified five candidate compounds targeting MMP7 and HSP90AA1, of which four HSP90 inhibitors (tanespimycin, alvespimycin, NVP-AUY922 and AT-13387) showed negative connectivity scores, suggesting potential to reverse the pollutant-induced expression profile.

PUBMED Cancer: colorectal cancer Method: TabTransformer

Integration of deep learning and radiomic features from multiplex immunohistochemistry images for reproducible Multi-Outcome prediction in a Multi-Center study of colorectal cancer.

Yizhuo Yin, Zhe Sun, Xin Deng, Qing Fan
Published 2026-07-01 00:00

This study develops and validates a multimodal machine learning framework that integrates radiomic and deep learning features from multiplex immunohistochemistry images for predicting outcomes in colorectal cancer. The framework was tested on a large cohort of 2,117 patients across multiple centers, demonstrating superior performance in predicting tumor recurrence, survival status, TNM staging, and immune profiles. The results indicate that the integration of these features enhances prediction accuracy and generalizability in clinical settings.

Read abstract

To develop and validate a robust, multimodal machine learning framework integrating radiomic and deep learning features from multiplex immunohistochemistry (mIHC) images for comprehensive outcome prediction in colorectal cancer (CRC). This multi-institutional retrospective study included 2,117 CRC patients from seven centers, with 1,548 cases used for model training and internal testing, and 569 for external validation. mIHC-stained whole-slide images targeting six immune markers (CD3, CD8, CD45RO, PD-1, LAG-3, Tim-3) were analyzed from two spatial compartments: tumor center and invasive margin. Radiomic features (n = 71/region/marker) were extracted using HistomicsTK, while 768-dimensional deep features were derived using a pre-trained Vision Transformer (ViT-B/16). Feature robustness across biomarkers was quantified via intraclass correlation coefficients (ICC ≥ 0.75). Selected features underwent multi-step selection (LASSO, MI, RFE) and were fused into a single feature space, followed by PCA-based dimensionality reduction. Five clinical tasks were modeled: tumor recurrence, survival status, overall survival duration, TNM staging, and immune profile classification. Classification models (TabTransformer, XGBoost, TabNet) and survival models (DeepSurv, CoxPH, RSF) were trained using 5-fold cross-validation and tested on independent cohorts. Fused features significantly outperformed individual modalities across all tasks. TabTransformer with LASSO-selected fused features achieved top performance: recurrence (AUC = 95.9%), survival status (AUC = 94.5%), TNM staging (macro-AUC = 91.0%), and immune profile (macro-AUC = 91.0%). For survival regression, DeepSurv achieved a C-index of 0.82 and time-dependent AUC of 0.82. Models exhibited strong generalizability, with negligible performance drop on external datasets. SHAP analysis confirmed feature interpretability, with fused features contributing the most across tasks. This study demonstrates that fused mIHC-derived radiomic and deep features yield accurate, interpretable, and generalizable predictions for multiple CRC outcomes, supporting their integration into precision oncology workflows.

PUBMED Cancer: colorectal cancer Method: polynomial regression

An algorithm-enhanced stool DNA system improves the differential diagnosis of colorectal cancer versus Crohn's disease in high-risk symptomatic patients.

Le Gao, Zhe Guo, Zeyou Wang, Min Wang
Published 2026-07-01 00:00

This study developed and validated an algorithm-enhanced stool DNA system (FIT-sDNA-CA) to improve the differential diagnosis of colorectal cancer (CRC) versus Crohn's disease (CD) in high-risk symptomatic patients. By integrating various biomarkers and employing machine learning algorithms, the system demonstrated a positive predictive value of 69.65%, significantly outperforming traditional methods. The polynomial regression model was identified as the optimal approach, enhancing the specificity of CRC triage.

Read abstract

Crohn's disease (CD) and colorectal cancer (CRC) share many clinical symptoms, making non-invasive differential diagnosis difficult. FIT-sDNA is sensitive for CRC screening in average-risk populations but often gives false positives in CD patients due to inflammation-induced mucosal turnover. This study aimed to develop and validate an algorithm-enhanced system (FIT-sDNA-CA) to improve the specificity of CRC triage using current DNA tests. The study enrolled 312 subjects, comprising a training cohort of 234 confirmed patients and a prospective validation cohort of 78 potential patients initially diagnosed by clinicians with either CD or CRC. Machine learning algorithms integrated gender, age, fecal KRAS mutation, BMP3/NDRG4/SDC2 methylation, fecal calprotectin (FC), and fecal immunochemical test (FIT) results. After comparing eight algorithms, polynomial regression (PR) was determined to be the optimal model. The PR model demonstrated superior clinical applicability compared to long short-term memory (LSTM) networks (validation set AUC 0.906 vs 0.794). In CRC differential diagnosis, the FIT-sDNA-CA system achieved a positive predictive value of 69.65 % (95 % CI, 66.73-71.58), significantly higher than FIT (45.93 %) and FC (22.92 %). By integrating genetic, epigenetic, and inflammatory biomarkers, the FIT-sDNA-CA system effectively filters out confounding signals from intestinal inflammation, overcoming the low specificity limitation of traditional fecal DNA testing. As a highly accurate non-invasive triage tool, this system facilitates early risk stratification for patients with high-risk colorectal cancer symptoms and significantly reduces unnecessary endoscopic referrals.

PUBMED Cancer: non-small cell lung cancer Method: random forest

An intelligent fusion model for Ki-67 prediction in non-small cell lung cancer: A cloud-based prediction system integrating radiomics.

Zhenyu Cao, Xiaoling Xu, Guoqun Mao, Feng Cui, Zhongfeng Niu, Zongyu Xie, Hengfeng Shi, Cheng Yan, Jian Wang
Published 2026-07-01 00:00

This study presents a cloud-based prediction system for Ki-67 expression in non-small cell lung cancer (NSCLC) using a multimodal random forest model. The model integrates radiomic features extracted from CT images and deep learning scores generated by ResNet101. The combined approach achieved high AUCs and F1-scores across training, testing, and validation sets, demonstrating its potential for accurate preoperative prediction and personalized therapeutic strategies.

Read abstract

The expression level of Ki-67 affects the prognosis of NSCLC patients. Accurate preoperative prediction of Ki-67 expression in non-small cell lung cancer (NSCLC) is crucial for prognostic stratification. This multicenter retrospective study enrolled 876 NSCLC patients (January 2015-December 2024) from four institutions, randomly divided into training (n = 525), testing (n = 175), and external validation (n = 176) sets. Radiomic features were extracted from intratumoral and peritumoral (0-12 mm) regions on CT images to construct intra-, peri-, and combined (intra + peri) radiomic scores (Rad-score). Deep learning scores (DL-score) were generated using ResNet101 for whole-lung and tumor-specific analyses. A random forest model integrating Rad-scores, DL-scores, and clinical parameters (lobulation, emphysema, etc.) was developed and validated across all datasets. The combined model (intra + peri Rad-score, intra-tumor DL-score, and clinical features) achieved AUCs of 0.98 (95% CI: 0.97-0.99), 0.92 (0.88-0.96), and 0.92 (0.87-0.96) in training, testing, and external validation sets, with corresponding F1-scores of 0.90, 0.75, and 0.70. SHAP interpretation identified intra-tumor DL-score as the most significant predictor (feature contribution: 46.8%). The multimodal random forest model enables noninvasive and accurate Ki-67 prediction in NSCLC, demonstrating superior generalizability and interpretability to guide personalized therapeutic strategies. Integrating deep learning with intratumoral and peritumoral radiomics enhances the preoperative prediction of Ki-67 expression in patients with non-small cell lung cancer.

PUBMED Cancer: osteosarcoma Method: support vector machine

Habitat-based MRI heterogeneity radiomics for predicting neoadjuvant chemotherapy response in osteosarcoma.

Shuo Wang, Qingsong Wang, Xing Wan, Xianghong Meng, Man Sun, Jinglai Sun, Xuyao Yu, Guangpu Wang, Lei Zhu, Hui Yu
Published 2026-07-01 00:00

This study aimed to develop a habitat-based MRI heterogeneity radiomics (H-radiomics) approach to predict the response of osteosarcoma patients to neoadjuvant chemotherapy (NAC). By analyzing MRI scans and employing feature selection techniques, the researchers identified key features that improved prediction accuracy. The combined model of H-radiomics and conventional radiomics outperformed individual models, achieving an area under the curve (AUC) of 0.91 in predicting treatment response.

Read abstract

Osteosarcoma is a highly heterogeneous malignant tumor with varied responses to neoadjuvant chemotherapy (NAC). This study developed heterogeneity radiomics (H-radiomics) based on habitat imaging to predict the treatment response of osteosarcoma patients after NAC. This study retrospectively included MRI scans (T1-weighted and T2-weighted) of osteosarcoma patients who underwent NAC and surgery at two centers between April 2015 and September 2024. Conventional radiomics (C-radiomics) features and habitat imaging-based H-radiomics features were extracted, with 2236 features obtained for each. Unsupervised reproducibility feature correlation analysis and the least absolute shrinkage and selection operator (LASSO) were used for feature selection, which resulted in 5 features being selected. Support vector machines (SVM) served as the classifier. C-radiomics, H-radiomics, and combined models were developed, and their performance was evaluated using the area under the receiver operating characteristic curve (AUC). The training set included 57 patients (mean age, 17 years ± 11 [SD], 29 men) from Center 1, while the external validation set included 48 patients (mean age, 18 years ± 14 [SD], 28 men) from Center 2. In the external test set, the H-radiomics model achieved an AUC of 0.86, outperforming the C-radiomics model, which had an AUC of 0.79. The combined model demonstrated the best performance, with an AUC of 0.91. Additionally, the combined model achieved an accuracy of 85%, sensitivity of 88%, and specificity of 83%. The combined model of H-radiomics and C-radiomics from multiparametric MRI demonstrates good performance in predicting the treatment response after NAC in osteosarcoma patients.

PUBMED Cancer: hepatocellular carcinoma Method: large language models

Large Language Models for the Differentiation of Benign and Malignant Liver Nodules based on Multimodal Prompts in Liver US Cases.

Sijia Lin, Yu Li, Rushuang Mao, Xuebin Zou, Yixin Hu, Hongsheng Ye, Xiaojun Wu, Liang Yang, Jichong He, Shilin Lu, Lingling Li, Jianhua Zhou
Published 2026-07-01 00:00

This study evaluates the performance of large language models (LLMs) in differentiating benign and malignant liver nodules using multimodal prompts in liver ultrasound cases. The research involved 400 liver ultrasound cases, with a focus on identifying the optimal input for LLMs. Results indicated that the combination of ultrasound images and medical history provided the highest diagnostic accuracy, with LLMs achieving performance comparable to senior radiologists.

Read abstract

Large language models (LLMs) that can process both images and text are increasingly being used in radiology. This study aimed to evaluate the performance of LLMs including GPT-4 Omni (GPT-4o), Claude-3.5-Sonnet (Claude), and Gemini 1.5 Pro (Gemini) in differentiating benign and malignant nodules in liver US cases and compare it with that of human readers. Four hundred liver US cases with pathologically confirmed liver nodules visible on B-mode US from January 2020 to November 2024 were randomly selected in this retrospective study. They were divided into a development set (n = 100) and a test set (n = 300). Five prompt groups for LLMs including US image [I-only], image description [D-only], image and description [I+D], image and liver US e-textbook [I+T], and image and medical history [I+H] were evaluated to identify the optimal input in development set. In test set, accuracy of LLMs in differentiating benign and malignant liver nodules was compared with that of human readers using McNemar's test. In development set, the prompt group I+H for all LLMs exhibited the highest diagnostic accuracy in differentiating benign and malignant liver nodules, being considering as the optimal input (taking GPT-4o as an example, with I-only, 57.0% [as reference]; D-only, 62.0%, p = 0.55; I+D, 62.0%, p = 0.54; I+T, 62.0%, p = 0.36; I+H, 77.0%, p = 0.01). In test set, LLMs with I+H outperformed junior group and showed similar accuracy to senior group (Junior, 70.0% [as reference1]; Senior, 78.3% [as reference2]; GPT-4o, 83.3%, P1 < .001, P2 = .10; Claude, 77.0%, p1 = 0.04, p2 = 0.72; Gemini, 75.3%, p1 = 0.14, p2 = 0.36). Large language models with US image and medical history inputs achieved accuracy comparable to senior radiologists and superior to junior radiologists in differentiating benign and malignant liver nodules.

Find the papers that actually matter