Research Papers

PUBMED Cancer: ovarian cancer Method: large language model

Automated O-RADS Risk Stratification Using a Large Language Model Analysis of Narrative Ultrasound Reports.

Yanhui Guo, Jingjing Gong, Ruquan Jiang, Asmi Agarwal, Ruchika Goel, Richard Selingreund, Yujie Liu, Min Ren
Published 2026-07-01 00:00

This study presents an automated method for Ovarian-Adnexal Reporting and Data System (O-RADS) scoring using a large language model (LLM) to analyze ultrasound reports. A two-stage pipeline was developed, where the Lingshu LLM extracted features from narrative descriptions, which were then used to train various machine learning algorithms for predicting O-RADS scores. The method achieved a high accuracy of 0.803, indicating its potential to improve clinical workflow and reduce diagnostic variability in ovarian cancer risk assessment.

Read abstract

The Ovarian-Adnexal Reporting and Data System (O-RADS) is essential for standardizing the risk stratification of ovarian lesions detected on ultrasound. However, manual assignment of O-RADS scores is time-consuming and can vary between observers. This study investigates an automated method for O-RADS scoring using a large language model (LLM) to analyze narrative ultrasound reports. A two-stage pipeline was developed for automated O-RADS classification. Initially, the Lingshu LLM, specialized in medical language, extracted and embedded features from free-text descriptions of ovarian lesions. It identified key diagnostic features mentioned by sonologists. Subsequently, these features were used to train and evaluate several machine learning algorithms, including logistic regression (LR), support vector machines and random forests, to predict O-RADS scores (1-5). The proposed method was evaluated on a dataset of 513 cases using fivefold cross-validation. The pipeline using Lingshu model embeddings with LR achieved the highest accuracy of 0.803 [95% CI: 0.753, 0.853], a weighted-average F1-score of 0.819 [95% CI: 0.777, 0.861] and a macro-averaged AUROC of 0.948 [95% CI: 0.937, 0.959]. This outperformed the MedGemma model's pipeline, which had an accuracy of 0.760 [95% CI: 0.700, 0.820], F1-score of 0.787 [95% CI: 0.739, 0.835] and AUROC of 0.941 [95% CI: 0.911, 0.971]. This study introduces a novel approach to automate O-RADS scoring using LLMs for feature extraction and traditional machine learning for classification. The results indicate that this method can accurately stratify ovarian cancer risk, potentially improving clinical workflow efficiency and reducing diagnostic variability. This approach may support radiologists in making more consistent and timely assessments.

PUBMED Cancer: nasopharyngeal carcinoma Method: multimodal learning

Construction of an interpretable multimodal image model for differentiating T1-stage nasopharyngeal carcinoma from benign hyperplasia.

Kai-Jie Wang, Tian-Cheng Lin, Si-Jia Zuo, Zhi Fu, Xin Jin, Fu-Jin Liu, Gang Wu, Wei-Yuan Huang
Published 2026-07-01 00:00

This study aims to construct and validate a multimodal model that combines magnetic resonance imaging (MRI) and endoscopy to differentiate T1-stage nasopharyngeal carcinoma from benign hyperplasia. The model utilizes various feature selection techniques and machine learning algorithms, with the Multilayer Perceptron (MLP) model demonstrating the best performance. The results indicate a mean AUC of 0.98, showcasing the model's potential for clinical application in distinguishing between benign and malignant tissues.

Read abstract

Differentiating T1-stage nasopharyngeal carcinoma (NPC) from benign hyperplasia (BH) is challenging. This study aims to construct and validate a multimodal model combining magnetic resonance imaging (MRI) and endoscopy to distinguish T1-NPC from BH. Additionally, SHapley Additive exPlanations (SHAP) are used for model interpretability analysis. Data from 161 patients with histologically confirmed diagnoses between 2015 and 2022 were retrospectively collected, including 95 cases of T1-NPC and 66 cases of BH. Regions of interest (ROI) were drawn based on MRI and endoscopy to extract features. Feature selection techniques, such as elastic net, recursive feature elimination, and deep learning, were used to identify the optimal feature subset. Naive Bayes, Adaptive Boosting (AdaBoost), Light Gradient Boosting Machine (LightGBM), k-Nearest Neighbors (kNN), and Multilayer Perceptron (MLP) were applied to establish the MRI radiomics model and the MRI-endoscopy combined model. SHAP was used to perform interpretability analysis of the models. The MRI-endoscopy combined model outperformed the radiomics model, with the MLP-based model showing the best performance. The mean AUC of the test set reached 0.98, with an accuracy of 0.90, precision of 0.90, sensitivity of 0.93, and specificity of 0.86. SHAP analysis revealed that texture features (including GLSZM, GLCM, and GLRLM) and first-order features were critical for distinguishing T1-NPC from BH. Compared to traditional radiomics methods, the multimodal model combining MRI and endoscopy more accurately distinguishes between benign and malignant tissues. SHAP enables visualization of feature contributions and model predictions, highlighting the model's clinical potential.

PUBMED Cancer: breast cancer Method: HRNetV2

Artificial intelligence-assisted three-dimensional imaging of breast microinvasive carcinoma reveals larger invasive focus size in a substantial proportion of cases.

Yichieh Chien, Cher-Wei Liang, Chih-Yi Hsu, Yu-Chieh Lin, Yen-Yu Lin
Published 2026-07-01 00:00

This study focuses on microinvasive carcinoma of the breast, which is characterized by small invasive foci. A three-dimensional imaging method was developed to reassess the size of these foci, revealing that many cases previously reported as close to 1 mm were actually larger. The research utilized an AI program based on the HRNetV2 architecture to assist in future case annotations, leading to significant re-classifications of some cases. The findings suggest that AI-assisted 3D imaging can enhance the precision of diagnosing microinvasive carcinoma.

Read abstract

Microinvasive carcinoma of breast is a unique type of malignancy characterized by the presence of small invasive foci (less than 1 mm in diameter) in a background of carcinoma in situ. The disease is the earliest stage of breast invasive carcinoma development, and patients diagnosed with this disease are often treated conservatively. However, diagnosing microinvasive carcinoma based on a single tissue section may underestimate the invasive focus size. We developed a three-dimensional (3D) imaging method to re-evaluate the invasive focus size in microinvasive carcinoma cases in which the original reported focus size was close to 1 mm. The 3D images were annotated and used to developed an artificial intelligence (AI) program based on the HRNetV2 architecture to assist in the annotation of future cases. We found that in 8 of 11 cases (72.7 %), the foci sizes are greater than 1 mm when the specimens are analyzed in 3D space, resulting in re-classification of the cases as T1a invasive carcinoma. Notably, in one of the reclassified cases, isolated tumor cells were identified in the sentinel lymph node biopsy. Our findings challenge the robustness of the microinvasion concept and indicate that AI-assisted 3D imaging is a valuable tool in precision diagnosis of microinvasive carcinoma.

PUBMED Cancer: non-small cell lung cancer Method: AI-enhanced multi-omics

Lung cancer as a global health challenge: Multidimensional biomarker research and therapeutic advances.

Dezhong Jin, Liangwang Zhong, Lai Chen
Published 2026-07-01 00:00

The paper discusses the challenges and advancements in lung cancer diagnostics, particularly focusing on the integration of multidimensional biomarkers and artificial intelligence. It highlights the limitations of traditional serum biomarkers and emphasizes the potential of liquid biopsies combined with AI-enhanced multi-omics to improve early detection and treatment monitoring. The authors advocate for the clinical adoption of validated AI-integrated platforms to enhance precision oncology for lung cancer.

Read abstract

Lung cancer, the leading cause of global cancer-related mortality, is categorized into small-cell and non-small-cell subtypes. The heterogeneous non-small-cell lung cancer group is further subcategorized primarily into adenocarcinoma, squamous cell carcinoma, and large cell carcinoma, each underpinned by distinct molecular alterations. Although traditional serum biomarkers aid in subtype differentiation and treatment monitoring, their utility is limited by challenges such as poor specificity due to inflammatory confounders and the difficulty of dynamically tracking therapeutic resistance. Recent advances have identified emergent subtype-specific biomarkers that reflect metabolic reprogramming, epigenetic dysregulation, stemness signatures, and interactions within the immune microenvironment. By integrating analytes such as ctDNA, exosomal RNAs, and urinary DNA with multi-analyte panels and advanced imaging, liquid biopsies offer a promising avenue to enhance early detection accuracy, prognostication, and dynamic therapy monitoring. Nevertheless, the clinical adoption is hindered by several challenges, including incomplete validation, the need for technical standardization, intratumoral heterogeneity, and inter-ethnic variability. The convergence of artificial intelligence (AI)-enhanced multi-omics with biomarker-guided therapeutics represents a transformative strategy with the potential to overcome resistance, mitigate ethnic disparities, and ultimately transform lung cancer into a chronic, manageable disease. Therefore, prioritizing clinically validated AI-integrated platforms is pivotal to achieve precision oncology.

PUBMED Cancer: breast cancer Method: unknown

Peptide-drug conjugates bearing an antimitotic Ahx-DA1 payload achieve potent antitumor activity in Her2-amplified and EGFR-positive KRAS-mutant cancers in vivo.

Akash Panja, Pousali Mitra, Iryna Tkachenko, Gary Gellerman
Published 2026-07-01 00:00

This study investigates the efficacy of peptide-drug conjugates (PDCs) utilizing the Ahx-DA1 payload in targeting HER2-amplified and EGFR-positive KRAS-mutant cancers. The research demonstrates that these PDCs exhibit potent cytotoxicity and high target specificity in various cancer cell lines and xenograft models. The findings indicate significant tumor growth inhibition in both HER2+ and EGFR+ KRAS-mutated cancer models, highlighting the potential of Ahx-DA1 as an effective therapeutic agent.

Read abstract

Peptide-drug conjugates (PDCs) represent a targeted cancer therapy strategy that combines tumor-homing peptides with potent cytotoxic payloads, offering a promising alternative to antibody-drug conjugates (ADCs) through improved tissue penetration, synthetic accessibility, and tumor selectivity. Auristatins (MMAE, MMAF, etc.), which are synthetic analogues of antimitotic dolastatin 10 (Dol-10), are widely used as ADC payloads; however, their systematic evaluation in PDC formats remains limited. In this study, we investigated Ahx-DA1, an enzymatically stable derivative of microtubule inhibitor DA1, a previously reported dolastatin-10 analogue, as a payload for PDCs. Two receptor-specific peptides, HER2-targeting peptide A9 and EGFR-binding peptide P6, were conjugated to a Ahx-DA1 and evaluated in the HER2-overexpressing breast cancer BT-474 model and the EGFR-overexpressing KRAS-mutated colorectal (HCT116) and pancreatic (PANC1) models, respectively. A cell-based study of DA1-bearing PDCs revealed specific and potent cytotoxicity in cancer cell lines, with the corresponding overexpressed receptors demonstrating high target specificity. The DA1-based PDCs exhibited high stability and favorable tolerability profiles across all the tested xenograft models. In vivo studies demonstrated pronounced tumor growth inhibition by A9-DA1 in HER2+ xenograft and P6-DA1 in EGFR+ KRAS mutated colorectal and pancreatic xenograft models. Overall, our findings suggest that Ahx-DA1 is a highly effective auristatin-class payload for the development of DA1 based anticancer PDCs.

PUBMED Cancer: breast cancer Method: multivariable linear regression

Breast area affects the performance of a commercial artificial intelligence algorithm assessment of negative digital breast tomosynthesis exams.

Emily C Barre, Yinhao Ren, Derek L Nguyen, Joseph Y Lo, Lars J Grimm
Published 2026-07-01 00:00

This study investigates the influence of breast area and the number of slices on the performance of an AI algorithm in assessing negative digital breast tomosynthesis (DBT) exams. A retrospective cohort of 4842 women was analyzed to evaluate the relationship between demographic factors, image attributes, and AI-generated risk scores. The findings indicate that breast area positively correlates with both case and risk scores, highlighting its significance in AI assessments of malignancy likelihood.

Read abstract

To understand whether cancer-neutral image attributes (breast area and number of slices) impact an AI algorithm assessment of negative digital breast tomosynthesis (DBT) screening exams. This retrospective cohort study included women from a single institution whose screening mammogram was interpreted as negative between 2016 and 2019. All patients had at least 2 years follow-up without evidence of malignancy. Primary outcome measures were AI-calculated assessment of present and future likelihood of malignancy, quantified as a case and risk score. A multivariable linear regression model evaluated the relationship between patient demographics (age, race/ethnicity), image size (breast area, number of slices), and AI algorithm outputs (breast density, case score, risk score). There were 4842 female patients included in the study (mean age 55.0 ± 10.6 years). For case score, there was a positive association with breast area (p < 0.0001), as well as older age, breast density (scattered vs fatty), and race (White vs Asian and Black vs White, all p < 0.05). For risk score, there was also a positive association with breast area (p < 0.001), as well as older age, breast density (scattered vs fatty, heterogeneously dense vs scattered, extremely dense vs heterogeneously dense), and race (White vs Asian, all p < 0.05). Number of DBT slices was not significantly associated with either case or risk scores. Known breast cancer risk factors and one neutral characteristic (breast area), significantly impacted an AI algorithm's assessment of present and future likelihood of malignancy.

PUBMED Cancer: unknown Method: single-cell multi-omics

The transformative role of single-cell analysis in multifactorial disorders research.

Chih-Yang Wang, Ching-Chung Ko, Sachin Kumar, Do Thi Minh Xuan, Hui-Ru Lin, Yung-Kuo Lee, Ngoc Uyen Nhi Nguyen, Pei-Ming Yang, Dahlak Daniel Solomon
Published 2026-07-01 00:00

This review discusses the role of single-cell analysis in understanding multifactorial inherited disorders (MIDs), emphasizing its potential to resolve cellular contexts that traditional methods struggle to address. It synthesizes current single-cell methodologies, including transcriptomic and proteomic techniques, and highlights their applications in revealing cell-type-specific regulatory circuits and responses to environmental factors. The authors argue that integrating multi-omics data is essential for advancing biomarker discovery and understanding complex traits, while also acknowledging the challenges that remain in this field.

Read abstract

Multifactorial inherited disorders (MIDs) arise from complex interactions between polygenic risk and environmental exposures, presenting major challenges for mechanistic discovery, patient stratification, and targeted therapy development. While traditional approaches like genome-wide association studies (GWAS) and bulk omics profiling have identified broad associations, they often struggle to resolve the cellular context in which these interactions drive pathogenesis.Emergingsingle-cell technologies now offer unprecedented resolution to dissect tissue heterogeneity, define rare or transient disease-relevant cell states, and map dynamic trajectories across tissues and disease stages. This reviewprovides a comprehensive synthesis ofcurrent single-cell methodologies including transcriptomic, epigenomic, proteomic, and spatial techniques and their application to MID research. We explore how these toolsare revealingcell-type-specific regulatory circuits,contextualizingthe functional impact of inherited risk variants, andelucidatingcellular responses to environmental perturbations.We propose thatintegrating single-cell multi-omics data is critical for illuminating the mechanistic basis of complex traits and for advancing biomarker discovery. However, significant challenges remain, including technical variability, limited cohort scalability, difficulties in multi-modal data integration, and a lack of standardized analytical workflows for polygenic diseases. Overcoming these barriers will require harmonized study designs, robust computational frameworks, and the incorporation of longitudinal and environmental exposure data.Ultimately, we conclude thatsingle-cell analysis is poised to transform MID research, offering a powerful new paradigm for mechanistic insight, therapeutic innovation, and the realization of precision medicine.

PUBMED Cancer: actinic keratosis Method: multimodal learning

Actinic keratosis staging in multimodal image data.

Anna Slian, Katarzyna Korecka, Adriana Polańska, Joanna Czajkowska
Published 2026-07-01 00:00

This study developed a multimodal machine-learning framework to stage Actinic Keratosis (AK) using dermatoscopic and high-frequency ultrasound (HFUS) data. The framework achieved over 80% accuracy in classifying different stages of AK and nearly 90% accuracy in detecting early lesions. The results indicate that combining features from both imaging modalities outperforms single-modality models.

Read abstract

Actinic Keratosis (AK) is a common skin condition, usually appearing on sun-exposed areas, whose progression is associated with characteristic dermatoscopic and structural changes. Early detection of AK is crucial, as cancer progression may occur in changed skin. This study aimed to develop a multimodal, machine-learning-based framework combining dermatoscopic and high-frequency ultrasound (HFUS) data to automatically stage AK and identify early lesions. A dataset containing 222 pairs of dermatoscopic and HFUS images was clinically evaluated using the 3-point Zalaudek scale. Dermatoscopic images underwent ROI selection, hair removal, and extensive feature extraction (color, erythema, pigmentation, vessels, scales, pixel intensities, GLCM/LBP texture). HFUS images were divided into entry echo, sub-epidermal low-echoic band (SLEB), and dermis using a deep neural network, and then features describing the morphology and structure of the skin for each layer were extracted. A pre-trained EfficientNet network was used for feature extraction. Logistic Regression, k-Nearest Neighbors, Random Forests, Support Vector Machines and Multilayer Perceptrons with Sequential Feature Selection using 5-fold patient-wise cross-validation were used for feature-based classification. Additionally, multimodal TwinCNN was evaluated, with various pre-trained models as feature extractors. Combining dermatoscopic and HFUS features consistently outperformed single-modality models. Depending on the defined task, the models achieved over 80% accuracy (healthy, AK1-AK3), 78% (AK1-AK3), and almost 90% in the case of early AK detection vs. healthy and advanced AK on multimodal features. The TwinCNN model performed worse than classical machine-learning approaches, likely due to the limited size of the dataset and class imbalance. A multimodal framework integrating dermatoscopic and HFUS imaging enables accurate AK classification, surpassing single-modality approaches. Future work should expand multicenter datasets, improve automation of pre-processing steps, and explore enhanced neural multimodal fusion architectures.

PUBMED Cancer: unknown Method: deep learning

Artificial intelligence applications in OCT and OCTA for diabetic retinopathy: A systematic review.

Meysam Tavakoli, Esmat Ramezanzadeh
Published 2026-07-01 00:00

This systematic review evaluates the applications of artificial intelligence, particularly deep learning, in the detection and analysis of diabetic retinopathy (DR) using optical coherence tomography (OCT) and OCT angiography (OCTA). The review highlights that deep learning approaches, especially convolutional neural networks and vision transformers, outperform traditional machine learning methods in identifying key DR biomarkers. Despite the promising performance metrics, challenges such as dataset limitations and model interpretability are noted, suggesting areas for future research.

Read abstract

Diabetic retinopathy (DR) is a leading cause of vision impairment worldwide. Optical coherence tomography (OCT) and OCT angiography (OCTA) provide detailed retinal imaging, enabling early detection of microvascular changes. This study aims to systematically review artificial intelligence (AI), particularly deep learning (DL), applications for DR detection and analysis using OCT and OCTA images. A comprehensive literature search was conducted across PubMed, Web of Science, Scopus, IEEE Xplore, and Embase for studies published up to March 2026. A total of 1007 articles were identified, of which 175 studies met the inclusion criteria following the PRISMA study selection process. DL-based approaches consistently demonstrated superior performance compared to traditional machine learning (ML) methods, with reported AUC values typically ranging from 0.90 to 0.99 across classification and segmentation tasks. Convolutional neural networks (CNNs), Vision Transformers (ViTs), and encoder-decoder architectures such as U-Net showed strong performance in detecting key DR biomarkers, including microaneurysms, macular edema, and neovascularization. However, performance variability was observed depending on dataset size, imaging modality, and annotation quality. AI-driven analysis of OCT and OCTA images offers significant potential for automated DR detection. Despite promising results, challenges such as limited public datasets, lack of cross-institutional validation, and model interpretability remain. Future research should focus on multimodal integration, explainable AI, and large-scale validation to enhance clinical applicability.

PUBMED Cancer: hepatocellular carcinoma Method: large language models

Large Language Models for the Differentiation of Benign and Malignant Liver Nodules based on Multimodal Prompts in Liver US Cases.

Sijia Lin, Yu Li, Rushuang Mao, Xuebin Zou, Yixin Hu, Hongsheng Ye, Xiaojun Wu, Liang Yang, Jichong He, Shilin Lu, Lingling Li, Jianhua Zhou
Published 2026-07-01 00:00

This study evaluates the performance of large language models (LLMs) in differentiating benign and malignant liver nodules using multimodal prompts in liver ultrasound cases. The research involved 400 liver ultrasound cases, with a focus on identifying the optimal input for LLMs. Results indicated that the combination of ultrasound images and medical history provided the highest diagnostic accuracy, with LLMs achieving performance comparable to senior radiologists.

Read abstract

Large language models (LLMs) that can process both images and text are increasingly being used in radiology. This study aimed to evaluate the performance of LLMs including GPT-4 Omni (GPT-4o), Claude-3.5-Sonnet (Claude), and Gemini 1.5 Pro (Gemini) in differentiating benign and malignant nodules in liver US cases and compare it with that of human readers. Four hundred liver US cases with pathologically confirmed liver nodules visible on B-mode US from January 2020 to November 2024 were randomly selected in this retrospective study. They were divided into a development set (n = 100) and a test set (n = 300). Five prompt groups for LLMs including US image [I-only], image description [D-only], image and description [I+D], image and liver US e-textbook [I+T], and image and medical history [I+H] were evaluated to identify the optimal input in development set. In test set, accuracy of LLMs in differentiating benign and malignant liver nodules was compared with that of human readers using McNemar's test. In development set, the prompt group I+H for all LLMs exhibited the highest diagnostic accuracy in differentiating benign and malignant liver nodules, being considering as the optimal input (taking GPT-4o as an example, with I-only, 57.0% [as reference]; D-only, 62.0%, p = 0.55; I+D, 62.0%, p = 0.54; I+T, 62.0%, p = 0.36; I+H, 77.0%, p = 0.01). In test set, LLMs with I+H outperformed junior group and showed similar accuracy to senior group (Junior, 70.0% [as reference1]; Senior, 78.3% [as reference2]; GPT-4o, 83.3%, P1 < .001, P2 = .10; Claude, 77.0%, p1 = 0.04, p2 = 0.72; Gemini, 75.3%, p1 = 0.14, p2 = 0.36). Large language models with US image and medical history inputs achieved accuracy comparable to senior radiologists and superior to junior radiologists in differentiating benign and malignant liver nodules.

Find the papers that actually matter