Research Papers

ARXIV Cancer: brain tumor Method: Prior-Guided ROI Reasoning Network

PGR-Net: Prior-Guided ROI Reasoning Network for Brain Tumor MRI Segmentation

Jiacheng Lu, Hui Ding, Shiyu Zhang, Guoping Huo
Published 2026-03-23 06:45

This paper presents PGR-Net, a Prior-Guided ROI Reasoning Network designed for the segmentation of brain tumors in MRI scans. The method incorporates spatial priors to enhance the accuracy of lesion detection by focusing on regions of interest (ROIs) and improving localization precision. Experimental results demonstrate that PGR-Net outperforms existing segmentation methods, achieving high Dice scores on benchmark datasets.

Read abstract

Brain tumor MRI segmentation is essential for clinical diagnosis and treatment planning, enabling accurate lesion detection and radiotherapy target delineation. However, tumor lesions occupy only a small fraction of the volumetric space, resulting in severe spatial sparsity, while existing segmentation networks often overlook clinically observed spatial priors of tumor occurrence, leading to redundant feature computation over extensive background regions. To address this issue, we propose PGR-Net (Prior-Guided ROI Reasoning Network) - an explicit ROI-aware framework that incorporates a data-driven spatial prior set to capture the distribution and scale characteristics of tumor lesions, providing global guidance for more stable segmentation. Leveraging these priors, PGR-Net introduces a hierarchical Top-K ROI decision mechanism that progressively selects the most confident lesion candidate regions across encoder layers to improve localization precision. We further develop the WinGS-ROI (Windowed Gaussian-Spatial Decay ROI) module, which uses multi-window Gaussian templates with a spatial decay function to produce center-enhanced guidance maps, thus directing feature learning throughout the network. With these ROI features, a windowed RetNet backbone is adopted to enhance localization reliability. Experiments on BraTS-2019/2023 and MSD Task01 show that PGR-Net consistently outperforms existing approaches while using only 8.64M Params, achieving Dice scores of 89.02%, 91.82%, and 89.67% on the Whole Tumor region. Code is available at https://github.com/CNU-MedAI-Lab/PGR-Net.

ARXIV Cancer: general cancer Method: relational graph convolutional network

SynLeaF: A Dual-Stage Multimodal Fusion Framework for Synthetic Lethality Prediction Across Pan- and Single-Cancer Contexts

Zheming Xing, Siyuan Zhou, Ruinan Wang, Rui Han, Shiming Zhang, Shiqu Chen, Yurui Huang, Jiahao Ma, Yifan Chen, Xuan Wang, Yadong Wang, Junyi Li
Published 2026-03-23 05:55

This study presents SynLeaF, a dual-stage multimodal fusion framework designed for predicting synthetic lethality across both pan-cancer and single-cancer contexts. The framework integrates various omics data types using a VAE-based cross-encoder and a relational graph convolutional network to enhance the prediction accuracy. Extensive experiments demonstrate that SynLeaF outperforms existing models in most scenarios, highlighting its effectiveness in addressing challenges related to modality laziness.

Read abstract

Accurate prediction of synthetic lethality (SL) is important for guiding the development of cancer drugs and therapies. SL prediction faces significant challenges in the effective fusion of heterogeneous multi-source data. Existing multimodal methods often suffer from "modality laziness" due to disparate convergence speeds, which hinders the exploitation of complementary information. This is also one reason why most existing SL prediction models cannot perform well on both pan-cancer and single-cancer SL pair prediction. In this study, we propose SynLeaF, a dual-stage multimodal fusion framework for SL prediction across pan- and single-cancer contexts. The framework employs a VAE-based cross-encoder with a product of experts mechanism to fuse four omics data types (gene expression, mutation, methylation, and CNV), while simultaneously utilizing a relational graph convolutional network to capture structured gene representations from biomedical knowledge graphs. To mitigate modality laziness, SynLeaF introduces a dual-stage training mechanism employing featurelevel knowledge distillation with adaptive uni-modal teacher and ensemble strategies. In extensive experiments across eight specific cancer types and a pancancer dataset, SynLeaF achieves superior performance in 17 out of 19 scenarios. Ablation studies and gradient analyses further validate the critical contributions of the proposed fusion and distillation mechanisms to model robustness and generalization. To facilitate community use, a web server is available at https://synleaf.bioinformatics-lilab.cn.

ARXIV Cancer: breast cancer Method: parameter-efficient prompt tuning

Parameter-efficient Prompt Tuning and Hierarchical Textual Guidance for Few-shot Whole Slide Image Classification

Jayanie Bogahawatte, Sachith Seneviratne, Saman Halgamuge
Published 2026-03-23 02:50

This paper presents a novel approach for few-shot weakly supervised whole slide image classification (FSWC) by introducing a parameter-efficient prompt tuning method and a hierarchical textual guidance strategy. The proposed method aims to reduce computational costs while leveraging the pre-trained knowledge of vision-language models (VLMs) and the hierarchical structure of whole slide images. Evaluations on pathology datasets for breast, lung, and ovarian cancers show significant improvements in classification performance and a reduction in trainable parameters compared to existing methods.

Read abstract

Whole Slide Images (WSIs) are giga-pixel in scale and are typically partitioned into small instances in WSI classification pipelines for computational feasibility. However, obtaining extensive instance level annotations is costly, making few-shot weakly supervised WSI classification (FSWC) crucial for learning from limited slide-level labels. Recently, pre-trained vision-language models (VLMs) have been adopted in FSWC, yet they exhibit several limitations. Existing prompt tuning methods in FSWC substantially increase both the number of trainable parameters and inference overhead. Moreover, current methods discard instances with low alignment to text embeddings from VLMs, potentially leading to information loss. To address these challenges, we propose two key contributions. First, we introduce a new parameter efficient prompt tuning method by scaling and shifting features in text encoder, which significantly reduces the computational cost. Second, to leverage not only the pre-trained knowledge of VLMs, but also the inherent hierarchical structure of WSIs, we introduce a WSI representation learning approach with a soft hierarchical textual guidance strategy without utilizing hard instance filtering. Comprehensive evaluations on pathology datasets covering breast, lung, and ovarian cancer types demonstrate consistent improvements up-to 10.9%, 7.8%, and 13.8% respectively, over the state-of-the-art methods in FSWC. Our method reduces the number of trainable parameters by 18.1% on both breast and lung cancer datasets, and 5.8% on the ovarian cancer dataset, while also excelling at weakly-supervised tumor localization. Code at https://github.com/Jayanie/HIPSS.

ARXIV Cancer: brain tumor Method: vision transformer

Enhancing Brain Tumor Classification Using Vision Transformers with Colormap-Based Feature Representation on BRISC2025 Dataset

Faisal Ahmed
Published 2026-03-22 13:46

This study presents a deep learning framework utilizing Vision Transformers (ViT) with colormap-based feature representation to enhance the classification of brain tumors from MRI scans. The approach aims to improve multi-class classification performance by leveraging transformer architectures to capture long-range dependencies and emphasizing structural variations through color mapping. Experiments on the BRISC2025 dataset demonstrate a classification accuracy of 98.90% and an AUC of 99.97%, indicating the method's effectiveness and potential for clinical applications.

Read abstract

Accurate classification of brain tumors from magnetic resonance imaging (MRI) plays a critical role in early diagnosis and effective treatment planning. In this study, we propose a deep learning framework based on Vision Transformers (ViT) enhanced with colormap-based feature representation to improve multi-class brain tumor classification performance. The proposed approach leverages the ability of transformer architectures to capture long-range dependencies while incorporating color mapping techniques to emphasize important structural and intensity variations within MRI scans. Experiments are conducted on the BRISC2025 dataset, which includes four classes: glioma, meningioma, pituitary tumor, and non-tumor cases. The model is trained and evaluated using standard performance metrics such as accuracy, precision, recall, F1-score, and area under the receiver operating characteristic curve (AUC). The proposed method achieves a classification accuracy of 98.90%, outperforming baseline convolutional neural network models including ResNet50, ResNet101, and EfficientNetB2. In addition, the model demonstrates strong generalization capability with an AUC of 99.97%, indicating high discriminative performance across all classes. These results highlight the effectiveness of combining Vision Transformers with colormap-based feature enhancement for accurate and robust brain tumor classification and suggest strong potential for clinical decision support applications.

ARXIV Cancer: brain tumor Method: deep learning

DGRNet: Disagreement-Guided Refinement for Uncertainty-Aware Brain Tumor Segmentation

Bahram Mohammadi, Yanqiu Wu, Vu Minh Hieu Phan, Sam White, Minh-Son To, Jian Yang, Michael Sheng, Yang Song, Yuankai Qi
Published 2026-03-22 06:56

This paper presents the Disagreement-Guided Refinement Network (DGRNet), a framework designed for accurate brain tumor segmentation from MRI scans. It addresses the challenges of uncertainty quantification in single-model predictions and the under-utilization of information from radiology reports. The proposed method enhances segmentation accuracy and provides reliable uncertainty estimates, as demonstrated by improved performance metrics on the TextBraTS dataset.

Read abstract

Accurate brain tumor segmentation from MRI scans is critical for diagnosis and treatment planning. Despite the strong performance of recent deep learning approaches, two fundamental limitations remain: (1) the lack of reliable uncertainty quantification in single-model predictions, which is essential for clinical deployment because the level of uncertainty may impact treatment decision-making, and (2) the under-utilization of rich information in radiology reports that can guide segmentation in ambiguous regions. In this paper, we propose the Disagreement-Guided Refinement Network (DGRNet), a novel framework that addresses both limitations through multi-view disagreement-based uncertainty estimation and text-conditioned refinement. DGRNet generates diverse predictions via four lightweight view-specific adapters attached to a shared encoder-decoder, enabling efficient uncertainty quantification within a single forward pass. Afterward, we build disagreement maps to identify regions of high segmentation uncertainty, which are then selectively refined according to clinical reports. Moreover, we introduce a diversity-preserving training strategy that combines pairwise similarity penalties and gradient isolation to prevent view collapse. The experimental results on the TextBraTS dataset show that DGRNet favorably improves state-of-the-art segmentation accuracy by 2.4% and 11% in main metrics Dice and HD95, respectively, while providing meaningful uncertainty estimates.

ARXIV Cancer: brain cancer Method: text-modulated soft cascade architecture

Hierarchical Text-Guided Brain Tumor Segmentation via Sub-Region-Aware Prompts

Bahram Mohammadi, Ta Duc Huy, Afrouz Sheikholeslami, Qi Chen, Vu Minh Hieu Phan, Sam White, Minh-Son To, Xuyun Zhang, Amin Beheshti, Luping Zhou, Yuankai Qi
Published 2026-03-22 06:45

This paper presents a novel framework called TextCSP for brain tumor segmentation that integrates radiological descriptions with imaging data. The method employs a hierarchical approach to predict tumor sub-regions in a coarse-to-fine manner, utilizing specialized text representations for each sub-region. Experimental results on the TextBraTS dataset indicate significant improvements in segmentation accuracy compared to existing state-of-the-art methods.

Read abstract

Brain tumor segmentation remains challenging because the three standard sub-regions, i.e., whole tumor (WT), tumor core (TC), and enhancing tumor (ET), often exhibit ambiguous visual boundaries. Integrating radiological description texts with imaging has shown promise. However, most multimodal approaches typically compress a report into a single global text embedding shared across all sub-regions, overlooking their distinct clinical characteristics. We propose TextCSP (text-modulated soft cascade architecture), a hierarchical text-guided framework that builds on the TextBraTS baseline with three novel components: (1) a text-modulated soft cascade decoder that predicts WT->TC->ET in a coarse-to-fine manner consistent with their anatomical containment hierarchy. (2) sub-region-aware prompt tuning, which uses learnable soft prompts with a LoRA-adapted BioBERT encoder to generate specialized text representations tailored for each sub-region; (3) text-semantic channel modulators that convert the aforementioned representations into channel-wise refinement signals, enabling the decoder to emphasize features aligned with clinically described patterns. Experiments on the TextBraTS dataset demonstrate consistent improvements across all sub-regions against state-of-the-art methods by 1.7% and 6% on the main metrics Dice and HD95.

ARXIV Cancer: breast cancer Method: gradient boosting

First-Mover Bias in Gradient Boosting Explanations: Mechanism, Detection, and Resolution

Drake Caraker, Bryan Arnold, David Rhoads
Published 2026-03-22 02:59

This paper investigates first-mover bias in gradient boosting, which leads to a concentration of feature importance due to sequential residual fitting. The authors propose a method called DASH to mitigate this bias and demonstrate its effectiveness on the Breast Cancer dataset, significantly improving stability in feature importance rankings. Additionally, they introduce diagnostic tools to detect first-mover bias, enhancing the reliability of feature explanations in machine learning models.

Read abstract

We isolate and empirically characterize first-mover bias -- a path-dependent concentration of feature importance caused by sequential residual fitting in gradient boosting -- as a specific mechanistic cause of the well-known instability of SHAP-based feature rankings under multicollinearity. When correlated features compete for early splits, gradient boosting creates a self-reinforcing advantage for whichever feature is selected first: subsequent trees inherit modified residuals that favor the incumbent, concentrating SHAP importance on an arbitrary feature rather than distributing it across the correlated group. Scaling up a single model amplifies this effect -- a Large Single Model with the same total tree count as our method produces the worst explanations of any approach tested. We demonstrate that model independence is sufficient to resolve first-mover bias in the linear regime, and remains the most effective mitigation under nonlinear data-generating processes. Both our proposed method, DASH (Diversified Aggregation of SHAP), and simple seed-averaging (Stochastic Retrain) restore stability by breaking the sequential dependency chain, confirming that the operative mechanism is independence between explained models. At rho=0.9, both achieve stability=0.977, while the single-best workflow degrades to 0.958 and the Large Single Model to 0.938. On the Breast Cancer dataset, DASH improves stability from 0.32 to 0.93 (+0.61) against a tree-count-matched baseline. DASH additionally provides two diagnostic tools -- the Feature Stability Index (FSI) and Importance-Stability (IS) Plot -- that detect first-mover bias without ground truth, enabling practitioners to audit explanation reliability before acting on feature rankings. Software and reproducible benchmarks are available at https://github.com/DrakeCaraker/dash-shap.

ARXIV Cancer: skin cancer Method: vision-language model

SkinCLIP-VL: Consistency-Aware Vision-Language Learning for Multimodal Skin Cancer Diagnosis

Zhixiang Lu, Shijie Xu, Kaicheng Yan, Xuyue Cai, Chong Zhang, Yulong Li, Angelos Stefanidis, Anh Nguyen, Jionglong Su
Published 2026-03-22 02:07

This paper introduces SkinCLIP-VL, a resource-efficient framework designed for skin cancer diagnosis using vision-language models. The method integrates a frozen CLIP encoder with a lightweight Qwen2.5-VL model, employing a novel Consistency-aware Focal Alignment Loss to improve accuracy and clinical trust. Results demonstrate that SkinCLIP-VL outperforms existing models while reducing the number of parameters, indicating its effectiveness in addressing challenges in dermatological applications.

Read abstract

The deployment of vision-language models (VLMs) in dermatology is hindered by the trilemma of high computational costs, extreme data scarcity, and the black-box nature of deep learning. To address these challenges, we present SkinCLIP-VL, a resource-efficient framework that adapts foundation models for trustworthy skin cancer diagnosis. Adopting a frozen perception, adaptive reasoning paradigm, we integrate a frozen CLIP encoder with a lightweight, quantized Qwen2.5-VL via low-rank adaptation (LoRA). To strictly align visual regions with clinical semantics under long-tailed distributions, we propose the Consistency-aware Focal Alignment (CFA) Loss. This objective synergizes focal re-weighting, semantic alignment, and calibration. On ISIC and Derm7pt benchmarks, SkinCLIP-VL surpasses 13B-parameter baselines by 4.3-6.2% in accuracy with 43% fewer parameters. Crucially, blinded expert evaluation and out-of-distribution testing confirm that our visually grounded rationales significantly enhance clinical trust compared to traditional saliency maps.

ARXIV Cancer: general cancer Method: multiple-instance learning

GOLDMARK: Governed Outcome-Linked Diagnostic Model Assessment Reference Kit

Chad Vanderbilt, Gabriele Campanella, Siddharth Singi, Swaraj Nanda, Jie-Fu Chen, Ali Kamali, Amir Momeni Boroujeni, David Kim, Mohamed Yakoub, Jamal Benhamida, Meera Hameed, Neeraj Kumar, Gregory Goldgof
Published 2026-03-21 15:09

The paper presents GOLDMARK, a standardized benchmarking framework for computational biomarkers derived from histopathology images. It utilizes slide-level multiple-instance learning with pathology foundation models to improve predictive performance in cancer diagnostics. The framework includes structured intermediate representations and evaluation outputs, demonstrating mean AUROC scores across various tumor-biomarker tasks. GOLDMARK aims to enhance reproducibility and comparability in computational pathology research.

Read abstract

Computational biomarkers (CBs) are histopathology-derived patterns extracted from hematoxylin-eosin (H&E) whole-slide images (WSIs) using artificial intelligence (AI) to predict therapeutic response or prognosis. Recently, slide-level multiple-instance learning (MIL) with pathology foundation models (PFMs) has become the standard baseline for CB development. While these methods have improved predictive performance, computational pathology lacks standardized intermediate data formats, provenance tracking, checkpointing conventions, and reproducible evaluation metrics required for clinical-grade deployment. We introduce GOLDMARK (https://artificialintelligencepathology.org), a standardized benchmarking framework built on a curated TCGA cohort with clinically actionable OncoKB level 1-3 biomarker labels. GOLDMARK releases structured intermediate representations, including tile coordinate maps, per-slide feature embeddings from canonical PFMs, quality-control metadata, predefined patient-level splits, trained slide-level models, and evaluation outputs. Models are trained on TCGA and evaluated on an independent MSKCC cohort with reciprocal testing. Across 33 tumor-biomarker tasks, mean AUROC was 0.689 (TCGA) and 0.630 (MSKCC). Restricting to the eight highest-performing tasks yielded mean AUROCs of 0.831 and 0.801, respectively. These tasks correspond to established morphologic-genomic associations (e.g., LGG IDH1, COAD MSI/BRAF, THCA BRAF/NRAS, BLCA FGFR3, UCEC PTEN) and showed the most stable cross-site performance. Differences between canonical encoders were modest relative to task-specific variability. GOLDMARK establishes a shared experimental substrate for computational pathology, enabling reproducible benchmarking and direct comparison of methods across datasets and models.

ARXIV Cancer: colorectal cancer Method: large language model

Deep reflective reasoning in interdependence constrained structured data extraction from clinical notes for digital health

Jingwei Huang, Kuroush Nezafati, Zhikai Chi, Ruichen Rong, Colin Treager, Tingyi Wanyan, Yueshuang Xu, Xiaowei Zhan, Patrick Leavey, Guanghua Xiao, Wenqi Shi, Yang Xie
Published 2026-03-20 19:05

This paper presents a novel framework called deep reflective reasoning for extracting structured information from clinical notes, addressing the challenges posed by interdependent variables. The method utilizes a large language model that iteratively critiques and revises outputs to ensure consistency among variables and domain knowledge. The framework was evaluated across three oncology applications, showing significant improvements in accuracy and reliability of data extraction.

Read abstract

Extracting structured information from clinical notes requires navigating a dense web of interdependent variables where the value of one attribute logically constrains others. Existing Large Language Model (LLM)-based extraction pipelines often struggle to capture these dependencies, leading to clinically inconsistent outputs. We propose deep reflective reasoning, a large language model agent framework that iteratively self-critiques and revises structured outputs by checking consistency among variables, the input text, and retrieved domain knowledge, stopping when outputs converge. We extensively evaluate the proposed method in three diverse oncology applications: (1) On colorectal cancer synoptic reporting from gross descriptions (n=217), reflective reasoning improved average F1 across eight categorical synoptic variables from 0.828 to 0.911 and increased mean correct rate across four numeric variables from 0.806 to 0.895; (2) On Ewing sarcoma CD99 immunostaining pattern identification (n=200), the accuracy improved from 0.870 to 0.927; (3) On lung cancer tumor staging (n=100), tumor stage accuracy improved from 0.680 to 0.833 (pT: 0.842 -> 0.884; pN: 0.885 -> 0.948). The results demonstrate that deep reflective reasoning can systematically improve the reliability of LLM-based structured data extraction under interdependence constraints, enabling more consistent machine-operable clinical datasets and facilitating knowledge discovery with machine learning and data science towards digital health.

Find the papers that actually matter