Research Papers

ARXIV Cancer: brain cancer Method: generative adversarial network

Brain MR Image Synthesis with 3D Multi-Contrast Self-Attention GAN

Zaid A. Abod, Furqan Aziz
Published 2026-03-31 13:43

This study presents 3D-MC-SAGAN, a 3D Multi-Contrast Self-Attention Generative Adversarial Network designed to synthesize high-quality missing MRI modalities from a single T2-weighted input. The method incorporates a multi-scale encoder-decoder architecture and a Memory-Bounded Hybrid Attention block to efficiently capture long-range dependencies while preserving tumor characteristics. Experimental results indicate that 3D-MC-SAGAN achieves state-of-the-art performance in generating anatomically plausible MRI contrasts and maintains tumor segmentation accuracy comparable to fully acquired multi-modal inputs.

Read abstract

Complete and high-quality multi-modal Magnetic Resonance Imaging (MRI) is essential for accurate neuro-oncological assessment, as each contrast provides complementary anatomical and pathological information. However, acquiring all modalities (e.g., T1c, T1n, T2w, T2f) for every patient is often impractical due to prolonged scan times, cost, and patient discomfort, potentially limiting comprehensive tumour evaluation. We propose 3D-MC-SAGAN (3D Multi-Contrast Self-Attention Generative Adversarial Network), a unified 3D multi-contrast synthesis framework that generates high-fidelity missing modalities from a single T2w input while explicitly preserving tumour characteristics. The model employs a multi-scale 3D encoder--decoder generator with residual connections and a novel Memory-Bounded Hybrid Attention (MBHA) block to capture long-range dependencies efficiently, and is trained with a WGAN-GP critic and an auxiliary domain classification head to produce T2f, T1n, and T1c volumes within a unified network. To ensure anatomical and pathological fidelity, we incorporate a frozen 3D U-Net-based segmentation network that enforces a tumour-consistency constraint during training. A composite objective combining adversarial, reconstruction, perceptual, structural similarity, contrast-classification, and segmentation-guided losses further promotes both global realism and tumour-preserving structure. Extensive experiments on 3D brain MRI datasets demonstrate that 3D-MC-SAGAN achieves state-of-the-art quantitative performance and produces visually coherent, anatomically plausible contrasts with improved distributional realism. Importantly, the proposed method maintains tumour segmentation accuracy comparable to that achieved using fully acquired multi-modal inputs, highlighting its potential to reduce acquisition burden while preserving clinically meaningful information.

ARXIV Cancer: brain cancer Method: generative adversarial network

Brain MR Image Synthesis with Multi-contrast Self-attention GAN

Zaid A. Abod, Furqan Aziz
Published 2026-03-31 13:43

This study presents 3D-MC-SAGAN, a generative adversarial network designed to synthesize missing multi-modal MRI images from a single T2 input for neuro-oncological assessment. The model employs a multi-scale 3D encoder-decoder architecture and incorporates a segmentation-consistency constraint to maintain lesion morphology. Evaluation on brain MRI datasets shows that the method achieves state-of-the-art performance in generating high-fidelity images while preserving tumor characteristics.

Read abstract

Accurate and complete multi-modal Magnetic Resonance Imaging (MRI) is essential for neuro-oncological assessment, as each contrast provides complementary anatomical and pathological information. However, acquiring all modalities (e.g., T1c, T1n, T2, T2f) for every patient is often impractical due to time, cost, and patient discomfort, potentially limiting comprehensive tumour evaluation. We propose 3D-MC-SAGAN (3D Multi-Contrast Self-Attention generative adversarial network), a unified 3D multi-contrast synthesis framework that generates high-fidelity missing modalities from a single T2 input while explicitly preserving tumour characteristics. The model employs a multi-scale 3D encoder-decoder generator with residual connections and a novel Memory-Bounded Hybrid Attention (MBHA) block to capture long-range dependencies efficiently, and is trained with a WGAN-GP critic and an auxiliary contrast-conditioning branch to produce T2f, T1n, and T1c volumes within a single unified network. A frozen 3D U-Net-based segmentation module introduces a segmentation-consistency constraint to preserve lesion morphology. The composite objective integrates adversarial, reconstruction, perceptual, structural similarity, contrast-classification, and segmentation-guided losses to align global realism with tumour-preserving structure. Extensive evaluation on 3D brain MRI datasets demonstrates that 3D-MC-SAGAN achieves state-of-the-art quantitative performance and generates visually coherent, anatomically plausible contrasts with improved distribution-level realism. Moreover, it maintains tumour segmentation accuracy comparable to fully acquired multi-modal inputs, highlighting its potential to reduce acquisition burden while preserving clinically meaningful information.

ARXIV Cancer: melanoma Method: UNet

Exploring the Impact of Skin Color on Skin Lesion Segmentation

Kuniko Paxton, Medina Kapo, Amila Akagić, Koorosh Aslansefat, Dhavalkumar Thakker, Yiannis Papadopoulos
Published 2026-03-31 12:49

This study investigates the impact of skin color on the segmentation of skin lesions in melanoma, emphasizing the importance of early detection. The authors evaluate three advanced segmentation architectures on public dermoscopic datasets and introduce a novel continuous pigment analysis. The results indicate that low lesion-skin contrast is linked to increased segmentation errors, suggesting that addressing contrast issues is crucial for improving segmentation performance.

Read abstract

Skin cancer, particularly melanoma, remains a major cause of morbidity and mortality, making early detection critical. AI-driven dermatology systems often rely on skin lesion segmentation as a preprocessing step to delineate the lesion from surrounding skin and support downstream analysis. While fairness concerns regarding skin tone have been widely studied for lesion classification, the influence of skin tone on the segmentation stage remains under-quantified and is frequently assessed using coarse, discrete skin tone categories. In this work, we evaluate three strong segmentation architectures (UNet, DeepLabV3 with a ResNet50 backbone, and DINOv2) on two public dermoscopic datasets (HAM10000 and ISIC2017) and introduce a continuous pigment or contrast analysis that treats pixel-wise ITA values as distributions. Using Wasserstein distances between within-image distributions for skin-only, lesion-only, and whole-image regions, we quantify lesion skin contrast and relate it to segmentation performance across multiple metrics. Within the range represented in these datasets, global skin tone metrics (Fitzpatrick grouping or mean ITA) show weak association with segmentation quality. In contrast, low lesion-skin contrast is consistently associated with larger segmentation errors in models, indicating that boundary ambiguity and low contrast are key drivers of failure. These findings suggest that fairness improvements in dermoscopic segmentation should prioritize robust handling of low-contrast lesions, and the distribution-based pigment measures provide a more informative audit signal than discrete skin-tone categories.

ARXIV Cancer: head and neck squamous cell carcinoma Method: markerless surface registration

All-in-One Augmented Reality Guided Head and Neck Tumor Resection

Yue Yang, Matthieu Chabanas, Carrie Reale, Annie Benson, Jason Slagle, Matthew Weinger, Michael Topf, Jie Ying Wu
Published 2026-03-31 09:38

This study presents an augmented reality (AR) system designed to improve the precision of intraoperative re-resection in head and neck squamous cell carcinoma by relocalizing positive margins. The system utilizes HoloLens 2 for depth sensing and automated markerless surface registration. Results from a silicone phantom study indicate that AR guidance significantly reduces localization errors compared to traditional verbal guidance. The findings suggest that markerless AR can enhance the accuracy of surgical procedures involving positive margins.

Read abstract

Positive margins are common in head and neck squamous cell carcinoma, yet intraoperative re-resection is often imprecise because margin locations are typically communicated verbally from pathology. We present an all-in-one augmented reality (AR) system that relocalizes positive margins from a resected specimen to the resection bed and visualizes them in situ using HoloLens 2 depth sensing and fully automated markerless surface registration. In a silicone phantom study with six medical trainees, markerless registration achieved target registration errors comparable to a marker-based baseline (median 1.8 mm vs. 1.7 mm; maximum < 4 mm). In a margin relocalization task, AR guidance reduced error from verbal guidance (median 14.2 mm) to a few millimeters (median 3.2 mm), with all AR localizations within 5 mm error. These results support the feasibility of markerless AR margin guidance for more precise intraoperative re-excision.

ARXIV Cancer: cholangiocarcinoma Method: deep learning

NeoNet: An End-to-End 3D MRI-Based Deep Learning Framework for Non-Invasive Prediction of Perineural Invasion via Generation-Driven Classification

Youngung Han, Minkyung Cha, Kyeonghun Kim, Induk Um, Myeongbin Sho, Joo Young Bae, Jaewon Jung, Jung Hyeok Park, Seojun Lee, Nam-Joon Kim, Woo Kyoung Jeong, Won Jae Lee, Pa Hong, Ken Ying-Kai Liao, Hyuk-Jae Lee
Published 2026-03-31 08:51

The study introduces NeoNet, a comprehensive 3D deep learning framework aimed at non-invasive prediction of perineural invasion (PNI) in cholangiocarcinoma. The framework comprises three modules: NeoSeg for tumor localization, NeoGen for generating synthetic image patches, and NeoCls for final prediction using a specialized attention network. In a 5-fold cross-validation, NeoNet demonstrated superior performance compared to baseline models, achieving a maximum AUC of 0.7903.

Read abstract

Minimizing invasive diagnostic procedures to reduce the risk of patient injury and infection is a central goal in medical imaging. And yet, noninvasive diagnosis of perineural invasion (PNI), a critical prognostic factor involving infiltration of tumor cells along the surrounding nerve, still remains challenging, due to the lack of clear and consistent imaging criteria criteria for identifying PNI. To address this challenge, we present NeoNet, an integrated end-to-end 3D deep learning framework for PNI prediction in cholangiocarcinoma that does not rely on predefined image features. NeoNet integrates three modules: (1) NeoSeg, utilizing a Tumor-Localized ROI Crop (TLCR) algorithm; (2) NeoGen, a 3D Latent Diffusion Model (LDM) with ControlNet, conditioned on anatomical masks to generate synthetic image patches, specifically balancing the dataset to a 1:1 ratio; and (3) NeoCls, the final prediction module. For NeoCls, we developed the PNI-Attention Network (PattenNet), which uses the frozen LDM encoder and specialized 3D Dual Attention Blocks (DAB) designed to detect subtle intensity variations and spatial patterns indicative of PNI. In 5-fold cross-validation, NeoNet outperformed baseline 3D models and achieved the highest performance with a maximum AUC of 0.7903.

ARXIV Cancer: oncology Method: large language models

AI-Generated Prior Authorization Letters: Strong Clinical Content, Weak Administrative Scaffolding

Moiz Sadiq Awan, Maryam Raza
Published 2026-03-31 07:40

This study evaluates the effectiveness of three large language models (LLMs) in generating prior authorization letters across various medical specialties, including oncology. The models produced letters with strong clinical content but revealed significant gaps in meeting real-world administrative requirements. The findings highlight the need for improved administrative support systems to complement the clinical capabilities of LLMs.

Read abstract

Prior authorization remains one of the most burdensome administrative processes in U.S. healthcare, consuming billions of dollars and thousands of physician hours each year. While large language models have shown promise across clinical text tasks, their ability to produce submission-ready prior authorization letters has received only limited attention, with existing work confined to single-case demonstrations rather than structured multi-scenario evaluation. We assessed three commercially available LLMs (GPT-4o, Claude Sonnet 4.5, and Gemini 2.5 Pro) across 45 physician-validated synthetic scenarios spanning rheumatology, psychiatry, oncology, cardiology, and orthopedics. All three models generated letters with strong clinical content: accurate diagnoses, well-structured medical necessity arguments, and thorough step therapy documentation. However, a secondary analysis of real-world administrative requirements revealed consistent gaps that clinical scoring alone did not capture, including absent billing codes, missing authorization duration requests, and inadequate follow-up plans. These findings reframe the question: the challenge for clinical deployment is not whether LLMs can write clinically adequate letters, but whether the systems built around them can supply the administrative precision that payer workflows require.

ARXIV Cancer: unknown Method: adversarial fine-tuning

Trojan-Speak: Bypassing Constitutional Classifiers with No Jailbreak Tax via Adversarial Finetuning

Bilgehan Sel, Xuanli He, Alwin Peng, Ming Jin, Jerry Wei
Published 2026-03-30 22:10

The paper presents Trojan-Speak, an adversarial fine-tuning method designed to bypass safety measures in AI models, specifically targeting Anthropic's Constitutional Classifiers. The method employs curriculum learning and GRPO-based hybrid reinforcement learning to teach models a communication protocol that successfully evades content classification. Notably, Trojan-Speak achieves over 99% classifier evasion with minimal degradation in reasoning performance, highlighting the limitations of current LLM-based content classifiers in preventing information disclosure.

Read abstract

Fine-tuning APIs offered by major AI providers create new attack surfaces where adversaries can bypass safety measures through targeted fine-tuning. We introduce Trojan-Speak, an adversarial fine-tuning method that bypasses Anthropic's Constitutional Classifiers. Our approach uses curriculum learning combined with GRPO-based hybrid reinforcement learning to teach models a communication protocol that evades LLM-based content classification. Crucially, while prior adversarial fine-tuning approaches report more than 25% capability degradation on reasoning benchmarks, Trojan-Speak incurs less than 5% degradation while achieving 99+% classifier evasion for models with 14B+ parameters. We demonstrate that fine-tuned models can provide detailed responses to expert-level CBRN (Chemical, Biological, Radiological, and Nuclear) queries from Anthropic's Constitutional Classifiers bug-bounty program. Our findings reveal that LLM-based content classifiers alone are insufficient for preventing dangerous information disclosure when adversaries have fine-tuning access, and we show that activation-level probes can substantially improve robustness to such attacks.

ARXIV Cancer: general cancer Method: contrastive learning

ChemCLIP: Bridging Organic and Inorganic Anticancer Compounds Through Contrastive Learning

Mohamad Koohi-Moghadam, Hongzhe Sun, Hongyan Li, Kyongtae Tyler Bae
Published 2026-03-30 15:28

The paper presents ChemCLIP, a dual-encoder contrastive learning framework designed to bridge the gap between organic and inorganic anticancer compounds. By utilizing a dataset of 44,854 organic compounds and 5,164 metal complexes across 60 cancer cell lines, the method learns unified representations based on shared anticancer activities. The study evaluates various molecular encoding strategies, finding that Morgan fingerprints provide the best performance in aligning and classifying these compounds.

Read abstract

The discovery of anticancer therapeutics has traditionally treated organic small molecules and metal-based coordination complexes as separate chemical domains, limiting knowledge transfer despite their shared biological objectives. This disparity is particularly pronounced in available data, with extensive screening databases for organic compounds compared to only a few thousand characterized metal complexes. Here, we introduce ChemCLIP, a dual-encoder contrastive learning framework that bridges this organic-inorganic divide by learning unified representations based on shared anticancer activities rather than structural similarity. We compiled complementary datasets comprising 44,854 unique organic compounds and 5,164 unique metal complexes, standardized across 60 cancer cell lines. By training parallel encoders with activity-aware hard negative mining, we mapped structurally distinct compounds into a shared 256-dimensional embedding space where biologically similar compounds cluster together regardless of chemical class. We systematically evaluated four molecular encoding strategies: Morgan fingerprints, ChemBERTa, MolFormer, and Chemprop, through quantitative alignment metrics, embedding visualizations, and downstream classification tasks. Morgan fingerprints achieved superior performance with an average alignment ratio of 0.899 and downstream classification AUCs of 0.859 (inorganic) and 0.817 (organic). This work establishes contrastive learning as an effective strategy for unifying disparate chemical domains and provides empirical guidance for encoder selection in multi-modal chemistry applications, with implications extending beyond anticancer drug discovery to any scenario requiring cross-domain chemical knowledge transfer.

ARXIV Cancer: brain tumor Method: ensemble model

Optimized Weighted Voting System for Brain Tumor Classification Using MRI Images

Ha Anh Vu
Published 2026-03-30 12:24

This paper presents a weighted ensemble learning approach for the classification of brain tumors using MRI images. The method integrates multiple classifiers, including deep learning models and traditional machine learning techniques, to enhance classification performance. Experimental results indicate that the proposed system achieves state-of-the-art accuracy, outperforming existing models in the field.

Read abstract

The accurate classification of brain tumors from MRI scans is essential for effective diagnosis and treatment planning. This paper presents a weighted ensemble learning approach that combines deep learning and traditional machine learning models to improve classification performance. The proposed system integrates multiple classifiers, including ResNet101, DenseNet121, Xception, CNN-MRI, and ResNet50 with edge-enhanced images, SVM, and KNN with HOG features. A weighted voting mechanism assigns higher influence to models with better individual accuracy, ensuring robust decision-making. Image processing techniques such as Balance Contrast Enhancement, K-means clustering, and Canny edge detection are applied to enhance feature extraction. Experimental evaluations on the Figshare and Kaggle MRI datasets demonstrate that the proposed method achieves state-of-the-art accuracy, outperforming existing models. These findings highlight the potential of ensemble-based learning for improving brain tumor classification, offering a reliable and scalable framework for medical image analysis.

ARXIV Cancer: brain cancer Method: physics-embedded deep learning

Physics-Embedded Feature Learning for AI in Medical Imaging

Pulock Das, Al Amin, Kamrul Hasan, Rohan Thompson, Azubike D. Okpalaeze, Liang Hong
Published 2026-03-30 05:45

This paper presents PhysNet, a physics-embedded deep learning framework designed to enhance interpretability and robustness in tumor classification. By integrating tumor growth dynamics into the feature learning process of a convolutional neural network, PhysNet aims to provide physically consistent predictions and insights into tumor behavior. Experimental results on a large brain MRI dataset indicate that PhysNet outperforms several state-of-the-art deep learning models, achieving superior classification accuracy and interpretability.

Read abstract

Deep learning (DL) models have achieved strong performance in an intelligence healthcare setting, yet most existing approaches operate as black boxes and ignore the physical processes that govern tumor growth, limiting interpretability, robustness, and clinical trust. To address this limitation, we propose PhysNet, a physics-embedded DL framework that integrates tumor growth dynamics directly into the feature learning process of a convolutional neural network (CNN). Unlike conventional physics-informed methods that impose physical constraints only at the output level, PhysNet embeds a reaction diffusion model of tumor growth within intermediate feature representations of a ResNet backbone. The architecture jointly performs multi-class tumor classification while learning a latent tumor density field, its temporal evolution, and biologically meaningful physical parameters, including tumor diffusion and growth rates, through end-to-end training. This design is necessary because purely data-driven models, even when highly accurate or ensemble-based, cannot guarantee physically consistent predictions or provide insight into tumor behavior. Experimental results on a large brain MRI dataset demonstrate that PhysNet outperforms multiple state-of-the-art DL baselines, including MobileNetV2, VGG16, VGG19, and ensemble models, achieving superior classification accuracy and F1-score. In addition to improved performance, PhysNet produces interpretable latent representations and learned bio-physical parameters that align with established medical knowledge, highlighting physics-embedded representation learning as a practical pathway toward more trustworthy and clinically meaningful medical AI systems.

Find the papers that actually matter