Publications

Peer-reviewable research by Ahmed Taha on trustworthy medical AI — fairness, demographic sensitivity, and documentation integrity in vision–language and LLM-agent systems. All papers are preprints with public code and datasets.

Google Scholar: 0 citations · h-index 0

SpineFairBench: A Counterfactual Benchmark for Auditing Demographic Sensitivity in Spinal Radiology VLM Reports

Ahmed Taha, Abdelrahman Taeha, Muzzammil Ahmadzada · Preprint, 2026 · Under review · Cited by 0

A paired counterfactual benchmark that audits whether nine frozen vision–language models change their spinal-radiology reports when apparent age and sex are edited while the target pathology is preserved.
MedInsider: A Benchmark for Documentation Integrity in Medical LLM Agents Under Institutional Pressure

Ahmed Taha, Abdelrahman Taeha, Muzzammil Ahmadzada · Preprint, 2026 · Under review · Cited by 0

A benchmark and simulated electronic-health-record environment that measures whether medical LLM agents preserve documentation integrity when institutional context rewards shortcuts or omissions.

Coming soon

SpineShiftBench

Coming soon
TrustFold

Coming soon
Warden

Coming soon

Research profiles

Ahmed Taha on ORCID · Ahmed Taha on Google Scholar · Ahmed Taha on GitHub · Ahmed Taha on Hugging Face