欢迎来到我的站点

关于我

我是王子龙,目前在微软亚洲研究院(上海)机器学习组担任Senior Researcher。我的研究聚焦于 AI4Health、Foundation Models 与 Human-AI Interaction 的交叉前沿,致力于构建可靠、可泛化、深度契合真实世界需求的人工智能系统,尤其面向医疗健康等高风险场景。我于 2018 年获得复旦大学上海医学院临床医学医学博士学位。临床训练塑造了我的研究方法论:以医学问题为牵引,强调模型在真实世界异质数据下的稳健性、与临床推理的一致性,以及在复杂工作流中的安全部署。

我的研究主要沿三条主线展开:在 AI4Health 方向,我关注医学影像与多模态智能系统,用于筛查、诊断与长期疾病管理;在 Foundation Models 方向,我探索多模态基础模型与医疗大语言模型(LLM)的架构设计、评测方法与强化学习优化策略,以提升泛化能力、可解释性与临床可信度;在 Human-AI Interaction 方向,我研究先进 AI 系统(包括多模态与智能体模型)在真实场景中的人机交互机制,涵盖临床工作流、无障碍技术以及老龄人群应用等场景,强调 human-in-the-loop 设计,使用户能够查询、验证、纠错并引导 AI 行为,从而提升系统的透明性、可控性与可信度。

在 2023 年加入微软亚洲研究院之前,我曾在医疗科技初创公司担任 CTO,主导研发多款 AI 软件医疗器械(SaMD),并推动完成临床验证、注册审批与市场准入,形成从技术研发到产业落地的完整转化经验。我曾入选 2020 年福布斯中国 30 岁以下精英榜与 2021 年胡润 U30 中国创业领袖榜,并担任中国计算机学会(CCF)数字医学专委会执行委员。

最新动态

[2026年2月] 我们发布 OMGs(卵巢肿瘤多学科智能体系统),这是一个由大语言模型驱动的多智能体框架,旨在支持卵巢肿瘤全病程管理中的 MDT(多学科会诊)决策。在多中心评估中,OMGs 的表现达到专家 MDT 共识水平,展示了协作式智能体系统在高风险临床决策支持中的潜力。

[2026年1月] 四篇论文被 ICLR 2026 接收为 poster,分别围绕视觉提示微调的可解释性、多模态阿尔茨海默病诊断的统一基础模型、面向领域泛化的推理驱动多模态大模型,以及大模型强化学习中低概率 token 主导问题的修正方法。

[2026年1月] 三篇论文被顶级会议接收:AAAI 2026(面向内耳颞骨 CT 分析的医疗基础模型微调)以及两篇 CHI 2026 论文,聚焦 Vibe Coding 时代屏幕阅读器用户的可访问性,以及计算机使用场景下面向屏幕阅读器用户的人机交互。

[2026年1月] 我们推出 GI-Bench 基准测试平台,涵盖 20 种细粒度病变类别,并围绕消化道内镜五阶段临床工作流,对多模态大语言模型(MLLMs)进行系统性评估,推动面向真实临床流程的多模态模型评价标准建设。

[2025年8月] 我们开源 Agent Lightning⚡ 框架,使开发者能够通过强化学习(RL)训练任意 AI 智能体。该框架将智能体执行过程与模型训练过程解耦,可在几乎无需修改代码的情况下,无缝集成至 LangChain、AutoGen、CrewAI 等主流框架。

[2025年8月] 我们发布两项医疗基础模型预印本成果:RenalCLIP(面向肾癌精准肿瘤学的视觉-语言基础模型)与 DermINO(基于多视图混合预训练策略的皮肤科通用基础模型)。

开源项目

OpenOE-Lite —— OpenEvidence 风格循证医学问答系统的开源轻量级复现。仅需一个 LLM API key、零本地向量库、零语料处理,即可从 OpenAlex 的 2.5 亿篇开放学术文献中实时生成带引用对齐的循证回答。系统采用六阶段流水线(安全边界 → 三视角查询增强 → 多路检索 → RRF 去重排序 → 小模型证据门控 → 循证回答生成),模块化架构支持从 Lite 内核平滑升级至集成本地向量库与临床指南的完整 RAG。

Anji-Bridge 安济桥 —— 一个面向 LLM/Agent 的 PDF → 知识桥梁,将 PDF 转换为适用于大模型与智能体的结构化、语义化 Markdown/JSON。采用 PaddleOCR-VL 进行版面感知的 OCR,结合 Ovis2.5-9B 视觉-语言模型进行图像理解与描述生成,并在 AST 层面完成标题纠错、装饰性元素过滤、多格式导出等增强,支持批量处理与 base64 嵌入的可携带输出。

Mandarin Speech Prosody Benchmark (MSPB) —— Interspeech 2025 论文《Can AI Understand Mandarin Speech Prosody?》的配套仓库。MSPB 是一个语言学驱动的普通话韵律理解评测基准,包含 178 条经人工录制与专家校验的测试语料,覆盖 8 类任务(语气/语调、韵律歧义、焦点标记、焦点算子、等级含义、反语、带/不带语境的情感韵律),系统性评估 Speech LLM 在音系、句法、语义、语用多个层面对韵律线索的理解能力。

联系方式

Selected Publications

2026

  • GI-Bench: A Panoramic Benchmark Revealing the Knowledge-Experience Dissociation of Multimodal Large Language Models in Gastrointestinal Endoscopy Against Clinical Standards.
    Zhu, Yan, Luo, Te, Fu, Pei-Yao, Zhang, Zhen, Wang, Zi-Long, Qu, Yi-Fan, Geng, Zi-Han, Xu, Jia-Qi, Yao, Lu, Ma, Li-Yun, Su, Wei, Chen, Wei-Feng , et al.
    arXiv preprint arXiv:2601.08183, 2026 · Link · DOI
  • OMGs: A multi-agent system supporting MDT decision-making across the ovarian tumour care continuum.
    Zhang, Yangyang, Wang, Zilong, Xu, Jianbo, Chen, Yongqi, Han, Chu, Zhang, Zhihao, Liu, Shuai, Li, Hui, Zhang, Huiping, Liu, Ziqi, Chen, Jiaxin, Zhu, Jun , et al.
    arXiv preprint arXiv:2602.13793, 2026 · Link · DOI
  • Exploring interpretability for visual prompt tuning with cross-layer concepts.
    Wang, Yubin, Jiang, Xinyang, Cheng, De, Zhao, Xiangqian, Wang, Zilong, Li, Dongsheng, Zhao, Cairong
    ICLR 2026 (poster), 2026 · Link
  • Joint adaptation of uni-modal foundation models for multi-modal Alzheimer's disease diagnosis.
    Gu, Wentao, Li, Yuquan, Jiang, Xinyang, Wang, Zilong, Li, Dongsheng, Li, Zehui, Dong, Zijian, Zhao, Cairong
    ICLR 2026 (poster), 2026 · Link
  • Reasoning-driven multimodal LLM for domain generalization.
    Xu, Zhipeng, Wang, Zilong, Jiang, Xinyang, Li, Dongsheng, Cheng, De, Wang, Nannan
    ICLR 2026 (poster), 2026 · Link
  • Do not let low-probability tokens over-dominate in RL for LLMs.
    Yang, Zhihe, Luo, Xufang, Wang, Zilong, Han, Dongqi, He, Zhiyuan, Li, Dongsheng, Xu, Yunjian
    ICLR 2026 (poster), 2026 · Link
  • Programmers Who Use Screen Readers in the Vibe Coding Era: Adaptation, Empowerment, and New Accessibility Landscape.
    Chen, Nan, Qiu, Luna K., Wang, Arran Zeyu, Wang, Zilong, Yang, Yuqing
    CHI 2026, 2026 · Link · DOI
  • From Struggle to Success: Context-Aware Guidance for Screen Reader Users in Computer Use.
    Chen, Nan, Lu, Jing, Wang, Zilong, Qiu, Luna K., Chen, Siming, Yang, Yuqing
    CHI 2026, 2026
  • Tuning Medical Foundation Models for Inner Ear Temporal CT Analysis with Plug-and-play Domain Knowledge Aggregator.
    Wan, Weixun, Jiang, Xinyang, Wang, Zilong, Li, Bei, Zhao, Cairong
    AAAI 2026, 2026
  • ReMe: Scaffolding Personalized Cognitive Training via Controllable LLM-Mediated Conversations.
    Wang, Zilong, Chen, Nan, Qiu, Luna K., Yue, Ling, Guo, Geli, Ou, Yang, Jiang, Shiqi, Yang, Yuqing, Qiu, Lili
    CHI 2026 (LBW/poster), 2026 · Link · DOI

2025

  • EEGChaT: A Transformer-Based Modular Channel Selector for SEEG Analysis.
    Wang, Chen, Wang, Yansen, Han, Dongqi, Wang, Zilong, Li, Dongsheng
    arXiv preprint arXiv:2510.13592, 2025 · Link · DOI
  • Towards Ultra-low Framerate Ultrasound Localization Microscopy on Human Brain with Artificial Intelligence.
    Jiang, Xinyang, Zhong, Chuanyu, Wan, Weixun, Qu, Zefan, Wang, Zilong, Zhang, Xingxuan, Xu, Xiang, Wei, Linglin, Sun, Dailin, Wang, Yu
    Preprint, 2025
  • Can AI Understand Mandarin Speech Prosody? A Framework and Benchmark Showcase.
    Wang, Zilong, Zhang, Xiaoxue, Jiang, Xinyang, Song, Kaitao, Yu, Jue
    Interspeech 2025, 2025
  • Segmentation Helps Understanding: Mask-Infused Vision-Language Pre-training for 3D Medical Images.
    Hu, Yuqi, Luo, Xufang, Wang, Zilong
    Preprint, 2025
  • A Disease-Centric Vision-Language Foundation Model for Precision Oncology in Kidney Cancer.
    Tao, Yuhui, Zhao, Zhongwei, Wang, Zilong, Luo, Xufang, Chen, Feng, Wang, Kang, Wu, Chuanfu, Zhang, Xue, Zhang, Shaoting, Yao, Jiaxi, Jin, Xingwei, Jiang, Xinyang , et al.
    arXiv preprint arXiv:2508.16569, 2025 · Link · DOI
  • DermINO: Hybrid Pretraining for a Versatile Dermatology Foundation Model.
    Xu, Jingkai, Cheng, De, Zhao, Xiangqian, Yang, Jungang, Wang, Zilong, Jiang, Xinyang, Luo, Xufang, Chen, Lili, Ning, Xiaoli, Li, Chengxu, Zhou, Xinzhu, Song, Xuejiao , et al.
    arXiv preprint arXiv:2508.12190, 2025 · Link · DOI
  • Learning Robust Representations for Medical Images via Unifying (Self-)Supervisions.
    He, Xiaoxuan, Luo, Xufang, Yang, Yifan, Jiang, Xinyang, Wang, Zilong, Usuyama, Naoto, Zhang, Sheng, Poon, Hoifung, Yang, Yuqing, Li, Dongsheng, Qiu, Lili
    ICLR 2025 submission (OpenReview), 2025 · Link
  • Agent Lightning: Train ANY AI Agents with Reinforcement Learning.
    Luo, Xufang, Zhang, Yuge, He, Zhiyuan, Wang, Zilong, Zhao, Siyun, Li, Dongsheng, Qiu, Luna K., Yang, Yuqing
    arXiv preprint arXiv:2508.03680, 2025 · Link · DOI
  • AI-assisted facial analysis in healthcare: From disease detection to comprehensive management.
    Patterns, 2025 · Link

2024

  • Screening chronic kidney disease through deep learning utilizing ultra-wide-field fundus images.
    Zhao, Xinyu, Gu, Xingwang, Meng, Lihui, Chen, Yongwei, Zhao, Qing, Cheng, Shiyu, Zhang, Wenfei, Cheng, Tiantian, Wang, Chuting, Shi, Zhengming, Jiao, Shengyin, Jiang, Changlong, Jiao, Guofang, Teng, Da, Sun, Xiaolei, Zhang, Bilei, Li, Yakun, Lu, Huiqin, Chen, Changzheng, Zhang, Hao, Yuan, Ling, Su, Chang, Zhang, Han, Xia, Song, Liang, Anyi, Li, Mengda, Zhu, Dan, Xue, Meirong, Sun, Dawei, Li, Qiuming, Zhang, Ziwu, Zhang, Donglei, Lv, Hongbin, Ahmat, Rishet, Wang, Zilong , et al.
    npj Digital Medicine 7:275, 2024 · Link · DOI
  • DualStreamFoveaNet: A dual stream fusion architecture with anatomical awareness for robust fovea localization.
    Song, Sifan, Wang, Jinfeng, Wang, Zilong, Wang, Hongxing, Su, Jionglong, Ding, Xiaowei, Dang, Kang
    IEEE Journal of Biomedical and Health Informatics 28(12):7217–7229, 2024 · Link · DOI
  • LLM-RadJudge: Achieving Radiologist-Level Evaluation for X-Ray Report Generation.
    Wang, Zilong, Luo, Xufang, Jiang, Xinyang, Li, Dongsheng, Qiu, Lili
    arXiv preprint arXiv:2404.00998, 2024 · Link · DOI

2023

  • Early detection of visual impairment in young children using a smartphone-based deep learning system.
    Chen, Wenben, Li, Ruiyang, Yu, Qinji, Xu, Andi, Feng, Yile, Wang, Ruixin, Zhao, Lanqin, Lin, Zhenzhe, Yang, Yahan, Lin, Duoru, Wu, Xiaohang, Chen, Jingjing, Liu, Zhenzhen, Wu, Yuxuan, Dang, Kang, Qiu, Kexin, Wang, Zilong , et al.
    Nature Medicine 29(2):493–503, 2023 · Link · DOI

2020

  • Artificial intelligence-enabled screening for diabetic retinopathy: A real-world, multicenter and prospective study.
    Zhang, Yifei, Shi, Juan, Peng, Ying, Zhao, Zhiyun, Zheng, Qidong, Wang, Zilong, Liu, Kun, Jiao, Shengyin, Qiu, Kexin, Zhou, Ziheng, Yan, Li, Zhao, Dong , et al.
    BMJ Open Diabetes Research & Care 8(1):e001596, 2020 · Link · DOI