首页

人工智能与医学

基于ChatGPT-4o与DeepSeek的虚拟标准化患者系统在医学问诊教学中的比较研究

：1346-1352

摘要

浏览

PDF

虚拟标准化患者大语言模型 AI 医患沟通生成式人工智能

背景虚拟标准化患者作为医学教育中的新型教学工具, 已广泛用于提升学生的临床问诊能力。随着生成式人工智能的快速发展, 基于大语言模型（LLMs）构建的VSP系统成为研究热点。然而, 目前尚缺乏对不同LLM在模拟患者角色方面表现的系统比较。目的比较ChatGPT-4o与DeepSeek两种主流LLM在VSP模拟中的适用性, 评估其在病史采集、语言自然度、线索引导能力及教学辅助效果等方面的表现差异。方法采用类实验研究,参与者为某医学院校临床医学专业本科四年级学生, 所有参与者均已修完《诊断学》课程, 具备基础问诊技能, 研究对象共60人, 按学号尾数单双分为两组, 分别与ChatGPT-4o或DeepSeek驱动的VSP系统进行交互。进行模拟急性阑尾炎问诊, 并在完成病史采集后提交诊断判断与体验问卷。结果 ChatGPT-4o在结构化信息整合、线索引导及技术稳定性方面更为优越, 而DeepSeek则在语言亲和力与情感回应方面表现更具人文关怀色彩。结论不同LLM在VSP中的优势方向不同, 可根据教学目标进行有针对性地系统选择与设计。未来研究可进一步拓展至不同病种、交互方式及评估维度,以全面评估LLM驱动VSP在医学教育场景下的适应性与教学成效。

Background Virtual standardized patients（VSPs）have emerged as a novel tool in medical education, widely adopted to enhance students’ clinical interview skills．With the rapid development of generative artificial intelligence, VSP systems powered by large language models（LLMs）have become a new focus of research．However, few studies have systematically compared the performance of different LLMs in simulating patient roles．Objective This study aims to compare the applicability of two mainstream LLMs, ChatGPT-4o and DeepSeek, in VSP-based medical interview simulations, focusing on their differences in history-taking performance,linguistic naturalness, clue guidance,and educational support．Methods A quasi-experimental study was conducted involving 60 fourth-year clinical medicine undergraduates from a medical school．All participants had completed a diagnostics course and possessed basic interviewing skills．Students were assigned to either the ChatGPT-4o or DeepSeek group based on the parity of their student ID numbers．Each participant conducted a text-based simulated interview with a VSP presenting with acute appendicitis, then submitted both a preliminary diagnosis and a structured satisfaction questionnaire．Results ChatGPT-4o demonstrated superior performance in structured information integration, clue-based prompting, and system stability．In contrast, DeepSeek showed more natural language affinity and emotional responsiveness,reflecting stronger humanistic communication traits．The two models displayed divergent strengths within the VSP framework, suggesting that system selection and integration should be tailored to specific teaching objectives．Conclusions Future research should expand the scope to include diverse disease scenarios, interaction modalities, and evaluation dimensions, to comprehensively assess the educational utility and adaptability of LLM-driven VSP systems in medical training．

论著

抑郁障碍患者血清IL-2和TNF-α水平与认知功能的相关性研究

：58-59

摘要

浏览

PDF

抑郁障碍 IL-2 TNF-α 认知功能

目的检测抑郁障碍患者血清中IL-2和TNF-α水平,探讨IL-2和TNF-α水平与认知功能情况相关性。方法采用酶联免疫吸附法(ELISA)检测100例抑郁障碍患者(观察组)和100例健康人(对照组)的血清IL-2、TNF-α的水平,并结合汉密尔顿抑郁量表(HAMD)观察患者抑郁障碍的严重程度,应用Loewenstein 认知评定量表评定患者的认知状态情况进行相关分析。结果与对照组相比,观察组的IL-2、TNF-α的水平明显更高(P<0.05)。IL-2、TNF-α的水平与HAMD,LOTCA总分呈正相关(P<0.05)。结论抑郁障碍患者血清中IL-2、TNF-α的水平与抑郁障碍患者的严重程度和认知状态情况呈正相关。

Objective To study the levels of serum IL-2 and TNF-α in depressed patients and theircorrelations with the cognitive function. Methods 100 depressed patients (observation group) and 100 healthy people (control group) were enrolled to this study and we compared their levels of serum IL-2 and TNF-α detected by enzyme-linked immunosorbent (ELISA) from two groups. The correlation analyses of the serum IL-2 and TNF-α levels with the severity of depression of depressed patients observed with Hamilton depression scale (HAMD), and the serum IL-2 and TNF-α levels with the cognitive function evaluated with Loewenstein were conducted. Results The levels of serum IL-2 and TNF-α in the observation group were significantly higher than control group (P<0.05).There were positive correlations between the levels of IL-2 and TNF-α and HAMD scores and between the levels of IL-2 and TNF-α and LOTCA scores (P<0.05). Conclusion The levels of serum IL-2 and TNF-α in the depressed patients were positively correlative with the severity of depression and their cognitive function.