Abstract
ChatGPT is a large natural language processing model with numerous applications. However, its effectiveness in the field of dermatopathology has not yet been formally evaluated. Diagnostically challenging cases often require significant workup with immunohistochemical tests. Here we evaluate the performance of ChatGPT (3.5; v. May 3) in providing diagnostically useful information on immunohistochemistry relating to entities encountered by dermatopathologists. We first queried ChatGPT for the immunophenotypes of specific cutaneous neoplasms (e.g., spindle cell lipoma). Next, we prompted ChatGPT to provide panels of immunohistochemical tests to interrogate broader differentials (e.g. spindle cell neoplasms). Additionally, we requested ChatGPT to provide references for these entities. Lastly, we directed ChatGPT to provide sample pathology reports incorporating typical immunostaining results. ChatGPT recommendations were repeated 10x for reproducibility, compiled, quantified, then compared with standard references. We found that ChatGPT provided sensitive and specific immunohistochemical panels for most cases and differentials (97%). However, in a substantial proportion of cases (34%), ChatGPT also provided factually incorrect or misleading recommendations, which was worst for adnexal carcinomas (91%). ChatGPT rarely provided clinically useful citations (15%), and over half were confabulated (52%). ChatGPT was able to integrate immunohistochemical results sufficiently into pathology reports. These data suggest that the current model of ChatGPT may be useful for rapidly generating immunohistochemical panels for dermatologic diagnoses, and for drafting sample pathology reports. However, extreme caution should be exercised regarding tendencies of ChatGPT to fabricate material. Future natural language processing models designed to augment diagnostic medicine may have immense value in dermatopathologists’ workflow.