Human vs. Machine: Assessing Translation Quality of Four-character Terms in the Classical Chinese Medical Text Huangdi Neijing
LIU Dongge
School of Languages and Cultures, Youjiang Medical University for Nationalities, China.
ZHAO Meijuan *
School of Languages and Cultures, Youjiang Medical University for Nationalities, China.
*Author to whom correspondence should be addressed.
Abstract
With the rapid advancement of technology, machine translation has been increasingly applied in the field of medical translation. As the foundational text of Traditional Chinese Medicine (TCM), the Huangdi Neijing plays a critical role in shaping TCM theory and clinical practice. This study focuses on the translation of four-character terms in the Huangdi Neijing and compares the translation quality between human and machine translation. A mixed-methods approach, incorporating both qualitative and quantitative analysis, was employed. A total of 463 samples were selected, and four mainstream machine translation systems—Youdao, DeepL, ChatGPT-4o, and DeepSeek were utilized.By analyzing BLEU and TER scores to evaluate the outputs against human translations, it was found that ChatGPT-4o achieved the highest BLEU score of 0.60, indicating the greatest lexical similarity to human translation, and the lowest TER score of 0.99, suggesting the fewest required edits. These results suggest that ChatGPT-4o is currently the most effective among the tested systems in handling this specialized translation task. Despite these advancements, machine translation still exhibits notable limitations when applied to TCM texts. Human translation still outperforms machine-generated outputs, particularly in conveying semantic precision and preserving the cultural and conceptual subtleties embedded in medical discourse. This study contributes to the field by providing empirical evidence on the performance of large language models in translating culturally and linguistically complex four-character expressions from classical TCM texts.
Keywords: Comparison between human and machine translation, four-character terms, translation quality assessment, traditional Chinese medical classics, Huangdi Neijing