English
Enterprise Edition

Exploring the Semantic Boundaries of LLM Translation through 'The Moon and Sixpence': A Comparative Study of Claude 4.5 and Human Translators

Doclingo ResearchNovember 27, 2025

Exploring the Semantic Boundaries of LLM Translation through 'The Moon and Sixpence': A Comparative Study of Claude 4.5 and Human Translators

Introduction: Considerations Beyond Model Parameters

With the successive releases of Claude 4.5 and Gemini 3, the performance of large language models (LLMs) in logical reasoning and code generation has reached a saturation point. However, when dealing with high-context texts—especially literary works involving complex metaphors and cultural references—does the stacking of parameters directly translate into an improvement in translation quality?

As the technical team of Doclingo, we approach this with caution. To explore the true capabilities of AI translation, we selected Maugham's The Moon and Sixpence as a test sample, comparing the current top models with classic human translations in a parallel corpus.

This article will strip away marketing rhetoric and objectively analyze the current limitations and breakthroughs of AI translation from both linguistic and engineering perspectives.


Dimension One: Lexical Accuracy vs. Stylistic Fidelity

In the first round of testing, we focused on whether the translation preserved the unique narrative voice of the original author.

Sample Analysis: The original text uses the word "Accident" to describe the acquisition of a certain social status.

AI Models (Claude 4.5/Gemini 3): Tend to process the word as "accident" or "chance" in standard semantics. From a dictionary definition perspective, this is accurate. However, the translation presents a flat, "explanatory" style, losing the underlying irony of the original text.

Human Translator: Translates it as "born by chance."

Analysis Conclusion: The human translator's handling is not merely a semantic conversion but a reconstruction of the literary theme of "fate." AI currently remains at the "decoding-encoding" logical level, lacking sensitivity to the emotional granularity of the text. It can convey information accurately but struggles to reproduce the author's incisive questioning tone. This reveals AI's first shortcoming: Stylistic Homogenization.


Dimension Two: Explicit Semantics vs. Metaphorical Mapping

The core difficulty of literary translation lies in handling "implications." We tested the model's ability to concretize abstract concepts.

Sample Analysis: The original text uses the simple locative word "here."

AI Model: Adopts a literal translation strategy, rendering it as "here" or "this place."

Human Translator: Processes it as "coming into this world."

Analysis Conclusion: This reflects the essential differences in semantic mapping mechanisms between the two. AI pursues statistical maximum likelihood, tending to choose the safest literal translation; whereas the human translator, based on an understanding of the book's main theme, performs creative semantic enrichment. In a literary context, a less "faithful" literal translation often deviates the most from the original meaning.


Dimension Three: Context Window vs. Global Coherence

Although models like Claude 4.5 have greatly expanded the context window, we still observed "context drift" when processing lengthy documents.

  • Terminology Oscillation: The same proper noun is translated differently in different chapters.
  • Tone Disruption: The dialogue style of characters fails to maintain consistency, sometimes sounding classical, sometimes modern.

This indicates that merely increasing the token limit does not fully resolve the coherence issues of global narrative logic. The "macro framework" constructed by human translators remains a barrier that algorithms struggle to overcome.


Industry Reflection and Engineering Solutions

In summary, our evaluation concludes: AI is an excellent language engineer but not yet a qualified literary artist. It addresses the "breadth" and "speed" of translation but still exhibits significant gaps in "depth" and "warmth."

Based on this understanding, we did not blindly pursue "full automation" in building Doclingo's product architecture; instead, we are committed to creating a "human-in-the-loop" professional workflow:

1. Structural Pre-processing

Utilizing Doclingo's unique mirror layout engine, we parse PDF documents before sending them to the LLM. This not only preserves complex formatting but, more importantly, aids AI in better understanding the logical relationships between text blocks through the restoration of the physical layout, thereby enhancing the contextual accuracy of translations.

2. Augmented Memory

To address consistency issues in long documents, we encapsulate a layer of dynamic terminology management on top of the native capabilities of mainstream large models like Gemini/GPT. This ensures global uniformity of key terms and writing style in the processing of tens of thousands of words, compensating for the native model's "attention decay" shortcomings.

3. Expert Interaction

We position AI as a "draft generator." Doclingo provides a professional bilingual comparison and proofreading interface, aiming to empower human experts—allowing you to focus on high-value polishing tasks like "born by chance," rather than wasting time on the basic transport of source material.

Conclusion

Technology should not be an excuse to replace humans but a lever to extend human wisdom.

At Doclingo, we are dedicated to making cutting-edge models like Claude 4.5 and Gemini 3 truly applicable in professional scenarios, bridging the gap between algorithms and the human spirit through rigorous engineering methods.

Copyright © 2025 Doclingo. All Rights Reserved.
Products
Document Translation
More Tools
API
Enterprise Edition
Resources
Premium
App
About
Help Center
User Agreement
Privacy Policy
Version Updates
Blog
Contact Information
Email: support@doclingo.ai
Copyright © 2025 Doclingo. All Rights Reserved.