TUTORIALS 12 min read

Build an AI Translation App with Python: Beyond Google Translate

Build a context-aware translation app using LLMs that handles idioms, cultural context, and domain-specific terminology. Full Python code with Flask API and simple web frontend.

By EgoistAI ·
Build an AI Translation App with Python: Beyond Google Translate

Google Translate handles 100+ billion words per day. It’s fast, free, and good enough for getting the gist of a menu in Tokyo. But “good enough” isn’t good enough for business documents, marketing copy, technical documentation, or any text where nuance matters.

The problem with traditional machine translation: it translates words, not meaning. “It’s raining cats and dogs” becomes literal animals falling from the sky. Technical jargon gets mangled. Cultural context is ignored. Tone is lost.

LLM-based translation solves these problems by understanding context before translating. In this tutorial, we’ll build a translation app that handles idioms, domain terminology, and cultural adaptation — the things Google Translate gets wrong.


Architecture

User Input (text + source lang + target lang + context)
  → Preprocessing (language detection, text normalization)
  → Translation Engine (Claude API or local model)
  → Post-processing (formatting, quality check)
  → Output (translated text + alternatives + notes)

Setup

mkdir ai-translator && cd ai-translator
python3 -m venv venv
source venv/bin/activate
pip install flask anthropic langdetect python-dotenv

The Translation Engine

# translator.py
"""AI-powered translation engine with context awareness."""

import json
import anthropic
from langdetect import detect


class AITranslator:
    def __init__(self, api_key: str):
        self.client = anthropic.Anthropic(api_key=api_key)
    
    def detect_language(self, text: str) -> str:
        """Detect the language of input text."""
        try:
            return detect(text)
        except Exception:
            return "unknown"
    
    def translate(
        self,
        text: str,
        target_lang: str,
        source_lang: str = "auto",
        context: str = "",
        domain: str = "general",
        formality: str = "neutral"
    ) -> dict:
        """
        Translate text with context awareness.
        
        Args:
            text: Text to translate
            target_lang: Target language (e.g., "Japanese", "Spanish")
            source_lang: Source language ("auto" for detection)
            context: Additional context about the text
            domain: Domain for terminology (general, medical, legal, tech)
            formality: Formality level (formal, neutral, casual)
        
        Returns:
            dict with translation, alternatives, and notes
        """
        if source_lang == "auto":
            detected = self.detect_language(text)
            source_lang = detected
        
        domain_instructions = {
            "general": "",
            "medical": "Use precise medical terminology. Maintain clinical accuracy.",
            "legal": "Use exact legal terminology. Preserve legal meaning precisely.",
            "tech": "Use standard technical terminology. Keep code/commands untranslated.",
            "marketing": "Adapt for cultural resonance. Prioritize impact over literal meaning.",
            "academic": "Maintain academic register. Preserve citation formats.",
        }
        
        formality_instructions = {
            "formal": "Use formal register and honorifics where appropriate.",
            "neutral": "Use standard register.",
            "casual": "Use conversational, casual language.",
        }
        
        prompt = f"""Translate the following text from {source_lang} to {target_lang}.

Text to translate:
"{text}"

{f'Context: {context}' if context else ''}
Domain: {domain}
{domain_instructions.get(domain, '')}
Formality: {formality}
{formality_instructions.get(formality, '')}

Return ONLY valid JSON:
{{
  "translation": "the translated text",
  "alternatives": ["1-2 alternative translations if applicable"],
  "notes": ["any important translation notes, cultural context, or ambiguities"],
  "idioms_adapted": ["list any idioms that were culturally adapted rather than literally translated"],
  "confidence": 0.0 to 1.0
}}

Rules:
- Translate meaning, not just words
- Adapt idioms and cultural references for the target culture
- Preserve formatting (paragraphs, lists, emphasis)
- For technical terms with no standard translation, keep original in parentheses
- Flag any ambiguous passages in notes"""

        response = self.client.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=2048,
            system=(
                "You are a professional translator with expertise in "
                "cultural adaptation and domain-specific terminology. "
                "Always return valid JSON."
            ),
            messages=[{"role": "user", "content": prompt}]
        )
        
        result_text = response.content[0].text
        if "```json" in result_text:
            result_text = result_text.split("```json")[1].split("```")[0]
        elif "```" in result_text:
            result_text = result_text.split("```")[1].split("```")[0]
        
        try:
            result = json.loads(result_text.strip())
        except json.JSONDecodeError:
            result = {
                "translation": result_text.strip(),
                "alternatives": [],
                "notes": ["JSON parsing failed; raw translation returned"],
                "idioms_adapted": [],
                "confidence": 0.5
            }
        
        result['source_language'] = source_lang
        result['target_language'] = target_lang
        result['original'] = text
        
        return result
    
    def translate_batch(
        self,
        texts: list[str],
        target_lang: str,
        **kwargs
    ) -> list[dict]:
        """Translate multiple texts."""
        return [
            self.translate(text, target_lang, **kwargs)
            for text in texts
        ]
    
    def compare_translations(
        self,
        text: str,
        target_lang: str,
        source_lang: str = "auto"
    ) -> dict:
        """Generate multiple translation styles for comparison."""
        
        styles = {
            "literal": "Translate as literally as possible while remaining grammatically correct.",
            "natural": "Translate for natural, fluent reading in the target language.",
            "adapted": "Freely adapt for maximum cultural resonance and impact.",
        }
        
        results = {}
        for style_name, instruction in styles.items():
            response = self.client.messages.create(
                model="claude-sonnet-4-20250514",
                max_tokens=1024,
                messages=[{
                    "role": "user",
                    "content": (
                        f"Translate from {source_lang} to {target_lang}. "
                        f"Style: {instruction}\n\n"
                        f"Text: \"{text}\"\n\n"
                        f"Return only the translation, nothing else."
                    )
                }]
            )
            results[style_name] = response.content[0].text.strip().strip('"')
        
        return {
            "original": text,
            "source_language": source_lang,
            "target_language": target_lang,
            "translations": results
        }

The Web API

# app.py
"""Flask API for the translation service."""

import os
from flask import Flask, request, jsonify, render_template_string
from dotenv import load_dotenv
from translator import AITranslator

load_dotenv()

app = Flask(__name__)
translator = AITranslator(api_key=os.getenv('ANTHROPIC_API_KEY'))

SUPPORTED_LANGUAGES = [
    "English", "Spanish", "French", "German", "Italian",
    "Portuguese", "Japanese", "Korean", "Chinese (Simplified)",
    "Chinese (Traditional)", "Arabic", "Hindi", "Russian",
    "Dutch", "Swedish", "Polish", "Turkish", "Thai",
    "Vietnamese", "Indonesian"
]


@app.route('/')
def index():
    return render_template_string(HTML_TEMPLATE, languages=SUPPORTED_LANGUAGES)


@app.route('/api/translate', methods=['POST'])
def api_translate():
    data = request.json
    
    if not data or 'text' not in data or 'target_lang' not in data:
        return jsonify({'error': 'Missing required fields: text, target_lang'}), 400
    
    result = translator.translate(
        text=data['text'],
        target_lang=data['target_lang'],
        source_lang=data.get('source_lang', 'auto'),
        context=data.get('context', ''),
        domain=data.get('domain', 'general'),
        formality=data.get('formality', 'neutral')
    )
    
    return jsonify(result)


@app.route('/api/compare', methods=['POST'])
def api_compare():
    data = request.json
    
    if not data or 'text' not in data or 'target_lang' not in data:
        return jsonify({'error': 'Missing required fields: text, target_lang'}), 400
    
    result = translator.compare_translations(
        text=data['text'],
        target_lang=data['target_lang'],
        source_lang=data.get('source_lang', 'auto')
    )
    
    return jsonify(result)


HTML_TEMPLATE = """
<!DOCTYPE html>
<html>
<head>
    <title>AI Translator</title>
    <style>
        * { margin: 0; padding: 0; box-sizing: border-box; }
        body { font-family: system-ui; max-width: 900px; margin: 0 auto; padding: 20px; background: #f5f5f5; }
        h1 { margin-bottom: 20px; }
        .container { display: grid; grid-template-columns: 1fr 1fr; gap: 16px; }
        textarea { width: 100%; height: 200px; padding: 12px; border: 1px solid #ddd; border-radius: 8px; font-size: 15px; resize: vertical; }
        select, button { padding: 10px 16px; border-radius: 8px; font-size: 14px; }
        select { border: 1px solid #ddd; background: white; }
        button { background: #2563eb; color: white; border: none; cursor: pointer; }
        button:hover { background: #1d4ed8; }
        .controls { display: flex; gap: 12px; margin: 12px 0; flex-wrap: wrap; align-items: center; }
        .result { background: white; padding: 16px; border-radius: 8px; border: 1px solid #ddd; margin-top: 8px; }
        .notes { margin-top: 12px; padding: 12px; background: #f0f9ff; border-radius: 6px; font-size: 13px; }
        .note-item { margin: 4px 0; color: #555; }
    </style>
</head>
<body>
    <h1>AI Translator</h1>
    <div class="controls">
        <select id="source-lang"><option value="auto">Auto-detect</option>
        {% for lang in languages %}<option value="{{ lang }}">{{ lang }}</option>{% endfor %}
        </select>
        <span>→</span>
        <select id="target-lang">
        {% for lang in languages %}<option value="{{ lang }}" {{ 'selected' if lang == 'Spanish' }}>{{ lang }}</option>{% endfor %}
        </select>
        <select id="domain">
            <option value="general">General</option>
            <option value="medical">Medical</option>
            <option value="legal">Legal</option>
            <option value="tech">Technical</option>
            <option value="marketing">Marketing</option>
        </select>
        <select id="formality">
            <option value="neutral">Neutral</option>
            <option value="formal">Formal</option>
            <option value="casual">Casual</option>
        </select>
        <button onclick="translate()">Translate</button>
    </div>
    <div class="container">
        <div>
            <textarea id="input" placeholder="Enter text to translate..."></textarea>
        </div>
        <div>
            <textarea id="output" placeholder="Translation will appear here..." readonly></textarea>
            <div id="notes-area" class="notes" style="display:none"></div>
        </div>
    </div>
    <script>
    async function translate() {
        const text = document.getElementById('input').value;
        if (!text) return;
        document.getElementById('output').value = 'Translating...';
        const res = await fetch('/api/translate', {
            method: 'POST',
            headers: {'Content-Type': 'application/json'},
            body: JSON.stringify({
                text: text,
                target_lang: document.getElementById('target-lang').value,
                source_lang: document.getElementById('source-lang').value,
                domain: document.getElementById('domain').value,
                formality: document.getElementById('formality').value
            })
        });
        const data = await res.json();
        document.getElementById('output').value = data.translation || data.error;
        const notesArea = document.getElementById('notes-area');
        if (data.notes && data.notes.length > 0) {
            notesArea.innerHTML = '<strong>Notes:</strong>' + data.notes.map(n => '<div class="note-item">• ' + n + '</div>').join('');
            notesArea.style.display = 'block';
        } else { notesArea.style.display = 'none'; }
    }
    </script>
</body>
</html>
"""


if __name__ == '__main__':
    app.run(debug=True, port=5000)

Testing: Where AI Translation Wins

Idiom Handling

English: "Break a leg!"
Google Translate → Spanish: "¡Rómpete una pierna!" (literal - sounds violent)
AI Translator → Spanish: "¡Mucha mierda!" (actual Spanish theater idiom)
                Note: "Spanish theater tradition uses 'mucha mierda' 
                as the equivalent good luck expression"

Domain-Specific Terminology

English (medical): "The patient presents with acute myocardial infarction 
with ST-segment elevation in leads V1-V4."

Google Translate → Japanese: [correct medical terms but awkward phrasing]
AI Translator → Japanese: [correct terms with proper Japanese medical 
                register and standard clinical phrasing]
                Note: "Used standard Japanese cardiology terminology 
                per JCS guidelines"

Formality Levels

English: "Could you send me the report?"

AI Translator → Japanese (formal): 
  "レポートをお送りいただけますでしょうか。"
  (keigo / very polite business Japanese)

AI Translator → Japanese (casual):
  "レポート送ってくれる?"
  (casual friend-to-friend)

Google Translate → Japanese:
  "レポートを送ってもらえますか?"
  (one option, mid-formality, often inappropriate)

Cost Analysis

Per translation (average 200 words):
- Input: ~300 tokens
- Output: ~500 tokens
- Cost (Claude Sonnet): ~$0.009

Comparison:
- Google Translate API: $20 per million characters (~$0.004 per translation)
- DeepL API: $25 per million characters (~$0.005 per translation)  
- This AI translator: ~$0.009 per translation

AI translation costs ~2x more but provides:
- Context-aware translation
- Cultural adaptation
- Domain terminology
- Formality control
- Translation notes
- Multiple alternatives

When to Use What

ScenarioBest Tool
Quick understanding of foreign textGoogle Translate (free, fast)
Bulk translation of simple contentDeepL API (cost-effective, good quality)
Business documents and contractsThis AI translator (context, formality)
Marketing copy localizationThis AI translator (cultural adaptation)
Medical/legal translationThis AI translator (domain expertise)
Real-time conversationGoogle Translate (speed)
Website localization (1000+ pages)DeepL + human review (cost at scale)

The AI translation app we built handles the cases where nuance matters — and those are exactly the cases where cheap, fast translation fails most dangerously. A mistranslated marketing slogan is embarrassing. A mistranslated medical instruction is dangerous. A mistranslated legal clause is expensive.

Build for the cases that matter. Leave the menu translations to Google.

Share this article

> Want more like this?

Get the best AI insights delivered weekly.

> Related Articles

Tags

translationNLPPythonClaude APImultilingualtutorial

> Stay in the loop

Weekly AI tools & insights.