Back to blog

A Practical Format of Glossary for Django Developers

2026-06-12 10 min read
A Practical Format of Glossary for Django Developers

Meta description: Your Django app doesn't need a giant termbase. Use a practical glossary format that stays in Git, guides translations, and survives releases.

You ship a feature on Friday. In locale/fr/LC_MESSAGES/django.po, one string translates Subscription as Abonnement. Two weeks later, another PR translates Manage Subscription as Gérer le Plan. Nobody notices until support screenshots both screens side by side.

That's the core problem with the format of glossary. It isn't documentation polish. It's whether your product language stays stable when multiple developers, reviewers, and translation tools touch the same app over time.

makemessages extracts strings. It does not protect terminology. String-by-string translation is where drift starts, especially when short UI labels have weak context and different people interpret them differently.

When Your Translations Become Inconsistent

A lot of teams assume gettext is enough because the extraction side works. Django finds the strings, writes the .po files, and gives you a place to review diffs. That part is fine. The weak point is term control.

If your app has billing, permissions, onboarding, and admin UI, you already have words that must stay stable. Plan, subscription, workspace, owner, seat, member. If those drift, users feel it immediately.

Practical rule: if a term appears in more than one screen and carries product meaning, it belongs in a glossary before it belongs in translation review.

The gap is wider than generally perceived. A 2025 LISA study found that 78% of localization bottlenecks stem from inconsistent terminology management, yet only 12% of technical documentation includes versioned, machine-readable glossary schemas. That's exactly the mismatch most Django teams live with. The problem is common. The tooling and habits are behind.

Where drift actually comes from

It usually isn't one bad translator. It's normal engineering work:

Teams often try to fix this with translation memory alone. That helps with repeated segments, but it doesn't replace a decision about preferred terminology. If you want the difference in plain terms, this translation memory explainer is worth reading.

What works better

You need one source of truth for terms that are easy to mistranslate and expensive to keep correcting later.

Not a giant enterprise termbase. Not a vendor portal your team never opens.

A file in your repo, reviewed like code, with just enough structure to guide humans and tools.

The Simplest Glossary Format a Markdown File

Start with a file your team will maintain. For most Django projects, that file is TRANSLATING.md at the repo root.

A hand-drawn diagram illustrating a project file structure with a highlighted TRANSLATING.md file.

Markdown works because it has zero adoption cost. Git already versions it. Reviewers can read it in a PR. Your team doesn't need extra software to edit it.

A glossary also doesn't need to be huge to be useful. A plain-language glossary built for methodology researchers had 64 terms and still standardized language for its audience, as noted in the plain-language glossary study. That matches what works in product teams. Start with the terms that keep causing churn.

A copy-pasteable Markdown pattern

# Translating Guide

## Preferred product terms

### Subscription
- Preferred French: Abonnement
- Context: Billing UI, invoices, customer portal
- Notes: Use for the paid recurring product relationship. Do not switch to "forfait" unless the source term is "plan".

### Plan
- Preferred French: Forfait
- Context: Pricing page, upgrade flow
- Notes: Refers to pricing tier, not the billing relationship.

### Workspace
- Preferred French: Espace de travail
- Context: Team collaboration features
- Notes: Never translate as "bureau".

### Owner
- Preferred French: Propriétaire
- Context: Team roles and permissions
- Notes: Role name in access control UI.

## Strings with placeholders

### Welcome, %(name)s
- Preferred French: Bienvenue, %(name)s
- Context: Dashboard greeting
- Notes: Preserve %(name)s exactly.

### {count} seat
- Preferred French: {count} place
- Context: Billing and team size
- Notes: Preserve {count} exactly. Check plural handling in the .po entry.

## Do not translate

- TranslateBot
- Django
- DeepL
- GPT-4o-mini

Why Markdown is enough at first

A good Markdown glossary gives you a few things immediately:

Keep the first glossary boring. If your team needs a meeting to understand the file, the format is already too heavy.

What doesn't work is stuffing every possible phrase into it. A glossary should capture preferred terms, tricky strings, and forbidden translations. Leave ordinary sentences to the .po files.

Structured Glossary Formats JSON and CSV

Markdown is good for adoption. It's weaker when you want validation, automation, or importing terms into scripts. That's when JSON or CSV starts to pay off.

The trade-off is obvious. More structure gives you better tooling, but it also raises the maintenance burden. If your team won't keep the file current, a perfect schema is useless.

JSON when you want validation

JSON is the better choice if you want CI checks, programmatic lookups, or a predictable schema.

[
  {
    "term": "Subscription",
    "locale": "fr",
    "translation": "Abonnement",
    "context": "Billing UI, invoices, customer portal",
    "notes": "Use for the recurring paid relationship.",
    "do_not_translate": false
  },
  {
    "term": "Plan",
    "locale": "fr",
    "translation": "Forfait",
    "context": "Pricing page and upgrade flow",
    "notes": "Tier name, not the subscription itself.",
    "do_not_translate": false
  },
  {
    "term": "Welcome, %(name)s",
    "locale": "fr",
    "translation": "Bienvenue, %(name)s",
    "context": "Dashboard greeting",
    "notes": "Preserve %(name)s exactly.",
    "do_not_translate": false
  }
]

JSON is strict. That's good in CI. It's less pleasant for non-technical editors.

CSV when editors live in spreadsheets

CSV is useful when a product manager, translator, or reviewer wants to edit terms in Excel or Google Sheets.

term,locale,translation,context,notes,do_not_translate
Subscription,fr,Abonnement,"Billing UI, invoices, customer portal","Use for the recurring paid relationship.",false
Plan,fr,Forfait,"Pricing page and upgrade flow","Tier name, not the subscription itself.",false
"Welcome, %(name)s",fr,"Bienvenue, %(name)s","Dashboard greeting","Preserve %(name)s exactly.",false
TranslateBot,fr,TranslateBot,"Brand name","Never translate.",true

CSV is easy to share. It's also easy to break with commas, quotes, and line endings if nobody validates it.

Comparison of Glossary File Formats

Format Best For Pros Cons
Markdown Small teams, early adoption Readable in Git, low friction, good for notes Harder to validate automatically
JSON Engineering-led workflows Strong schema, easy CI validation, good for tooling Less friendly for manual editing
CSV Spreadsheet-heavy collaboration Easy bulk editing, import/export friendly Weak structure, easier to corrupt

If you're deciding between them, it helps to think in terms of writing constraints and audience. Documind's guide to writing formats is useful here because it frames format as a function of use, not preference. That's exactly how you should pick a glossary file.

Don't choose the most advanced format. Choose the most advanced format your team will still update during a release week.

Anatomy of a Glossary Entry

File format matters less than entry format. Most broken glossaries fail because each term is written differently, with missing context and no ownership.

A diagram illustrating the anatomy of a glossary entry, including term, definition, category, slug, and source.

The entry should be standardized. That's how you make it reviewable and auditable over time. The business glossary guidance from Decube recommends capturing metadata like definition source, last update date, related terms, ownership, and usage examples. For product localization, that's the right instinct even if your file stays lightweight.

Fields worth keeping

At minimum, each entry should have:

After that, add metadata that helps review:

For a broader terminology reference your team can align on, keep a product-facing companion like a glossary of terms for localization work.

Placeholder-safe entries

Glossary entries should preserve placeholders exactly. If your glossary can't represent that rule clearly, it's not ready for Django strings.

#: billing/views.py:42
#, python-format
msgid "Welcome, %(name)s"
msgstr "Bienvenue, %(name)s"

#: teams/templates/teams/summary.html:18
msgid "{0} seats available"
msgstr "{0} places disponibles"

A few rules matter here:

Short labels are where context earns its keep. Owner in billing might map differently than owner in object storage or document metadata. The term alone isn't enough.

Versioning Your Glossary with Git and CI

If the glossary affects shipped text, it belongs in the same lifecycle as code. That means Git, pull requests, validation, and release checks.

A six-step infographic illustrating the process of versioning a glossary using Git and CI workflows.

Teams that treat glossary edits as side notes get side-note quality. Teams that review terminology changes in PRs catch drift before it lands in production.

The logic isn't new. The OECD glossary was created to support consistent data collection and to highlight inconsistencies between standards, according to the PARIS21 glossary reference. Different domain, same lesson. Formal glossaries exist to keep shared language stable at scale.

What a workable flow looks like

A practical flow looks like this:

  1. Add or update terms in TRANSLATING.md, glossary.json, or glossary.csv.
  2. Commit glossary changes in the same branch as feature strings.
  3. Open a PR that shows both code and terminology diffs.
  4. Run CI checks on glossary structure.
  5. Regenerate or review translations before deploy.

For teams already thinking about prompt and automation governance, Prompt Builder's piece on streamlining prompt development is a good parallel. Prompt files and glossary files have the same maintenance problem. They drift unless versioning is part of the workflow.

Here's the kind of validation step that pays off fast:

python manage.py makemessages --locale=fr
python -m json.tool glossary.json > /dev/null
python manage.py compilemessages

That won't catch semantic mistakes, but it does catch broken structure before someone merges garbage into main.

A short explainer on what i18n means in real app development is useful if you need to align the team on why this belongs in the release process at all.

Review rules that actually help

Review glossary diffs like API changes. A renamed term can have user-facing impact in five screens and three emails.

Use a few hard rules:

Later, if you automate translation in CI, the glossary becomes an active input instead of a passive note.

Here's a walkthrough that shows the broader workflow thinking behind this kind of setup:

Your Next Step a Ready-to-Use Glossary Template

If you've got no glossary today, don't overdesign it. Add one file. Put it in Git. Start with the terms that already cause review comments.

Use this as your first TRANSLATING.md:

# Translating Guide

## Product terms

### Subscription
- Preferred translation (fr): Abonnement
- Context: Billing, invoices, renewal emails
- Notes: The recurring customer relationship. Not the pricing tier.
- Owner: Billing team
- Last updated: 2026-06-12

### Plan
- Preferred translation (fr): Forfait
- Context: Pricing page, upgrade modal
- Notes: Pricing tier only. Do not use as a synonym for Subscription.
- Owner: Growth team
- Last updated: 2026-06-12

### Workspace
- Preferred translation (fr): Espace de travail
- Context: Team collaboration UI
- Notes: Never translate as bureau.
- Owner: Core product
- Last updated: 2026-06-12

## Placeholder-sensitive strings

### Welcome, %(name)s
- Preferred translation (fr): Bienvenue, %(name)s
- Context: Dashboard greeting
- Notes: Preserve %(name)s exactly.

### {0} seats available
- Preferred translation (fr): {0} places disponibles
- Context: Team billing summary
- Notes: Preserve {0} exactly.

## Do not translate

- Django
- TranslateBot
- DeepL
- GPT-4o-mini
- LocaleMiddleware

Then wire it into the workflow your team already uses. If your translation command supports a glossary file, the interface should look like this:

python manage.py translate --glossary TRANSLATING.md --locale fr

That's enough to stop a lot of avoidable drift. You can move to JSON later if CI needs stricter validation.


If you want a developer-native way to use a versioned TRANSLATING.md file with Django .po files, TranslateBot is built for that workflow. It keeps translations in your repo, preserves placeholders and HTML, and fits next to makemessages and compilemessages instead of sending your team into a separate TMS.

Stop editing .po files manually

TranslateBot automates Django translations with AI. One command, all your languages, pennies per translation.