A Practical Format of Glossary for Django Developers

Meta description: Your Django app doesn't need a giant termbase. Use a practical glossary format that stays in Git, guides translations, and survives releases.

You ship a feature on Friday. In locale/fr/LC_MESSAGES/django.po, one string translates Subscription as Abonnement. Two weeks later, another PR translates Manage Subscription as Gérer le Plan. Nobody notices until support screenshots both screens side by side.

That's the core problem with the format of glossary. It isn't documentation polish. It's whether your product language stays stable when multiple developers, reviewers, and translation tools touch the same app over time.

makemessages extracts strings. It does not protect terminology. String-by-string translation is where drift starts, especially when short UI labels have weak context and different people interpret them differently.

When Your Translations Become Inconsistent

A lot of teams assume gettext is enough because the extraction side works. Django finds the strings, writes the .po files, and gives you a place to review diffs. That part is fine. The weak point is term control.

If your app has billing, permissions, onboarding, and admin UI, you already have words that must stay stable. Plan, subscription, workspace, owner, seat, member. If those drift, users feel it immediately.

Practical rule: if a term appears in more than one screen and carries product meaning, it belongs in a glossary before it belongs in translation review.

The gap is wider than generally perceived. A 2025 LISA study found that 78% of localization bottlenecks stem from inconsistent terminology management, yet only 12% of technical documentation includes versioned, machine-readable glossary schemas. That's exactly the mismatch most Django teams live with. The problem is common. The tooling and habits are behind.

Where drift actually comes from

It usually isn't one bad translator. It's normal engineering work:

New feature labels: a developer writes _("Plan") in one view and _("Subscription") in another.
Missing context: short strings don't explain whether archive is a noun or a verb.
PR isolation: reviewers read one diff, not the whole product vocabulary.
AI output variance: the model picks a plausible translation that doesn't match prior usage.
Contractor handoff: outside translators don't know your product semantics.

Teams often try to fix this with translation memory alone. That helps with repeated segments, but it doesn't replace a decision about preferred terminology. If you want the difference in plain terms, this translation memory explainer is worth reading.

What works better

You need one source of truth for terms that are easy to mistranslate and expensive to keep correcting later.

Not a giant enterprise termbase. Not a vendor portal your team never opens.

A file in your repo, reviewed like code, with just enough structure to guide humans and tools.

The Simplest Glossary Format a Markdown File

Start with a file your team will maintain. For most Django projects, that file is TRANSLATING.md at the repo root.

A hand-drawn diagram illustrating a project file structure with a highlighted TRANSLATING.md file.

Markdown works because it has zero adoption cost. Git already versions it. Reviewers can read it in a PR. Your team doesn't need extra software to edit it.

A glossary also doesn't need to be huge to be useful. A plain-language glossary built for methodology researchers had 64 terms and still standardized language for its audience, as noted in the plain-language glossary study. That matches what works in product teams. Start with the terms that keep causing churn.

A copy-pasteable Markdown pattern

# Translating Guide

## Preferred product terms

### Subscription
- Preferred French: Abonnement
- Context: Billing UI, invoices, customer portal
- Notes: Use for the paid recurring product relationship. Do not switch to "forfait" unless the source term is "plan".

### Plan
- Preferred French: Forfait
- Context: Pricing page, upgrade flow
- Notes: Refers to pricing tier, not the billing relationship.

### Workspace
- Preferred French: Espace de travail
- Context: Team collaboration features
- Notes: Never translate as "bureau".

### Owner
- Preferred French: Propriétaire
- Context: Team roles and permissions
- Notes: Role name in access control UI.

## Strings with placeholders

### Welcome, %(name)s
- Preferred French: Bienvenue, %(name)s
- Context: Dashboard greeting
- Notes: Preserve %(name)s exactly.

### {count} seat
- Preferred French: {count} place
- Context: Billing and team size
- Notes: Preserve {count} exactly. Check plural handling in the .po entry.

## Do not translate

- TranslateBot
- Django
- DeepL
- GPT-4o-mini

Why Markdown is enough at first

A good Markdown glossary gives you a few things immediately:

Readable in review: developers can scan it without opening another tool.
Versioned by default: every terminology change gets a commit history.
Close to code: it lives next to locale/, not in a separate portal.
Usable by humans first: that matters more than schema purity early on.

Keep the first glossary boring. If your team needs a meeting to understand the file, the format is already too heavy.

What doesn't work is stuffing every possible phrase into it. A glossary should capture preferred terms, tricky strings, and forbidden translations. Leave ordinary sentences to the .po files.

Structured Glossary Formats JSON and CSV

Markdown is good for adoption. It's weaker when you want validation, automation, or importing terms into scripts. That's when JSON or CSV starts to pay off.

The trade-off is obvious. More structure gives you better tooling, but it also raises the maintenance burden. If your team won't keep the file current, a perfect schema is useless.

JSON when you want validation

JSON is the better choice if you want CI checks, programmatic lookups, or a predictable schema.

[
  {
    "term": "Subscription",
    "locale": "fr",
    "translation": "Abonnement",
    "context": "Billing UI, invoices, customer portal",
    "notes": "Use for the recurring paid relationship.",
    "do_not_translate": false
  },
  {
    "term": "Plan",
    "locale": "fr",
    "translation": "Forfait",
    "context": "Pricing page and upgrade flow",
    "notes": "Tier name, not the subscription itself.",
    "do_not_translate": false
  },
  {
    "term": "Welcome, %(name)s",
    "locale": "fr",
    "translation": "Bienvenue, %(name)s",
    "context": "Dashboard greeting",
    "notes": "Preserve %(name)s exactly.",
    "do_not_translate": false
  }
]

JSON is strict. That's good in CI. It's less pleasant for non-technical editors.

CSV when editors live in spreadsheets

CSV is useful when a product manager, translator, or reviewer wants to edit terms in Excel or Google Sheets.

term,locale,translation,context,notes,do_not_translate
Subscription,fr,Abonnement,"Billing UI, invoices, customer portal","Use for the recurring paid relationship.",false
Plan,fr,Forfait,"Pricing page and upgrade flow","Tier name, not the subscription itself.",false
"Welcome, %(name)s",fr,"Bienvenue, %(name)s","Dashboard greeting","Preserve %(name)s exactly.",false
TranslateBot,fr,TranslateBot,"Brand name","Never translate.",true

CSV is easy to share. It's also easy to break with commas, quotes, and line endings if nobody validates it.

Comparison of Glossary File Formats

Format	Best For	Pros	Cons
Markdown	Small teams, early adoption	Readable in Git, low friction, good for notes	Harder to validate automatically
JSON	Engineering-led workflows	Strong schema, easy CI validation, good for tooling	Less friendly for manual editing
CSV	Spreadsheet-heavy collaboration	Easy bulk editing, import/export friendly	Weak structure, easier to corrupt

If you're deciding between them, it helps to think in terms of writing constraints and audience. Documind's guide to writing formats is useful here because it frames format as a function of use, not preference. That's exactly how you should pick a glossary file.

Don't choose the most advanced format. Choose the most advanced format your team will still update during a release week.

Anatomy of a Glossary Entry

File format matters less than entry format. Most broken glossaries fail because each term is written differently, with missing context and no ownership.

A diagram illustrating the anatomy of a glossary entry, including term, definition, category, slug, and source.

The entry should be standardized. That's how you make it reviewable and auditable over time. The business glossary guidance from Decube recommends capturing metadata like definition source, last update date, related terms, ownership, and usage examples. For product localization, that's the right instinct even if your file stays lightweight.

Fields worth keeping

At minimum, each entry should have:

Term: the source phrase as it appears in code or .po.
Translation: the preferred target-language rendering.
Context: where the string appears in the app.
Notes: what distinction the translator must preserve.

After that, add metadata that helps review:

Source: who decided the term, or where it came from.
Last updated: useful when product language changes.
Owner: the person or team responsible for the decision.
Related terms: useful for pairs like plan and subscription.
Flags: for rules like do-not-translate.

For a broader terminology reference your team can align on, keep a product-facing companion like a glossary of terms for localization work.

Placeholder-safe entries

Glossary entries should preserve placeholders exactly. If your glossary can't represent that rule clearly, it's not ready for Django strings.

#: billing/views.py:42
#, python-format
msgid "Welcome, %(name)s"
msgstr "Bienvenue, %(name)s"

#: teams/templates/teams/summary.html:18
msgid "{0} seats available"
msgstr "{0} places disponibles"

A few rules matter here:

Keep placeholder tokens unchanged: %(name)s, %s, {0}, {count}.
Record grammatical notes: some languages need a different sentence shape around the token.
Mark non-translatable tokens clearly: product names, code identifiers, and CLI commands.

Short labels are where context earns its keep. Owner in billing might map differently than owner in object storage or document metadata. The term alone isn't enough.

Versioning Your Glossary with Git and CI

If the glossary affects shipped text, it belongs in the same lifecycle as code. That means Git, pull requests, validation, and release checks.

A six-step infographic illustrating the process of versioning a glossary using Git and CI workflows.

Teams that treat glossary edits as side notes get side-note quality. Teams that review terminology changes in PRs catch drift before it lands in production.

The logic isn't new. The OECD glossary was created to support consistent data collection and to highlight inconsistencies between standards, according to the PARIS21 glossary reference. Different domain, same lesson. Formal glossaries exist to keep shared language stable at scale.

What a workable flow looks like

A practical flow looks like this:

Add or update terms in TRANSLATING.md, glossary.json, or glossary.csv.
Commit glossary changes in the same branch as feature strings.
Open a PR that shows both code and terminology diffs.
Run CI checks on glossary structure.
Regenerate or review translations before deploy.

For teams already thinking about prompt and automation governance, Prompt Builder's piece on streamlining prompt development is a good parallel. Prompt files and glossary files have the same maintenance problem. They drift unless versioning is part of the workflow.

Here's the kind of validation step that pays off fast:

python manage.py makemessages --locale=fr
python -m json.tool glossary.json > /dev/null
python manage.py compilemessages

That won't catch semantic mistakes, but it does catch broken structure before someone merges garbage into main.

A short explainer on what i18n means in real app development is useful if you need to align the team on why this belongs in the release process at all.

Review rules that actually help

Review glossary diffs like API changes. A renamed term can have user-facing impact in five screens and three emails.

Use a few hard rules:

One PR, one terminology decision: don't bury term changes inside unrelated copy edits.
Require context in every new entry: no raw term without screen or domain notes.
Reject silent synonyms: if plan and subscription are distinct in product logic, the glossary must say so.
Keep machine checks cheap: schema validation should run fast enough that nobody disables it.

Later, if you automate translation in CI, the glossary becomes an active input instead of a passive note.

Here's a walkthrough that shows the broader workflow thinking behind this kind of setup:

Your Next Step a Ready-to-Use Glossary Template

If you've got no glossary today, don't overdesign it. Add one file. Put it in Git. Start with the terms that already cause review comments.

Use this as your first TRANSLATING.md:

# Translating Guide

## Product terms

### Subscription
- Preferred translation (fr): Abonnement
- Context: Billing, invoices, renewal emails
- Notes: The recurring customer relationship. Not the pricing tier.
- Owner: Billing team
- Last updated: 2026-06-12

### Plan
- Preferred translation (fr): Forfait
- Context: Pricing page, upgrade modal
- Notes: Pricing tier only. Do not use as a synonym for Subscription.
- Owner: Growth team
- Last updated: 2026-06-12

### Workspace
- Preferred translation (fr): Espace de travail
- Context: Team collaboration UI
- Notes: Never translate as bureau.
- Owner: Core product
- Last updated: 2026-06-12

## Placeholder-sensitive strings

### Welcome, %(name)s
- Preferred translation (fr): Bienvenue, %(name)s
- Context: Dashboard greeting
- Notes: Preserve %(name)s exactly.

### {0} seats available
- Preferred translation (fr): {0} places disponibles
- Context: Team billing summary
- Notes: Preserve {0} exactly.

## Do not translate

- Django
- TranslateBot
- DeepL
- GPT-4o-mini
- LocaleMiddleware

Then wire it into the workflow your team already uses. If your translation command supports a glossary file, the interface should look like this:

python manage.py translate --glossary TRANSLATING.md --locale fr

That's enough to stop a lot of avoidable drift. You can move to JSON later if CI needs stricter validation.

If you want a developer-native way to use a versioned TRANSLATING.md file with Django .po files, TranslateBot is built for that workflow. It keeps translations in your repo, preserves placeholders and HTML, and fits next to makemessages and compilemessages instead of sending your team into a separate TMS.