Translate README files in reStructuredText with Sphinx

reStructuredText (reST) is a markup language that is popular in the Python developers community. reST is the standard markup language for docutils, Sphinx documentation generator, and the Python Package Index (PyPI). However, reST by now is still not popular enough. Most translation platforms, including Crowdin which I’m using now for EH Forwarder Bot, have no support to reST documents.

Sphinx has provided a plug-in sphinx-intl, which extract strings from reST documents and compile into GNU gettext message catalog template (.pot) files, and build new documents with translated strings in other languages. GNU gettext formats are widely accepted by translation platforms, making our life much easier. This would work out-of-box if you are generating HTML or PDF documentations, but not so simple if you want a reST output.

To utilize what we have in sphinx-intl, what we need is instead a document writer for Sphinx that outputs reST. There is a plug-in for called restbuilder that serves this very purpose, but it has not been updated for over a year. restbuilder is currently looking for maintainers. Unfortunately I don’t really have much time to maintain such a complex project. What I did is just forked the project, included some other fixes from existing PRs, and fixed some more stuff myself.

Extract strings

Since sphinx-build in general works with directories, we need to create a new temporary directory to isolate README.rst. We don’t want to include write a conf.py file just for the translation either as everything can be indicated in command line.

mkdir .build_readme
cp README.rst .build_readme/README.rst
sphinx-build -b gettext -C -D master_doc=README -D gettext_additional_targets=literal-block,image .build_readme ./readme_translations/locale/ .build_readme/README.rst
rm -rf .build_readme

Flags used on line 3:

  • -b gettext: Use gettext string extractor as builder
  • -C: Use no config file at all, only -D options
  • -D master_doc=README: Set the front page name as README (that’s the only file we have here)
  • -D gettext_additional_targets=literal-block,image: Include code blocks and images (and captions) into the message catalog.
  • .build_readme: Source directory
  • ./readme_translations/locale/: Destination directory
  • .build_readme/README.rst: File to build

This will extract strings into ./readme_translations/locale/README.pot. This file can then be uploaded onto any translation platform or directly passed over to translators.

Build translated README files

When you have got translated strings from your translators, you need to arrange it into a folder structure that sphinx-intl can recognize, namely {language_code}/LC_MESSAGES/README.(po|mo). I have arranged them in readme_translations/locale/{language_code}/LC_MESSAGES/ in my case.

To build translated README files, we need to install the restbuilder plugin for Sphinx. Here I’ll use my fork as example.

git clone https://github.com/blueset/restbuilder.git
cd restbuilder.git
python3 setup.py

Then, write a script to iterate through the locale folder and get a list of languages available, and use that to build a list of commands to run.

import glob
from pathlib import Path

# My language code is defined in a POSIX-like style. E.g. en_US
languages = [i[i.rfind('/')+1:] for i in glob.glob("./readme_translations/locale/*_*")]

# Compile .po files to .mo
sources = glob.glob("./**/*.po", recursive=True)
dests = [i[:-3] + ".mo" for i in sources]
actions = [["msgfmt", sources[i], "-o", dests[i]] for i in range(len(sources))]

# Build translated README files
actions.append(["mkdir", "./.build_readme/source"])
actions.append(["cp", "README.rst", "./.build_readme/source/README.rst"])

locale_dirs = (Path('.') / "readme_translations" / "locale").absolute()
for i in languages:
    actions.append(["sphinx-build", "-b", "rst", "-C",
                    "-D", f"language={i}", "-D", f"locale_dirs={locale_dirs}",
                    "-D", "extensions=sphinxcontrib.restbuilder",
                    "-D", "master_doc=README", "./.build_readme/source", f"./.build_readme/{i}"])
    actions.append(["mv", f"./.build_readme/{i}/README.rst", f"./readme_translations/{i}.rst"])
    actions.append(["rm", "-rf", f"./.build_readme/{i}"])
actions.append(["rm", "-rf", "./.build_readme/source"])

More flags used:

  • -b rst: Use reST as output format
  • -D language={language}: Indicate language code to use
  • -D extensions=sphinx.contrib.restbuilder: Load the restbuilder extension installed

With the commands above, you can build translated README files automatically with Sphinx and GNU gettext message catalogs. For full sample code with doit automation, visit the script in EFB Telegram Master Channel repository, and look for task_gettext and task_msgfmt methods.


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *