Welcome to PyG3T’s documentation!¶
PyG3T (short for Python GetText Translation Toolkit) is a set of pure Python tools to work with gettext catalogs (.po-files) as a translator.
The toolkit consists of:
- gtcat: write a catalog in normalized format, or change encoding
- gtcheckargs: parse translations of command line options in a catalog, checking for errors (designed for GNU coreutils and similar)
- gtcompare: compare two catalogs qualitatively
- gtgrep: perform string searches within catalogs
- gtmerge: combine two or more catalogs in different ways
- gtprevmsgdiff: show a word-wise diff which compares old msgids in a catalog with current ones
- gtwdiff: show an ordinary podiff as a word-wise podiff
- poabc: check for common translation errors, such as missing punctuation
- podiff: generate diffs of po-files, such that each differing entry is printed completely
- popatch: apply a podiff to an old catalog to obtain the new catalog
- gtxml: check xml in translations
Contents:
Getting Started¶
TODO
Examples¶
gtcat¶
Write catalog to stdout with syntax coloring:
gtcat -c file.po
Change encoding of file to UTF-8:
gtcat --encode utf-8 file.po > file.utf8.po
podiff¶
The main application of podiff is to generate a diff where the translator has added and changed some strings or comments. Generate a diff of two catalogs:
podiff old.po new.po > differences.podiff
If the two catalogs have different English strings, the above will cause an error, because we expect only the translator to have been at work. To override this and produce a diff with all changes, including addition and removal of English strings, use:
podiff old.po new.po --full > differences.full.podiff
gtcompare¶
You get a file from Launchpad and another from debian for the same translation, but with different contents:
gtcompare file1.po file2.po
This will print something like:
Template of second file is more recent
Translations in first file were revised more recently
Total number of messages increased by 1849 from 193 to 2042.
49 msgids removed [u: 0, f: 0, t: 49].
1898 msgids added [u: 176, f: 307, t:1415].
144 msgids in common.
0 messages remain untranslated.
0 untranslated messages changed to fuzzy.
0 untranslated messages changed to translated.
0 fuzzy messages changed to untranslated.
0 messages remain fuzzy.
0 fuzzy messages changed to translated.
53 translated messages changed to untranslated.
87 translated messages changed to fuzzy.
4 messages remain translated.
Miscellaneous¶
Update an outdated podiff (program.podiff) to match the msgid of a newer version (master.po), and regenerate the diff:
popatch program.podiff --new > new.po
gtmerge new.po master.po > merged.po
podiff master.po merged.po > program.new.podiff
The diff will only include those messages that are still present in the new template. Often it might be a good idea to use msgmerge instead of gtmerge to preserve fuzzy messages.
This command does the same, but in one line:
popatch program.podiff --new | gtmerge - master.po | podiff master.po - > program.new.podiff
API documentation¶
Autogenerated API documentation for the pyg3t modules.
Glossary¶
- chunk
- Chunk refers to the chunk of text in a .po file that corresponds to a message. The whole file consists of such chunks separated by a blank line. See the entry for message for an example of what a chunk looks like.
- fuzzy
Fuzzy is a flag that can be set programmatically when extracting strings for translation or manually by the translator. It is used to indicate that the translation may no longer be valid. Once the translator has reviewed (and possibly corrected) the string, the flag is removed.
See the gettext reference documentation for details.
- gettext catalog
A gettext catalog is a collection of gettext message s and a header with metadata. It is most often contained in a .po file. The header is the message with and empty msgid and is usually the first message of the file. In pyg3t a gettext catalog is represented by the
pyg3t.gtparse.Catalog
class.See the gettext reference documentation for details.
- message
A message refers to all data and metadata pertaining to a single translatable string in a gettext catalog. The data consists of the msgid (original string) and the msgstr (translated string), both of which can optionally have plural forms. The metadata consists of references to source code lines for the msgids and optionally a comment, context and a previous version of the msgid. In pyg3t a message is represented by the classes
pyg3t.gtparse.Message
andpyg3t.gtparse.ObsoleteMessage
.A typical message could look like this (inspired by the Danish translation of nautilus) (FIXME better example):
# Ahh, nautilus can run programs #: ../data/nautilus-autorun-software.desktop.in.in.h:1 msgid "Run Software" msgstr "Kør programmer"
See the gettext reference documentation for details.
- msg
- See message
- msgid
The original string (usually in english), which is to be translated.
See the gettext reference documentation for details.
- msgstr
The translated string.
See the gettext reference documentation for details.