About parallel corpora

Parallel corpora: what is it?

A parallel corpus is a special case of a linguistic corpus, one of the main tools used by linguistics specialists in the XXI century. Like the main part of linguistic corpora, the parallel corpus is usually provided with the so-called metainformation (information about each text — when it was created, by whom, what volume it is, etc.), as well as markup (each word is assigned its initial form, grammatical information, etc.).

Parallel corpus is a collection of texts in two languages at once. An important element of marking parallel corpora is alignment: each sentence (at least a paragraph) in language X corresponds to a sentence in language Y. Thanks to the alignment, the parallel corpus becomes a useful tool for several categories of users. This:

Here are the most famous examples of parallel cases: