Parallel corpora in English, German and Spanish will be uploaded. The corpora will be aligned using structure in the source data.
1. Create three corpora, one in each language.
2. If each corpus consists of multiple files, make sure the alphabetical order of the corresponding files is the same in all corpora, i.e. the first English file must correspond to the first German file and the first Spanish file, the second file to the second files, etc. It may be practical to prefix the file names with a number to avoid aligning incorrect segments.
3. Make sure the source data contain structure align to mark segments. No segment can be omitted. The order of segments must be the same in all aligned corpora. The structure must be added to the files before uploading them..
You can also use an alignment software such as hunalign. A manual correction of the output might be necessary.