4.2.2. Logical structure of documents
A printed document is composed of two elements: content, i.e. the character strings associated with their logical structure, and presentation.
Presentation, which falls into the category of typography, is finalised by the contractor in accordance with the instructions of the graphic designers. Very often the presentation of the original text, its page layout, etc. do not correspond to the final presentation of the text compiled by the contractor. So, when the manuscript is being prepared, it is pointless to try to stick too closely to the print presentation (e.g. care must be taken not to insert manual text breaks, as these have to be removed when the document is processed by the contractor).
However, it is essential that the contractor can recognise the different parts of the text. To this effect, when inputting the document, it is important to either:
In order for the contractor to correctly interpret the text, it is crucial that the different text levels be correctly marked (headings, normal text, annotations, etc.).
To this effect, Word provides a simple solution, but it must be applied rigorously: styles. Each component of the text is differentiated by the application of a different marker (style):
Each element must have a unique style attributed to it, preferably based on a logical structure.
Do not, for example, differentiate headings manually (bold, italic, etc.)!
Allowing each author to freely choose which styles they use can quickly lead to difficulties: namely that each work requires individual processing, with a profusion of styles that can very quickly become difficult to manage. This is why standardisation of styles is desirable.
Ideally, the style sheets applied to the different documents should have the same foundation (a standard sheet). At the same time, style sheets may be accompanied by actual templates, which respond to the diversity of presentations (‘actual template’ refers to the adaptation of a unique style sheet to the specific typographical presentation of the work being prepared).
The rigorous use of styles, especially in differentiating headings, provides an additional benefit in Word: it allows the author to generate a table of contents automatically, which is impossible when headings are differentiated manually.
Another important technique for differentiating text elements is the application of a ‘markup protocol’, which specifically indicates the logical level of all text elements (e.g. chapter heading, section heading; normal text, indented text; references). A markup protocol must be developed using a description of the said elements, markers and the required typographical presentation.
Markers currently have a <MARKER> type format, e.g. <TCHAP> in the case of a marker for a chapter heading. They are based on SGML (standard generalised markup language). There have been many developments since SGML was first implemented, and at present XML is predominant.
The advantage of the markers used in these protocols is that they can be directly interpreted by desktop publishing programs (as well as by advanced word-processing programs), which makes the laborious process of finalising documents before printing redundant. The application of markup protocols needs to be agreed at a fairly early stage, ideally when the document is created.
In the case of a multilingual document, it is advisable to involve the institution's translation service. The translation service, which acts as a text multiplier by adding the required linguistic versions, can process the marked-up text, focusing on the content without wasting resources reproducing the presentation. It should also be pointed out that a marked-up text, which contains a minimum of formatting codes, is better suited to processing by advanced language technology tools.