I should never be tired of repeating it: word-processors are one of
the more overrated and misused software in the computers of too many
common users.
It's not a battle against the infamous Microsoft Word, even if
from the
freedom point of view, office suite like
OpenOffice
or
LibreOffice
are a better choice. The speech holds for these software as well: the
problem is with that kind of software, with what is called
currently word-processing
.
In short the main point is that textual, human readable
formats in specific contexts are better than any binary
format. It should be not hard to agree when the content is itself
textual: an article, our next best-seller book, our thesis about
political science, our screenplay, notes of several kind (shopping
list, names and numbers you need to remember), a blog entry and so
on. They all are primarily made of text.
Unfortunately people are accustomed to
WYSIWYG, which is often evil unless you are an artist and
you are doing some kind of visual elaboration of the text which is
part of the message you want to convey, or if you have to arrange the
text altogether with a lot of graphics. Users should emancipate
from this visual
approach and learn to focus on the content,
when it is what they have to deal with.
Word-processors make users to believe they are in control of how
things will be presented and that it is their responsibility. But
often it's not their duty. Often it's someone else duty and the users
must focus on the content and the role of segments of this
content: they have to mark a piece of text e.g. as chapter
title
, but forget about how that chapter title will be
rendered.
Often the way things must be presented is codified; there are rules
describing page size, font styles, font sizes, spacings and so on.
Once you get that you need to deal for real only with
the meaning or role of the things you write, then you
are ready to throw away your preferred word-processor (or use it in a
totally different way — modern WP would make it possible, but
they don't work as well as other tools
).
So you must prefer human readable formats: the file containing
the text you wrote does not need a specific piece of software.
It's enough a text editor. You can use a language that allows
you to describe the actual content, marking it
someway. And that's the sense of a markup language.
When I talk about human readable formats and markup language
I am not thinking about
several eXtensible Markup Languages
produced and consumed by a machine: in
fact even the evil Microsoft new
MS Office formats, altogether
with the
OpenDocument standard for documents (the
standard you should use when you won't follow the suggestions of this
article), are XML based formats. So you could read the contents, but
unlikely you can benefit of this possibility, and it would be
even harder if not impossible to write that content by hand. Moreover
everything is packaged into a zip
archive and so those formats
appear as binary.
Some reason why purely textual, simple formats are better than
binary or complex textual
machine-addressed formats:
- contents can be understood having no or few knowledge of the
format (this does not mean it's easy or they are usable the same
way you would do interpreting correctly and knowing the format);
- script to parse the data can be easily written by a programming
enabled mind. Although the finest art is not for everyone, simply
grepping or data extraction should be, almost;
- the only program needed to view or edit the file is a text
editor: an application that must be on every computer with any
operating system installed. The data are operating system
independent and software independent (not 100% true, but it is
in common modern computing worlds);
- merging and splitting can be done easily; if there's some sort
of structure, this minimal knowledge is required in order to let
the splitted files or the single merged file to keep an
independent meaning;
- diff-ing and comparing can be done with standard common tools;
- versioning tool can track changes more easily (without relying
on the specific tools of the specific software that can handle
the binary format)
Simple markup languages are particularly tailored for benefitting
of these features. This article was written using a subset of HTML,
the markup language of the World Wide Web, and stored into a directory
managed
by
Mercurial, a distributed Source Control Management tool,
that makes it possible to track the changes and
the history and (pushing to and pulling from a remote server) I
could also
contribute with myself. Without using a special software to
handle this particular format.
The format of course was chosen according to a specific need: in
this specific example the format is suitable for direct
web
publishing. Other requirements would have made us choose other
formats. E.g. since I am an Emacs user, the format I choose for notes
and other casual writings is often the org-mode.
Interpreting the markup language (that could be easy and at hand
for a lot of computer geeks) it is possible to transform it (to
another markup language or to anything else) and elaborate it in
several mechanical ways.
Another classic example is when you think about stuffs you want to
see printed, like for example a book, or an article on paper; then one
of the most suitable format is
LaTeX.
Even if you need a specific application (a set of applications and
data, indeed) in order to produce the final document ready to
be printed (e.g. a PDF), the format is textual, structured, it was
thought to be written by hand (with just a text editor) and
focus on
content and not on presentation. From one unique
source you can produce e.g.
PostScript,
PDF, HTML document or (virtually) any kind
of document
and format. And yet, you can read it with a text
editor. And you can even add metainformations in disguise of comments.
For sure everything is clearer if you use a modern powerful text
editor able to highlight the syntax of such languages. But it
is not something you can't live without.
Once you learn to separate the meaning from the way it
is presented, then you get the value of using everything but a
word-processor for the vast majority of the things you may imagine
(when the value is the content and not the way it is shown).
Said it in another way: the
WYSIWYG approach is largely overrated. You must
learn to separate the actual content from the way you want it to be
seen on a screen, paper or other media. Learn that a lot of these way
are codified (or they should), and usually you are not the one who
codified it. So, you must focus on content, since the description of
how to show it sits elsewhere — maybe it's even someone else
duty, thus you have not to worry about.
Finally: before to fire a word-processor, consider other approaches
that can make your life easier, even if it doesn't seem so at first
(in this article I have ignored several anecdotes that drove crazy a
lot of word-processors' users trying to obtain what they wanted from
their software, losing more time on these efforts than in producing
their content). In general, if you drop all word-processors
forcibly, you will discover how rarely useful they are and how
beneficial other workflow can be.