Summary of some TeX features


The input language

phases
macro expansion
execution (registers, boxes, etc.)
\relax
do nothing.
useful as a separator, as it survives as a token to macro expansion

Characters and numbers

\char99
the character 99, which is c.
`\c
the number of character c, which is 99.
this is a number, use \number`\c to print it on the page.
\number12
\romannumeral12
Print a number on the page.
Also work for registers and numbers used internally.
^^H
The character that differs from H by 64.
'153
"6B
Number in octal and hex (only uppercase letters).
Only when TeX expects a number.
When TeX expects a number, `c is the same as `\c.
Examples: char.tex

Tokenization and character categories

character categories, and their defaults:
category meaning default charaters
0 escape \
1 begin group {
2 end group }
6 parameter #
10 space  
11 letter a-zA-Z
12 other 0-9,;@...
13 active ~
14 comment %
15 invalid
Control sequences can only contain letters, not others (no digits).
tokenization
The stream of characters is tokenized by:
\catcode`\c=12
Change che category of character c to 12.
Limited to the current group.
The actual command is \catcode99=12, since `\c is 99.
effects of category changes
A commmand like \catcode`\c=14 only affects the c's that are yet to be read. The ones that have already been read remain associated to their category at the moment they were read. This is important when using macros.
Examples: category.tex

Control sequences

Control sequences can only contain letters, not others (nor digits).
\csname par\endcsname
Same as \par.
The string may contain control sequences, but is an error if their expansion contains TeX primitives rather than being a sequence of characters.
\string\par
Same as the sequence \, p, a and r.
All four characters are assigned category 12 (other).
\meaning\cs
The meaning of a control sequence. If a macro, its definition.

Control characters and active characters

Can be used like control sequences (for macro names, etc.)
control character
Backslash + single character.
All characters can be used, including others.
active character
A character in category 13

Aliases

\let\a=t
Makes an alias \a for the single token t.
A token is either a single character (\let\a=b) or a control sequence (\let\a=\par).
The single character can be an arbitrary one, including for example {.
\futurelet \a \b t
Same as:
\let\a=t
\b
t
This way, \b can examine t through its alias \a, but t is still in the input sequence. Only works for a single token (=character or control sequence) t, which can however also be { or $.
Examples: let.tex, future.tex

Macros

\def\a{replacement}
Defines the macro \a.
The input is read sequentially. If \a is read, it is replaced by the text replacement, then reading continues on the text replacement.
\def\a#1#2{ab #1 cd #2 ef}
A macro with arguments.
macros vs. subroutines
Macros are different from subroutines, as shown by this example:
\def\a#1{argument #1 stop}
\def\b{text ending in \a}
\b c
The definition of \b would be wrong in functional programming, as it calls \a without its mandatory argument. It instead works, with replacements:
\b c 				% \b is replaced by "text ending in \a"
text ending in \a c		% now the argument to \a is c
text ending in argument c stop	% text that is shown on page
Not only this is allowed, it is the base of iterative macros.
cycles
Define a macro with a conditional call to itself at the end.
\def\a#1{
	operate on #1
	\if(termination)
		\let\next=\relax
	\else
		\let\next=\a
	\fi
	\next
}
\edef\a{replacement}
First expand the control sequences in replacement, then associate the resulting text to \a.
With a regular \def, replacement is associated verbatim to \a. Only when \a is used and replaced by replacement these control sequences are expanded.
In the replacement text, \noexpand\b avoid the expansion of \b.
\expandafter ab
First expand token b non-recursively, then proceed reading token a.
\def\a #1x#2_{replacement}
Arguments of \a are identified as the text delimited by space, x and _, instead of being the next two tokens.

Examples: expand.tex next.tex beginend.tex after.tex edef.tex noexpand.tex

Modifiers

\global
Makes the following command not limited to the current group.
Applies to:
\long
Allow \par in the argument of the macro.
\outer
Macro cannot be used inside another macro.

Conditionals

\ifx ab ... \else ... \fi
Test for tokens a and b to be the same.
\ifnum a=b ... \else ... \fi
Numeric test. Also works with <=, etc.
Expects a and b to be numbers: \the is not necessary.
\ifdim, \ifodd, \ifvmod, \ifcat, \ifvoid...
Other conditionals.
\ifcase num ... \or ... \or ... \or ... \fi
Case statement, depending on the value of num, from 0 up.
skipped part
The part of the false condition is skipped.
No expansion, no check for brake balancing.
Only checked for \if...\else...\fi balancing.
balancing
The text in a conditional needs not to be group-balanced.
For example, it may contain only { while } is outside.
In a macro, \iffalse{\else}\fi allows for an unbalanced brake.
condition
End with space or \relax to avoid the true part to be included in the condition.
\noexpand may be useful.
Examples: conditionals.tex

Registers

Numbered from 0 to 255.
Boxes have a different access syntax than the other registers.
\count243, \dimen83, ...
Registers for numbers, dimensions, etc.
Set value with \count243=21.
Use the value in a register with: First unused counter: \newcount\x, then \x=12, etc.
\box52
A register that contains a box.
Store value with \setbox52=....
Use value as \box52.
First unused: \newbox\x; set with \setbox\x=..., use as \box\x.
peculiarities of box registers
Set by \setbox32=..., not by \count12=...
After \newbox\x, token \x is a number, not a register; therefore, it is set by \setbox\x=..., not \x=....
\box52
\copy52
place the content of box 52, emptying it or not
save the size of a box for later use
\edef\w{\wd52} does not work
why: \wd52 is unexpandable; it is evaluated in the execution phase
correct way: \dimen38=\wd52
=
optional: \count12 43 is the same as \count12=43
also for internal numbers: \baselineskip 12pt
{\count2=12}
registers are local to groups:
the new value is lost, the previous is restored
useful for storing temp values
\chardef\name=32
\mathchardef\name=23142
commonly used to store natural numbers, sparing a counter
the first is only for numbers in the range 0-255
Examples: count.tex boxregister.tex chardef.tex

Error generation

\errmessage{this is an error message}
generate an error
\errhelp{the help message of the error}
the text shown if the user types 'h'
do before the previous one

Example: error.tex

Comments on the language

Tokenization

The input sequence is tokenized by converting each sequence \something into a single token of type "control sequence" and each other character into a token with an associated category.

Contrarily to most programming languages, the input cannot be tokenized in full and only then interpreted. Each token is immediately interpreted, and this may affect the subsequent tokenization. For example, \catcode`\@=0 makes @par to be a single token (a control sequence) rather than the four character tokens @, p, a and r. Another example is \csname abcd\endcsname, turned into the single token \abcd.

Macro expansion

While they may look so, macros are not subroutines like in C, etc. They are replaced, not executed.

Defining \def\a{text} associates the uninterpreted (but tokenized) text to the control sequence \a. This means that text may contain invalid sequences: unbalanced \if or \begingroup, undefined control sequences, macros without mandatory arguments, etc.

Each time \a is encountered in the input it is replaced by text, and then reading of the input sequence continues as if it contained text instead of \a. Only at this point text is interpreted.

Another difference from subroutines is that macros are expanded before executing commands. For example, if \def\a{\count21=45} then \a 1 is replaced by \count21=451.

Types

Certain sequences of tokens are treated as a single object: groups, numbers, dimensions, etc.

One such sequence is considered a single object only when TeX expects a group, or a number, a dimension, etc. A group is not a token: for example, \let\a=b requires b to be a token, if {xxx} is used then \a becomes an alias for { only and an error for the unbalanced } is issued. A group is taken as a single object only when TeX expects a group (more commonly, it expects something that can be either a token or a group). The same holds for numbers, dimensions, etc.

Note that \def\a{text} is a {}-enclosed text, not a group.

group
A sequence of tokens delimited by either {...} or \begingroup...\endgroup.
A group cannot be started by { and finished by \endgroup, nor vice versa.
The characters { and } are any two characters in category 1 and 2.
The brackets in \def\a{...} do not form a group. They do for boxes.
number
A sequence of tokens 0,…,9 is treated as a number, but only when a number is expected, for example after \count12=.

When a number is expected, some control sequences change meaning: while \count12 is usually the beginning of an assigment \count12 34 (same as \count12=34), when preceded by \number or \ifnum it means the value of counter twelve.

Also, when a number is expected, a character is taken as its number.

dimension
A number followed by a unit like pt or cm.
glue
A dimension plus streachability and shrinkability (see below).

End a number, dimension or glue with \relax if the following text is not to be taken as part of it, like in \count9=12 \b or \skip12=0pt \c where \b may start with 3 and \c with plus 1fil.

Braces
The formal syntax has an unintuitive way of denoting the braces: { and } mean braces (charcode 1 and 2) but also everything let equivalent to them like \bgroup and \egroup. Instead, ⟨left brace⟩ and ⟨right brace⟩ mean only the braces (charcode 1 and 2).

Some misleading features: groups are not tokens; macros are not subroutines.
Examples: misleading.tex


Output generation

Fonts

\font\internalname=externalname
loads a font
internal name
for switching to that font, like in {\internalneme abcd}
external name
basename of the font file, like cmr10 for cmr10.afm, cmr10.tfm, cmr10.mf, etc.

Note: the external name is a string, do no precede with a backslash unless the name is stored in a macro.
\font\internalname=externalname at 20pt
Default size is what specified in the font file.
\font\internalname=externalname scaled 2000
Default scale is 1000.
Magnification by powers of 1.2 can be done by a scaling of \magstep0, \magstep1, \magstep2, like in \font\internalname=externalname scaled \magstep1. Also \magstephalf is a magnification between 0 and 1.

Examples: font.tex

Glue

10pt plus 5pt minus 3pt
A space of 10pt that can be stretched to 10+5pt or shrinked to 10-3pt.
Can be streched more if necessary, but not shrinked more.
Stretchability and shrinkabiity are 0pt by default.
\hskip glue
\vskip glue
Place such a space (a glob of glue) horizontally or vertically.
\skip25=...
A register holding a glue.
10pt plus 1fil
Pseudo-unit that mean infinity.
Increasing levels of infinities: fil, fill, filll.
Shorthands: \vfil is \vskip 0pt plus 1fil, etc.
division of stretching or shrinking
Whenever a number of glue globs have to be stretched, the amount of space that is added to each is proportional to its strechability. Therefore a glue that is not stretchable is not stretched, while infinite wins over finite.
Same for shrinking.
glue vs. skip
a glue, a glue placed somewhere
\kern-2.3pt
like a fixed glue (no stretch or shrink)
not a valid break point (unless followed/preceded by a glue)

Boxes

baseline
every character, but also every box, has a baseline
example, baseline of a \vbox:
+----+
|abcd|
|efpl|
|====|
|  | |
+----+
reference point
the leftmost point of the baseline
height, depth
distance from the baseline to the top and to the bottom of the box
\wd, \ht, \dp
width, heigth and depth of a box register
can be modified: change the size but not the internal spacing of a box
character/rule vs. box
a single character has width, height and depth, but is not a box
same for rules
yet, they are not boxes: some operations take only boxes
(not characters or rules)

Examples: baseline.tex

Explicit boxes

\hbox{some text}
\hbox\bgroup some text \egroup
the second is useful for begin-end constructions
group
the inside of a box is in a group
\hbox to 3cm{...}
box is always 3cm large, but
content may be larger than 3cm, sticks out at the end
\hss to avoid warning
\rlap{...}
\llap{...}
0-width hboxes, aligned on the right or on the left
\hbox spread 3cm{...}
make box 3cm larger than its natural width

Example: boxgroup.tex tospread.tex

Box building (modes)

horizontal mode
line boxes on their baseline
restricted horizontal mode
inside a \hbox
no line breaks, all in a single line
vertical mode
line subboxes on their left edge
external: building a page
internal: in a \vbox or \vtop
\vbox vs. \vtop
baseline is that of their last (bottom) or first (top) subbox
subbox displacement
horizontal mode: \raise and \lower
vertical mode: \moveleft, \moveright (only boxes, no chars or rules)
part sticking out left is neglected from the width of a vbox or vtop

Examples: boxes.tex

Mode switching

start of page
vertical mode
a character in vertical mode
switch to unrestricted horizontal mode (for building a paragraph)
inside a vbox or hbox
regardless of current mode, switches to the mode of the box
after the end of a hbox
previous mode, not necessarily horizontal
\ifvmode
\ifhmode
\ifinner
test current mode
...

Examples: switch.tex

List manipulation

In vertical mode, TeX builds a list of boxes intended to be placed in a column. The last element of this vertical list can be inspected and removed. The same for horizontal mode.

\lastskip
\lastkern
amount of the last skip of kern, if the last of current list is so
\unskip
\unkern
remove the last item of the current list, if glue or kern
\lastbox
the last box in the current list, if it was a box;
using it removes the box from the current list
cannot be used in external vertical mode
\showbox21
print the content of \box21 to log file
includes glue, kerns and size of boxes
\showlists
show the lists that TeX is currently building
\setbox2=\vsplit1 to \dimen3
take out the first \dimen3 part of \box1 and moves it to \box2
same mechanism of page breaking

Examples: list.tex

Box overlapping

In vertical mode, boxes are tentatively placed like regular text, the baselines at \baselineskip from each other. If the first is deep or the second is tall, they may overlap. To avoid this, the second is moved down.
\lineskiplimit
if boxes in vertical mode are closer than this, move down the lower
\lineskip
glue introduced in between if the second box is lowered

Example: overlapping.tex crossbaseline.tex

Line breaks

\hsize
width at which lines are broken (line length)
also in a \vbox (see below)
\penalty10000
\penalty-10000
\penalty100
forbids, forces or set the penalty of breaking at this point
break of lines or pages depending on horizontal or vertical mode
macros for line/page break/nobreak are all defined in terms of this
\vfill or \hfill before \penalty-10000 to avoid excessive stretch
\vadjust{\penalty-10000} in a line forbids a page break after it
\pretolerance=1000
\tolerance=1000
accept bad paragraphs before or after trying hyphenation
\emergencystretch=10em
use this is to avoid formulae overflowing the right margin
technically: add this streatchability if all else fails
abcd\-efgh
\hyphenation{ab-cd-ef xx-yy s-wdf-tr}
allowed hyphenation points: single case or everywhere
general method: \discretionary{before}{after}{unbroken}

Example: line.tex, hyphenation.tex

Breaks in boxes

no breaks in an \hbox
it is a single horizontal line of boxes
may contain a \vbox (where breaks are allowed)
\hsize
the length of lines
determines where lines are broken
breaks in a \vbox
done according to \hsize
the width of the box containing the vbox is irrelevant
rationale
when TeX finds characters while in vertical mode, it switches to unrestricted horizontal mode; this mode use \hsize

Examples: breakbox.tex

Paragraphs

\parskip \parindent
vertical and horizontal movements at the beginning of a paragraph
the indent is not a skip but a hbox of width \parindent
\hangindent, \hangafter
indent lines of \hangindent amount
only for \hangafter lines of the paragraph
only after \hangafter lines if negative
both reset when paragraph ends (including a blank line)
\vadjust{abcd}
places abcd as if it were the following line in the paragraph

Example: paragraph.tex migrating.tex

Other

\prevdepth
the depth of the last line in a paragraph or vbox (simplifying)
can only be used in vertical mode
(=when paragraph is ended and the next not yet started)
depth of a vbox
zero if the last item is a kern or glue
\voidb@ox
a box that is always empty

Examples: depth.tex

Rules and leaders

Contrary to what its name suggests, \vrule is a horizontal command, which means it can only be used in horizontal mode. This is because a \vrule is intended to be placed in a horizontal list, like in ab|cd. This is indeed by realized by \hbox{ab\vrule cd}

In a similar way, \hrule is a vertical command, intended to separate elements placed one under the other. For the contrary (horizontal line in a horizontal list), use leaders.

\vrule
a vertical line
as tall as the horizontal box in which it is
\hrule
an horizontal line
as large as the vertical box in which it is
\hrule width10pt height8pt depth2pt
a rule with fixed dimensions
vertical command, \vrule is horizontal
\leaders\hbox{+}\hfil
fill space with repetition of the box
a rule can be used in place of the box
a glue can be used instead of hfil
vertical leaders are also allowed

Example: rules.tex leaders.tex

Tables

\halign
a table of equally-horizontally-aligned rows
\halign{specification\cr row\cr row\cr row\cr}
\valign
a table of equally-vertically-aligned columns
\valign{specification\cr column\cr column\cr column\cr}

Examples: align.tex

Math

\mathcode`<="0042
in math mode < produces B (character 42)
first two digits are for font and family
\over ab
\atop ab
a over b, with and without line
also versions ...withdelims
\underline{abcd}
\overline{abcd}
overline and underline
\mathord A
\mathbin A
\mathrel A
spaces around A: ordinary characters, binary operators and relations
increasing amount of spacing
\displaystyle {a \over b}
\textstyle{a \over b}
force formula to be shown in display and text mode
the first is useful for placing formulae in tables

Examples: math.tex

Page

origin
one inch from left and top margin of the page
\hoffset
\voffset
set the origin of the page
\hsize
\vsize
length at which lines/pages are broken
\nobreak
forbids a break at this point in a vertical list
only in vertical mode: after the previous paragraph or hbox has ended
inside a line/paragraph: \vadjust{\nobreak}
forbids a page break after the current line (not paragraph)
\penalty
in vertical mode, the cost of breaking pages at this point
\nobreak is defined from this
Example: page.tex

Output routine

\box255
the current page
\shipout\box255
output the box as the next page
work for arbitrary vertical box
\output{some tokens}
set tokens to be inserted in the sequence of tokens when the page ends
these tokens are then scanned and interpreted
done inside a group: assigments are local unless \global
\deadcycles
\maxdeadcycles
check that \output emptied \box255

Example: output.tex


Some commands and macros

Examples: some.tex

Special

Used by other programs, like xdvi, dvips, dvipdfm.

tex -src ...
latex -src ...
Add source specials to the dvi file, to be used by dvi viewer.
Same as \special{src:line file} at every paragraph.
\special{papersize=200pt,300pt}
Make pages 200pt wide and 300pt tall.
Border shown in red by xdvi.
Used by dvips and dvipdfm.
\special{pdf: outline 1 << /Title (third page) /Dest [ @thispage /FitH @ypos ] >>}
\special{pdf: outline 1 << /Title (another) /Dest [ @page2 /FitH 0 ] >>}
Add the current page or page 2 to the pdf bookmarks (dvipdfm).

How to