|
An important part of the m4 language is the quote characters. By default the quote characters in m4 are the characters ` and ' but ' is needed for H.T.M.L. so it is not practical to use these as quote characters. Instead I always define the quote characters to be the left square bracket [ and the right square bracket ] because they are hardly ever used in H.T.M.L., while left round bracket ( and right round bracket ) are needed by m4 and < and > are needed for H.T.M.L. tags. The rest of this document will explain how quotes are used. To change quotes to [ . . . ] characters, you need to write the following line of m4 code at the top of every m4 document:
m4_changequote([,]) |
The m4 language by default also has the # character as the start of a comment. For creating H.T.M.L. documents it is convenient to turn comments off, which can be achieved by the following line of m4 code:
m4_changequote |
What follows is a diagram showing in a slightly simplified way, how m4
works. In this section we will make use of the following notation:
macro definitions will be indicated by the red right arrow (→) so
that X → Y means that X evaluates to Y. In the
following eval
( . . . )
stands for Evaluate, ieval
( . . . )
stands
for Inner Evaluate and mcall
( . . . )
stands for
Macro Call.
| = X |
|
|
= nothing, but has the side effect of setting ieval (X) → ieval (Y) . |
|
|
= |
〉— if A is defined, or |
|
= A( |
〉— if A is undefined. |
| = |
|
| = X |
|
|
= nothing, but has the side effect of setting ieval (X) → ieval (Y) . |
|
| = |
|
| = |
|
Note that in the second and fifth items of the above list the value
for n can be zero, which means that you are calling a macro with
no arguments. Suppose that the macro X → Y is defined, then
mcall
(X,Z1, . . . ,Zn)
evaluates to Y with every
occurrence of $1 replaced with Z1, every occurrence of
$2 replaced with Z2, and so on until every occurrence of
$n is replaced with Zn.
Note that there is a bug in the above algorithm for quoted arguments to macros. This bug only manifests itself in Section § 5.5 and Question 5.7b. I also know of another bug but it is so rare in practise that I will not mention it in this document.
Suppose the following definitions are in effect:
APPLE | → aaa $1 aaa |
SEX | → DOG |
which can be achieved by the following lines of code:
m4_define(APPLE,aaa $1 aaa) m4_define(SEX,DOG) |
then
|
Therefore APPLE(SEX) evaluates to APPLE(DOG) which finally evaluates to aaa DOG aaa. Suppose the following definitions are in effect:
APPLE | → BANANA |
BANANA | → CARROT |
CARROT | → DOUGHNUT |
which can be achieved by the following lines of code:
m4_define(APPLE,BANANA) m4_define(BANANA,CARROT) m4_define(CARROT,DOUGHNUT) |
then
|
Therefore APPLE evaluates to BANANA, which evaluates to CARROT, which finally evaluates to DOUGHNUT. In the rest of this document we will make use of the red right double arrow (⇒) in the following abbreviated notation: APPLE ⇒ BANANA ⇒ CARROT ⇒ DOUGHNUT.
The basic rule of thumb is that you evaluate the arguments of macros before you evaluate the enclosing macros. When evaluating a macro call, you replace it with its expansion until you get a macro call which is undefined, which you leave as is. Also remember that outer macros have their results evaluated one more time than inner macros. In practise you simply write macro calls without quotes [ and ] and add quotes until your code behaves the way that you want it to.
When m4 is invoked with the -P command-line switch, all built-in macros are named with the prefix m4_. All macros that you write should also use this prefix because such symbols are coloured in Emacs with a red background.
The m4_ prefix is highly unlikely to accidentally clash with any word that you would think of writing. If you don't want to or can't be bothered using the m4_ prefix, then you should name the macro using using all uppercase letters to reduce the chances that the macro will accidentally clash with a word that you will write.
Quotes in m4 are used to prevent the evaluation of expressions. The first use of quotes is for constants, which are expressions that we don't want evaluated. For example, suppose the following macros are defined:
m4_define(SEX,DOG) m4_define(APPLE,AAA $1 AAA) m4_define(BANANA,BBB $1 BBB) |
so that SEX → DOG and APPLE → AAA $1 AAA and BANANA → BBB $1 BBB. With these macros in effect, writing SEX in our m4 source file generates DOG in our H.T.M.L. output file, because line 1 above defines SEX → DOG. So the natural question to ask is, how do we write the word SEX and prevent m4 from evaluating it as a macro? The answer depends on how many levels of macro call you are currently in:
Note: There is a bug in my algorithm for the last two elements in this list so that the output from the algorithm presented in § 5.2 differs from the output of m4.
Naïvely you would think that the following code defines SEX → APPLE and then defines SEX → BANANA:
m4_define(SEX,APPLE) m4_define(SEX,BANANA) |
Let's take a look at what actually happens. Line 1 defines SEX → APPLE and line 2 defines APPLE (the current value of SEX) → BANANA. Therefore SEX ⇒ APPLE ⇒ BANANA. To avoid the spurious definition of other macros, we need to quote the first arguments to m4_define like so:
m4_define([SEX],APPLE) m4_define([SEX],BANANA) |
With these definitions in force, SEX ⇒ BANANA. The golden rule is therefore that you should always quote the first argument to a m4_define statement.
With the following definitions in place:
m4_define([APPLE],BANANA) m4_define([SEX],APPLE) m4_define([APPLE],CARROT) |
then SEX ⇒ BANANA, the value of APPLE at the time the SEX macro is defined. This technique is called early binding, since the binding of the name occurs when the SEX macro is defined. With the following definitions in place:
m4_define([APPLE],BANANA) m4_define([SEX],[APPLE]) m4_define([APPLE],CARROT) |
then SEX ⇒ CARROT, the value of APPLE at the time SEX is evaluated. This technique is called late binding since the binding of the name occurs when the expression SEX is evaluated. Usually late binding is preferable to early binding so the golden rule is that you should always quote the second argument to m4_define. Combining these two golden rules, you should always quote both arguments to m4_define, unless you want early binding in which case you should only quote the first argument to m4_define.
Consider the following pseudo code:
FUNCTION factorial(n : INTEGER) : INTEGER BEGIN IF n = 0 RETURN 1; ELSE RETURN n * factorial(n - 1); END |
This is a self-referential definition because in line 4 the function calls itself. This function successfully computes the factorial of the argument n which can take any value in the set {0,1,2,3,...}, and whose result is equal to:
n * (n-1) * (n-2) * ... * 3 * 2 * 1 |
Self-referential definitions are okay in some cases. The above pseudo code gives an infinite loop if it is given an argument less than zero. The following self-referential function also results in an infinite loop, whatever argument is given.
FUNCTION factorial(n : INTEGER) : INTEGER BEGIN RETURN factorial(n); END |
Therefore this self-referential definition should not be used. Self-referential definitions are allowed in m4 which gives you more rope to hang yourself with. Basically, self-referential definitions should only be used by advanced users of m4 because of the danger of resulting in an infinite loop. Suppose the following definition is in place:
m4_define([FROG],FROG)
|
If FROG → X before this line then the above line has no effect so that FROG ⇒ X which may or may not result in an infinite loop. If FROG is undefined before this line, then this line defines FROG → FROG, so that FROG ⇒ FROG ⇒ FROG and so on in an infinite loop. Suppose we quote both arguments to m4_define like so:
m4_define([FROG],[FROG])
|
This line defines FROG → FROG so that FROG ⇒ FROG ⇒ FROG and so on in an infinite loop. Therefore these kind so self-referential macros should not be used. Advanced users of m4 should look at the m4 info pages for the m4_shift statement, which provides a way for writing self-referential macros that do not result in an infinite loop. The info page can be found be pressing F1 in Emacs for info and then scrolling down to the m4 info page.
Double macro infinite loops are also possible, including:
m4_define([APPLE],BANANA) m4_define([BANANA],APPLE) |
Line 1 defines APPLE → BANANA and line 2 defines BANANA → BANANA (the current value of APPLE), so that APPLE ⇒ BANANA ⇒ BANANA and so on in an infinite loop. Quoting both arguments to m4_define() gives a slightly different result:
m4_define([APPLE],[BANANA]) m4_define([BANANA],[APPLE]) |
Line 1 defines APPLE → BANANA and line 2 defines BANANA → APPLE, so that APPLE ⇒ BANANA ⇒ APPLE ⇒ BANANA and so on in an infinite loop. The moral of this story is that you should never write a pair of mutually-referential macros. In advanced uses of m4 however, this restriction does not apply.
Triple, quadruple, "pentapule" . . . (and so on) macro infinite loops are also theoretically possible.
Back to Web Design Course |
This page has the following hit count:
|