davin.50webs.com/webdesign
a GNU world order – your home of everything that is free

Main Menu	Research Projects	Photo Album	Curriculum Vitae	The Greatest Artists
Email Address	Computer Games	Web Design	Java Training Wheels	The Fly (A Story)
Political Activism	Scruff the Cat	My Life Story	Smoking Cessation	Other Links
Tutorial 1	Tutorial 2	Tutorial 3	Tutorial 4	Tutorial 5
Tutorial 6	Tutorial 7	Tutorial 8	Tutorial 9	Tutorial 10
Tutorial 11	Dec/Hex Codes	H.T.M.L. ASCII Codes	Using Emacs	Download Links

Web design course tutorial 5 : advanced m4

NOTE: This tutorial can be skipped by weaker mathematics students. The intention is to provide an as rigorous as possible analysis of the m4 language. Such detail is only needed by those people who wish to write new m4 code as opposed those people who merely want to borrow other people's existing m4 code.

§ 5.1 Quotes and comments in m4

An important part of the m4 language is the quote characters. By default the quote characters in m4 are the characters ` and ' but ' is needed for H.T.M.L. so it is not practical to use these as quote characters. Instead I always define the quote characters to be the left square bracket [ and the right square bracket ] because they are hardly ever used in H.T.M.L., while left round bracket ( and right round bracket ) are needed by m4 and < and > are needed for H.T.M.L. tags. The rest of this document will explain how quotes are used. To change quotes to [ . . . ] characters, you need to write the following line of m4 code at the top of every m4 document:

m4_changequote([,])

The m4 language by default also has the # character as the start of a comment. For creating H.T.M.L. documents it is convenient to turn comments off, which can be achieved by the following line of m4 code:

m4_changequote

§ 5.2 How m4 works for maths nerds

What follows is a diagram showing in a slightly simplified way, how m4 works. In this section we will make use of the following notation: macro definitions will be indicated by the red right arrow (→) so that X → Y means that X evaluates to Y. In the following eval( . . . ) stands for Evaluate, ieval( . . . ) stands for Inner Evaluate and mcall( . . . ) stands for Macro Call.

`ieval([X])`	`= X`
`ieval(m4_define(X,Y))`	= nothing, but has the side effect of setting `ieval(X)` → `ieval(Y)`.
`ieval(A(B₁, . . . ,B_n))`	`= ieval(mcall(A,ieval(B₁), . . . ,ieval(B_n)))`	⟩— if A is defined, or
	`= A(ieval(B₁, . . . ,ieval(B_n)))`	⟩— if A is undefined.
`ieval(A₁ . . . A_n)`	`= ieval(A₁) . . . ieval(A_n)`
`eval([X])`	`= X`
`eval(m4_define(X,Y))`	= nothing, but has the side effect of setting `ieval(X)` → `ieval(Y)`.
`eval(A(B₁, . . . , B_n))`	`= ieval(ieval(A(B₁, . . . , B_n)))`
`eval(A₁ . . . A_n)`	`= eval(A₁) . . . eval(A_n)`

Note that in the second and fifth items of the above list the value for n can be zero, which means that you are calling a macro with no arguments. Suppose that the macro X → Y is defined, then mcall(X,Z₁, . . . ,Z_n) evaluates to Y with every occurrence of $1 replaced with Z₁, every occurrence of $2 replaced with Z₂, and so on until every occurrence of $n is replaced with Z_n.

Note that there is a bug in the above algorithm for quoted arguments to macros. This bug only manifests itself in Section § 5.5 and Question 5.7b. I also know of another bug but it is so rare in practise that I will not mention it in this document.

Suppose the following definitions are in effect:

`APPLE`	`→ aaa $1 aaa`
`SEX`	`→ DOG`

which can be achieved by the following lines of code:

m4_define(APPLE,aaa $1 aaa)
m4_define(SEX,DOG)

then

eval(APPLE(SEX)) = ieval(ieval(APPLE(SEX)))
                 = ieval(ieval(mcall(APPLE,ieval(SEX))))
                 = ieval(ieval(mcall(APPLE,ieval(mcall(SEX)))))
                 = ieval(ieval(mcall(APPLE,ieval(DOG))))
                 = ieval(ieval(mcall(APPLE,DOG)))
                 = ieval(ieval(aaa DOG aaa))
                 = ieval(ieval(aaa) ieval(DOG) ieval(aaa))
                 = ieval(aaa DOG aaa)
                 = ieval(aaa) ieval(DOG) ieval(aaa)
                 = aaa DOG aaa.

Therefore APPLE(SEX) evaluates to APPLE(DOG) which finally evaluates to aaa DOG aaa. Suppose the following definitions are in effect:

`APPLE`	`→ BANANA`
`BANANA`	`→ CARROT`
`CARROT`	`→ DOUGHNUT`

which can be achieved by the following lines of code:

m4_define(APPLE,BANANA)
m4_define(BANANA,CARROT)
m4_define(CARROT,DOUGHNUT)

then

eval(APPLE) = ieval(ieval(APPLE))
            = ieval(ieval(mcall(APPLE)))
            = ieval(ieval(BANANA))
            = ieval(ieval(mcall(BANANA)))
            = ieval(ieval(CARROT))
            = ieval(ieval(mcall(CARROT)))
            = ieval(ieval(DOUGHNUT))
            = ieval(DOUGHNUT)
            = DOUGHNUT.

Therefore APPLE evaluates to BANANA, which evaluates to CARROT, which finally evaluates to DOUGHNUT. In the rest of this document we will make use of the red right double arrow (⇒) in the following abbreviated notation: APPLE ⇒ BANANA ⇒ CARROT ⇒ DOUGHNUT.

§ 5.3 How m4 works for the rest of us

The basic rule of thumb is that you evaluate the arguments of macros before you evaluate the enclosing macros. When evaluating a macro call, you replace it with its expansion until you get a macro call which is undefined, which you leave as is. Also remember that outer macros have their results evaluated one more time than inner macros. In practise you simply write macro calls without quotes [ and ] and add quotes until your code behaves the way that you want it to.

§ 5.4 Macro naming convention

When m4 is invoked with the -P command-line switch, all built-in macros are named with the prefix m4_. All macros that you write should also use this prefix because such symbols are coloured in Emacs with a red background.

The m4_ prefix is highly unlikely to accidentally clash with any word that you would think of writing. If you don't want to or can't be bothered using the m4_ prefix, then you should name the macro using using all uppercase letters to reduce the chances that the macro will accidentally clash with a word that you will write.

§ 5.5 Constants

Quotes in m4 are used to prevent the evaluation of expressions. The first use of quotes is for constants, which are expressions that we don't want evaluated. For example, suppose the following macros are defined:

m4_define(SEX,DOG)
m4_define(APPLE,AAA $1 AAA)
m4_define(BANANA,BBB $1 BBB)

so that SEX → DOG and APPLE → AAA $1 AAA and BANANA → BBB $1 BBB. With these macros in effect, writing SEX in our m4 source file generates DOG in our H.T.M.L. output file, because line 1 above defines SEX → DOG. So the natural question to ask is, how do we write the word SEX and prevent m4 from evaluating it as a macro? The answer depends on how many levels of macro call you are currently in:

To write literal SEX outside of any macros, we write: [SEX]. This evaluates to SEX. Note that one level of quotes is required to quote the constant.
To write literal SEX inside the APPLE macro, we write: APPLE([[SEX]]). This evaluates to AAA SEX AAA. Note that two levels of quotes are required to quote the constant.
To write literal SEX inside the double macro invocation APPLE(BANANA(...)) we write: APPLE(BANANA([[[SEX]]])). This evaluates to AAA BBB SEX BBB AAA. Note that three levels of quotes are required to quote the constant.

Note: There is a bug in my algorithm for the last two elements in this list so that the output from the algorithm presented in § 5.2 differs from the output of m4.

§ 5.6 Naïve use of m4

Naïvely you would think that the following code defines SEX → APPLE and then defines SEX → BANANA:

m4_define(SEX,APPLE)
m4_define(SEX,BANANA)

Let's take a look at what actually happens. Line 1 defines SEX → APPLE and line 2 defines APPLE (the current value of SEX) → BANANA. Therefore SEX ⇒ APPLE ⇒ BANANA. To avoid the spurious definition of other macros, we need to quote the first arguments to m4_define like so:

m4_define([SEX],APPLE)
m4_define([SEX],BANANA)

With these definitions in force, SEX ⇒ BANANA. The golden rule is therefore that you should always quote the first argument to a m4_define statement.

§ 5.7 Early binding versus late binding

With the following definitions in place:

m4_define([APPLE],BANANA)
m4_define([SEX],APPLE)
m4_define([APPLE],CARROT)

then SEX ⇒ BANANA, the value of APPLE at the time the SEX macro is defined. This technique is called early binding, since the binding of the name occurs when the SEX macro is defined. With the following definitions in place:

m4_define([APPLE],BANANA)
m4_define([SEX],[APPLE])
m4_define([APPLE],CARROT)

then SEX ⇒ CARROT, the value of APPLE at the time SEX is evaluated. This technique is called late binding since the binding of the name occurs when the expression SEX is evaluated. Usually late binding is preferable to early binding so the golden rule is that you should always quote the second argument to m4_define. Combining these two golden rules, you should always quote both arguments to m4_define, unless you want early binding in which case you should only quote the first argument to m4_define.

§ 5.8 Single macro infinite loops

Consider the following pseudo code:

   FUNCTION factorial(n : INTEGER) : INTEGER
   BEGIN
      IF n = 0 RETURN 1;
      ELSE RETURN n * factorial(n - 1);
   END

This is a self-referential definition because in line 4 the function calls itself. This function successfully computes the factorial of the argument n which can take any value in the set {0,1,2,3,...}, and whose result is equal to:

 n * (n-1) * (n-2) * ... * 3 * 2 * 1

Self-referential definitions are okay in some cases. The above pseudo code gives an infinite loop if it is given an argument less than zero. The following self-referential function also results in an infinite loop, whatever argument is given.

   FUNCTION factorial(n : INTEGER) : INTEGER
   BEGIN
      RETURN factorial(n);
   END

Therefore this self-referential definition should not be used. Self-referential definitions are allowed in m4 which gives you more rope to hang yourself with. Basically, self-referential definitions should only be used by advanced users of m4 because of the danger of resulting in an infinite loop. Suppose the following definition is in place:

m4_define([FROG],FROG)

If FROG → X before this line then the above line has no effect so that FROG ⇒ X which may or may not result in an infinite loop. If FROG is undefined before this line, then this line defines FROG → FROG, so that FROG ⇒ FROG ⇒ FROG and so on in an infinite loop. Suppose we quote both arguments to m4_define like so:

m4_define([FROG],[FROG])

This line defines FROG → FROG so that FROG ⇒ FROG ⇒ FROG and so on in an infinite loop. Therefore these kind so self-referential macros should not be used. Advanced users of m4 should look at the m4 info pages for the m4_shift statement, which provides a way for writing self-referential macros that do not result in an infinite loop. The info page can be found be pressing F1 in Emacs for info and then scrolling down to the m4 info page.

§ 5.9 Double macro infinite loops

Double macro infinite loops are also possible, including:

m4_define([APPLE],BANANA)
m4_define([BANANA],APPLE)

Line 1 defines APPLE → BANANA and line 2 defines BANANA → BANANA (the current value of APPLE), so that APPLE ⇒ BANANA ⇒ BANANA and so on in an infinite loop. Quoting both arguments to m4_define() gives a slightly different result:

m4_define([APPLE],[BANANA])
m4_define([BANANA],[APPLE])

Line 1 defines APPLE → BANANA and line 2 defines BANANA → APPLE, so that APPLE ⇒ BANANA ⇒ APPLE ⇒ BANANA and so on in an infinite loop. The moral of this story is that you should never write a pair of mutually-referential macros. In advanced uses of m4 however, this restriction does not apply.

§ 5.10 Other kinds of infinite loops

Triple, quadruple, "pentapule" . . . (and so on) macro infinite loops are also theoretically possible.

§ 5.11 Questions

You can view the answers to this tutorial. A password is required.

Back to Web Design Course

This page has the following hit count: