Chapter 9. Programming in Makefiles

Table of Contents

9.1. Makefile variables
9.1.1. Naming conventions
9.2. Code snippets
9.2.1. Adding things to a list
9.2.2. Converting an internal list into an external list
9.2.3. Passing variables to a shell command
9.2.4. Quoting guideline
9.2.5. Workaround for a bug in BSD Make

Pkgsrc consists of many Makefile fragments, each of which forms a well-defined part of the pkgsrc system. Using the make(1) system as a programming language for a big system like pkgsrc requires some discipline to keep the code correct and understandable.

The basic ingredients for Makefile programming are variables (which are actually macros) and shell commands. Among these shell commands may even be more complex ones like awk(1) programs. To make sure that every shell command runs as intended it is necessary to quote all variables correctly when they are used.

This chapter describes some patterns, that appear quite often in Makefiles, including the pitfalls that come along with them.

9.1. Makefile variables

Makefile variables contain strings that can be processed using the five operators ``='', ``+='', ``?='', ``:='', and ``!='', which are described in the make(1) man page.

When a variable's value is parsed from a Makefile, the hash character ``#'' and the backslash character ``\'' are handled specially. If a backslash is followed by a newline, any whitespace immediately in front of the backslash, the backslash, the newline, and any whitespace immediately behind the newline are replaced with a single space. A backspace character and an immediately following hash character are replaced with a single hash character. Otherwise, the backslash is passed as is. In a variable assignment, any hash character that is not preceded by a backslash starts a comment that continues upto the end of the logical line.

Note: Because of this parsing algorithm the only way to create a variable consisting of a single backslash is using the ``!='' operator, for example: BACKSLASH!=echo "\\".

So far for defining variables. The other thing you can do with variables is evaluating them. A variable is evaluated when it is part of the right side of the ``:='' or the ``!='' operator, or directly before executing a shell command which the variable is part of. In all other cases, make(1) performs lazy evaluation, that is, variables are not evaluated until there's no other way. The ``modifiers'' mentioned in the man page also evaluate the variable.

Some of the modifiers split the string into words and then operate on the words, others operate on the string as a whole. When a string is split into words, it is split as you would expect it from sh(1).

No rule without exception—the .for loop does not follow the shell quoting rules but splits at sequences of whitespace.

There are several types of variables that should be handled differently. Strings and two types of lists.

  • Strings can contain arbitrary characters. Nevertheless, you should restrict yourself to only using printable characters. Examples are PREFIX and COMMENT.

  • Internal lists are lists that are never exported to any shell command. Their elements are separated by whitespace. Therefore, the elements themselves cannot have embedded whitespace. Any other characters are allowed. Internal lists can be used in .for loops. Examples are DEPENDS and BUILD_DEPENDS.

  • External lists are lists that may be exported to a shell command. Their elements can contain any characters, including whitespace. That's why they cannot be used in .for loops. Examples are DISTFILES and MASTER_SITES.

9.1.1. Naming conventions

  • All variable names starting with an underscore are reserved for use by the pkgsrc infrastructure. They shall not be used by package Makefiles.

  • In .for loops you should use lowercase variable names for the iteration variables.

  • All list variables should have a ``plural'' name, e.g. PKG_OPTIONS or DISTFILES.

9.2. Code snippets

This section presents you with some code snippets you should use in your own code. If you don't find anything appropriate here, you should test your code and add it here.

9.2.1. Adding things to a list

    STRING=                 foo * bar `date`
    INT_LIST=               # empty
    ANOTHER_INT_LIST=       apache-[0-9]*:../../www/apache
    EXT_LIST=               # empty
    ANOTHER_EXT_LIST=       a=b c=d

    INT_LIST+=              ${STRING}               # 1
    INT_LIST+=              ${ANOTHER_INT_LIST}     # 2
    EXT_LIST+=              ${STRING:Q}             # 3
    EXT_LIST+=              ${ANOTHER_EXT_LIST}     # 4

When you add a string to an external list (example 3), it must be quoted. In all other cases, you must not add a quoting level. You must not merge internal and external lists, unless you are sure that all entries are correctly interpreted in both lists.

9.2.2. Converting an internal list into an external list

    EXT_LIST=       # empty
    .for i in ${INT_LIST}
    EXT_LIST+=      ${i:Q}""
    .endfor

This code converts the internal list INT_LIST into the external list EXT_LIST. As the elements of an internal list are unquoted they must be quoted here. The reason for appending "" is explained below.

9.2.3. Passing variables to a shell command

    STRING=         foo bar <    > * `date` $$HOME ' "
    EXT_LIST=       string=${STRING:Q} x=second\ item

    all:
            echo ${STRING}                  # 1
            echo "${STRING}"                # 2
            echo "${STRING:Q}"              # 3
            echo ${STRING:Q}                # 4
            echo x${STRING:Q} | sed 1s,.,,  # 5
            env ${EXT_LIST} /bin/sh -c 'echo "$$string"; echo "$$x"'

Example 1 leads to a syntax error in the shell, as the characters are just copied.

Example 2 leads to a syntax error too, and if you leave out the last " character from ${STRING}, date(1) will be executed. The $HOME shell variable would be evaluated, too.

Example 3 outputs each space character preceded by a backslash (or not), depending on the implementation of the echo(1) command.

Example 4 handles correctly every string that does not start with a dash. In that case, the result depends on the implementation of the echo(1) command. As long as you can guarantee that your input does not start with a dash, this form is appropriate.

Example 5 handles even the case of a leading dash correctly.

The EXT_LIST does not need to be quoted because the quoting has already been done when adding elements to the list.

As internal lists shall not be passed to the shell, there is no example for it.

9.2.4. Quoting guideline

There are many possible sources of wrongly quoted variables. This section lists some of the commonly known ones.

  • Whenever you use the value of a list, think about what happens to leading or trailing whitespace. If the list is a well-formed shell expression, you can apply the :M* modifier to strip leading and trailing whitespace from each word. The :M operator first splits its argument according to the rules of the shell, and then creates a new list consisting of all words that match the shell glob expression *, that is: all. One class of situations where this is needed is when adding a variable like CPPFLAGS to CONFIGURE_ARGS. If the configure script invokes other configure scripts, it strips the leading and trailing whitespace from the variable and then passes it to the other configure scripts. But these configure scripts expect the (child) CPPFLAGS variable to be the same as the parent CPPFLAGS. That's why we better pass the CPPFLAGS value properly trimmed. And here is how we do it:

        CPPFLAGS=               # empty
        CPPFLAGS+=              -Wundef -DPREFIX=\"${PREFIX:Q}\"
        CPPFLAGS+=              ${MY_CPPFLAGS}
    
        CONFIGURE_ARGS+=        CPPFLAGS=${CPPFLAGS:M*:Q}
    
        all:
                echo x${CPPFLAGS:Q}x            # leading and trailing whitespace
                echo x${CONFIGURE_ARGS}x        # properly trimmed
    
  • The example above contains one bug: The ${PREFIX} is a properly quoted shell expression, but there is the C compiler after it, which also expects a properly quoted string (this time in C syntax). The version above is therefore only correct if ${PREFIX} does not have embedded backslashes or double quotes. If you want to allow these, you have to add another layer of quoting to each variable that is used as a C string literal. You cannot use the :Q operator for it, as this operator only works for the shell.

  • Whenever a variable can be empty, the :Q operator can have surprising results. Here are two completely different cases which can be solved with the same trick.

        EMPTY=                  # empty
        empty_test:
                for i in a ${EMPTY:Q} c; do \
                        echo "$$i"; \
                done
    
        for_test:
        .for i in a:\ a:\test.txt
                echo ${i:Q}
                echo "foo"
        .endfor
    

    The first example will only print two of the three lines we might have expected. This is because ${EMPTY:Q} expands to the empty string, which the shell cannot see. The workaround is to write ${EMPTY:Q}"". This pattern can be often found as ${TEST} -z ${VAR:Q} or as ${TEST} -f ${FNAME:Q} (both of these are wrong).

    The second example will only print three lines instead of four. The first line looks like a:\ echo foo. This is because the backslash of the value a:\ is interpreted as a line-continuation by make(1), which makes the second line the arguments of the echo(1) command from the first line. To avoid this, write ${i:Q}"".

9.2.5. Workaround for a bug in BSD Make

The pkgsrc bmake program does not handle the following assignment correctly. In case _othervar_ contains a ``-'' character, one of the closing braces is included in ${VAR} after this code executes.

    VAR:=   ${VAR:N${_othervar_:C/-//}}

For a more complex code snippet and a workaround, see the package regress/make-quoting, testcase bug1.