How GNU Make's patsubst Function Really Works
Latest update:
$(patsubst)
is a GNU Make internal function that deals with text
processing such as file names transformations. Despite of having a
very simple idea behind it, the peculiar way of its implementation
leads to confusion & uncertainty for novice Make users. The function
doesn’t return any errors or signal any warnings. It uses its own
wildcard mechanism that doesn’t have any resemblance with the usual
glob or regexp patterns.
For example, why this transformation doesn’t work?
$(patsubst src/%.js, build/%.js, ./src/foo.js)
We expect ./src/foo.js
to be converted to build/foo.js
, but
patsubst
leaves the file name untouched.
Extract method
Before we begin, we need a quick way of inspecting the results of
patsubst
evaluations. GNU Make doesn’t have a REPL. There are
primitive hacks around it
like ims:
$ rlwrap -S 'ims> ' ./ims
ims> . $(words will cost ten thousand lives this day)
7
that allow you to play with Make functions interactively, but they won’t
help you to examine Make’s internals, for there is no way to view the
source code of a particular function like you do it in irb +
method_source gem, for example.
I’ve extracted patsubst
function from Make 4.2.90 into a separate
command gmake-patsubst.
After you compile it, just run it from the shell as:
$ ./gmake-patsubst src/%.js build/%.js ./src/foo.js
./src/foo.js
providing exactly 3 arguments as you would do in makefiles, only using
the shell quoting/splitting rules instead of Make’s (i.e., use a space
as an argument separator instead of a comma).
(A side note about the extract: it’s ≈ 520 lines of an imperative
code! This is what you get when you program in C.)
If you want to read the algo itself, start from
patsubst_expand_pat()
.
patsubst explained
Let’s recap first what patsubst
does.
$(patsubst PATTERN, REPLACEMENT, TEXT)
The majority of its use is to tranform a list of file names. It
operates like a map()
on an iterable in JavaScript:
TEXT
.split(/\s+/)
.map( (file) => magic_transform(PATTERN, REPLACEMENT, file) )
.join(' '))
It’s a pure function that returns a new result, leaving its arguments
untouched. It works with supplied file names in TEXT
as strings–it
doesn’t do any IO.
The first thing to remember is that it splits TEXT
into chunks
before doing any substantial work further. All transforming is being
done by individually applying PATTERN
to each chunk.
For example, we have a list of .jsx file that we want to tranform into
the list of .js files. You may think that the simplest way of doing it
with patsubst
would look like this:
$ ./gmake-patsubst .jsx .js "foo.jsx bar.jsx"
foo.jsx bar.jsx
Well, that didn’t work!
The problem here is that in this case patsubst
checks if each
chunk matches PATTERN
exacly as a full word byte-to-byte. In regex
terms this would look as ^\.jsx$
. To prove this, we modify our
pattern to be exactly foo.jsx
:
$ ./gmake-patsubst foo.jsx .js "foo.jsx bar.jsx"
.js bar.jsx
Which works as we described but isn’t much of a help in real
makefiles.
Thus patsubst
has a wildcard support. It is similar to the character
%
in Make pattern rules, that mathes any non-empty string. For
example, %
in %.jsx
pattern could match foo
against foo.jsx
text. The substring that %
matches (foo
in the example) is called
a stem.
There could be only one %
in a pattern. If you have several of
them, only the first one would be the wildcard, all others would be
treated as regular characters.
To return to our example with .jsx files, using %
in both
PATTERN
& REPLACEMENT
arguments yields to desired result:
$ ./gmake-patsubst %.jsx %.js "foo.jsx bar.jsx"
foo.js bar.js
When REPLACEMENT
contains a %
character, it is replaced by the
stem that matched the %
in PATTERN
.
Using the character %
only in patterns is rarely useful, unless you
want to replicate Make’s $(filter-out)
function:
$ ./gmake-patsubst %.jsx "" "foo.jsx bar.js"
bar.js
Which is the equivalent of
$(filter-out %.jsx, foo.jsx bar.js)
If there is no %
in PATTERN
but there is %
in REPLACEMENT
,
patsubst
resorts to the case of a simple, exact substitution that we
saw before.
$ ./gmake-patsubst foo.jsx % "foo.jsx bar.jsx"
% bar.jsx
Now, to return to our first example from Abstract:
$(patsubst src/%.js, build/%.js, ./src/foo.js)
Why didn’t it work out?
Putting together all we’ve learned so far, here is the high-level
algorithm of what patsubst
does:
-
It searches for the %
in PATTERN
& REPLACEMENT
. If found, it
cuts off everything before %
. Let’s call such a cut-out part
pattern-prefix (src/
) & replacement-prefix (build/
). It
leaves us with .js
& (again) .js
correspondingly. Let’s call
those parts pattern-suffix & replacement-suffix.
-
Splits TEXT
into chunks. In our case there is nothing to split,
for we have only 1 file name (a string w/o spaces): ./src/foo.js
.
-
If there is no %
in PATTERN
it does a simple substitution for
each chunk & returns the result.
-
If there indeed was %
in PATTERN
, it (for each chunk):
4.1. (a) Makes sure that pattern-prefix is a substring of the
chunk. In JavaScript it would look like:
CHUNK.slice(0, PATTERN_PREFIX.length) === PATTERN_PREFIX
It’s false in our example, for src/
!= ./src/
.
(b) Makes sure that pattern-suffix is a substring of the
chunk. In JavaScript it would look like:
CHUNK.slice(-PATTERN_SUFFIX.length) === PATTERN_SUFFIX
It’s true in our example, for .js
== .js
.
4.2. If the subitem #4.1 is false (our case!) it just returns an
unmodified copy of the original chunk.
4.3. Iff both (a) & (b) in the subitem #4.1 were indeed
true, it cuts-out pattern-prefix & pattern-suffix from the
chunk, transforming it to a stem.
4.4. Concatenates replacement-prefix + stem + replacement-suffix.
-
Joins all the chunks (modified of unmodified) with a space &
returns the result.
As you see, the algo is simple enough, but probably is not exactly
similar to what you may have imagined after reading the Make
documentation.
In conclusion, hopefully now you can explain the result of patsubst
evaluation below (why only src/baz.js
was transformed correctly):
$ ./gmake-patsubst src/%.js build/%.js "./src/foo.js src/bar.jsx src/baz.js"
./src/foo.js src/bar.jsx build/baz.js
The nodejs version of the
patsubst
can be found here. Note
that it’s a simple example & it must not be held as a reference.
-
(For non-English speakers like yours trully) The noun stem
means several things: 1) (in linguistics) a form of a word after
all affixes are removed; 2) (in botany) a slender structure that
supports a plant.
Tags: ойті
Authors: ag