summaryrefslogtreecommitdiffstats
path: root/games/mame/README_ccache.txt
diff options
context:
space:
mode:
Diffstat (limited to 'games/mame/README_ccache.txt')
-rw-r--r--games/mame/README_ccache.txt130
1 files changed, 130 insertions, 0 deletions
diff --git a/games/mame/README_ccache.txt b/games/mame/README_ccache.txt
new file mode 100644
index 0000000000..e58f94b279
--- /dev/null
+++ b/games/mame/README_ccache.txt
@@ -0,0 +1,130 @@
+Notes about building modern mame with ccache...
+
+TL;DR version: I had to use clang and clang++ (not gcc and g++) to get
+ccache to work with mame, and add timestamps to the GroovyMAME patch
+to make it work with ccache.
+
+mame uses a precompiled header (PCH), called emu.h.gch. It gets
+created by precompiling emu.h, which is just a list of #include
+directives that include every header in mame's emulation core.
+
+This speeds up the build approximately 50% (cuts the build time in
+half)... you can test this by passing PRECOMPILE=0 to mame's make
+command. On my test box, it takes approximately 1h45m to build with
+gcc using the precompiled headers, and about twice as long (3h30m)
+without. So far so good.
+
+If you build with ccache and PCH (aka without PRECOMPILE=0), either
+with "ccache g++" or via symlinks, the 2nd build takes just as long as
+the first. ccache wasn't able to save any time... why not?
+
+To get ccache to even *try* to work with precompiled headers, you
+have to 'export CCACHE_SLOPPINESS=time_macros'. However, it doesn't
+actually fix the problem if you're using gcc.
+
+One "fix" would be to disable use of emu.cgh (PRECOMPILE=0). But
+that makes the first build take 2x as long (though later builds would
+indeed benefit from ccache). But can we get the best of both worlds?
+
+When gcc compiles a header into a .gch file, it "bakes in" the address
+the kernel gave it when it allocated memory... Since Slackware's
+kernels (and pretty much all other distros) have ASLR (address space
+layout randomization) enabled, this means that given identical input,
+gcc will generate a slightly different .gch file as output. You
+can easily test this yourself:
+
+ echo 'extern int foo(int bar);' > a.h
+ g++ -o a.gch a.h
+ md5sum a.gch
+
+Now repeat the last 2 commands:
+ g++ -o a.gch a.h
+ md5sum a.gch
+
+The md5sum will be different... it's possible to disable ASLR on a
+running x86_64 kernel (echo 0 > /proc/sys/kernel/randomize_va_space),
+which will cause gcc to generate an identical .gch every time...
+but last I checked, this doesn't work on 32-bit x86. Plus, ASLR is
+there for a reason: it's a security mechanism.
+
+ccache dutifully caches the .gch file, indexed by its hash... but
+on the next attempt to build mame with ccache, the .gch file it
+generates (early in the build) will have different contents, thus
+ccache computes a different hash, and correctly finds no cache entry
+for that hash. So, every C++ source file that uses emu.gch (which is
+all of them) will be a cache miss and get recompiled. Whoops.
+
+So what to do? Well, it turns out that clang++ doesn't have this
+problem: given identical input, it produces the same .gch every time,
+regardless of whether ASLR is enabled or not. Note: "identical input"
+includes not only the contents of emu.h and the headers it includes,
+but the timestamps on those headers as well!
+
+Unfortunately, on my test box at least, clang/clang++ takes 10%
+longer to build mame than gcc/g++. I've seen various blog/forum posts
+claiming that clang builds mame faster than gcc, but these were all
+from a few years ago. To me, this 10% penalty is worth it: whenever
+there's a new mame release, I have to test any changes I made to the
+SlackBuild, which involves running it repeatedly. The first run will
+take about 2 hours, but the 2nd and later ones will take <10 minutes.
+
+There's a bit more to the story than just using clang++: I
+noticed that with the GroovyMAME patch applied, ccache would still
+fail... this turns out to be because, even with CCACHE_SLOPPINESS set
+to include_file_mtime, ccache still rejects the cached emu.gch. The
+is because emu.gch is made from emu.h, which contains a bunch of
+"#include <whatever>"... and the timestamps of those included files
+get baked into the .gch file that clang++ generates.
+
+Try another test:
+
+ echo '#include "b.h"' > a.h
+ echo 'extern int foo(int bar);' > b.h
+ clang++ -o a.gch a.h
+ md5sum a.gch
+
+Now repeat the last 2 commands:
+
+ clang++ -o a.gch a.h
+ md5sum a.gch
+
+Same md5sum, right? Now:
+
+ sleep 2
+ touch b.h
+ clang++ -o a.gch a.h
+ md5sum a.gch
+
+Different md5sum. clang++ produced a different a.gch because b.h's
+timestamp changed. Notice we aren't using ccache for the test code:
+this is a feature (or misfeature?) of clang++ itself.
+
+The reason this was a problem with mame + the groovy patch is that the
+groovy patch as sent to us by github.com's REST API doesn't include
+timestamps (normally they appear on the diff lines that begin with
+'--- filename' and '+++ filename'). This isn't github's fault: the
+regular 'git diff' command doesn't put timestamps in its diffs, and
+the API is just running that for us, remotely.
+
+So on every run of the SlackBuild, ccache runs clang++ to generate
+emu.gch, and the emu.gch contents include the timestamps of the
+patched includes... which are set to the time the patch happened to be
+applied! This means that any file that includes emu.gch (pretty much
+all of them!) would be a cache miss and have to be recompiled. So on
+every run, ccache would refuse to use *any* cached result from the
+previous run!
+
+CCACHE_SLOPPINESS=include_file_mtime seems like it should fix the
+problem, but it doesn't. It would have worked if we weren't using
+PCH...
+
+The solution was to add timestamps to the context headers in the .diff
+(which is done in mkgroovy.sh) and use "patch -f -Z" to force patch to
+apply the timestamps.
+
+If the diff had been made by the regular diff command, it would have
+had timestamps already, and ccache would have Just Worked.
+
+If mame didn't use precompiled headers, the lack of timestamps
+wouldn't have caused a problem (or if they had, CCACHE_SLOPPINESS
+could have worked around it).