December 2024
I'm a strong advocate of unit tests, I can confidently say that it has saved me from introducing regressions countless times. Today I want to share one of the hidden gems of OCaml and their testing story with dune, cram tests.
Cram tests, in essence, are snapshots of interactive shell sessions. They provide a versatile framework for testing OCaml projects, but not only OCaml, they can be used for any CLI.
I’m surprised why this pattern is not more common outside of (the small) world of OCaml, since they come from Python (bitheap.org/cram) and there are implementations in Go (https://github.com/mgeisler/cram) and other languages. Probably because it didn’t reach to JavaScript world (??) and they are a bit old.
Regardless, let’s dive in
Those test files contain both the execution of the CLI invocations followed by the expected output either from stdout or stderr. The test runner, in this case dune, will run those invocations and diff with the expected output.
Let’s check the syntax with an example:
Here there is just a coment about this test, since there is no space at the start
it's treated as a comment.
$ echo "This command runs as part of the test" > data.txt
We want to see the content of "data.txt"
$ cat data.txt
This command runs as part of the test
The command `cat data.txt` gets diffed against the next line ` This command runs as part of the test
` from this file, and that's the first beauty of cram tests.
Here there is just a coment about this test, since there is no space at the start
it's treated as a comment.
$ echo "This command runs as part of the test" > data.txt
We want to see the content of "data.txt"
$ cat data.txt
This command runs as part of the test
The command `cat data.txt` gets diffed against the next line ` This command runs as part of the test
` from this file, and that's the first beauty of cram tests.
To define the syntax as rules:
>
contains the expected output (either stdout or stderr)
This is a screenshot of my editor, showcasing the syntax highlighting and a real .t
file
dune supports cram tests out of the box and integrates nicely with all your dune test workflow.
In previous versions of dune, you need to enable them manually. For dune < 3 add (cram enabled)
in dune-project (might need to create a dune-project file at the root of your project if you don't have one already). If you are using dune > 3, nothing to do, they are already enabled
If you want to experience the flow of it before using it in a real-world scenario, open your dune project and follow it as a tutorial.
Let’s start with the simplest possible test.
simple.t
with: $ echo 'lola'
$ echo 'lola'
dune runtest
dune build @simple —watch
(all cram tests are automatically dune aliases)Output of the execution will be:
File "tests/simple.t", line 1, characters 0-0:
diff --git a/_build/.sandbox/e72f0d9cb592f23848ccd62f4a630862/tests/simple.t b/_build/.sandbox/e72f0d9cb592f23848ccd62f4a630862/tests/simple.t.corrected
index de6f8c28..ef3231ef 100644
--- a/_build/.sandbox/e72f0d9cb592f23848ccd62f4a630862/tests/simple.t
+++ b/_build/.sandbox/e72f0d9cb592f23848ccd62f4a630862/tests/simple.t.corrected
@@ -1 +1,2 @@
$ echo 'lola'
+ lola
File "tests/simple.t", line 1, characters 0-0:
diff --git a/_build/.sandbox/e72f0d9cb592f23848ccd62f4a630862/tests/simple.t b/_build/.sandbox/e72f0d9cb592f23848ccd62f4a630862/tests/simple.t.corrected
index de6f8c28..ef3231ef 100644
--- a/_build/.sandbox/e72f0d9cb592f23848ccd62f4a630862/tests/simple.t
+++ b/_build/.sandbox/e72f0d9cb592f23848ccd62f4a630862/tests/simple.t.corrected
@@ -1 +1,2 @@
$ echo 'lola'
+ lola
Running dune build @simple —auto-promote
which updates simple.t
file for you.
To ensure everything works, try running the tests again, which in this case should pass. Run dune build @simple
and the output should be empty. Empty in dune, means very good
Directories with .t
with a run.t
file inside, are also valid cram tests.
Keep all your tests as .t
files can get out of hand pretty quickly, since some test suites could contain hundreds of tests, and some tests might contain fixtures or any config files
You might write a perfect test, only to be greeted with [ERROR] Program "your_binary" not found!
. This happens because dune doesn't automatically know which executables should be accessible during cram test execution.
There are four ways to expose your executable to cram tests:
public_name
in your executable stanza like (public_name amazing-binary)
, which makes the executable available to the test suite as amazing-binary.exe
, and when users install your package, the executable will be under my-opam-package.amazing-binary
.(install (section bin) (files amazing-binary.exe))
. Do not confuse dune installation with opam installation, the dune installation offers a way to "copying freshly built artifacts from the workspace to the system", while opam installs opam packages in your opam switch (downloading from opam repositories, building from sources, etc).(env _ (binaries ../bin/amazing-binary.exe))
to set an executable to be available as a binary: amazing-binary
will be available to the test suite, and can choose to change which environment will have it. This is useful if you don't want to rely on the exact location of the executable, and don't want to expose the executable to the user. (Thanks to sim642 for the tip).(deps %{exe:../bin/amazing-binary.exe})
. More info in about the cram
stanza in the dune documentation.If you're not sure which to choose, I recommend starting with setting a public_name
in your executable stanza. It's the most explicit approach and if you are working on a CLI, you will still need a public_name
to expose it on the opam package.
Writing effective cram tests requires more than just copying command output, let's see some key principles that will help you write robust and maintainable snapshot tests
Your tests should be hermetic - meaning they're self-contained and don't depend on external state. Each test should clean up after itself and running it multiple times should produce the same result. For example:
$ mkdir -p tmp
$ cd tmp
$ echo "hello" > test.txt
$ cat test.txt
hello
$ cd ..
$ rm -rf tmp
$ mkdir -p tmp
$ cd tmp
$ echo "hello" > test.txt
$ cat test.txt
hello
$ cd ..
$ rm -rf tmp
One common pitfall is hardcoding absolute paths in your tests. This makes tests brittle and non-portable. Dune provides several environment variables to help write path-independent tests:
# DON'T do this
$ cat /home/user/project/test.txt
# DO this instead
$ cat $DUNE_ROOT/test.txt
# DON'T do this
$ cat /home/user/project/test.txt
# DO this instead
$ cat $DUNE_ROOT/test.txt
Some useful environment variables include:
$DUNE_ROOT
: Points to your project root$INSIDE_DUNE
: Set when running inside dune's test suite$PWD
: Current working directory$DUNE_BUILD_DIR
: Path to dune's build directoryHere's a more complete example showing these variables in action:
$ echo $DUNE_ROOT | grep -o '[^/]*$'
my-project
# Create a file relative to project root
$ cat > $DUNE_ROOT/input.ml <<EOF
> let greeting = "Hello, World!"
> let () = print_endline greeting
> EOF
# Test the compiled output
$ dune exec ./bin/main.exe
Hello, World!
$ echo $DUNE_ROOT | grep -o '[^/]*$'
my-project
# Create a file relative to project root
$ cat > $DUNE_ROOT/input.ml <<EOF
> let greeting = "Hello, World!"
> let () = print_endline greeting
> EOF
# Test the compiled output
$ dune exec ./bin/main.exe
Hello, World!
This approach ensures your tests will run consistently across different environments and developer machines. It's especially important if you plan to run these tests in CI or in other developer local envs.
A good cram test should explain its purpose and any non-obvious setup, making it easier for others (or yourself in 6 months) to understand what's being tested and why. The comments should focus on the intent behind the test, not just describe the commands being run.
The true testament to cram tests' effectiveness is their adoption by prominent projects in the OCaml ecosystem.
Dune itself, uses cram tests extensively throughout its codebase - with over 800 tests covering everything from basic builds to complex workspace configurations. They also used cram tests for bug reporting. When users encounter issues, they're encouraged to submit a minimal reproduction case as a cram test. This practice has been so successful that many bug reports in their issue tracker include a .t
file demonstrating the problem.
Melange is another great other example of it, where ensures the JavaScript compilation to remain consistent with blackbox tests https://github.com/melange-re/melange/tree/main/test/blackbox-tests.
As a useful reference, there’s some of my work migrating to cram tests in here:
TLDR: cram tests are nice, actually.
Thanks for reaching the end. Let me know if you have any feedback, corrections or questions. Always happy to chat about any topic mentioned in this post, feel free to reach out.