Initialize repository

author: Rutger Broekhoff 2025-07-07 21:52:08 +0200
committer: Rutger Broekhoff 2025-07-07 21:52:08 +0200
commit: ba61dfd69504ec6263a9dee9931d93adeb6f3142 (patch)
tree: d6c9b78e50eeab24e0c1c09ab45909a6ae3fd5db /README.md
download: verified-dyn-lang-interp-ba61dfd69504ec6263a9dee9931d93adeb6f3142.tar.gz
verified-dyn-lang-interp-ba61dfd69504ec6263a9dee9931d93adeb6f3142.zip
1 files changed, 383 insertions, 0 deletions
diff --git a/README.md b/README.md
new file mode 100644
index 0000000..ad7b1b7
--- /dev/null
+++ b/README.md
@@ -0,0 +1,383 @@
+Artifact for "Verified Interpreters for Dynamic Languages with Applications to the Nix Expression Language"
+===========================================================================================================
+This is the accompanying artifact for the following paper:
+> Rutger Broekhoff and Robbert Krebbers. 2025. Verified Interpreters for Dynamic Languages with Applications to the Nix Expression Language. _Proc. ACM Program. Lang._ 9, ICFP, Article 268 (August 2025), 29 pages. https://doi.org/10.1145/3747537
+This artifact primarily consists of two components:
+1. A Rocq formalization of the languages presented in the paper (LambdaLang; § 2, DynLang; § 3, EvalLang; § 3.4, NixLang; § 4).
+   This includes operational semantics and properties, an interpreter and correctness proof, plus some tests and examples.
+2. An OCaml front-end for NixLang (§ 5), which elaborates Nix source files into NixLang.
+   These are then evaluated using the NixLang interpreter, which is derived from the Rocq sources using Rocq's program extraction functionality.
+> N.B.: If you are using the VM image that we provide for evaluating the artifact, then all necessary dependencies will already be installed using `opam`.
+> You can directly proceed with using the [Dune project and Makefile](#dune-project-and-makefile) or [using the CLI](#usage-of-the-cli).
+For the purposes of evaluating the artifact, we suggest the following steps:
+1. [Building](#building) (this must certainly be done first).
+2. Comparing that the mechanized semantics and interpreters of the defined languages match those presented in the paper.
+   The mechanization of all languages follows a [common structure](#structure-of-the-mechanization), but we also have a [table](#index-of-definitions-and-theories) that points you directly to the relevant definitions.
+3. Comparing that the mechanized theorems correspond to those presented in the paper.
+   We have a [table](#index-of-definitions-and-theories) that lists all theorems in the paper and their corresponding version in the formalization.
+   The paper mentions some properties/proofs in passing, we list the corresponding mechanized versions for these as well (but it may be less relevant to verify these, as they are primarily used as stepping stones for the major theorems presented in the paper).
+4. Exercising the Nix test suite on our interpreter and verifying that this gives the same results as presented in the paper (§ 5).
+5. Trying out some Nix programs in our REPL.
+   The paper (§ 1, § 4.1, § 5) contains some examples that might be interesting to try out.
+You may also be interested in the [axioms used](#use-of-axioms) by the mechanization, or in reproducing the interpreter coverage number (91.77%, § 5, p. 22) and Table 1 (§ 5 p. 22).
+How to do this depends on the build method that you are using, see the followings section.
+> N.B.: We have only verified this artifact to function on Linux (Arch Linux, Debian) and macOS. It does not work on Windows.
+## Building
+As mentioned at the start, this project consists of two parts: the Rocq mechanization and the front-end of the Nix interpreter.
+The front-end of the Nix interpreter, which uses the makes use of the NixLang interpreter (written in Rocq, extracted to OCaml), is built using [Dune](https://dune.build/).
+The Dune project also takes care of checking all of the parts of the Rocq mechanization, and extracting the NixLang interpreter to OCaml.
+However, for convenience, we have a separate (classic) `Makefile` which checks only the Rocq sources (this may be more familiar to some, and will explicitly print the files being checked).
+(Do note that Dune will refuse to build if the `.vo` files resulting from the checking process are not cleaned up; this can be done using `make clean`.)
+*Both parts require dependencies.*
+We have two main ways of managing the dependencies of the project: either using Nix or `opam`.
+If you are familiar with Nix and have it available to you, it will likely be most easy to use.
+If you are just interested in using Nix to build the project (so not directly interacting with the Dune project / Makefile), look [here](#building-with-nix).
+If you are instead interested in interacting with `dune`, Makefile and/or other scripts in this repository, you should probably use the Nix devshell instead.
+For that, look [here](#managing-dependencies-nix-devshell).
+With the devshell set up, you can then proceed with using the [Dune project and Makefile](#dune-project-and-makefile).
+If you cannot or do not want to use Nix, there is always still `opam`.
+Look [here](#managing-dependencies-opam) to see how to install project dependencies with `opam`.
+With the dependencies installed using `opam`, you can then proceed with using the [Dune project and Makefile](#dune-project-and-makefile).
+### Building with Nix
+> N.B.: Only do this if you are solely interested in building/testing and/or using the artifact. If you are interested in being able to manually use the `Makefile` and Dune project, you should use a Nix devshell instead; look [here](#nix-devshell) for instructions.
+Run `nix-build` (or equivalently `nix-build ./default.nix`) to build and test the artifact. This will take a few minutes.
+The resulting CLI should then be available as `./result/bin/mininix`.
+See how to use the CLI [below](#usage-of-the-cli).
+For some more details on the test suite, look [here](#use-of-the-nix-language-tests).
+Run `nix-build ./axioms.nix` to generate the list of axioms used by the formalization.
+This is equivalent to running `make validate`.
+The output of `coqchk` should be printed directly, but will also be stored at `./result/coqchk-output`.
+See [below](#use-of-axioms) for more information on the axioms used.
+In the paper, we report a 91.77% coverage for the interpreter code extracted from Rocq.
+To generate the coverage report, run `nix-build ./coverage.nix`.
+The coverage report should then be available as `./result/coverage/report-plain` (a text file, look for `lib/extraction/interp.ml` for the interpreter coverage).
+A HTML version is also generated, the report for the interpreter should then be available under `./result/coverage/html/lib/extraction/interp.ml.html`.
+To generate the line counts for the Rocq development (for comparison with Table 1 in the article), run `nix-build ./cloc.nix`.
+The resulting report should be available as a text file `./result/formalization-loc-report`.
+### Managing dependencies: Nix devshell
+The Nix devshell gives you a shell where all required dependencies are installed for you, so you do not have to fiddle with `opam`, but can still use the Dune project and Makefile that we provide.
+There are a few ways that you can use the devshell.
+Enter the devshell by running `nix-shell`.
+You can then proceed with using the [Dune project and Makefile](#dune-project-and-makefile) *inside of the Nix devshell*.
+### Managing dependencies: opam
+First, ensure that your `opam` repositories are up-to-date:
+```sh
+opam update
+```
+For maximum flexibility and reproducibility, we recommend running the following
+command to create a new opam switch in which the pinned dependencies are
+installed (this includes the required Rocq, Flocq, Rocq-std++ versions etc.):
+```sh
+# Leave out the --locked flag if not on Linux!
+opam switch create ./ --repos default,rocq-released=https://rocq-prover.org/opam/released --deps-only --locked
+```
+It may be necessary to activate the newly set up opam switch as follows (if
+you are not sure and opam instructs you to run this after creating the switch:
+_do_ run it):
+```sh
+eval $(opam env)
+```
+The [Dune project `Makefile`](#dune-project-and-makefile) can then be used as usual.
+### Dune project and `Makefile`
+#### The Makefile (just the formalization)
+If you are solely interested in the Rocq sources, you may make use of the `Makefile` to check and build Rocq sources and dependencies, provided that you have the appropriate [Rocq](https://rocq-prover.org/), [Flocq](https://flocq.gitlabpages.inria.fr/) and [Rocq-std++](https://gitlab.mpi-sws.org/iris/stdpp) versions available (8.20.1, 4.2.0/4.2.1 and 1.11.0 resp.).
+When using a Nix devshell or an `opam` switch with the required dependencies installed (instructions above), this should work automatically.
+Run `make` to check all files in the formalization.
+Run `make validate` to print the axioms used.
+See [below](#use-of-axioms) for more information on the axioms used.
+#### The Dune project (formalization + front-end)
+> N.B.: Dune will produce errors if `.vo` files generated by the `Makefile` are present.
+> Remove these by running `make clean`.
+> The Dune project will then build.
+This step assumes that you have dependencies installed and available, either using `opam` or a Nix devshell.
+See the relevant sections above.
+To check all proofs and build the CLI, use `dune build -p mininix`.
+To then run the CLI, use `./_build/default/bin/main.exe` (see how to use it [below](#usage-of-the-cli)).
+To execute the test suite, use `dune test` (this may take a few minutes).
+For some more details on the test suite, look [here](#use-of-the-nix-language-tests).
+To generate the coverage report, run `./coverage.sh` (this will take a few minutes).
+The report will then be printed directly (look for `lib/extraction/interp.ml`).
+A detailed HTML report should also be available under `_coverage/html/lib/extraction/interp.ml.html`.
+To generate the line counts for the Rocq development (for comparison with Table 1 in the article), make sure that `cloc` and `jq` are installed (if not using the provided Nix shell).
+Then run `./cloc-rocq.sh` to print the report.
+## Usage of the CLI
+Depending on how you built the CLI, it will be available to you as `./_build/default/bin/main.exe` (Dune) or `./result/bin/mininix` (Nix).
+In these examples, replace `<cli>` with the actual path of the CLI binary.
+- As a Nix REPL: `<cli> repl`.
+  Various meta-commands are available: `:quit`, `:run`, `:set` and `:settings`.
+  Autocomplete for these meta-commands is available; it is possible to cycle through suggestions using <kbd>TAB</kbd>.
+  Input can be canceled using <kbd>Ctrl+C</kbd> and the REPL can be quit using `:quit` or <kbd>Ctrl+D</kbd>.
+  The REPL evaluates in deep mode by default.
+  This can be changed by using `:set eval_strategy shallow`.
+  The REPL does not handle newlines in user input (a newline is processed as a request to process input).
+  The expected format of user input is the Nix language (although, as mentioned in the paper, e.g. derivations are not supported).
+  The paper (§ 1, § 4.1, § 5) contains some examples that might be interesting to try out.
+- For evaluating single files: `<cli> eval FILENAME`.
+  This command evaluates in shallow mode by default.
+  You can change to deep mode by using the flag `-strict`.
+  An import tree definition file can be passed using the `-importsdef` flag.
+  See [below](#import-tree-definitions).
+## The Rocq mechanization
+See how to check/build the mechanization [here](#build).
+This section describes the different parts of the mechanization, and how it relates to the different parts of the article.
+### Use of axioms
+See how to list the axioms used [above](#build) (look for the dependency management / build method that you are using).
+(In case that you are using the Makefile (so dependencies are managed using `opam` / the Nix devshell): run `make validate`. If you are using Nix but not the devshell: use `nix-build ./axioms.nix` (the axioms used will also be written to `result/coqchk-output`).)
+Use of four axioms will be reported, namely:
+```
+Coq.Logic.FunctionalExtensionality.functional_extensionality_dep
+Coq.Reals.ClassicalDedekindReals.sig_not_dec
+Coq.Reals.ClassicalDedekindReals.sig_forall_dec
+Coq.Logic.Classical_Prop.classic
+```
+*These are not axioms that we directly make use of.*
+Instead, they are marked as used because we import [Flocq](https://flocq.gitlabpages.inria.fr/) (so NixLang can support IEEE 754 floating point numbers, see `theories/nix/floats.v`), which imports the classical `Reals` module from the standard library.
+Running `coqchk` on the `Reals` module from the standard library gives the same list of axioms as shown above.
+### Structure of the mechanization
+There are two files that are shared by all four languages:
+- `theories/res.v`: contains the `res` monad. See also Fig 2. on p. 6.
+- `theories/utils.v`: contains some generic lemmas that are useful for all languages.
+  A decent amount of these are about finite maps, which we use heavily.
+There is a general structure for all four languages that we formalize:
+- `operational.v`: the definition of the operational semantics.
+  + Contains an inductive type `expr` for expressions.
+  + Contains a `step` relation that describes the small-step op. sem.
+  + Contains a `subst` function that gives parallel substitution (as used by `step`).
+- `operational_props.v`: properties of the operational semantics.
+  + Contains a lemma `step_det`, proving that the small-step op. sem. is deterministic.
+- `interp.v`: the definition of the interpreter.
+  + Contains the definition of values `val`, thunks `thunk` and environments `env`.
+  + Contains an `interp` function that takes some expression, environment and amount of fuel, and returns a result (timeout, fail or some value).
+        (In NixLang, we end up not using `interp` directly, but instead use `interp'` which wraps `interp` and allows specifying whether to evaluate in shallow/deep mode, cf. `⟦e⟧^{δ,E}_μ` (§ 4.4, p. 19).)
+- `interp_proofs.v`: soundness and completeness of interpreter w.r.t. operational semantics, in three main theorems:
+  + `interp_sound_complete_ret_string` (or `interp_sound_complete_ret_lit` for NixLang) proving the soundness and completeness for programs that reduce to strings (or literals in general in NixLang).
+  + `interp_sound_complete_fail` proving the soundness and completeness for programs that fault.
+  + `interp_sound_complete_no_fuel` proving the soundness and completeness for programs that loop.
+  For LambdaLang, an extra condition here is that the programs that we are considering must be closed.
+  
+  There is also always a generalized version of `interp_sound_complete_ret_string/lit`, namely `interp_sound_complete_ret`, which states that the interpreter is sound and complete w.r.t. the operational semantics for (in case of LambdaLang closed) programs that reduce to a value.
+Specific to the different languages are the following files:
+- In `theories/dynlang` (for DynLang, § 3):
+  + `equiv.v`, the equivalence of LambdaLang and DynLang for closed LambdaLang terms.
+- In `theories/evallang` (for EvalLang, § 3.4):
+  + `tests.v`, some tests of the EvalLang expression parser and interpreter.
+- In `theories/nix` (for NixLang, § 4):
+  + `floats.v`: our Flocq instantiation with some helper functions.
+  + `notations.v`: some notations to make writing NixLang programs in Rocq easier.
+  + `tests.v`: some example NixLang programs to test the functionality of the interpreter with.
+  + `wp.v`: the definition of our proof-of-concept weakest precondition-based program logic, derived rules (see § 5).
+  + `wp_examples.v`: examples of use of our WP-based program logic (see § 5).
+### Index of definitions and theories
+**Relevant definitions.**
+| In paper                                           | File and line number                  | Name                           |
+|----------------------------------------------------|---------------------------------------|--------------------------------|
+| Shared result monad (p. 6)                         | `theories/res.v:4`                    | `res`                          |
+| LambdaLang syntax (p. 6)                           | `theories/lambda/operational.v:6`     | `expr`                         |
+| LambdaLang operational semantics (p. 6)            | `theories/lambda/operational.v:21`    | `step`                         |
+| LambdaLang final expressions (p. 6)                | `theories/lambda/operational.v:28`    | `final`                        |
+| LambdaLang parallel substitution (p. 6)            | `theories/lambda/operational.v:12`    | `subst`                        |
+| LambdaLang interpreter (p. 6)                      | `theories/lambda/interp.v:34`         | `interp`                       |
+| LambdaLang interpreter data structures (p. 6)      | `theories/lambda/interp.v:7-14`       | `thunk`, `env`, `val`          |
+| DynLang syntax (p. 9)                              | `theories/dynlang/operational.v:6`    | `expr`                         |
+| DynLang operational semantics (p. 9)               | `theories/dynlang/operational.v:21`   | `step`                         |
+| DynLang final expressions (p. 9)                   | `theories/dynlang/operational.v:31`   | `final`                        |
+| DynLang parallel substitution (p. 9)               | `theories/dynlang/operational.v:12`   | `subst`                        |
+| DynLang interpreter (p. 9)                         | `theories/dynlang/interp.v:39`        | `interp`                       |
+| DynLang interpreter data structures (p. 9)         | `theories/dynlang/interp.v:7-13`      | `thunk`, `env`, `val`          |
+| EvalLang syntax (p. 11)                            | `theories/evallang/operational.v:7`   | `expr`                         |
+| EvalLang expression parser (p. 11)                 | `theories/evallang/operational.v:103` | `parse`                        |
+| EvalLang operational semantics (p. 11)             | `theories/evallang/operational.v:119` | `step`                         |
+| EvalLang final expressions (p. 11)                 | `theories/evallang/operational.v:130` | `final`                        |
+| EvalLang interpreter (p. 11)                       | `theories/evallang/interp.v:42`       | `interp`                       |
+| EvalLang interpreter data structures (p. 11)       | `theories/evallang/interp.v:7-13`     | `thunk`, `env`, `val`          |
+| NixLang syntax (p. 14)                             | `theories/nix/operational.v:67`       | `expr`                         |
+| NixLang operational semantics (p. 14)              | `theories/nix/operational.v:444`      | `step`                         |
+| NixLang evaluation contexts (p. 14)                | `theories/nix/operational.v:418`      | `ctx1`                         |
+| NixLang final expressions (p. 14)                  | `theories/nix/operational.v:139`      | `final`                        |
+| NixLang binary operator semantics (p. 15)          | `theories/nix/operational.v:297`      | `sem_bin_op`                   |
+| NixLang argument matching (p. 15)                  | `theories/nix/operational.v:398`      | `matches`                      |
+| NixLang substitution (p. 15)                       | `theories/nix/operational.v:112`      | `subst`                        |
+| NixLang interpreter (p. 18)                        | `theories/nix/interp.v:329`           | `interp`                       |
+| NixLang interpreter data structures (p. 18)        | `theories/nix/interp.v:5-21`          | `val`, `thunk`, `tattr`, `env` |
+| NixLang interpreter variant with mode arg. (p. 19) | `theories/nix/interp.v:348`           | `interp'`                      |
+**Corresponding theorems.**
+| Theorem in paper                | File and line number                  | Name                               |
+|---------------------------------|---------------------------------------|------------------------------------|
+| Theorem 2.1, Item 1 (p. 7)      | `theories/lambda/interp_proofs.v:575` | `interp_sound_complete_ret_string` |
+| Theorem 2.1, Item 2 (p. 7)      | `theories/lambda/interp_proofs.v:585` | `interp_sound_complete_fail`       |
+| Theorem 2.1, Item 3 (p. 7)      | `theories/lambda/interp_proofs.v:594` | `interp_sound_complete_no_fuel`    |
+| Lemma 2.2 (p. 8)                | `theories/lambda/interp_proofs.v:516` | `interp_sound_open`                |
+| Lemma 2.3 (p. 8)                | `theories/lambda/interp_proofs.v:402` | `interp_step`                      |
+| Lemma 2.4 (p. 8)                | `theories/lambda/interp_proofs.v:297` | `interp_proper`                    |
+| Theorem 3.1, Item 1 (p. 11)     | `theories/dynlang/equiv.v:120`        | `interp_equiv_ret_string`          |
+| Theorem 3.1, Item 2 (p. 11)     | `theories/dynlang/equiv.v:131`        | `interp_equiv_fail`                |
+| Theorem 3.1, Item 3 (p. 11)     | `theories/dynlang/equiv.v:142`        | `interp_equiv_no_fuel`             |
+| Theorem 4.1, Item 1 (p. 20) (*) | `theories/nix/interp_proofs.v:2655`   | `interp_sound_complete_ret_lit`    |
+| Theorem 4.1, Item 2 (p. 20)     | `theories/nix/interp_proofs.v:2665`   | `interp_sound_complete_fail`       |
+| Theorem 4.1, Item 3 (p. 20)     | `theories/nix/interp_proofs.v:2673`   | `interp_sound_complete_no_fuel`    |
+| Lemma 4.2 (p. 20)               | `theories/nix/interp_proofs.v:2619`   | `interp_sound_open'`               |
+| Lemma 4.3 (p. 20)               | `theories/nix/interp_proofs.v:2029`   | `interp_step'`                     |
+(*): The theorem in the formalization is stronger than the theorem presented in the paper. The latter is a trivial specialization of the former.
+**Other claims in the paper.**
+- § 2, LambdaLang:
+  + p. 7: Theorem 2.1, Item 1 generalizes to any final value.
+        See `theories/lambda/interp_proofs.v:563`, `interp_sound_complete_ret`.
+        This also holds for DynLang, EvalLang and NixLang.
+        These languages have a theory `interp_sound_complete_ret` in their respective `interp_proofs.v` files as well.
+  + p. 7: LambdaLang has a deterministic operational semantics. See `theories/lambda/operational_props.v:19`, `step_det`.
+        This also holds for DynLang, EvalLang and NixLang.
+        These languages have a theory `step_det` in their respective `operational_props.v` files as well. 
+- § 3, DynLang:
+  + p. 10: A variant of the main theorem of DynLang (Theorem 2.1) also holds for DynLang.
+        The primary difference here is that the closedness conditions are not needed for DynLang.
+        For a variant of Theorem 2.1, Item 1, see `theories/dynlang/interp_proofs.v:391`, `interp_sound_complete_ret_string`.
+        For a variant of Theorem 2.1, Item 2, see `theories/dynlang/interp_proofs.v:400`, `interp_sound_complete_fail`.
+        For a variant of Theorem 2.1, Item 3, see `theories/dynlang/interp_proofs.v:408`, `interp_sound_complete_no_fuel`.
+        A variant of the generalized version of Theorem 2.1, Item 1 also holds for DynLang.
+        See `theories/dynlang/interp_proofs.v:381`, `interp_sound_complete_ret`.
+- § 4, NixLang:
+  + p. 19/20: Theorem 4.1, Item 1 generalizes to any final value.
+  + p. 20: Lemma 4.2 follows from mutual induction on four properties.
+        For these four properties and their corresponding proof, see `theories/nix/interp_proofs.v:2267-2282` (`interp_sound_open`, `interp_thunk_sound`, `interp_app_sound`, `force_deep_sound`).
+  + p. 20: Lemma 4.3 follows from mutual induction on two properties.
+        For these two properties and their corresponding proof, see `theories/nix/interp_proofs.v:1519`, `interp_step`.
+- § 5, Evaluation (Program logic, p. 22):
+  + We can derive structural rules for WP (the application rule is presented).
+        See `theories/nix/wp.v:55`, `App_wp` for the presented application rule.
+        In general, `theories/nix/wp.v` contains all derived rules for WP.
+  + We prove the total correctness of three variants of the recursive program from § 4.1:
+        * See `theories/nix/wp_examples.v:84`, `even_rec_attr_wp'` for
+          ```nix
+          rec { f = x: if x = 0 then true else !(f (x - 1)); }.f n
+          ```
+        * See `theories/nix/wp_examples.v:132`, `even_rec_functor_wp'` for
+          ```nix
+          { "__functor " = r: x: if x == 0 then true else !(r (x - 1)); } n
+          ```
+        * See `theories/nix/wp_examples.v:157`, `even_rec_default_wp'` for
+          ```nix
+          ({ f ? (x: if x == 0 then true else !(f (x - 1))) }: f) {} n
+          ```
+  + We prove the total correctness of the following program for any non-recursive attribute set `e`:
+        ```nix
+        let x = 1; in with e; with { y = 2; }; x == y
+        ```
+        See `theories/nix/wp_examples.v:11`, `test_wp`.
+## Import tree definitions
+We do not claim to have (proper) support the `import` feature of Nix, but we do support it in a bare-bones fashion in order to be able to run the test suite.
+In the test suite, there is a file `lib.nix`, which a few test cases load by using `with import ./lib.nix; ...` (or something similar).
+To support this, we use a so-called import tree definition file, which pre-declares all files that may be imported.
+In the test suite, for example, we have created a `test/testdata/importdef.sexp` file, which looks like this:
+```
+(deps ./lib.nix)
+```
+This means that the file `lib.nix` should be available to be imported for all files in the `test/testdata` folder.
+Such `importdef.sexp` files are discovered automatically by walking up the directory tree from the path of the file to be evaluated (or in case of the REPL, from the current working directory).
+The general syntax (`<forest>`) is as follows: `(deps <tree>)`, where `<tree>` is either `<filename>` or `(<filename> <forest>)`
+In general, the primitive imports mechanism works as follows:
+- All path expressions (_e.g.,_ `./lib.nix`) in the Nix source file are converted to strings representing their absolute path.
+- We load all dependencies and evaluate them to NixLang values.
+- We put all of these values in a NixLang attribute set (`VAttr`, see `theories/nix/interp.v:14`), where we associate the NixLang value of each imported file with the original filename (as an absolute path).
+- We then create a NixLang function `import <filename>` (using `VClo`, see `theories/nix/interp.v:9`) that takes a string and returns the value for that file (if indeed forward-declared in the import tree definition).
+- This function is then inserted into the global environment of the file that we want to evaluate.
+This process is performed recursively, hence recursive imports are also supported.
+## Use of the Nix language tests
+When exercising the official Nix language tests on our extracted interpreter, we do not exactly consider all tests in the Nix test suite.
+Since we are only interested in tests for the interpreter, and not so much in tests for the parser, we only consider the `eval-*` test files in the `test/testdata` folder.
+There are two types of tests that we have here: tests where evaluation should succeed, and tests where evaluation should fail.
+This is simply marked in the filename; Nix files that should fail to evaluate are named `eval-fail-*.nix` and files that should evaluate successfully are named `eval-okay-*.nix`.
+For the former, there are expected error outputs that the interpreter should produce, but we are not interested in reproducing these (nor would this be realistic with our setup).
+For the latter, there are also expected output expressions, which can be found in matching `eval-okay-*.exp`.
+However, there are two `eval-okay-*.nix` files for which no such matching `eval-okay-*.exp` file exists:
+- `eval-okay-tail-call-1.nix`: a file `eval-okay-tail-call-1.exp-disabled` exists; we hence do not consider this test.
+- `eval-okay-xml.nix`: a file `eval-okay-xml.exp.xml` exists, but we are not interested in converting the resulting term to XML to validate the result of this test.
+The total amount of `eval-fail-*.nix` files and pairs of `eval-okay-*.nix` and `eval-okay-*.exp` files informs our total count of interpreter tests that we consider to begin with, namely 182 (§ 5, p. 21).
+For these tests again, we explicitly list 74 tests that should be ignored in `./test/test_mininix.ml`, since they cover functionality of Nix that is out of scope for our paper.
+We end up with 108 tests that we exercise our interpreter on.
+Our interpreter passes 103 of these tests, and times out for five tests due to its call-by-name nature.
+Some more details can be found in the paper, in § 5.
+## General project structure
+- The `theories` folder contains the Rocq formalization. See the [structure of the mechanization](#structure-of-the-mechanization).
+- The `bin` folder contains the entrypoint for the command-line interface.
+- The `lib` folder contains the Nix parser, elaborator and interpreter extraction
+  + The `nix` subfolder contains an adapted version of the parser and pretty-printer from nixformat (licensed under ISC) by Denis Korzunov.
+    See https://github.com/d2km/nixformat.
+    Some improvements were made to the parser and pretty-printer.
+    The file `elaborator.ml`, not part of nixformat, in this folder is concerned with desugaring attribute paths among other things; you may consider this a 'normalizing' stage, which allows the elaborator to our core language to be more concise.
+  + The `mininix` subfolder is the most important here. It contains, among
+    other files:
+    - `nix2mininix.ml`, elaboration of Nix into our core language
+    - `builtins.nix`, our implementation of builtins in Nix, in our core Nix language, where we liberally use 'core' builtins (available as binary operations) that are specific to the core language
+    - `conv.ml`, conversion between numeric types in OCaml and Rocq/Flocq
+    - `import.ml`, very bare-bones support for imports, used in the test suite
+  + The `extraction` folder contains a file that extracts the core language interpreter and some auxiliary functions to OCaml
+- The `explorer` folder contains some utilities for converting the Rocq Mechanization to a static site.
author	Rutger Broekhoff	2025-07-07 21:52:08 +0200
committer	Rutger Broekhoff	2025-07-07 21:52:08 +0200
commit	ba61dfd69504ec6263a9dee9931d93adeb6f3142 (patch)
tree	d6c9b78e50eeab24e0c1c09ab45909a6ae3fd5db /README.md
download	verified-dyn-lang-interp-ba61dfd69504ec6263a9dee9931d93adeb6f3142.tar.gz verified-dyn-lang-interp-ba61dfd69504ec6263a9dee9931d93adeb6f3142.zip