Generate OCaml parsers based on tree-sitter grammars, for semgrep.
Related ocaml-tree-sitter repositories:
- ocaml-tree-sitter-core: provides the code generator that takes a tree-sitter grammar and produces an OCaml library from it.
- ocaml-tree-sitter-languages: community repository that has scripts for building and publishing OCaml libraries for parsing a variety of programming languages.
- ocaml-tree-sitter-semgrep: this repo; same as ocaml-tree-sitter-languages but extends each language with constructs specific to semgrep patterns.
- Make sure you have at least 6 GiB of free memory. More will be needed for some of the grammars.
- Install the following tools:
- git
- GNU make
- pkg-config: manages the installation of tree-sitter's runtime library
- Node.js: JavaScript interpreter used to translate a grammar to json
- cargo: Rust compiler used to build
tree-sitter
- opam: OCaml package manager
- Run
opam init
,opam switch create 4.12.0
to install a recent version of OCaml. - Install ocaml dev tools for your favorite
editor:
typically
opam install merlin
+ some plugin for your editor. - Install
pre-commit
withpip3 install pre-commit
and runpre-commit install
to set up the pre-commit hook. This will re-indent code in a consistent fashion each time you callgit commit
. - Check out the extra instructions for MacOS.
See the Makefile for the available targets. Get started with:
make update
make setup
Then build and install the OCaml code generator (core):
make && make install
Say you want to build and test support for kotlin, you would run this:
$ cd lang
$ ./test-lang kotlin
For details, see How to upgrade the grammar for a language.
See How to add support for a new language.
We have limited documentation which is mostly targeted at early contributors. It's growing organically based on demand, so don't hesitate to file an issue explaining what you're trying to do.
ocaml-tree-sitter is free software with contributors from multiple organizations. The project is driven by r2c.
- OCaml code developed specifically for this project is distributed under the terms of the GNU GPL v3.
- The OCaml bindings to tree-sitter's C API were created by Bryan Phelps as part of the reason-tree-sitter project.
- The tree-sitter grammars for major programming languages are external projects. Each comes with its own license.