blogworktalksabout

Learning OCaml in 2023

Blog

If you’re new to OCaml you may find the ecosystem confusing. There are a few reasons for this:

  • OCaml has been born in academia and historically been focused in type-theory and functional programming
  • The community has a fame of being small
  • It was created long time ago and it carries some of the old-fashion methods from the past
  • It's a functional programming language, which it might be different paradigm than the one you're used to
  • Really well suited to create a programming language but got into more general purpose programming

Why should you listen to me?

I’m not a teacher, I don’t have 10+ years of experience in OCaml, I don’t have a PhD and more terribly I didn’t finish my Computer Science degree.

But often it’s very useful to learn from someone who is one step above you, and not someone how’s 10 steps above you.

I worked professionally in OCaml and ReasonML (a dialect of OCaml) for almost 3 years (and counting) and have been creating and maintaining some useful projects meanwhile. Some of them are:

  • styled-ppx: Typed styled components for ReScript
  • query-json: Faster, simpler and more portable implementation of jq in Reason
  • server-reason-react: Server rendering Reason React components natively
  • jsoo-react: js_of_ocaml bindings for ReactJS. Based on ReasonReact
  • jsoo-css: CSS Typed functional interface in jsoo, bindings to inline styles and emotion
  • reason: Simple, fast & type safe code that leverages the JavaScript & OCaml ecosystems

How to look at the OCaml ecosystem

The OCaml ecosystem seems overwhelming because it’s always explained in abstract terms, minimum explanations and probably the wrong order.

Additionally, there are a few topics that are often mentioned as "more advanced". The topics below are interesting, but they're difficult to understand, are far less popular than the above topics and aren't required for most apps but they can became immensely helpful when some specific problem appears.


Learning the basics of the language

ocaml.org/docs/up-and-running

Similarly on the first-hour from the OCaml docs, here's a quick overview of the core of the language to get you up and running.

Types

OCaml is a statically typed functional programming language. Unless other languages, OCaml has inference which means it's not necessary to specify the type of every variable/argument/return value.

let fn x = x + 1
(* fn: int -> int *)
let fn x = x + 1
(* fn: int -> int *)

Note: fn is a function that takes an int and returns an int, any other type will make the compiler angry. The compiler will infer the type as int since the + operator is just for ints. To sum floats, the operator is +..

The most important built-in types are records, tuples and variants.

  • records are used to define a data structure with a fixed set of fields. They are used to hold values together and define a domain model.
  type person = { first_name : string; surname : string; age : int; }
  type person = { first_name : string; surname : string; age : int; }
  • tuples are a type with a fixed set of values, without specify their names. They're often used to assosiacte values without creating a more formal data structure (like a record).
  let t = (1, "one", '1') in
  (*      int * string * char *)
  let t = (1, "one", '1') in
  (*      int * string * char *)
  • variants are used to define a sum type (also known as ADTs: Algebraic Data Types). They can be used to represent a value that can be one of a few different kindš.
  type color =
    | Red
    | Green
    | Blue
    | Yellow
    | RGB of int * int * int
  type color =
    | Red
    | Green
    | Blue
    | Yellow
    | RGB of int * int * int

Pattern match

Pattern matching is the greatest feature of OCaml, it allows to match a value against a pattern and execute code on each branch. Works amazingly well with variants but also great with records, tuples and lists.

match color with
| Red -> "red"
| Green -> "green"
| Blue -> "blue"
| RGB (r, g, b) -> "rgb(" ^ string_of_int r ^ "," ^ string_of_int g ^ "," ^ string_of_int b ^ ")"
match color with
| Red -> "red"
| Green -> "green"
| Blue -> "blue"
| RGB (r, g, b) -> "rgb(" ^ string_of_int r ^ "," ^ string_of_int g ^ "," ^ string_of_int b ^ ")"

The compiler enforces that you pattern matching defines all possible cases. If I forgot a case, such as | Yellow the compiler will say:

Warning number 8
 
You forgot to handle a possible case here, for example:
**Yellow**
Warning number 8
 
You forgot to handle a possible case here, for example:
**Yellow**

Pattern match it will be one of your most (ab)used features of OCaml and can be combined with almost any feature of the language and became very advance topic. Here's a more extensive example: Mathematical expressions

Balance of styles

OCaml has the right balance between Functional language and imperative styles. Allows you to write code in both, it's often preferred to stick to functional and ocassionaly opt-out to imperative.

By default, all values are immutable. List.map will return a new list, instead of mutating the original one.

let data = [1, 2, 3] in
let data_plus_one = List.map (fun x => x + 1) data in
(* `data_plus_one` is a new list and `data` is still available in scope *)
let data = [1, 2, 3] in
let data_plus_one = List.map (fun x => x + 1) data in
(* `data_plus_one` is a new list and `data` is still available in scope *)

But it's possible to mutate values using ref and mutable keyword (inside records). Extense explanation here

Similarly on the above, you can debug values by printing to the stdout with print_endline or Printf. The Printf module comes with a lot of useful functions to format your output, and similarly to C with the printing notation: %s for strings, %d for integers, %f for floats, etc.

print_endline "Hello world";
(* Hello world *)
print_endline "Hello world";
(* Hello world *)

Modules

Modules are the way to organize your code in OCaml. Each file is an implicit module that uses the name of the file. You can create modules inside with module Whatever = { ... }.

Note: All modules need to start by an uppercase letter.

You will find modules often have a type t which representes the "main" type of the module. For example, the List module has a type t which is 'a list (a list of any type).

Modules can import other modules using include makes all functions/types from the included module available in the current module.

You can open modules as well, removing the need to prefix all values/functions/types by the module.

Learning opam

opam.ocaml.org/

opam is the OCaml's package manager. Read this post to feel familiar with the basics. It can create a "switch" (a set of packages for each project attached to a version of the compiler). Download packages and install

Learning dune

dune.readthedocs.io/en/stable/overview.html

Dune is the most common build system for OCaml projects. It's power comes from the fact that the user can define a build system in a declarative way and dune takes care of most low-level details of OCaml compilations, creation of libraries/executables.

I recommend creating a dummy project following this guide: Building a Hello World Program From Scratch and try to understand the dune file and how modules are organized.

Configure Editor

github.com/ocaml/ocaml-lsp

Merlin and ocaml-lsp-server (OCaml's Language Server Protocol) are the tools to enhance editors (like Visual Studio Code, Vim, or Emacs) by providing many useful features such as "jump to definition", "type on hover", "refactor symbol", "autocomplete", "expand switch statement", "create an interface file from an implementation", etc...

Setup your preferred IDE by following this.

Learning about standard library replacements

In addition to the OCaml standard library, there are several popular third-party libraries that are widely used in the OCaml community. Some of these libraries provide alternative implementations of certain features, while others offer additional functionality that is not available.

Even thought most of these might be handy, I recommend to start with the Standard library and explore what's missing and try to reach for one of these when the time comes. Knowing that they exist and they are used in some online materials is good enough to move forward.

Learning about ppx

ocaml.org/docs/metaprogramming

What are these [@deriving yojson], [@react.component], [%...], [@@ ...], let%lwt annotations? Those annotations are called ppx or preprocessing extension and they do a lot of stuff for you. They are used to generate code, either by extending the language or by generating boilerplate code.

This is often called meta-programmig and it's a very powerful tool and also very easy to abuse it. In the official documentation there's an extensive explanation of Preprocessors in OCaml.

Learning about Compilation modes: native, byte, js

One of the strengths of OCaml is that it can be compiled to run on a wide variety of platforms: native code, bytecode and JavaScript.

OCaml by itself comprises two compilers:

One generates bytecode which is then interpreted by a C program. This compiler runs quickly, generates compact code with moderate memory requirements, and is portable to essentially any 32 or 64 bit Unix platform. Performance of generated programs is quite good for a bytecoded implementation. This compiler can be used either as a standalone, batch-oriented compiler that produces standalone programs, or as an interactive, toplevel-based system.

The other compiler generates high-performance native code for a number of processors. Compilation takes longer and generates bigger code, but the generated programs deliver excellent performance, while retaining the moderate memory requirements of the bytecode compiler.

There are also two compilers that target JavaScript: js_of_ocaml and Melange.

js_of_ocaml tries to work closely with the OCaml ecosystem of libraries, allowing to compile to JavaScript any OCaml library.

Melange integrates closer with the JavaScript/npm ecosystems.

Learning Functors

dev.realworldocaml.org/functors.html

Functors are a way to parameterize modules over other modules. They are used in modules from the Standard Library such as Map and Set.

module Map_with_strings = Map.Make(String)
(*                                 ^^^^^^ `String` is a module
  `Map.Make` can have access to all the interface from String and
any module with the same interfaced can be passed:
 
  type t = 'a
  val compare : t -> t -> int
 
*)
 
module Custom_map = Map.Make(struct type t = int let compare a b = a - b end)
(*                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                             Here's the module without a name *)
module Map_with_strings = Map.Make(String)
(*                                 ^^^^^^ `String` is a module
  `Map.Make` can have access to all the interface from String and
any module with the same interfaced can be passed:
 
  type t = 'a
  val compare : t -> t -> int
 
*)
 
module Custom_map = Map.Make(struct type t = int let compare a b = a - b end)
(*                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                             Here's the module without a name *)

Here's a simple implementation of the Map functor:

module Map = struct
  (* The module type this functor recieves *)
  module type OrderedType = sig
    type t
    val compare : t -> t -> int
  end
 
  (* The interface from the module created *)
  module type S = sig
    type key
    type 'a t
 
    val empty : 'a t
    val add : key -> 'a -> 'a t -> 'a t
    val find : key -> 'a t -> 'a
  end
 
  (* The implementation of the functor itself *)
  module Make (Ord : OrderedType) : S with type key = Ord.t = struct
    type key = Ord.t
    type 'a t = (key * 'a) list
 
    let empty = []
    let add key value map = (key, value) :: map
    let rec find key map =
      match map with
      | [] -> raise Not_found
      | (current_key, value) :: rest -> if Ord.compare key current_key = 0 then value else find k rest
  end
end
module Map = struct
  (* The module type this functor recieves *)
  module type OrderedType = sig
    type t
    val compare : t -> t -> int
  end
 
  (* The interface from the module created *)
  module type S = sig
    type key
    type 'a t
 
    val empty : 'a t
    val add : key -> 'a -> 'a t -> 'a t
    val find : key -> 'a t -> 'a
  end
 
  (* The implementation of the functor itself *)
  module Make (Ord : OrderedType) : S with type key = Ord.t = struct
    type key = Ord.t
    type 'a t = (key * 'a) list
 
    let empty = []
    let add key value map = (key, value) :: map
    let rec find key map =
      match map with
      | [] -> raise Not_found
      | (current_key, value) :: rest -> if Ord.compare key current_key = 0 then value else find k rest
  end
end

An specific example of a Functor

Learning about memory management

OCaml provides a garbage collector so that you don't need to explicitly allocate and free memory as in C/C++. The OCaml garbage collector is a modern hybrid generational/incremental collector which outperforms hand-allocation in most cases.

Learning GADTs

ocaml.org/manual/gadts-tutorial.html

GADTs (Generalized Algebraic Datatypes) are a different kind of variants (ADTs). They enable a few use cases for the typechcker that are not possible with normal variants to express polymorphism (in the sense of a function being able to return different types depending on the input).

It's very sophisticated and it's not something you'll need to use often, but it's good to know, for example the Printf implementation or

The best resource I read to understand the basics though is blog.mads-hartmann.com/ocaml/2015/01/05/gadt-ocaml.html.


References


Based on https://github.com/petehunt/react-howto and https://github.com/petehunt/webpack-howto


Thanks for reaching the end. Let me know if you have any feedback, corrections or questions. Always happy to chat.