Erlang

Organisation:Copyright (C) 2021-2022 Olivier Boudeville
Contact:about (dash) howtos (at) esperide (dot) com
Creation date:Saturday, November 20, 2021
Lastly updated:Sunday, January 9, 2022

Overview

Erlang is a concurrent, functional programming language available as free software; see its official website for more details.

Erlang is dynamically typed, and is executed by the BEAM virtual machine. This VM (Virtual Machine) operates on bytecodes and can perform Just-In-Time compilation. It powers also other related languages, such as Elixir and LFE.

Let's Start with some Shameless Advertisement for Erlang and the BEAM VM

Taken from this presentation:

Hint

What makes Elixir StackOverflow’s #4 most-loved language?

What makes Erlang and Elixir StackOverflow’s #3 and #4 best-paid languages?

How did WhatsApp scale to billions of users with just dozens of Erlang engineers?

What’s so special about Erlang that it powers CouchDB and RabbitMQ?

Why are multi-billion-dollar corporations like Bet365 and Klarna built on Erlang?

Why do PepsiCo, Cars.com, Change.org, Boston’s MBTA, and Discord all rely on Elixir?

Why was Elixir chosen to power a bank?

Why does Cisco ship 2 million Erlang devices each year? Why is Erlang used to control 90% of Internet traffic?

Installation

Erlang can be installed thanks to the various options listed in these guidelines.

Building Erlang from the sources of its latest stable version is certainly the best approach; for more control we prefer relying on our custom procedure.

For a development activity, we recommend also specifying the following options to our conf/install-erlang.sh script:

Run ./install-erlang.sh --help for more information.

Once installed, ensure that ~/Software/Erlang/Erlang-current-install/bin/ is in your PATH (ex: by enriching your ~/.bashrc accordingly), so that you can run erl (the Erlang interpreter) from any location, resulting a prompt like:

$ erl
Erlang/OTP 24 [erts-12.1.5] [source] [64-bit] [smp:8:8] [ds:8:8:10] [async-threads:1] [jit]

Eshell V12.1.5  (abort with ^G)
1>

Then enter CTRL-C twice in order to come back to the (UNIX) shell.

Congratulations, you have a functional Erlang now!

Ceylan's Language Use

Ceylan users shall note that most of our related developments (namely Myriad, WOOPER, Traces, LEEC, Seaplus, Mobile, US-Common, US-Web and US-Main) depart significantly from the general conventions observed by most Erlang applications:

Using the Shell

If it is as simple to run erl, we prefer, with Ceylan settings, running make shell in order to benefit from a well-initialized VM (notably with the full code path of the current layer and the ones below).

Refer then to the shell commands, notably for:

1> rr(code:lib_dir(xmerl) ++ "/include/xmerl.hrl").

See also the JCL mode (for Job Control Language) to connect and interact with other Erlang nodes.

About Security

More Advanced Topics

Metaprogramming

Metaprogramming is to be done in Erlang through parse transforms, which are user-defined modules that transform an AST (for Abstract Syntax Trees, an Erlang term that represents actual code; see the Abstract Format for more details) into another AST that is fed afterwards to the compiler.

See also:

Improper Lists

A proper list is created from the empty one ([], also known as "nil") by appending (with the | operator, a.k.a. "cons") elements in turn; for example [1,2] is actually [1 | [2 | []]].

However, instead of enriching a list from the empty one, one can start a list with any other term than [], for example my_atom. Then, instead of [2|[]], [2|my_atom] may be specified and will be indeed a list - albeit an improper one.

Many recursive functions expect proper lists, and will fail (typically with a function clause) if given an improper list to process (ex: lists:flatten/1).

So, why not banning such construct? Why even standard modules like digraph rely on improper lists?

The reason is that improper lists are a way to reduce the memory footprint of some datastructures, by storing a value of interest instead of the empty list.

Indeed, as explained in this post, a (proper) list of 2 elements will consume:

  • 1 list cell (2 words of memory) to store the first element and a pointer to second cell
  • 1 list cell (2 more words) to store the second element and the empty list

For a total of 4 words of memory (so, on a 64-bit architecture, it is 32 bytes).

As for an improper list of 2 elements, only 1 list cell (2 words of memory) will be consumed to store the first element and then the second one.

Such a solution is even more compact than a pair (a 2-element tuple), which consumes 2+1 = 3 words. Accessing the elements of an improper list is also faster (one handle to be inspected vs also an header to be inspected).

Finally, for sizes expressed in bytes:

1> system_utils:get_size([2,my_atom]).
40

2> system_utils:get_size({2,my_atom}).
32

3> system_utils:get_size([2|my_atom]).
24

See also the 1, 2 pointers for more information.

Everyone shall decide on whether relying on improper lists is a trick, a hack or a technique to prohibit.

Post-Mortem Investigations

Erlang programs may fail, and this may result in mere (Erlang-level) crashes (the VM detects an error, and reports information about it, possibly in the form of a crash dump) or (sometimes, quite infrequently though) in more brutal, lower-level core dumps (the VM crashes as a whole, like any faulty program run by the operating system); this last case happens typically when relying on faulty NIFs.

Erlang Crash Dumps

If experiencing "only" an Erlang-level crash, a erl_crash.dump file is produced in the directory whence the executable (generally erl) was launched. The best way to study it is to use the cdv (refer to crashdump viewer) tool, available, from the Erlang installation, as lib/erlang/cdv [3].

[3]Hence, according to the Ceylan-Myriad conventions, in ~/Software/Erlang/Erlang-current-install/lib/erlang/cdv.

Using this debug tool is as easy as:

$ cdv erl_crash.dump

Then, through the wx-based interface, a rather large number of Erlang-level information will be available (processes, ports, ETS tables, nodes, modules, memory, etc.) to better understand the context of this crash and hopefully diagnose its root cause.

Core Dumps

In the worst cases, the VM will crash like any other OS-level process, and generic (non Erlang-specific) tools will have to be used. Do not expect to be pointed to line numbers in Erlang source files anymore!

Refer to our general section dedicated to core dumps for that.

Language Bindings

The two main approaches in order to integrate third-party code to Erlang are to:

Language Implementation

Message-Passing: Copying vs Sharing

Knowing that, in functional languages such as Erlang, terms ("variables") are immutable, why could not they be shared between local processes when sent through messages, instead of being copied in the heap of each of them, as it is actually the case with the Erlang VM?

The reason lies in the fact that, beyond the constness of these terms, their life-cycle has also to be managed. If they are copied, each process can very easily perform its (concurrent, autonomous) garbage collections. On the contrary, if terms were shared, then reference counting would be needed to deallocate them properly (neither too soon nor never at all), which, in a concurrent context, is bound to require locks.

So a trade-off between memory (due to data duplication) and processing (due to lock contention) has to be found and at least for most terms (excepted larger binaries), the sweet spot consists in sacrificing a bit of memory in favour of a lesser CPU load. Solutions like persistent_term may address situations where more specific needs arise.

Just-in-Time Compilation

This long-awaited feature, named BeamAsm and whose rationale and history have been detailed in these articles, has been introduced in Erlang 24 and shall transparently lead to increased performances for most applications.

Static Typing

Static type checking can be performed on Erlang code; the usual course of action is to use Dialyzer - albeit other solutions like Gradualizer exist.

A few statically-typed languages can operate on top of the Erlang VM, even if none has reached yet the popularity of Erlang or Elixir (that are dynamically-typed).

In addition to the increased type safety that statically-typed languages permit (possibly applying to sequential code but also to inter-process messages), it is unsure whether such extra static awareness may also lead to better performances (especially now that the standard compiler supports JIT).

Intermediate Languages

To better discover the inner workings of the Erlang compilation, one may look at the eplaypen online demo (whose project is here) and/or at the Compiler Explorer (which supports the Erlang language among others).

Both of them allow to read the intermediate representations involved when compiling Erlang code (BEAM stage, erl_scan, preprocessed sources, abstract code, Core Erlang, Static Single Assignment form, BEAM VM assembler opcodes, x86-64 assembler generated by the JIT, etc.).

Erlang Resources