About 3D

Organisation:Copyright (C) 2021-2022 Olivier Boudeville
Contact:about (dash) howtos (at) esperide (dot) com
Creation date:Saturday, November 20, 2021
Lastly updated:Saturday, January 15, 2022

As usual, these information pertain to a GNU/Linux perspective.

Cross-Platform Game Engines

The big three are Godot, Unreal Engine and Unity3D.


Godot is our personal favorite engine, notably because it is free software (released under the very permissive MIT license).

See its official website and its asset library.

Godot (version 3.4.1) will not be able to load FBX files that reference formats like PSD or TIF and/or of older versions (ex: FBX 6.1). See for that our section regarding format conversions.


On Arch Linux: pacman -Sy godot.


Godot logs are stored per-project; ex: ~/.local/share/godot/app_userdata/my-test-project/logs/godot.log; past log files are kept once timestamped. They tend not to have interesting content.

A configuration tree lies in .config/godot, a cache tree in ~/.cache/godot.

Unreal Engine

Another contender is the Unreal Engine, a C++ game engine developed by Epic Games; we have not used it yet.

Its licence is meant to induce costs only when making large-enough profits.

See its official website and its marketplace.


Purchased assets may be used in one's own shipped products (source) and apparently at least usually no restrictive terms apply.

Assets not created by Epic Games can be used in other engines unless otherwise specified (source).


Unity is most probably the cross-platform game engine that is the most popular.

Regarding the licensing of the engine, various plans apply, depending notably on whether one subscribes as an individual or a team, and on one's profile, revenue and funding.

See its official website and its asset store.

Unity may be installed at least in order to access its asset store, knowing that apparently an asset purchased in this store may be used with any game engine of choice. Indeed, for the standard licence, it is stipulated in the EULA legal terms that:

Licensor grants to the END-USER a non-exclusive, worldwide, and perpetual license to the Asset to integrate Assets only as incorporated and embedded components of electronic games and interactive media and distribute such electronic game and interactive media.

So, in legal terms, an asset could be bought in the Unity Asset Store and used in Godot, for example - provided that its content can be used there technically without too much effort/constraints (this may happen with prefabs, specific animations, materials or shaders, conventions in use, etc.).


Unity shall now be obtained thanks to the Unity Hub.

On Arch Linux it is available through the AUR, as an AppImage; one may thus use: yay -Sy unityhub.

Then, when running (as a non-priviledged user) unityhub, a Unity account will be needed, then a licence, then a Unity release will have to be added in order to have it downloaded and installed for good, covering the selected target platforms (ex: Linux and Windows "Build Supports").

We rely here on the Unity version 2021.2.7f1.

Additional information: Unity3D on Arch.


Configuring Unity so that its interface (mouse, keyboard bindings) behave like, for example, the one of Blender, is not natively supported.

Running Unity

Just execute unityhub, which requires signing up and activating a licence.


The log files are stored in ~/.config/unity3d:

  • Unity Editor: Editor.log (the most interesting one)
  • Unity Package Manager: upm.log
  • Unity Licensing client: Unity.Licensing.Client.log

If the editor is stuck (ex: when importing an asset), one may use as a last resort kill-unity3d.sh.

In term of persistent state, beyond the project trees themselves, there are:

  • ~/.config/UnityHub/ and ~/.local/share/UnityHub/
  • ~/.config/unity3d/ and ~/.local/share/unity3d/

(nothing in ~/.cache apparently)

Unity Assets

Once ordered through the Unity Asset Store, assets can be downloaded through the Window -> Package Manager menu, by replacing, in the top Packages drop-down, the In Project entry by the My Assets one. After having selected an asset, use the Download button at the bottom-right of the screen.

Then, to gain access to such downloaded assets, of course the simplest approach is to use the Unity editor; this is done by creating a project (ex: MyProject), selecting the aforementioned menu option (just above), then clicking on Import and selecting the relevant content that will end up in clear form in your project, i.e. in the UNIX filesystem with their actual name and content, for example in MyProject/Assets/CorrespondingAssetProvider/AssetName. We experienced reproducible freezes when importing resources.

Yet such Unity packages, once downloaded (whether or not they have been imported in projects afterwards) are files stored typically in the ~/.local/share/unity3d/Asset Store-5.x directory and whose extension is .unitypackage.

Such files are actually .tar.gz archives, and thus their content can be listed thanks to:

$ tar tvzf Foobar.unitypackage

Inside such archives, each individual package resource is located in a directory whose name is probably akin to the checksum of this resource (ex: 167e85f3d750117459ff6199b79166fd) [1]; such directory generally contains at least 3 files:

  • asset: the resource itself, renamed to that unique checksum name, yet containing its exact original content (ex: the one of a Targa image)
  • asset.meta: the metadata about that asset (file format, identifier, timestamp, type-specific settings, etc.), as an ASCII, YAML-like, text
  • pathname: the path of that asset in the package "virtual" tree (ex: Assets/Foo/Textures/baz.tga)

When applicable, a preview.png file may also exist.

[1]Yet no checksum tool among md5sum, sha1sum, sha256sum, sha512sum, shasum, sha224sum, sha384sum seems to correspond; it must a be a different, possibly custom, checksum.

Some types of content are Unity-specific and thus may not transpose (at least directly) to another game engine. This is the case for example for materials or prefabs (whose file format is relatively simple, based on YAML 1.1).

Tools like AssetStudio (probably Windows-only) strive to automate most of the process of exploring, extracting and exporting Unity assets.

Meshes are typically in the FBX (proprietary) file format, that can nevertheless be imported in Blender and converted to other file formats (ex: gltF 2.0); see blender import and blender convert for that.

3D Data

File Formats

They are designed to store 3D content (scenes, nodes, vertices, normals, meshes, textures, materials, animations, skins, cameras, lights, etc.).


We prefer to rely on the open, well-specified, modern glTF 2.0 format in order to perform import/export operations.

It comes in two forms:

  • either as *.gltf when JSON-based, possibly embedding the actual data (vertices, normals, textures, etc.) as ASCII base64-encoded content, or referencing external files
  • or as *.glb when binary; this is the most compact form, and the one that we recommend especially

See also the glTF 2.0 quick reference guide, the related section of Godot and this standard viewer of predefined glTF samples.

This (generic) online glTF viewer proved lightweight and convenient notably because it displays errors, warnings and information regarding the glTF data that it decodes.


The second best choice that we see is Collada (*.dae files), an XML-based counterpart (open as well, yet older and with less validating facilities) to glTF.

FBX, OBJ, etc.

Often, assets can be found as FBX of OBJ files and thus may have to be converted (typically to glTF), which is never a riskless task. FBX comes in two flavours: text-based (ASCII) or binary, see this retro-specification for more information.

In General

Refer to blender import in order to handle the most common 3D file formats, and the next section about conversions.

The file command is able to report the version of at least some formats; for example:

# Means FBX 7.3:
$ file foobar.fbx
foobar.fbx: Kaydara FBX model, version 7300

Too often, some tool will not be able to load a file and will fail to properly report why. When suspecting that a binary file (ex: a FBX one) references external content either missing or in an unsupported format (ex: PSD or TIFF?), one may peek at their content without any dedicated tool, directly from a terminal, like in:

$ strings my_asset.fbx | sort | uniq | grep '\.'

This should list, among other elements, the paths that such a binary file is embedding.


Due to the larger number of 3D file formats and the role of commercial software, interoperability regarding 3D content is poor and depends on many versions (of tools and formats).

Workaround #1: Using Autodesk FBX Converter

The simpler approach seems to download the (free) Autodesk FBX Converter and to use wine to run it on GNU/Linux. Just install then this converter with: wine fbx20133_converter_win_x64.exe.

A convenient alias (based on default settings, typically to be put in one's ~/.bashrc) can then be defined to run it:

$ alias fbx-converter-ui="$HOME/.wine/drive_c/Program\ Files/Autodesk/FBX/FBX\ Converter/2013.3/FBXConverterUI.exe 2>/dev/null &"

Conversions may take place from, for example, FBX 6.1 (also: 3DS, DAE, DXF, OBJ) to a FBX version in: 2006, 2009, 2010, 2011, 2013 (i.e. 7.3 - of course the most interesting one here), but also DXF, OBJ and Collada, with various settings (embedded media, binary/ASCII mode, etc.).

An even better option is to use directly the command-line tool bin/FbxConverter.exe, which the previous user interface actually executes. Use its /? option to get help, with interesting information.

For example, to update a file in a presumably older FBX into a 7.3 one (that Blender can import):

$ cd ~/.wine/drive_c/Program\ Files/Autodesk/FBX/FBX\ Converter/2013.3/bin
$ FbxConverter.exe My-legacy.FBX newer.fbx /v /sffFBX /dffFBX /e /f201300

We devised the update-fbx.sh script to automate such an in-place FBX update.

Unfortunately, at least on one FBX sample taken from a Unity package, if the mesh could be imported in Blender, textures and materials were not (having checked Embed media in the converter or not).

Workaround #2: Relying on Unity

Here the principle is to import a content in Unity (the same could probably be done with Godot), and to export it from there.

Unity does not allow to export for example FBX natively, however a package for that is provided. It shall be installed first, once per project.

One shall select in the menu Window -> Package Manager, ensure that the entry Packages: points to Unity Registry, and search for FBX Exporter, then install it (bottom right button).

Afterwards, in the GameObject menu, an Export to FBX option will be available. Select the Binary export format (not ASCII) if wanting to be compliant with Blender.


Here are a few samples of 3D content (useful for testing):

Asset Providers

Usually, for one's creation, much multimedia artwork has to be secured: typically graphical assets (ex: 2D/3D geometries, animations, textures) and/or audio ones (ex: music, sounds, speech syntheses, special effects).

Instead of creating such content by oneself (not enough time/interest/skill?), it may be more relevant to rely on specialised third-parties.

Hiring a professional or a freelance is then an option. This is of course relatively expensive, involves more efforts (to define requirements and review the results), longer, but it is to provide exactly the artwork that one would like.

Another option is to rely on specialised third-party providers that sell non-exclusive licences for the content they offer.

These providers can be either direct content producers (companies with staffs of modellers), or asset aggregators (marketplaces which federate the offers of many producers of any size) that are often created in link to a given multimedia engine. An interesting point is that assets purchased in these stores generally can be used in any technical context, hence are not meant to be bound to the corresponding engine.

Nowadays, much content is available, in terms of theme/setting (ex: Medieval, Science-Fiction, etc.), of nature (ex: characters, environments, vehicles, etc.), etc. and the overall quality/price ratio is rather good.

The main advantages of these marketplaces is that:

  • they favor the competition between content providers: the clients can easily compare assets and share their opinion about them
  • they generalised simple, standard, unobtrusive licensing terms; ex: royalty free, allowing content to be used as they are or in a modified form, not limited by types of usage, number of distributed copies, duration of use, number of countries addressed, etc.; the general rule is that much freedom is left to the asset purchasers provided that they use for their own projects (rather than for example selling the artwork as they are)

The main content aggregators that we spotted are (roughly by decreasing order of interest, based on our limited experience):

  • the Unity Asset Store, already discussed in the Unity Assets section; websites like this one allow to track the significant discounts that are regularly made on assets
  • the UE Marketplace, i.e. the store associated to the Unreal Engine; in terms of licensing and uses:
    • this article states that When customers purchase Marketplace products, they get a non-exclusive, worldwide, perpetual license to download, use, copy, post, modify, promote, license, sell, publicly perform, publicly display, digitally perform, distribute, or transmit your product’s content for personal, promotional, and/or commercial purposes. Distribution of products via the Marketplace is not a sale of the content but the granting of digital rights to the customer.
    • this one states that Any Marketplace products that have not been created by Epic Games can be used in other engines unless otherwise specified.
    • this one states that All products sold on the Marketplace are licensed to the customer (who may be either an individual or company) for the lifetime right to use the content in developing an unlimited number of products and in shipping those products. The customer is also licensed to make the content available to employees and contractors for the sole purpose of contributing to products controlled by the customer.
  • itch.io
  • Turbosquid
  • Free3D
  • CGtrader
  • ArtStation
  • Sketchfab
  • 3DRT
  • Reallusion
  • Arteria3D
  • the GameDev Market (GDM)
  • the Game Creator Store

Many asset providers organise interesting discount offers (at least -50% on a selection of assets, sometimes even more for limited quantities) for the Black Friday (hence end of November) or for Christmas (hence mid-December till the first days of January).

Modelling Software


Blender is a very powerful open-source 3D toolset.

Blender (version 3.0.0) can import FBX files of version at least 7.1 ("7100"). See for that our section regarding format conversions.

We recommend the use our Blender scripts in order to:

  • import conveniently various file formats in Blender, with blender-import.sh
  • convert directly on the command-line various file formats (still thanks to a non-interactive Blender), with blender-convert.sh


Wings3D is a nice, Erlang-based, free software subdivision modeler.

It can be installed on Arch Linux, from the AUR, as wings3d.

Other Tools


Draco is an open-source library for compressing and decompressing 3D geometric meshes and point clouds.

It is intended to improve the storage and transmission of 3D graphics; it can be used with glTF, with Blender, with Compressonator, or separately.

A draco AUR package exists, and results notably in creating the /usr/lib/libdraco.so shared library file.

Even once this package is installed, when Blender exports a mesh, a message like the following is displayed:

'/usr/bin/3.0/python/lib/python3.10/site-packages/libextern_draco.so' does
not exist, draco mesh compression not available, please add it or create
environment variable BLENDER_EXTERN_DRACO_LIBRARY_PATH pointing to the folder

Setting the environment prior to running Blender is necessary (and done by our blender-*.sh scripts:


but not sufficient, as the built library does not bear the expected name.

So, as root, one shall fix that once for all:

$ cd /usr/lib
$ ln -s libdraco.so libextern_draco.so

Then the log message will become:

'/usr/lib/libextern_draco.so' exists, draco mesh compression is available


f3d (installable from the AUR) is a fast and minimalist VTK-based 3D viewer.

Such a viewer is especially interesting to investigate whether a tool failed to properly export a content or whether it is the next tool that actually failed to properly import, and to gain another chance to have relevant error messages.

OpenGL Corner


Code snippets will corresponds to the OpenGL/GLU APIs as they are exposed in Erlang. These translate easily in the vanilla C GL/GLU implementations for example.

Referenced tests will be Ceylan-Myriad ones, typically located here.


  • OpenGL is a software interface to graphics hardware, i.e. an API of around 150 functions (version 1.1)
  • OpenGL concentrates on hardware-independent 2D/3D rendering; no commands for performing window-related tasks or obtaining user input are included; for example frame buffer configuration is done outside of OpenGL, in conjunction with the windowing system
  • OpenGL offers only low-level primitives organised through a pipeline in which vertices are assembled into primitives, then to fragments, and finally to pixels in the frame buffer; as such OpenGL is a building-block for higher-level engines (ex: like Godot)
  • OpenGL is a procedural (function-based, not object-oriented) state machine comprising a larger number of variables defined within a given OpenGL state (named OpenGL context; comprising vertex coordinates, textures, frame buffer, etc.); said otherwise, relatively to an OpenGL context (which is often implicit), all OpenGL state variables behave like global variables; when a parameter is set, it applies and lasts as long as it is not modified; the effect of an OpenGL command may vary depending on whether certain modes are enabled (i.e. whether some state variables are set)
  • so the currently processed element (ex: a vertex) inherits (implicitly) the current settings of the context (ex: color, normal, texture coordinate, etc.); this is the only reasonable mode of operation, knowing that a host of parameters apply when performing a rendering operation (specifying all these parameters would not be a realistic option); as a result, any specific parameter shall be set first (prior to triggering such an operation), and is to last afterwards (being "implicitly inherited"), until possibly being reassigned in some future
  • OpenGL respects a client/server execution model: an application (a specific client, running on a CPU) issues commands to a rendering server (on the same host or not - see GLX; generally the server can be seen as running on a local graphic card), that executes them sequentially and in-order; as such, most of the calls performed by user programs are asynchronous: through OpenGL they are triggered by the program and return almost immediately, whereas they have not been executed yet; they have just be queued; indeed OpenGL implementations are almost always pipelined, so the rendering must be thought as primarily taking place in a background process; additional facilities like Display Lists allow to pipeline operations (as opposed to the default immediate mode), which are accumulated for processing at a later time
  • state variables are mostly server-side, yet some of them are client-side; in both cases, they can be gathered in attribute groups, which can be pushed on, and popped from, their respective server or client attribute stacks
  • OpenGL manages two types of data, handled by mostly different paths of its rendering pipeline yet that are ultimately integrated in the framebuffer through fragment-yielding rasterization:
    • geometric data (vertices, lines, and polygons)
    • pixel data (pixels, images, and bitmaps)
  • vertices and normals are transformed by the modelview and projection matrices (that can be each set and transformed on a stack of their own), before being used to produce an image in the frame buffer; texture coordinates are transformed by the texture matrix
  • textures may reside in the main, general-purpose, client, CPU-side memory (large and slow to access for the rendering) and/or in any auxiliary, dedicated, server-side GPU memory (more constrained, hence prioritized thanks to texture objects; and high-performance, rendering-wise)
  • OpenGL can operate on three mutually exclusive modes:
    • rendering: the default, most common one, discussed here
    • selection: determines, based on stacks of user-specified "names", which primitives would be drawn into some region of a window
    • feedback: allows to captures the primitives generated by the vertex processing step(s) in order to resubmit this data multiple times

Mini OpenGL Glossary

Terms that are more or less specific to OpenGL:

  • Accumulation buffer: a buffer that may be used for scene antialiasing; the scene is rendered several times, each time jittered less than one pixel, and the images are accumulated and then averaged
  • Context: a rendering context corresponds to the OpenGL state and the connection between OpenGL and the system; in order to perform rendering, a suitable context must be current (i.e. bound, active for the OpenGL commands); it is possible to have multiple rendering contexts share buffer data and textures, which is specially useful when the application use multiple threads for updating data into the memory of the graphics card
  • Display list: a group of OpenGL commands that has been stored for subsequent execution, so that it can be sent and processed more efficiently by the graphic card
  • (pixel) fragment: two-dimensional description of elements (point, line segment, or polygon) produced by the rasterization step, before being stored as pixels in the frame buffer; also defined as: "a point and its associated information"
  • Evaluator: the part of the pipeline to perform polynomial mapping (basis functions) and transform higher-level primitives (such as NURBS) into actual ones (vertices, normals, texture coordinates and colors)
  • Frame buffer: the "server-side" pixel buffer, filled, after rasterization took place, by combinations (notably blending) of the selected fragments; it is actually made of a set of logical buffers of bitplanes: the color (itself comprising multiple buffers), depth (for hidden-surface removal), accumulation, and stencil buffers
  • GL: Graphics Library (also a shorthand for OpenGL)
  • GLU: OpenGL Utility Library, a standard part of every OpenGL implementation, providing auxiliary features (ex:image scaling, automatic mipmapping, setting up matrices for specific viewing orientations and projections, performing polygon tessellation, rendering surfaces, supporting quadrics routines that create spheres, cylinders, cones, etc.); see this page for more information
  • GLUT, OpenGL Utility Toolkit, a window system-independent toolkit hiding the complexities of differing window system APIs and more complicated three-dimensional objects such as a sphere, a torus, and a teapot; its main interest was when learning OpenGL, nowadays is less used
  • GLX: the X extension of the OpenGL interface, i.e. a solution to integrate OpenGL to X servers; see this page for more information
  • Pixel: Picture Element
  • Primitive: points, lines, polygons, images, and bitmaps
  • (geometric) Primitives: they are (exactly) points, lines, and polygons
  • Rasterization: the process by which a primitive is converted to a two-dimensional image
  • Scissor Test: an arbitrary screen-aligned rectangle outside of which fragments will be discarded.
  • Stencil Test: conditionally discards a fragment based on the outcome of a selected comparison between the value in the stencil buffer and a reference value
  • Texel: Texture Element
  • Vertex Array: these in-memory client-side arrays may aggregate 6 types of data (vertex coordinates, RGBA colors, color indices, surface normals, texture coordinates, polygon edge flags), possibly interleaved; such arrays allow to reduce the number of calls to OpenGL functions, and also to share elements (ex: vertices pertaining to multiple faces should preferably be defined only once); in a non-networked setting, the GPU just dereferences the corresponding pointers

Refer to the description of the pipeline for further details.


In 2D

A popular convention, for example detailed in this section of the Red book, is to consider that the ordinates increase when going from the bottom of the viewport to its top; then for example the on-screen lower-left corner of the OpenGL canvas is (0,0), and its upper-right corner is (Width,Height).

As for us, we prefer the MyriadGUI 2D conventions, in which ordinates increase when going from the top of the viewport to its bottom, as depicted in the following figure:

Such a setting can be obtained thanks to:

gl:matrixMode( ?GL_PROJECTION ),

% Like glu:ortho2D/4:
gl:ortho( _Left=0.0, _Right=float( CanvasWidth ),
  _Bottom=float( CanvasHeight ), _Top=0.0, _Near=-1.0, _Far=1.0 )

In this case, the viewport can be addressed like a usual (2D) framebuffer (like provided by any classical 2D backend such as SDL) obeying the coordinate system just described: if the width of the OpenGL canvas is 800 pixels and its height is 600 pixels, then its top-left on-screen corner is (0,0) and its bottom-right one is (799,599), and any pixel-level operation can be directly performed there "as usual". One may refer to gui_opengl_2D_test.erl for a full example thereof, in which line-based letters are drawn to demonstrate these conventions.

Each time the OpenGL canvas is resized, this projection matrix will have to be updated, with the same procedure yet based on the new dimensions.

Another option - still with axes respecting the MyriadGUI 2D conventions - is to operate this time based on normalised, definition-independent coordinates, ranging in [0.0, 1.0], like in:

gl:matrixMode( ?GL_PROJECTION ),

gl:ortho( _Left=0.0, _Right=1.0, _Bottom=1.0, _Top=0.0, _Near=-1.0,
  _Far=1.0 )

Using "stable", device-independent floats instead of integers directly accounting for pixels may be more convenient. For example a resizing of the viewport will then not require an update of the projection matrix. One may refer to gui_opengl_minimal_test.erl for a full example thereof.

In 3D

We will rely here as well on the MyriadGUI conventions, this time for 3D (not taking specifically time here):

These are thus Z-up conventions (the Z axis being vertical and designating altitudes), like modelling software such as Blender.

A Tree of Referentials

In the general case, either in 2D or (more often of interest here) in 3D, a given scene (a model) is made of a set of elements (ex: the model of a street may comprise a car, two bikes, a few people) that will have to be rendered from a given viewpoint (ex: a window on the second floor of a given building) onto the (flat) user screen (with suitable clipping, perspective division and projection on the viewport). Let's start from the intended result and unwind the process.

The rendering objective requires to have ultimately one's scene transformed as a whole in eyes coordinates (to obtain coordinates along the aforementioned 2D screen referential, along the X and Y axes - the Z one serving to sort out depth, as per our conventions).

For that, a prerequisite is to have the target scene correctly composed, with all its elements defined in the same (scene-global) space, in their respective position and orientation (then only the viewpoint, i.e. the virtual camera, can take into account the scene as a whole, to transform it to eye coordinates).

As each individual type of model (ex: a bike model) is natively defined in an abstract, local referential (an orthonormal basis) of its own, each actual model instance (ex: the first bike, the second bike) has to be specifically placed in the referential of the overall scene. This placement is either directly defined in that target space (ex: bike A is at this absolute position and orientation in the scene global referential) or relatively to a series of parent referentials (ex: this character rides bike B - and thus is defined relatively to it, knowing that the bike is placed relatively to the car, and that the car itself is placed relatively to the scene).

So in the general case, referentials are nested (recursively defined relatively to their parent) and form a tree [2] whose root corresponds to the referential of the overall scene, like in:


This is actually named a scene graph rather than a scene tree, as if we consider the leaves of that "tree" to contain actual geometries (ex: of an abstract bike), as soon as a given geometry is instantiated more than once (ex: if having 2 of such bikes in the scene), this geometry will have multiple parents and thus the corresponding scene will be a graph.

As for us, we consider referential trees (no geometry involved) - a given 3D object being possibly associated to (1) a referential and (2) a geometry (independently).

A series of model transformations has thus to be operated in order to express all models in the scene referential:

(local referential of model Rf) -> (parent referential Rd) -> (...) -> (Ra) -> (scene referential Rs)

For example the hand of a character may be defined in \(R_h\), itself defined relatively to its associated forearm in \(R_f\) up to the overall referential \(R_a\) of that character, defined relatively to the referential of the whole scene, \(R_s\). This referential may have no explicit parent defined, meaning implicitly that it is defined in the canonical, global referential.

Once the model is expressed as a whole in the scene-global referential, the next transformations have to be conducted : view and projection. The view transformation involves at least an extra referential, the one of the camera in charge of the rendering, which is \(R_c\), possibly defined relatively to \(R_s\).

So a geometry (ex: a part of the hand, defined in \(R_f\)) has been transformed upward in the referential tree in order to be expressed in the common, "global" scene referential \(R_s\), before being transformed last in the camera one, \(R_c\).

In practice, all these operations can be done thanks to the multiplication of homogeneous 4x4 matrices, each able to express any combination of rotations, scalings/reflections/shearings, translations, which thus include the transformation of one referential into another. Their product can be computed once, and then applying a vector (ex: corresponding to a vertex) to the resulting matrix allows to perform in one go the full composition thereof, encoding all modelview transformations and even the projection as well.

Noting \(P_{a{\rightarrow}b}\) the transition matrix transforming a vector \(\vec{V_a}\) expressed in \(R_a\) into its representation \(\vec{V_b}\) in \(R_b\), we have:

\begin{equation*} \vec{V_b} = P_{a{\rightarrow}b}.\vec{V_a} \end{equation*}

Thus, to express the geometry of said hand (natively defined in \(R_h\)) in camera space (hence in \(R_c\)), the following composition of referential changes [3] shall be applied:

\begin{equation*} P_{h{\rightarrow}c} = P_{s{\rightarrow}c}.P_{a{\rightarrow}s}.P_{f{\rightarrow}a}.P_{h{\rightarrow}f}. \end{equation*}
[3]Thus transformation matrices, knowing that the product of such matrices is in turn a transformation matrix.

So a whole series of transformations can be done by applying a single matrix - whose coordinates are now to be determined.

Computing Transition Matrices

For that, let's consider an homogeneous 4x4 matrix is in the form of:

\begin{equation*} M = \begin{bmatrix} r_{11} & r_{12} & r_{13} & t_1 \\ r_{21} & r_{22} & r_{23} & t_2 \\ r_{31} & r_{32} & r_{33} & t_3 \\ 0 & 0 & 0 & 1 \\ \end{bmatrix} \end{equation*}

It can be interpreted as a matrix comprising two blocks of interest, \(R\) and \(\vec{T}\):

\begin{equation*} P_{1\rightarrow2} = \begin{bmatrix} R & \vec{T} \\ 0 & 1 \\ \end{bmatrix} \end{equation*}


  • \(\matrix{R}\), which accounts for a 3D rotation submatrix:
\begin{equation*} R = \begin{bmatrix} r_{11} & r_{12} & r_{13} \\ r_{21} & r_{22} & r_{23} \\ r_{31} & r_{32} & r_{33} \\ \end{bmatrix} \end{equation*}
  • \(\vec{T}\), which accounts for a 3D translation vector:

\(\vec{T} = \begin{bmatrix} t1 \\ t2 \\ t3 \end{bmatrix}\)

Applying a (4x4 homogeneous) point \(P = \begin{Bmatrix} x \\ y \\ z \\ 1 \end{Bmatrix}\) to \(M\) yields \(P' = M.P\) where \(P'\) corresponds to P once it has been (1) rotated by \(\matrix{R}\) and then (2) translated by \(\vec{T}\) (order matters).

Let's consider now:

  • two referentials (defined as orthonormal bases), \(R_1\) and \(R_2\); \(R_2\) may for example be defined relatively to \(R_1\); for a given point or vector \(U\), \(U_1\) will designate its coordinates in \(R_1\) (and \(U_2\) its coordinates in \(R_2\))
  • \(P_{2\rightarrow1}\) the (homogeneous 4x4) transition matrix from \(R_2\) to \(R_1\), specified first by blocks then by coordinates as:
\begin{equation*} P_{2\rightarrow1} = \begin{bmatrix} R & \vec{T} \\ 0 & 1 \\ \end{bmatrix} \end{equation*}
\begin{equation*} = \begin{bmatrix} r_{11} & r_{12} & r_{13} & t_1 \\ r_{21} & r_{22} & r_{23} & t_2 \\ r_{31} & r_{32} & r_{33} & t_3 \\ 0 & 0 & 0 & 1 \\ \end{bmatrix} \end{equation*}
  • any (4D) point \(P\), whose coordinates are \(P_1\) in \(R_1\), and \(P_2\) in \(R_2\)

The objective is to determine \(P_{2\rightarrow1}\), i.e. \(R\) and \(\vec{T}\).

By definition of a transition matrix, for any point \(P\), we have: \(P_1 = P_{2\rightarrow1}.P_2 \qquad (1)\)

Let's study \(P_{2\rightarrow1}\) by first choosing a point \(P\) equal to the origin of \(R_2\) (shown as Ob in the figure).

By design, in homogeneous coordinates, \(P_2 = Ob_2 = \begin{Bmatrix} 0 \\ 0 \\ 0 \\ 1 \end{Bmatrix}\) and applying it on \((1)\) gives us: \(P_1 = Ob_1 = \begin{Bmatrix} t1 \\ t2 \\ t3 \\ 1 \end{Bmatrix}\).

So if \(Ob_1 = \begin{Bmatrix} XOb_1 \\ YOb_1 \\ ZOb_1 \\ 1 \end{Bmatrix}\), we have: \(\vec{T} = \vec{T_{2\rightarrow1}} = \begin{bmatrix} XOb_1 \\ YOb_1 \\ ZOb_1 \end{bmatrix}\).

Let's now determine the \(r_{xy}\) coordinates.

Let \(R_{2\rightarrow1}\) be the (3x3) rotation matrix transforming any vector expressed in \(R_2\) in its representation in \(R_1\): for any (3D) vector \(\vec{V}\), we have \(\vec{V_1} = R_{2\rightarrow1}.\vec{V_2} \qquad (2)\)

(we are dealing with vectors, not points, hence the origins are not involved here).

By choosing \(\vec{V}\) equal to the \(\vec{Ib}\) (abscissa) axis of \(R_2\) (shown as Ib in the figure), we have \(\vec{Ib_1} = R_{2\rightarrow1}.\vec{Ib_2}\)

Knowing that by design \(\vec{Ib_2} = \begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix}\), \((2)\) gives us:

\begin{equation*} \vec{Ib_1} = \begin{bmatrix} r_{11} \\ r_{21} \\ r_{31} \end{bmatrix} = \begin{bmatrix} XIb_{1} \\ YIb_{1} \\ ZIb_{1} \end{bmatrix} \end{equation*}

So the first column of the \(R\) matrix is \(\vec{Ib_1}\) , i.e. the first axis of \(R_2\) as expressed in \(R_1\).

Using in the same way the two other axes of \(R_2\) (shown as Jb and Kb in the figure), we see that:

\begin{equation*} R = R_{2\rightarrow1} \end{equation*}
\begin{equation*} = \begin{bmatrix} XIb_{1} & XJb_{1} & XKb_{1} \\ YIb_{1} & YJb_{1} & YKb_{1} \\ ZIb_{1} & ZJb_{1} & ZKb_{1} \\ \end{bmatrix} \end{equation*}


So finally the transition matrix from \(R_2\) to \(R_1\) is:

\begin{equation*} P_{2\rightarrow1} = \begin{bmatrix} R_{2\rightarrow1} & \vec{T_{2\rightarrow1}} \\ 0 & 1 \\ \end{bmatrix} = \begin{bmatrix} XIb_1 & XJb_1 & XKb_1 & XOb_1 \\ YIb_1 & YJb_1 & YKb_1 & YOb_1 \\ ZIb_1 & ZJb_1 & ZKb_1 & ZOb_1 \\ 0 & 0 & 0 & 1 \\ \end{bmatrix} \end{equation*}


  • \(R_{2\rightarrow1}\) is the 3x3 rotation matrix converting vectors of \(R_2\) in \(R_1\), i.e. whose columns are the axes of \(R_2\) expressed in \(R_1\)
  • \(\vec{T_{1\rightarrow2}} = Ob_1\) is the 3D vector of the coordinates of the origin of \(R_2\) as expressed in \(R_1\)

This also corresponds to a matrix obtained by describing the \(R_2\) referential in \(R_1\), by listing first the three (4D) vector axes of \(R_2\) then its (4D) origin, i.e. \(P_{2\rightarrow1} = \begin{bmatrix} \vec{Ib_1} && \vec{Jb_1} && \vec{Kb_1} && Ob_1 \end{bmatrix}\).

As a result, from the definition of a tree of referentials, we are able to compute the transition matrix transforming the representation of a vector expressed in any of them to its representation in any of the other referentials.

For that, like in the case of the scene-to-camera transformation, transition matrices may have to be inversed, knowing that \((P_{2\rightarrow1})^{-1} = P_{1\rightarrow2}\) (since by definition \(P_{2\rightarrow1}.P_{1\rightarrow2} = Id\)).

A special case of interest is, for the sake of rendering, to transform, through that tree, a local referential in which a geometry is defined into the one of the camera, defining where it is positioned and aimed [4]; in OpenGL parlance, this corresponds to the modelview matrix (for "modelling and viewing transformations") that we designate here as \(M_{mv}\) and which corresponds to \(P_{local{\rightarrow}camera}\).

[4]gluLookAt can define such a viewing transformation matrix, when given (1) the position of the camera, (2) a point at which it shall look, and (3) a vector specifying its up direction (i.e. where is the upward direction for the camera - as otherwise all directions orthogonal to its line of sight defined by (1) and (2) could be chosen).

Taking into account the last rendering step, the projection (comprising clipping, projection division and viewport transformation), which can be implemented as well thanks to a 4x4 matrix designated here as \(M_p\), we see that a single combined overall matrix \(M_o = M_p.M_{mv}\) is sufficient [5] to convey in one go all transformations that shall be applied to a given geometry for its rendering.

[5]In practice, for more flexibility, in OpenGL the management of viewport, or the projection and the model-view transformations are managed separately (for example, respectively, with: glViewport, glMatrixMode(GL_MODELVIEW) and glMatrixMode(GL_PROJECTION)).

More Advanced Topics


Determining the shadow of an arbitrary object on an arbitrary plane (representing typically the ground - or other objects) from an arbitrary light source (possibly at infinity) corresponds to performing a specific projection. For that, a relevant 4x4 (based on homogeneous coordinates) matrix (singular, i.e. non-invertible matrix) can be defined.

This matrix can be multiplied with the top of the modelview matrix stack, before drawing the object of interest in the shadow color (a shade of black generally).

Refer to this page for more information.

Sources of Information

Two very well-written books, strongly recommended, that are still perfectly relevant despite their old age (circa 1996):

Other elements of interest:

Operating System Support for 3D

Benefiting from a proper 2D/3D hardware acceleration on GNU/Linux is unfortunately not always straightforward, and sometimes brittle.


First, one may check whether such acceleration is already available by running, from the command-line, the glxinfo executable (to be obtained on Arch Linux thanks to the mesa-utils package), and hope to see, among the many displayed lines, direct rendering: Yes.

One may also run our display-opengl-information.sh script to report relevant information.

A final validation might be to run the glxgears executable (still obtained through the mesa-utils package), and to ensure that a window appears, showing three gears properly rotating.


If it is not the case (no direct rendering, or a GLX error being returned - typically involving any X Error of failed request:  BadValue for a X_GLXCreateNewContext), one should investigate one's configuration (with lspci | grep VGA, lsmod, etc.), update one's video driver on par with the current kernel, reboot, sacrifice a chicken, etc.

If using a NVidia graphic card, consider reading this Arch Linux wiki page first.

In our case, installation could be done with pacman -Sy nvidia nvidia-utils but requested a reboot.

Despite package dependencies and a not-so-successful attempt of using DKMS in order to link kernel updates with graphic controller updates, too often a proper 3D support was lost, either from the boot or afterwards. Refer to our software update section for hints in order to secure the durable use of proper drivers.