GUI Under Linux
1. Overview
In this article, we’ll go through the graphics stack used in Linux-based operating systems. We’ll see the different technologies that make graphical applications possible and how they interact with one another. We’ll start from the ground up and lead our way to the high-level GUI toolkits.
Finally, we’ll discuss how these technologies fit together to form a fully-fledged graphical experience.
2. Linux at Its Core
The name “Linux” merely refers to the Linux kernel. It’s not a complete operating system that contains everything out of the box, but rather a kernel around which everything is set up. The kernel is the interface between actual hardware and processes.
If we build and install a Linux kernel on a machine alongside helper tools and utilities, we can get very primitive graphics through Kernel Mode Setting in the virtual terminal but not complex graphics such as program windows, visual effects, and images with fancy gradients. To make Linux work with complex graphics, we’ll need a complete graphical stack including graphics drivers, graphics API wrappers, a window system, a compositor, and more.
So, most Linux-based operating systems like Ubuntu, Debian, and openSUSE have this graphical stack already packed into their distributions. Therefore, we have access to the graphical environment out of the box. However, if we were to use any other operating system like Arch, LFS, Gentoo, or Alpine, we’d need to configure the graphics stack manually in order to be able to have access to the graphical environment.
3. Graphics API
There are several popular graphics APIs such as OpenGL, OpenGL ES, Metal, Direct3D, and Vulkan. However, most of the graphics stack on Linux makes heavy use of OpenGL because it’s free and cross-platform.
3.1. The OpenGL API
To be precise, OpenGL is not exactly a library because each vendor has to implement the specification to produce an OpenGL library. Therefore, on Linux-based distributions, the libGL.so library file will be different for each vendor. In addition, there are multiple implementations of the OpenGL specification, including both third-party and open-source implementations.
3.2. Mesa
Mesa is the open-source implementation of the OpenGL API and Vulkan. It uses card-specific drivers to translate the API into a hardware-specific form. In addition, Mesa supports the Gallium3D architecture for building 3D graphics drivers, which allows portability to all major operating systems.
Modern display servers and window managers like X.Org and Wayland use OpenGL internally, so all the graphics go through Mesa.

3.3. GLES
3.4. GLX and WGL
Similar to GLX, WGL is the interface for OpenGL and the native window system of Microsoft Windows.
3.5. EGL and GLUT
Unlike EGL, GLUT is a wrapper around GLX and WGL, enabling us to write portable graphics applications.
3.6. fglrx and Catalyst
Catalyst is the AMD’s Xorg OpenGL driver, which went by the name fglrx. It’s a proprietary driver for the X.Org and has its own implementation of the OpenGL specification.
4. DRM and DRI
On Linux, we have libdrm, which makes it easy to access the DRM on the operating system. DRM uses a set of generic system ioctls to allocate memory for the graphical objects and stuffs the commands and texture it needs. The ioctl system is a special type of system call that deals with device-specific input and output operations. In this case, it deals with the input and output operations of a video card.
So, when we run a graphical application, it loads the OpenGL driver — for example, Mesa. The driver, in turn, loads libdrm, which enables talking directly to the kernel through ioctl.
So, this process goes on as long as the graphical application is running. However, we need a way to let the window system, such as the X Server, know what’s happening so it can synchronize and update itself. This synchronization process is known as Direct Rendering Infrastructure (DRI).

4.1. KMS
The graphical applications work great when we have a running X Server or a compositor. So, what about the graphics that run outside of the X Server, like the virtual terminal and the loading splash screen? This is where the Kernel Mode Settings subsystem comes in.
KMS is a subsystem in the Linux kernel and libdrm that enables us to directly configure the actual hardware through ioctls. For that reason, we don’t have to rely on the X Server. However, we should note that KMS is a very low-level subsystem and should only be used when a graphics server or a compositor cannot be run.
5. The X Window System
The X Window System is an open-source windowing system that is used by most Linux-based distributions. It’s based on the client-server architecture, which provides a network-transparent way to interact with windows that can also be used in remote environments.
Not only does it provide the fundamental framework for GUI environments, but it also carries out event handling and visual decorations.
5.1. X11
Since the X Window System is based on a client-server architecture, the client and the server needn’t be on the same machine. For that reason, we need a protocol that carries the message between the client and the server. The X11 protocol is responsible for messages delivery. When the client and the server are on the same machine, the messages are exchanged through UNIX sockets.
Apart from that, X11 is extensible. So, it’s easy to add new features without creating a new protocol or breaking the existing clients. One of the most useful extensions is XRender, which adds support for anti-aliased drawings.
5.2. Xlib and XCB
The X Library, or Xlib, is the client-side implementation. This library, in turn, is used by graphical toolkits like GTK+ and QT to create the graphical front-end for the software application.
XCB or X C-language Binding is also the client-side implementation of X. However, it’s on a much lower level than Xlib, and parts of the Xlib use XCB for some features.
5.3. X.Org Server
X.Org is the server-side implementation of the X Window System. It’s the most commonly used display server on Unix-like systems. The X.Org Server is typically started by a display manager or manually from the virtual terminal.

6. Cairo
While we can use Cairo directly, we primarily use it in drawing toolkits like GTK+. It also has support for rendering through OpenGL.
6.1. Pixman
The X server and Cairo each had their own implementation for pixel-level manipulation, which resulted in bloated code. To resolve this issue, Pixman was developed. Pixman is the shared library for X server and Cairo that provides rasterization algorithms, gradients support, and more.
7. Compositor
Each frame from each running window goes through the compositor. The compositor grabs the pixmap of the windows from the X server and renders it onto the OpenGL scene.

8. Wayland
Wayland is the new intended replacement for X. Wayland doesn’t rely on the client-server architecture. So, instead of relying on a server, it acts as the window manager or a compositor for the graphical applications that handle events through evdev and display windows using the same stack we discussed. Wayland’s protocol is also based around UNIX sockets.
The Wayland client requests a buffer from the compositor and draws into it using OpenGL, Cairo, or any other rendering module. The compositor can easily manipulate the buffer for visual effects before handing it over to the client. So, in a sense, the compositor is the server and the compositor.

8.1. XWayland
XWayland provides an X Server that runs under Wayland. Therefore, it’s a compatibility package for X applications during the transition to Wayland. However, it adds an extra layer between the X client and the X server since the messages are passed through the Wayland compositor.
9. GUI Toolkits
A GUI toolkit or GUI library contains the required functionality needed to create graphical interfaces and elements such as widgets, scenes, and event handlers. Some GUI toolkits are fully-featured frameworks that provide widgets, a graphical designer, and a development environment.
The GUI toolkit library is usually a wrapper around a low-level library such as Xlib or XCB. Therefore, it provides an easier way for us to develop graphical applications with additional styling and behavior.
Moreover, most mature GUI toolkits have an opinionated design. In other words, they implement their own markup languages, event systems, and state machines. The state machine is responsible for the management of complex programs that are reactive in nature.
There are a variety of GUI toolkits, each serving its own purpose. The most popular in use today are GTK+, Qt, wxWidgets, FLTK, and imgui. Let’s take a closer look at a couple of them.
9.1. GTK+
GTK+ or GIMP Toolkit is the toolkit of choice for Unix-like operating systems that use the X Window System and Wayland. It’s a stable GUI toolkit for building cross-platform and modern GUI applications. Some of the most popular GUI programs are developed with GTK, including GIMP, Mozilla Firefox, GNOME Desktop Environment, Inkscape, and Pidgin.
The GTK toolkit has subsystems for other backends as well, such as GDI for Windows and Quartz for macOS, which means that we can also develop programs for other platforms. Moreover, there are independent projects that provide additional programs to ease the development of GTK programs. One such project is Glade. Glade provides a graphical interface to easily design the program front-end. Some projects like Firefox have their own customized fork of GTK.
Although it’s originally written in the C language, there are also bindings for other languages such as GTKmm for C++, PyGObject for Python, and gotk3 for Golang.
9.2. Qt
Qt is a cross-platform application development framework that provides a widget library and a complete set of additional functionality. Unlike GTK, Qt has its own designer called QtDesigner and an integrated development environment called QtCreator.
Additionally, it defines its own implementation of networks, web sockets, multimedia, SQL, XML, and a web engine. We can easily port Qt programs to other platforms with little to no changes in the source code. Therefore, it’s the toolkit of choice for software that targets both embedded and desktop systems.
The Qt framework is thoroughly implemented in the C++ language. However, there are bindings for other languages as well. It defines its own markup language, QML, which is syntactically similar to CSS. For that reason, it gives us a great deal of power to customize the widgets to our liking.
Some popular applications written in Qt include Autodesk Maya, Autodesk 3ds Max, Crytek CryEngine, DaVinci Resolve, and the K Desktop Environment (KDE).
10. The Role of OpenGL
As we saw, there’s no official native toolkit or a GUI library that is a silver bullet for developing a GUI application under Linux. We saw that each module in the graphics stack serves a unique purpose.
For the most part, in the graphics stack, we saw that all the graphics instructions pass through OpenGL. Therefore, it’s safe to say that we can develop a standalone application solely based on OpenGL. So, one can consider OpenGL to be kind of a native tool for developing graphical applications on Linux.
As an example, we can take a look at Blender, which is a cross-platform 3D modeling and animation tool. Blender doesn’t rely on other GUI toolkits and low-level client libraries, but it has its own widget library that is based entirely on OpenGL.
11. Conclusion
In this article, we discussed the graphics stack used on Unix-like operating systems. We started with the low-level components and worked our way to the high-level GUI toolkits that use the entire stack to render the graphics.
Finally, we briefly discussed the role of OpenGL as a native tool that is responsible for rendering graphical applications.
Comments are closed on this article!

浙公网安备 33010602011771号