8.3 Structure of the VFS

Now that we are familiar with the basic structure of the VFS and the interface to users, we turn our
attention to the implementation details. A large number of sometimes very lengthy data structures are
involved in the implementation of the VFS interface. It is therefore best to sketch out a rough overview
of the components and how they are interlinked.

 

8.3.1 Structural Overview
The VFS consists of two components — files and filesystems — that need to be managed and abstracted.

File Representation
As noted above, inodes are the means of choice for representing file contents and associated metadata. In
theory, only one (albeit very long) data structure with all the requisite data would be needed to implement
this concept. In practice, the data load is spread over a series of smaller, clearly laid out structures
whose interplay is illustrated in Figure 8-3.
No fixed functions are used to abstract access to the underlying filesystems. Instead, function pointers
are required. These are held in two structures that group together related functions.

1. Inode Operations — Create links, rename files, generate new file entries in a directory, and
delete files.
2. File Operations — Act on the data contents of a file. They include obvious operations such
as read and write, but also operations such as setting file pointers and creating memory mappings.

Other structures in addition to the ones above are needed to hold the information associated with an
inode. Of particular significance is the data field that is linked with each inode and stores either the
contents of the file or a table of directory entries. Each inode also includes a pointer to the superblock
object of the underlying filesystem used to perform operations such as the manipulation of the inodes
themselves (these operations are also implemented by arrays of function pointers, as we will see shortly).
Information on filesystem features and limits can also be provided.

Because opened files are always assigned to a specific system process, the kernel must store the connection
between the file and the process in its data structures. As discussed briefly in Chapter 2, the task
structure includes an element in which all opened files are held (via a roundabout route). This element is
an array that is accessed using the file descriptor as an index. The objects it contains are not only linked
with the inode of the corresponding file, but also have a pointer to an element of the dentry cache used to
speed lookup operations.
The individual filesystem implementations are also able to store their own data (that is not manipulated
by the VFS layer) in the VFS inode.

Filesystem and Superblock Information

The supported filesystem types are linked by means of a special kernel object that features a method of
reading the superblock. As well as key information on the filesystem (block size, maximum file size, etc.),
the superblock contains function pointers to read, write, and manipulate inodes.
The kernel also creates a list of the superblock instances of all active filesystems. I use the term active
instead of mounted because, in certain circumstances, it is possible to use a single superblock for several
mount points.4

Whereas each filesystem appears just once in file_system_type, theremay be
several instances of a superblock for the same filesystem type in the list of
superblock instances because several filesystems of the same type can be stored on
various block devices or partitions. Most systems have, for example, both a root and
a home partition, which may be on different partitions of the hard disk but
normally use the same filesystem type. Only one occurrence of the filesystem type
need appear in file_system_type, but the superblocks for both mounts are
different, although the same filesystem is used in both cases.

An important element of the superblock structure is a list with all modified inodes of the relevant filesystem
(the kernel refers to these rather disrespectfully as dirty inodes). Files and directories that have been
modified are easily identified by reference to this list so that they can be written back to the storage
medium. Writeback must be coordinated and kept to a necessary minimum because it is a very timeconsuming
operation (hard disks, floppy disk drives, and other media are very slow as compared to
other system components). On the other hand, it is fatal to write back modified data too infrequently
because a system crash (or, more likely in the case of Linux, a power outage) results in irrecoverable data
loss. The kernel scans the list of dirty blocks at periodic intervals and transfers changes to the underlying
hardware.5

posted @ 2014-09-23 09:47  诺记老周  阅读(133)  评论(0)    收藏  举报