Linux · August 25, 2025

Mastering Linux Filesystems: A Technical Guide to Inodes and ext4 Structure

Introduction

The Linux philosophy of “everything is a file” underpins its filesystem, a critical component for managing character devices, block devices, pipes, inter-process communication, and networking. This article introduces the fundamentals of Linux filesystems, focusing on the inode structure and the ext4 filesystem’s architecture. It provides a detailed, technical breakdown for system administrators and developers seeking to understand filesystem mechanics.

Filesystem Fundamentals

Filesystems are designed to meet essential requirements:

  • Ease of Use: Enable intuitive file reading/writing while preventing naming conflicts.
  • Organization: Facilitate efficient file lookup and categorization.
  • Management: Track file usage across processes.

To address these needs, Linux filesystems incorporate:

  • Tree Structure: Organizes files in a hierarchical directory structure.
  • Caching: Accelerates access to frequently used files.
  • Indexing: Enhances file lookup efficiency.
  • Metadata Tracking: Maintains data structures to monitor file usage.

These principles form the foundation of Linux’s robust filesystem architecture.

The Inode Structure and ext4 Filesystem

1. Block Storage and Inodes

The inode (index node) is the cornerstone of Linux filesystems, representing metadata for files or directories stored on disk. It encapsulates critical information such as permissions, ownership, and block locations.

Inode Structure

The ext4_inode structure, defined in the Linux kernel, includes:

FieldDescription
i_modeFile permissions and type
i_uidOwner user ID (low 16 bits)
i_gidGroup ID (low 16 bits)
i_size_loFile size in bytes
i_blocks_loNumber of allocated blocks
i_atimeLast access time
i_mtimeLast modification time
i_ctimeLast inode change time
i_dtimeDeletion time (if applicable)
i_blockArray of block pointers (EXT4_N_BLOCKS)
  • Source: include/linux/ext4_fs.h
  • Key Field: i_block stores pointers to data blocks. In ext2/ext3, the first 12 entries (EXT4_NDIR_BLOCKS) directly reference 4KB data blocks. For larger files, indirect blocks (EXT4_IND_BLOCK, EXT4_DIND_BLOCK, EXT4_TIND_BLOCK) are used, enabling hierarchical block addressing.

Limitation: Indirect block addressing requires multiple disk accesses for large files, slowing performance.

Extents in ext4

To optimize large file access, ext4 introduces extents, a tree-based structure for contiguous block storage. The key components include:

  • ext4_extent_header: Metadata for extent trees.
    • eh_entries: Number of valid entries.
    • eh_depth: Tree depth (0 for leaf nodes in small files).
  • ext4_extent: Leaf node pointing to contiguous disk blocks.
  • ext4_extent_idx: Index node pointing to lower-level nodes.

For small files, the inode’s i_block holds a header and up to four extents, each addressing up to 128MB. For larger files, a multi-level extent tree is created, supporting files up to 42.5GB with a single 4KB block containing 340 extents. Deeper trees handle even larger files.

Source: include/linux/ext4_fs.h

Example:

struct ext4_extent_header {
    __le16 eh_magic;    /* Format identifier */
    __le16 eh_entries;  /* Number of valid entries */
    __le16 eh_max;      /* Maximum entries capacity */
    __le16 eh_depth;    /* Tree depth */
    __le32 eh_generation; /* Tree generation */
};

struct ext4_extent {
    __le32 ee_block;    /* First logical block */
    __le16 ee_len;      /* Number of blocks */
    __le16 ee_start_hi; /* High 16 bits of physical block */
    __le32 ee_start_lo; /* Low 32 bits of physical block */
};

struct ext4_extent_idx {
    __le32 ei_block;    /* Logical block index */
    __le32 ei_leaf_lo;  /* Pointer to next level */
    __le16 ei_leaf_hi;  /* High 16 bits of pointer */
    __u16 ei_unused;
};

Inode Allocation

Inodes are managed using a bitmap to track free and allocated inodes. The ext4_new_inode function locates the next free inode:

struct inode *ext4_new_inode(...) {
    ...
    inode_bitmap_bh = ext4_read_inode_bitmap(sb, group);
    ino = ext4_find_next_zero_bit((unsigned long *)inode_bitmap_bh->b_data,
                                  EXT4_INODES_PER_GROUP(sb), ino);
    ...
}

2. Filesystem Structure

The ext4 filesystem is built upon inodes and blocks, organized into higher-level structures:

  • Block Group: A unit of storage defined by ext4_group_desc, containing:
    • bg_block_bitmap_lo: Block bitmap.
    • bg_inode_bitmap_lo: Inode bitmap.
    • bg_inode_table_lo: Inode table.
  • Block Group Descriptor Table: Aggregates descriptors for multiple block groups.
  • Superblock: Stores global filesystem metadata (ext4_super_block), including:
    • Total inodes (s_inodes_count).
    • Total blocks (s_blocks_count_lo).
    • Inodes per group (s_inodes_per_group).
    • Blocks per group (s_blocks_per_group).
  • Boot Block: Reserves 1KB in the first block group for bootloader data.

Backup Strategies:

  • Default: Superblock and descriptor table backups in every block group.
  • Sparse Super: Backups in block groups 0, 3, 5, 7, and powers thereof.
  • Meta Block Groups: Groups block groups into sets of 64, each with a descriptor table covering only its own groups, optimizing space usage.

3. Directory Storage

Directories are files with inodes pointing to blocks containing file metadata (ext4_dir_entry_2):

FieldDescription
inodeInode number of the file
rec_lenEntry length
name_lenFile name length
file_typeFile type (e.g., regular, directory)
nameFile name (up to EXT4_NAME_LEN)
  • Linear Storage: Entries are stored as a list, with . (current directory) and .. (parent directory) as the first entries.
  • Indexed Storage: For large directories, setting the EXT4_INDEX_FL flag enables a hash-based index tree (dx_root):
    • dx_root_info: Tracks index levels (indirect_levels).
    • dx_entry: Maps filename hashes to data blocks.
    • Leaf nodes contain ext4_dir_entry_2 lists, enabling faster lookups.

Source: include/linux/ext4_fs.h

4. Soft and Hard Links

  • Hard Links:
    • Share the same inode as the original file.
    • Restricted to the same filesystem (inodes are filesystem-specific).
    • Created with ln source target.
  • Soft Links:
    • Independent files with their own inodes, pointing to another file’s path.
    • Support cross-filesystem linking.
    • Remain valid even if the target is deleted (become dangling).
    • Created with ln -s source target.

Conclusion

The Linux filesystem, exemplified by ext4, is a sophisticated structure built around inodes. Inodes manage file metadata and block pointers, with ext4 enhancing performance through extents. Higher-level structures like block groups, superblocks, and descriptor tables ensure efficient storage and management. Directories use linear or indexed storage for fast lookups, while links provide flexible file referencing. This architecture underscores Linux’s robust and scalable approach to file management, essential for developers and administrators alike.