A Demystifying Introduction to Initrd and Initramfs

 

In Linux systems, initrd (initial ramdisk) or Initramfs are used for loading a temporary root file system into memory, to be further used as part of the Linux startup process. Both are commonly used to make preparations before the real root file system can be mounted. The initramfs is a complete set of directories that you would find on a normal root filesystem. It is bundled into a single cpio archive and compressed with one of several compression algorithms.

Why Initrd or Initramfs came in existance

  • Many Linux distributions create a single, generic Linux kernel image to boot on a wide variety of hardware.
  • The device drivers for this generic kernel image are included as loadable kernel modules because statically compiling many drivers into one kernel causes the kernel image to be much larger, to boot on computers with limited memory, sometimes even results in boot crashes.
  • Many a times, the root file system may be on a LVM, NFS (on diskless workstations), or on an encrypted partition. All of these require special preparations to mount.
  • To avoid having to hardcode handling for so many unique scenarios into the kernel, a temporary root file-system is utilized during the initial boot stage, known as early user space. In order to mount the actual root file-system, user-space tools that perform hardware detection, module loading, and device discovery can be found in this root file-system.
  • So in a broader sense, the Initramfs or Initrd has essentially one purpose: locating and mounting the real root file system so that the boot process can transition to it.
  • To elaborate in detail with example of Encrypted Partitions mounting, some system configurations like using a cryptodevices requires a user space utility to provoke the kernel to configure the devices appropriately, as they need to have a password from the user. This password requesting utility being a user space utility, could pose a chicken and egg problem i.e your rootfs contains the user space utilities, but the rootfs cannot come up till the user space utilities are available. In such cases, the Initramfs plays a mediator in between giving a temporary rootfs which has the user space utilities needed for mounting the real rootfs.

Linux boot process with Initrd or Initramfs

  • The Image of this Initial ROOT File System (along with the kernel image) is required be stored somewhere accessible by the Linux bootloader or the boot firmware of the computer.
  • This Image storage location can be the root file system itself, a boot image on an optical disc, on a small partition on a local disk (usually using ext4 or FAT file systems), or a TFTP server (for systems that can boot from Ethernet).
  • The Bootloader will start the kernel by sending in the memory address of the image after loading the kernel and initial root file system image into memory.
    • U-Boot Example: bootz ${kernel_addr_r} ${ramdisk_addr_r} ${fdt_addr}
  • The kernel attempts to determine the image’s format from its initial few data blocks at the end of the boot process, which can either be the Initrd or Initramfs.
     
  • In case of Initrd:
    • The initrd image may be a file system image (optionally compressed), which is made available in a special block device (/dev/ram) which is then mounted as the initial root file system.
    • :pencil: The device driver for this file system must be compiled statically into the kernel.
    • Many distributions use compressed ext2 file system images, while the Debian uses cramfs (Compressed ROM/RAM file system) in order to boot on memory constrained systems, since the cramfs image can be mounted in-place without requiring extra space for decompression.
    • Once the initial root file system is up, the kernel executes /linuxrc as its first process.
      • /linuxrc must be located in the root directory of the Initrd and needs to be executable which is run with root permissions by the kernel.
      • In case /linuxrc is dynamically linked, all required shared libraries from /lib must be available in Initrd also.
      • /linuxrc can also be a shell script, in that case a shell must exist in /bin.
      • In SUSE Linux, a statically-linked /linuxrc is used to keep initrd as small as possible.
    • When /linuxrc process exits, the kernel assumes that the real root file system has been mounted and executes /sbin/init to begin the normal user-space boot process.
    • On an Initrd, the new final root is mounted at a temporary mount point and rotated into place with pivot_root(8) (to change the root filesystem). This leaves the Initial root file system at a mount point (such as /initrd) where normal boot scripts can later unmount it to free up memory held by the Initrd.
       
  • For Initramfs:
    • Initramfs is available since the Linux kernel 2.6.13.
    • The Initramfs image may be a cpio archive (optionally compressed).
    • The archive is unpacked by the kernel into a special instance of a tmpfs that becomes the initial root file system.
    • :pencil: The Initramfs has the advantage over the Initrd of not requiring an intermediate file system or block device drivers to be compiled into the kernel.
    • Red Hat Linux distribution uses the dracut package to create an initramfs image.
    • With Initramfs, the kernel executes /init as its first process that is not expected to exit. The /init program is typically a shell script.
    • For some applications like Ubuntu live cds, Initramfs can use the casper utility to create a writable environment using unionfs to overlay a persistence layer over a read-only root filesystem image. For example, overlay data can be stored on a USB flash drive, while a compressed SquashFS read-only image stored on a live CD acts as a root filesystem.
    • :pencil: On an Initramfs, the initial root file system cannot be rotated away like that on Initrd using pivot_root(8). Instead, Initramfs is simply emptied and the final root file system mounted over the top.
       
  • kernel can unpack Initrd or Initramfs images compressed with gzip, bzip2, LZMA, XZ, LZO, LZ4 and zstd.
  • Initrd and Initramfs implement /linuxrc or /init as a shell script.
  • Both include a minimal shell (usually /bin/ash) along with some essential user-space utilities (usually BusyBox toolkit).
  • To further save space, the shell, utilities and their supporting libraries are typically compiled with space optimizations enabled (such as gcc -O flag) and linked against klibc, a minimal version of the libc library written specifically for this purpose.

Few more details about:

Initial RAM Disk (Initrd):

  • Initrd provides the capability to load a RAM disk Image and mount it as the Root File System by the Bootloader. Then user-space programs can be run from it.
  • Afterwards, a new larger Root File System can be mounted from a different device. The previous root (from initrd) is then moved to a directory and can be subsequently unmounted.
  • Initrd is mainly designed to allow system startup to occur in two phases, where the kernel comes up with a minimum set of compiled-in drivers, and where additional modules are loaded from initrd.
     
  • IMPORTANT: Initrd mechanism has been deprecated and the support for it is removed from Linux Kernel from 2021.

Init RAM File System (Initramfs):

  • Linux Kernel version 2.6 onwards, contain a gzipped cpio format archive, which is extracted into ROOTFS when the kernel boots up.
  • An Initramfs archive is a complete self-contained ROOTFS for Linux.
  • The Linux Kernel version 2.6 build process always creates a gzipped cpio format initramfs archive and links it into the resulting kernel binary. By default, this archive is empty (consuming 134 bytes on x86).
  • After extracting, the kernel checks to see if ROOTFS contains a file /init, and if so it executes it as PID=1. This init process is responsible for bringing rest of the system up, including locating and mounting the real root device (if any).
  • If ROOTFS does not contain an init program after the embedded cpio archive is extracted into it, the kernel will try to locate and mount a ROOT partition, then exec some variant of /sbin/init out of that.
  • Using External initramfs images:
    • If the kernel has initrd support enabled, an external cpio.gz archive can also be passed into a v2.6 kernel in place of an Initrd. The kernel will autodetect the type as Initramfs, and extract the external cpio archive as a ROOTFS before trying to run /init.
    • This approach of separately packaging of Initramfs has an Advantage that, you can run a non-GPL code from within Initramfs, without conflating it with the GPL licensed Linux kernel binary.
    • External Initramfs can supplement the kernel’s built-in initramfs image. The external archive will overwrite any conflicting files in the built-in initramfs archive.
  • Initramfs usually contains either klibc or uClibc which are C libraries designed to statically link early userspace code against, along with some related utilities. Also contains BusyBox.

Why Kernel replaced Initrd with Initramfs

  • RAMdisk vs RAMfs
    • A RAMdisk (like Initrd) is a RAM based Block device, which means it’s a fixed size of memory that can be formatted and mounted like a disk.
    • The contents of the RAMdisk have to be formatted with mke2fs and mounted with losetup tools, and like all block devices it requires a filesystem driver to interpret the data at runtime.
    • Normally these tools are kept on a file system as user-space program, but as there’s no real file system mounted at this boot stage, these tools and drivers are required to be compiled in the Kernel itself.
    • Fixed size requirement of RAMdisk puts a limitation on the size, that will either wastes the excess memory or will not be adequate for the purpose. You can't expand or shrink it without formatting again.
    • RAMdisk also wastes memory as it gets cached, as Linux caches all the files read from or written to block devices. This is the downside of the RAMdisk being treated as a block device.
    • Whereas, Initramfs is an instance of tmpfs, which automatically grow or shrink to fit the size of the data they contain. Adding files to a Initramfs (or extending existing files) automatically allocates more memory, and deleting or truncating files frees that memory.
    • With Initramfs, as there’s no block device, there’s no duplication of data between block device and cache.
    • Initramfs doesn’t need any file system driver built in the Kernel.
       
  • Few other design limitations of Initrd overcame by Initramfs:
    • The /linuxrc tries to determine the several available devices for real ROOTFS before returning the identified device number to the Kernel so the kernel could mount the real root device and execute the real init program.
    • Initrd assums that real ROOTFS will always be a Block device, rather than a network share.
    • Also Initrd was never considered to be the real ROOTFS for memory constrained embedded devices.
    • In Initrd, /linuxrc is not run like an init program with PID=1, which actually provides special properties reserved for init like it cannot be killed with kill -9 command.
    • In Initramfs, the Kernel doesn’t care where the real ROOTFS is (it’s Initramfs until init program from real ROOTFS is executed).
    • In Initramfs, the init program of Initramfs is always run as a real init, with PID=1.
    • When Initramfs init needs to hand that special Process ID off to another program, it can use the exec() syscall just like everybody else.

Use cases for Initramfs or Initrd

1. Minimizing Linux Kernel size

  • The idea here is that there’s a lot of initialization tasks done in the Linux Kernel that could be just as easily done in userspace.
  • Many modules required for mounting different typesf of ROOTFS like NFS, LVM, etc can be offloaded to Initramfs or Initrd instead of compiling all of them into the kernel, as not all of these modules will be required for every type of ROOTFS mounting.

2. Mounting Real ROOTFS

  • Some Linux distributions like Debian generates a customized Initrd image which contains only necessary stuffs to boot some particular computer, such as ATA, SCSI and filesystem kernel modules. Such images typically embed the location and type of the root file system.
  • Other Linux distributions like Ubuntu generates a generic Initrd image. These images start only with the device name of the root file system (or its UUID) and discovers everything else at boot time. In this case, the software must perform a complex cascade of tasks to get the root file system mounted. Few examples of what else would be required to be done to boot the ROOTFS are:
    • Load all storage device drivers that are required by the boot process. For this, kernel modules of common storage devices can be added to Initrd image and then an event-driven hotplug agent like udev can be used to load the modules matching the computer’s detected hardware.
    • If Boot Splash Screen is required, the video hardware must be initialized along with user-space utilities to paint the splash screen animation during the boot process.
    • If the NFS needs to be accessed, Initrd must bring up the primary network interface, invoke a DHCP client, to obtain a DHCP lease, extract the name of the NFS shared location and the address of the NFS server from the lease, and mount the NFS shared location.
    • If root file system is on a logical volume, the LVM utilities must be invoked to scan for the activate the volume group containing it.
    • If root file system is on an encrypted block device, the Initrd needs to invoke a utility script to prompt the user to type in a passphrase and/or insert a hardware token (such as a smart card or a USB security dongle), and then create a decryption target with the device mapper.

3. Maintenance Tasks on ROOTFS before mounting

  • When the Root file system finally becomes visible, any maintenance tasks that cannot be ran on a mounted Root file system are required to be done, the Root file system is mounted read-only, and any processes that must continue running (such as the splash screen utility) are hoisted into the newly mounted root file system.

4. Recovery disks creation

  • Initrd or Initramfs can be used to create a Recovery Disk Image or Backup a Partition.
  • Because with Initrd or Initramfs mechanism, the real ROOTFS is not required to be loaded, so the backup of the ROOTFS can be created and saved to some external device like USB/CD-ROM.
  • The system loaded from initrd can invoke a user-friendly dialog or it can also perform some form of auto-detection for available backup media.

5. Linux Installers

  • Linux distribution Installers typically run entirely from an Initramfs, as they must be able to host the installer interface and supporting tools before any persistent storage has been set up.
  • CD-ROM distributors may use Initrd for better installation from CD, by bootstrapping a bigger RAM disk via initrd from CD; or by booting via a bootloader like loadlin or directly from the CD-ROM.
  • Tiny Core Linux (TCL) and Puppy Linux can run entirely from Initrd.

References