Make the most of large drives with GPT and Linux
Preparing for future disk storage with the GUID Partition Table
Before embarking on a quest to replace your hard disk’s partitioning scheme, it’s helpful to review the limitations that are forcing this change. Understanding these limits—and the proposed tools for overcoming them—will enable you to judge how quickly you should jump ship from the master boot record (MBR) to the GUID Partition Table (GPT), particularly if you’re considering adopting GPT before new disk purchases force your hand. GPT offers advantages over MBR even on smaller disks, but you must balance those advantages against the difficulties of a switch.
Understanding the MBR’s limits
The MBR partitioning system is a hodge-podge of data structure patches applied to overcome earlier limits. The MBR itself resides entirely on the first sector (512 bytes) of a hard disk. The first 440 bytes of the MBR are devoted to code: the boot loader. The basic input/output system (BIOS) reads this code and executes it when the computer boots.
Following the code area, the MBR stores data about four partitions, known as primary partitions. Each partition is described in two ways: using cylinder/head/sector (CHS) notation and using logical block addressing (LBA) notation. The CHS notation is almost a historical footnote today, because it’s a 24-bit number. This means that it’s limited to describing areas of about 8GB in size. The 32-bit LBA values permit 2TiB sizes, assuming a sector size of 512 bytes. This 2TiB ceiling is not easily overcome; there simply aren’t any unallocated fields left in the MBR that could be used to add more bits to the LBA addresses.
In addition to the looming 2TiB problem, the MBR presents other difficulties. Chief among these is the limitation of four primary partitions. To work around this limitation, it’s possible to set aside one primary partition as a placeholder (known as an extended partition) to hold an arbitrary number of additional partitions, known as logical partitions. This is, however, an ugly workaround that creates its own problems, such as difficulties installing multiple operating systems when too many of them want too many primary partitions to themselves.
The MBR has data-integrity problems, as well. It is a single data structure that’s vulnerable to damage by carelessness or hardware failure. In addition, because logical partitions are defined in a linked-list structure, damage to one of them can block access to the remaining logical partitions. None of these data structures includes any form of error-detection capability, so damage can be difficult to spot.
Intel® created the GPT definition as part of its Extensible Firmware Interface (EFI) specification for a BIOS replacement (see resources on the right for links to more information). Despite the fact that GPT is part of a standard that’s meant to replace the legacy BIOS, it’s possible to use GPT even on BIOS-based systems. If your computer uses EFI, this fact is another plus to GPT adoption. Whether your computer uses a legacy BIOS or an EFI, GPT fixes many of the MBR’s limitations:
- GPT uses LBA exclusively, so CHS headaches are gone.
- Disk pointers are 64 bits in size, meaning that GPT can handle disks of up to 512 x 2 64 bytes (8 zebibytes, or 8.6 billion TiB), assuming 512-byte sectors.
- GPT data structures are stored twice on the disk: once at the start and again at the end. This duplication improves the odds of successful recovery in case of damage from an accident or a bad sector.
- Cyclic redundancy check values are computed for critical data structures, improving the odds of detection of data corruption.
- GPT stores all partitions in a single partition table (with backup), so there’s no need for extended or logical partitions. By default, 128 partitions are supported, although you can change the partition table size if the partitioning software supports such changes.
- Whereas MBR provides a 1-byte partition type code, GPT uses a 16-byte globally unique identifier (GUID) value to identify partition types. This makes partition-type collisions less likely.
- GPT enables storing a human-readable partition name. You can use this field to name your Linux® /home, /usr, /var, and other partitions for easier identification within partitioning software.
The first sector of the disk is reserved for a protective MBR, which is a legal MBR data structure that defines a single partition of type
0xEE (EFI GPT). On sub-2TiB disks, this partition should span the entire disk; on larger disks, it should be 2TiB in size. The idea is to protect the GPT disk from damage by GPT-unaware disk utilities. If such tools look at the disk, they’ll see an MBR disk with no free space. (Some disk utilities can create a hybrid MBR, which defines up to three MBR partitions in addition to the EFI GPT partition. The idea is to enable a GPT-unaware operating system, such as most pre-Windows Vista® versions of Windows®, to coexist on a disk along with GPT partitions. This configuration is decidedly non-standard and kludgy, though.)
Because GPT incorporates a protective MBR, a BIOS-based computer can boot from a GPT disk using a boot loader stored in the protective MBR’s code area, but the boot loader and operating system must both be GPT aware. (Some buggy BIOSes have problems booting from GPT disks, though.) EFI provides its own boot methods, so you can boot from a GPT disk on an EFI-based system.
The main problem with GPT is one of compatibility: Low-level disk utilities and operating systems must all support GPT. Such support is fairly common for Linux, although you may need to attend to some of these details and change some of the tools you use for low-level disk maintenance. If you multi-boot a computer, you’ll have to look into GPT support for all of your operating systems.
If you administer many Linux systems, or if you anticipate adding an over-2TiB disk in the not-too-distant future, you may want to consider doing a test installation with GPT. Doing so before you’re forced to do it will give you first-hand experience with GPT’s features as well as with the quirks of some of the GPT-aware Linux utilities.
It’s possible to run a system with a mixture of MBR and GPT disks. For instance, you can boot from an MBR disk but still use GPT for a data disk. Such a configuration is most useful for Windows on BIOS-based systems, because Windows can’t boot from GPT using BIOS, but Windows Vista and later Microsoft operating systems can use a GPT data disk.
Three main classes of software all require GPT support: the kernel, the boot loader, and low-level disk utilities. If you’re using GPT because you’re setting up a large redundant array of independent disks (RAID) array, you may also need to look into file system support for extra-large disks.
Note: If you’re installing Linux from scratch and want to use GPT, your installer must provide GPT support in all three of these categories. In 2012, this support is present in all the major Linux distributions.
The Linux kernel must provide GPT support to provide access to data on the disk’s partitions. Fortunately, this support has long been present in Linux. If you compile your own kernel, be sure to select EFI GUID Partition Support in the Partition Types area of the Enable the Block Layer configuration area, as shown in Figure 1. (This item used to be located under File Systems, so look there if you’ve got an older kernel.)
Figure 1. The Linux kernel provides GPT support, but it must be enabled when you compile a new kernel
If you don’t compile your own kernel, you’re at the mercy of your distribution provider to enable this support. Fortunately, most do so. If you’re in doubt, you can use a GPT-aware partitioning tool to set up GPT partitions on a test disk. If Linux recognizes the partitions, then your kernel is properly configured.
Boot loader support
Boot loader support for GPT is variable and depends on your computer’s firmware type. Under BIOS, only the Grand Unified Bootloader (GRUB) 2 officially supports GPT. Most Linux distributions today use GRUB 2 as the default boot loader, but some continue to use the older GRUB Legacy. GRUB Legacy doesn’t officially support GPT, but patched versions with GPT support are readily available. The still-older Linux Loader (LILO) doesn’t explicitly support GPT, but its disk-addressing methods are based on sector locations, so it often does work (in practice).
If you use GRUB 2 on a BIOS-based computer, be sure to create a BIOS Boot Partition, which holds GRUB’s second-stage code. (This partition is identified as having its
bios_grub flag set under GNU Parted or as being of type
gdisk.) The BIOS Boot Partition can be as small as 32KiB in some configurations, although it must sometimes be a bit larger. Given modern partition alignment policies, a size of 1MiB is common.
If your computer uses EFI, any EFI-capable boot loader will work with GPT; but EFI boot loader selection for Linux is tricky. As of mid-2012, some boot loaders remain unreliable or have system-specific quirks. In my experience, the Linux kernel’s EFI stub loader (introduced with the 3.3.0 kernel) is the most reliable, followed by the EFI LILO (ELILO), a heavily patched version of GRUB Legacy used by Fedora, and finally GRUB 2. In addition to the boot loader, you might need a separate boot manager to enable operating system selection, particularly if you dual-boot and use the kernel’s EFI stub loader or ELILO to boot Linux. Two common choices for this task are rEFIt and rEFInd, the latter being a more up-to-date fork of the former. (Note that I maintain rEFInd.) See resources on the right for links to all of these programs.
EFI requires the presence of an EFI System Partition (ESP) to boot. (Macs are a partial exception to this rule, although they ship with ESPs defined.) The ESP should contain a FAT32 file system. The EFI standard doesn’t specify a size, but something between 100MiB and 500MiB usually works well. If you use the Linux kernel’s EFI stub loader or ELILO, you may need to store your kernel on the ESP, so creating an ESP on the large end of the scale is advisable.
The third area of GPT support is system utilities. Linux provides three main families of partitioning tools, with varying support for GPT:
fdiskfamily. These programs (
sfdisk) are text-mode tools that can handle MBR and some more exotic partition tables, but they can’t handle GPT.
- GNU Parted (
libparted). The GNU Parted project provides a library (
libparted) and a text-mode utility (
parted) for partitioning. Several graphical user interface (GUI) utilities are built atop
libparted, as well. The
libpartedlibrary can handle MBR, GPT, and several other partition table types.
fdisk. This family (
sgdisk) is modelled after the
fdiskfamily but works on GPT disks. (Note that I’m the author of GPT
As a general rule, tools based on GNU Parted—and particularly GUI tools such as GParted or the Palimpsest Disk Utility—are the easiest to use; however, GPT
fdisk (and particularly
gdisk) provides access to more GPT features. Thus, you might want to use GParted or other GUI tools to set up your disks but use GPT
fdisk to fine-tune your configuration or repair damage to a GPT disk.
If you want to create fresh GPT partitions on a disk using GNU Parted, you should launch the program, then use its
mklabel command, as in Listing 1.
Listing 1. Using GNU Parted to create GPT disk partitions
#parted /dev/sdd GNU Parted 3.1 Using /dev/sdd Welcome to GNU Parted! Type 'help' to view a list of commands. (parted) mklabel New disk label type? gpt (parted)
At this point, you can begin creating partitions using GNU Parted’s
mkpart command or otherwise manipulate partitions. The process is similar to that of managing MBR partitions with
parted, with a few twists. For instance, there’s no need to specify a partition type as primary or logical; but you can enter a name for the partition.
gdisk is similar to using
fdisk. Launching the program on a blank disk creates a new GPT, as in Listing 2.
Listing 2. Using gdisk to create GPT disk partitions
#gdisk /dev/sdd GPT fdisk (gdisk) version 0.8.4 Partition table scan: MBR: not present BSD: not present APM: not present GPT: not present Creating new GPT entries. Command (? for help):
gdisk commands for creating and manipulating partitions are similar to those used in
fdisk, such as
n to create a partition. As with
parted on GPT disks, there’s no need to specify a partition as primary, extended, or logical. Type codes in
gdisk are based on MBR type codes but multiplied by 0x100—for instance, a Linux swap partition is of type
0x82 in MBR and
gdisk. You can set a partition’s name with the
c command or perform more advanced operations as described in
gdisk‘s man page and online documentation.
Whether you use
gdisk, when you’re done, you can use the normal Linux file system management tools, such as
mkfs, to create file systems on your disk. You can also create logical volume management and RAID configurations much as you would on MBR disks.
If you prefer a GUI tool, the Gnome Partition Editor (GParted) will do the job. Click Device > Create Partition Table to create a new GPT data structure. Click Advanced, then select gpt from the Select new partition table type list, as in Figure 2. Click Apply to create your new GPT data structures. You can then create new partitions in the same way you would if you were manipulating an MBR disk.
Figure 2. You must explicitly set the GPT partition table type to create GPT partitions in the Gnome Partition Editor
The GPT creation tools of both GNU Parted and GParted are inherently destructive: If you’ve got an MBR disk, the only way to turn it into a GPT disk with these tools is to destroy your existing MBR partitions. If you want to convert an MBR disk in place, GPT
fdisk does so automatically when you launch it. Be aware, though, that this conversion renders a BIOS boot disk unbootable until the boot loader is re-installed.
Linux employs a handful of MBR partition type codes, such as
0x83, to identify its MBR partitions. Similar GUID codes exist to identify Linux GPT partitions. One important caveat is that Linux has traditionally used the same GUID code as Windows for its data partitions. Thus, it’s impossible to differentiate Linux partitions and NTFS file system or FAT partitions from their partition table GUIDs alone. This is unimportant on a Linux-only system, but if you dual-boot Windows and Linux on an EFI-based computer or if you create Linux partitions on a removable disk and use it in Windows, the result is that your Linux partitions appear to be uninitialized partitions in Windows, and Windows may ask whether you want to format the partitions if you try to access them. You can correct this problem in
gdisk by giving your Linux partitions a
gdisk type code of
8300. This new type code should be supported by
libparted in the future, but it hadn’t been implemented as of
libparted version 3.1.
Large file systems support
If you’re switching to GPT because you’re using a large RAID configuration, you may need to investigate support for large file system sizes in the file systems you deploy. Table 1 summarizes these limits. (Note that some values vary with partitioning options.) Some of these values are quite large and use suffixes that may be unfamiliar—for example, 1TiB is 1024GiB, 1 pebibyte (PiB) is 1024TiB, 1 exbibyte (EiB) is 1024 petabytes, and 1 zebibyte is 1024PiB.
Table 1. File system volume and size limits
|File system||Maximum volume size||Maximum file size|
|Second extended file system (ext2) and third extended file system (ext3)||16TiB||2TiB|
|Fourth extended file system (ext4)||1EiB||16TiB|
|Journaled file system (JFS)||32PiB||4PiB|
|B-tree file system (Btrfs — under development)||16EiB||16EiB|
Beyond the file- and volume-size limits, there are file system performance differences. This topic is extremely complex, so you may need to consult with others who run setups similar to the one you’re planning.
GPT partitioning advice
Some special concerns crop up for GPT partitioning, particularly if your computer uses EFI or you run in a multi-boot environment:
- EFI requires an ESP, as noted earlier, on any boot disk.
- Also as noted earlier, you should create a BIOS Boot Partition if you plan to boot from GPT on a BIOS-based computer.
- Many GPT partitioning tools create gaps of about 128MiB after each partition (the ESP is an exception to this rule). The intention is that disk utilities can use this space to help with their jobs.
- On Mac OS X systems, partitions are created in sizes that are multiples of 4KiB (typically, eight sectors). This feature relates to limitations of the HFS Plus file system that most modern Macs use.
You can follow these partitioning rules or ignore them as you see fit. Linux is flexible enough that it won’t be bothered by a disregard for these rules, unless your computer requires an ESP or BIOS Boot Partition to boot.
One other rule isn’t GPT specific but is important on most large disks produced since early 2010: These disks use 4KiB physical sectors but 512-byte logical sectors. This discrepancy creates potentially severe performance issues if partitions aren’t aligned on physical sector boundaries. Partitioning tools released since late 2010 generally handle this well, but if you’re using older tools, be sure to create properly aligned partitions.
GPT is becoming the standard for hard disk partitioning because of the size limitations of the older MBR. Fortunately, Linux is well prepared for this transition. Although Linux users may have to give up certain tools (such as
fdisk), other tools are available to take their place (
libparted and GPT
fdisk, for instance). Understanding the requirements will help you make the transition easily when the time comes to do so. You’ll need to attend to your kernel configuration, your boot loader configuration, and the utilities you use to create and manage partitions.