Learn Linux, 101: Find and place system files
Where things go in the Filesystem Hierarchy Standard
In this tutorial, learn about the Filesystem Hierarchy Standard (FHS). Learn to:
- Recognize where to place files under the FHS
- Find files and commands on your Linux system
- Find other important files and directories defined in the FHS and understand their purposes
This tutorial helps you prepare for Objective 104.7 in Topic 104 of the Linux Server Professional (LPIC-1) exam 101. The objective has a weight of 2.
Bringing organization to a multitude of Linux distributions
With well over 200 Linux distributions available and several UNIX® variants, all having a lot of things in common, it is desirable to have a standard way to place files in the system. This helps new users feel more comfortable with a new system, helps experienced users manage a system, and eases the task of package distributors. The Filesystem Hierarchy Standard is a document that specifies a common layout of directories on a Linux or other UNIX-like system. Version 3.0 was announced on June 3, 2015, replacing the earlier Version 2.3 announced on January 29, 2004.
To get the most from the tutorials in this series, you should have a basic knowledge of Linux and a working Linux system on which you can practice the commands covered in this tutorial. Unless otherwise noted, the examples in this tutorial use Fedora with a 4.1.3 kernel.
Sometimes different versions of a program will format output differently, so your results may not always look exactly like the listings and figures shown here. In particular, much of the output we show is highly dependent on the packages that are already installed on our systems. Your own output may be quite different, although you should be able to recognize the important commonalities.
Filesystem Hierarchy Standard
By placing files in the same general place across Linux distributions, the FHS simplifies distribution-independent software development. The FHS is also used in the Linux Standard Base (see resources on the right). The FHS allows both users and software to predict the location of installed files and directories. An FHS-compliant filesystem assumes that the operating system supports the basic security features found in most UNIX filesystems.
The two independent FHS categories
At the core of the FHS are two independent characteristics of files:
Shareable vs. unshareable
Shareable files can be located on one system and used on another, while unshareable files must reside on the system on which they are used.
Static vs. variable
Static files change only through system administrator intervention, such as installing or upgrading a package, and include documentation, libraries, and binaries. Variable files are all other files, such as logs, spool files, databases, and user data, which are subject to change by users and by system processes.
These distinctions allow files with different sets of characteristics to be stored on different filesystems. Table 1 is an example from the FHS document showing a layout that would be FHS-compliant.
Table 1. FHS example
Linux systems often contain hundreds of thousands of files. A 64-bit Fedora 22 system that I recently installed has over 170,000 files in the /usr hierarchy alone.
[root@atticf22 ~]#find /usr ‑type f | wc ‑l 174182
Most of my other installations have over 100,000 files and often 200,000 files or more. The next few sections look at tools to help you find files, particularly programs, in this vast sea of data.
Your PATH to executable files
If you have used several Linux systems, you may have noticed that if you log in as root, you are able to execute some commands which you cannot execute if you are a user. And some commands can be executed by users but not root. When you run a program at the command line, the bash (or other) shell searches through a list of directories to find the program you requested. The list of directories is specified in your PATH environment variable. On older systems, it was common for root’s path to include /sbin, while non-root user paths did not. Listing 1 shows user and root path examples from three different distributions.
Listing 1. Some PATH examples
[ian@attic4‑cent ~]$ #CentOS 6 [ian@attic4‑cent ~]$ echo $PATH /usr/lib64/qt‑3.3/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/ian/bin [ian@attic4‑cent ~]$ su ‑ Password: [root@attic4‑cent ~]#echo $PATH /usr/lib64/qt‑3.3/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin [ian@atticf22 ~]$ #echo Fedora 22 [ian@atticf22 ~]$ echo $PATH /usr/local/bin:/usr/local/sbin:/usr/bin:/usr/sbin:/bin:/sbin:/home/ian/.local/bin:/home/ian/bin [ian@atticf22 ~]$ su ‑ Password: [root@atticf22 ~]#echo $PATH /usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin [root@atticf22 ~]#ls ‑l /sbin lrwxrwxrwx. 1 root root 8 Aug 16 2014 /sbin ‑> usr/sbin ian@attic‑mint17 ~ $ #Linux Mint 17.2 ian@attic‑mint17 ~ $ echo $PATH /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games ian@attic‑mint17 ~ $ sudo ‑s attic‑mint17 ~ #echo $PATH /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin attic‑mint17 ~ #fortune Command 'fortune' is available in '/usr/games/fortune' The command could not be located because '/usr/games' is not included in the PATH environment variable. fortune: command not found
As you can see, the PATH variable is just a list of directory names, separated by colons. Note the differences between the user path and root path order for CentOS and the two extra games directories in the Linux Mint user path. Since the
fortune game command is actually located in /usr/games/fortune, the Mint root user will have to stick to work instead of playing games unless he or she provides a fully qualified path name. On systems where the /sbin directory was not included in user paths, it was common for programs such as
tune2fs to only be found by the root user.
Usually, your path is set in an initialization file such as .bash_profile or .bashrc. You can change it for the current bash process by specifying a new path. Remember to export the PATH variable if you want the new value to be available to other processes that you start from this one. Suppose I am developing a “Hello World” application in the my test-hello directory. Listing 2 shows how to add it to my path and run my hello executable without having to give a fully qualified path each time.
Listing 2. Changing your PATH
[ian@atticf22 ~]$ hello bash: hello: command not found... [ian@atticf22 ~]$ test‑hello/hello Hello World! [ian@atticf22 ~]$ export PATH=~/test‑hello:$PATH [ian@atticf22 ~]$ hello Hello World!
The which, type, and whereis commands
In the previous section, you saw why the
hello command might not be available if you attempted to run it. However, there are several other useful commands that can help you find which command would run if you typed a command name.
The which command
You can use the
which command to search your path and find out which command will be executed (if any) when you type a command. Listing 3 shows an example of finding the
hello command before and after we update the path.
Listing 3. Using which
[ian@atticf22 ~]$ echo $PATH /usr/local/bin:/usr/local/sbin:/usr/bin:/usr/sbin:/bin:/sbin:/home/ian/.local/bin:/home/ian/bin [ian@atticf22 ~]$ which hello /usr/bin/which: no hello in (/usr/local/bin:/usr/local/sbin:/usr/bin:/usr/sbin:/bin:/sbin: /home/ian/.local/bin:/home/ian/bin) [ian@atticf22 ~]$ export PATH=~/test‑hello:$PATH [ian@atticf22 ~]$ echo $PATH /home/ian/test‑hello:/usr/local/bin:/usr/local/sbin:/usr/bin:/usr/sbin:/bin:/sbin: /home/ian/.local/bin:/home/ian/bin [ian@atticf22 ~]$ which hello ~/test‑hello/hello
which command shows you the first occurrence of a command in your path. If you want to know if there are multiple occurrences, then add the
-a option as shown in Listing 4.
Listing 4. Using which to find multiple occurrences
[ian@atticf22 ~]$ which awk /usr/bin/awk [ian@atticf22 ~]$ which ‑a awk /usr/bin/awk /bin/awk
Here we find the
awk command in two places: in /usr/bin (which is the main directory for commands on the system), and in /bin (which contains commands that may be used both by the system administrator and by users, but which are required when no other filesystems are mounted).
Another tutorial in this series, “Learn Linux 101: Create and change hard and symbolic links,” shows you how to check that these three different files all eventually represent the same underlying
gawk command as shown in Listing 5.
Listing 5. Awk commands lead to gawk
[ian@atticf22 ~]$ ls ‑l $(which ‑a awk) lrwxrwxrwx. 1 root root 4 Jan 2 2015 /bin/awk ‑> gawk lrwxrwxrwx. 1 root root 4 Jan 2 2015 /usr/bin/awk ‑> gawk [ian@atticf22 ~]$ ls ‑ld /bin /usr/bin lrwxrwxrwx. 1 root root 7 Aug 16 2014 /bin ‑> usr/bin dr‑xr‑xr‑x. 2 root root 69632 Aug 10 10:40 /usr/bin
On this Fedora 22 system, /bin is actually a symbolic link to /usr/bin which may surprise you.
The type command
There are some commands that the
which command will not find, such as shell builtins. The
type command is a builtin that will tell you how a given command string will be evaluated for execution. Listing 6 uses
type to show that the
type command is not an executable found on your path, but is a shell builtin.
Listing 6. Using type
[ian@atticf22 ~]$ which type /usr/bin/which: no type in (/usr/local/bin:/usr/local/sbin:/usr/bin:/usr/sbin:/bin:/sbin: /home/ian/.local/bin:/home/ian/bin) [ian@atticf22 ~]$ type type type is a shell builtin
The whereis command
If you want more information than just the location of a program, you can use the
whereis command. For example, you can find the man pages or other information, as shown in Listing 7.
Listing 7. Using whereis to find man pages
[ian@atticf22 ~]$ whereis awk awk: /usr/bin/awk /usr/libexec/awk /usr/share/awk /usr/share/man/man1p/awk.1p.gz /usr/share/man/man1/awk.1.gz
Note that the copy of
awk in /bin was not found by
whereis. The directories used by
whereis are fixed, so the command may not always find what you are looking for. The
whereis command can also search for source files, specify alternate search paths, and search for unusual entries. Consult the man pages to see how to override this behavior or change the fixed paths used by
The find command
In an earlier tutorial in this series, “Learn Linux 101: File and directory management,” you learned how to find files based on name (including wildcards), path, size, or timestamp. In another earlier tutorial in this series, “Learn Linux 101: Create and change hard and symbolic links,” you learned how to find the links to a particular file or inode.
find command is the Swiss Army knife of file-searching tools on Linux systems. Two other capabilities that you may find useful are its ability to find files based on user or group name and its ability to find files based on permissions.
Suppose you want to see what files a user has in the /tmp hierarchy. Listing 8 shows how the root user could find all the files for user ian in /tmp.
Listing 8. Finding files by user and group
root@atticf22 ~find /tmp ‑user ian /tmp/jna‑104022 ... /tmp/hsperfdata_ian /tmp/hsperfdata_ian/2390 /tmp/tracker‑extract‑files.1000 /tmp/.esd‑1000 /tmp/.esd‑1000/socket /tmp/.X11‑unix/X0 /tmp/.ICE‑unix/1668
You can also find files by group using the
-group test. And you can find files that do not belong to any user or group on the system using the
-nogroup options. As with other tests, you can negate the test using
!. I usually set my user number to 1000, as that is the default on many systems. I also create a group called ian with 1000 as the group number if the system does not automatically do that. Other systems still start at 500, or put new users in the group ‘users’ by default. Some of my older research material that was archived from a Red Hat 6.2 system still has user 500.
Listing 9 shows how to find some directories that are not owned by my current user group. The research/rh62/involution is owned by user 500 and group 4, neither of which exist on my current system. To find files or directories by numeric user id or group id, use the
Listing 9. Finding directories not owned by ian
[ian@atticf22 ~]$ find ‑L research ‑maxdepth 2 ‑type d ! ‑group ian research/rh62/involution research/rh62/programs research/lost+found find: ‘research/lost+found’: Permission denied [ian@atticf22 ~]$ ls ‑ld research/rh62/involution drwxr‑xr‑x. 2 500 adm 4096 Nov 10 1999 research/rh62/involution
To find files by permission, you can use the
-perm test along with symbolic expressions similar to those used with the
umask commands. You can search for exact permissions, but it is often more useful to prefix the permission expression with a hyphen to indicate that you want files with those permissions set, but that you don’t care about other permissions. Listing 10 illustrates how to find files that are executable by user, group, and everyone, and two different ways of finding files that are not readable by others.
Listing 10. Finding files by permission
[ian@atticf22 ~]$ find . ‑maxdepth 1 ‑type f ‑perm ‑uga=x ./hello [ian@atticf22 ~]$ find . ‑maxdepth 1 ! ‑perm ‑o=r . ./.ssh ./.ICEauthority ./.bash_history ./.config ... ./.kde ./.cache [ian@atticf22 ~]$ find . ‑maxdepth 1 ! ‑perm ‑0004 . ./.ssh ./.ICEauthority ./.bash_history ./.config ... ./.kde ./.cache
We have covered several major types of search that you can do with the
find command. To further narrow your output, you can combine multiple expressions, and you can add regular expressions to the mix. To learn more about this versatile command, use the man page, or better, use info find
infofind if you have the info system installed.
Listing 11 shows a final example of searching with
find. This example does a
cd to /usr/include to keep the listing length manageable, then finds all files containing
packet in their path name without regard to case. The second example further restricts this output to files that are not directories and that are at least 1500 bytes in size. Actual output on your system may differ depending on which packages you have installed.
Listing 11. A final example of find
[ian@atticf22 ~]$ cd /usr/include [ian@atticf22 include]$ find . ‑iregex ".packet." ./net/if_packet.h ./linux/packet_diag.h ./linux/if_packet.h ./netpacket ./netpacket/packet.h [ian@atticf22 include]$ find . ‑iregex ".packet." ! ‑type d ‑size +1500c ./linux/packet_diag.h ./linux/if_packet.h ./netpacket/packet.h
Note that the regular expression must match the full path returned by
find, and remember the difference between regular expressions and wildcards.
The locate and updatedb commands
find command searches all the directories you specify, every time you run it. To speed things up, you can use another command,
locate, which uses a database of stored path information rather than searching the filesystem every time.
The locate command
locate command searches for matching files in a database that is usually updated daily by a cron job.
locate command matches against any part of a path name, not just the file name. Put the file name in single quotes and include at least one globbing character to match more precisely. Listing 12 shows how to find paths containing the string
bin/ls, and shows two examples of using globbing characters to restrict the output.
Listing 12. Using locate to find paths and restrict output
[ian@attic4‑cent ~]$ #CentOS 6 [ian@attic4‑cent ~]$ locate /bin/ls /bin/ls /bin/lsblk /bin/lscgroup /bin/lssubsys /usr/bin/lsattr /usr/bin/lsb_release /usr/bin/lscpu /usr/bin/lsdiff /usr/bin/lshal /usr/bin/lslogins /usr/bin/lsusb /usr/bin/lsusb.py [ian@attic4‑cent ~]$ locate '\/bin/ls' /bin/ls [ian@attic4‑cent ~]$ locate '/bin/ls*' /bin/ls /bin/lsblk /bin/lscgroup /bin/lssubsys
The updatedb command
The default database used by
locate is stored in the /var filesystem, in a location such as /var/lib/locatedb. This may be different on systems that use slocate or mlocate packages to provide additional security or speed. You can find statistics on your locate database using locate -S
locate-S as shown in Listing 13.
Listing 13. Locatedb statistics
[ian@attic4‑cent ~]$ #CentOS 6 [ian@attic4‑cent ~] $locate ‑S Database /var/lib/mloc/mlocate.db: 77,143 directories 776,460 files 56,872,710 bytes in file names 21,074,461 bytes used to store database
The database is created or updated using the
updatedb command. This is usually run daily as a cron job. The file /etc/updatedb.conf, or sometimes /etc/sysconfig/locate, is the configuration file for
updatedb. To enable daily updates, the root user needs to edit /etc/updatedb.conf and set DAILY_UPDATE=yes. To create the database immediately, run the
updatedb command as root.
Other considerations for using
locate include security considerations and network file I/O considerations for daily builds of the
updatedb database. Check the man pages and
updatedb configuration files for more details.
FHS directories in the root filesystem
The FHS goal is to keep the root filesystem as small as possible. However, it must contain all the files necessary to boot, restore, recover, or repair the system, including the utilities that an experienced administrator would need for these tasks. Note that booting a system requires that enough files be on the root filesystem to permit mounting of other filesystems.
Table 2 shows the purpose of the directories that the FHS requires in the root (or /) filesystem. Either the directory or a symbolic link to it must be present, except for those marked as optional, which are required only if the corresponding subsystem is present.
Table 2. FHS root filesystem
|bin||Essential command binaries|
|boot||Static files of the boot loader|
|etc||Host-specific system configuration|
|lib||Essential shared libraries and kernel modules|
|media||Mount point for removable media|
|mnt||Mount point for mounting a filesystem temporarily|
|opt||Add-on application software packages|
|sbin||Essential system binaries|
|srv||Data for services provided by this system|
|home||User home directories (optional)|
|lib(less-thanqual>||Alternate format essential shared libraries (optional)|
|root||Home directory for the root user (optional)|
/usr and /var hierarchies
The /usr and /var hierarchies are complex enough to have complete sections of the FHS devoted to them. The /usr filesystem is the second major section of the filesystem, containing shareable, read-only data. It can be shared between systems, although present practice does not often do this.
The /var filesystem contains variable data files, including spool directories and files, administrative and logging data, and transient and temporary files. Some portions of /var are not shareable between different systems, but others, such as /var/mail, /var/cache/man, /var/cache/fonts, and /var/spool/news, may be shared.
Note: Version 3.0 of the FHS moved the /var/run hierarchy to a separate /run hierarchy. The /run directory contains system information describing the system since it was booted. Files under this directory must be removed or truncated at boot time. Programs may continue to use /var/run to for the purposes of backwards compatibility but should migrate to using /run.
To fully understand the standard, read the FHS document (see resources on the right).
This completes your introduction to file placement on Linux.