Prepare for LPIC-1 exam 1 - topic 104.7: Find and place system files

Overview

In this tutorial, learn about the Filesystem Hierarchy Standard (FHS). Learn to:

Recognize where to place files under the FHS
Find files and commands on your Linux system
Find other important files and directories defined in the FHS and understand their purposes

This tutorial helps you prepare for Objective 104.7 in Topic 104 of the Linux Server Professional (LPIC-1) exam 101. The objective has a weight of 2.

Bringing organization to a multitude of Linux distributions

With well over 200 Linux distributions available and several UNIX® variants, all having a lot of things in common, it is desirable to have a standard way to place files in the system. This helps new users feel more comfortable with a new system, helps experienced users manage a system, and eases the task of package distributors. The Filesystem Hierarchy Standard is a document that specifies a common layout of directories on a Linux or other UNIX-like system. Version 3.0 was announced on June 3, 2015, replacing the earlier Version 2.3 announced on January 29, 2004.

Prerequisites

To get the most from the tutorials in this series, you should have a basic knowledge of Linux and a working Linux system on which you can practice the commands covered in this tutorial. Unless otherwise noted, the examples in this tutorial use Fedora with a 4.1.3 kernel.

Sometimes different versions of a program will format output differently, so your results may not always look exactly like the listings and figures shown here. In particular, much of the output we show is highly dependent on the packages that are already installed on our systems. Your own output may be quite different, although you should be able to recognize the important commonalities.

Filesystem Hierarchy Standard

By placing files in the same general place across Linux distributions, the FHS simplifies distribution-independent software development. The FHS is also used in the Linux Standard Base (see resources on the right). The FHS allows both users and software to predict the location of installed files and directories. An FHS-compliant filesystem assumes that the operating system supports the basic security features found in most UNIX filesystems.

The two independent FHS categories

At the core of the FHS are two independent characteristics of files:

Shareable vs. unshareable
Shareable files can be located on one system and used on another, while unshareable files must reside on the system on which they are used.

Static vs. variable
Static files change only through system administrator intervention, such as installing or upgrading a package, and include documentation, libraries, and binaries. Variable files are all other files, such as logs, spool files, databases, and user data, which are subject to change by users and by system processes.

These distinctions allow files with different sets of characteristics to be stored on different filesystems. Table 1 is an example from the FHS document showing a layout that would be FHS-compliant.

Table 1. FHS example

	Shareable	Unshareable
Static	/usr /opt	/etc /boot
Variable	/var/mail /var/spool/news	/var/run /var/lock

Linux systems often contain hundreds of thousands of files. A 64-bit Fedora 22 system that I recently installed has over 170,000 files in the /usr hierarchy alone.


[root@atticf22 ~]#find /usr ‑type f | wc ‑l
174182

Most of my other installations have over 100,000 files and often 200,000 files or more. The next few sections look at tools to help you find files, particularly programs, in this vast sea of data.

Your PATH to executable files

If you have used several Linux systems, you may have noticed that if you log in as root, you are able to execute some commands which you cannot execute if you are a user. And some commands can be executed by users but not root. When you run a program at the command line, the bash (or other) shell searches through a list of directories to find the program you requested. The list of directories is specified in your PATH environment variable. On older systems, it was common for root's path to include /sbin, while non-root user paths did not. Listing 1 shows user and root path examples from three different distributions.

Listing 1. Some PATH examples


[ian@attic4‑cent ~]$ #CentOS 6
[ian@attic4‑cent ~]$ echo $PATH
/usr/lib64/qt‑3.3/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/ian/bin
[ian@attic4‑cent ~]$ su ‑
Password: 
[root@attic4‑cent ~]#echo $PATH
/usr/lib64/qt‑3.3/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin

[ian@atticf22 ~]$ #echo Fedora 22
[ian@atticf22 ~]$ echo $PATH
/usr/local/bin:/usr/local/sbin:/usr/bin:/usr/sbin:/bin:/sbin:/home/ian/.local/bin:/home/ian/bin
[ian@atticf22 ~]$ su ‑
Password: 
[root@atticf22 ~]#echo $PATH
/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin
[root@atticf22 ~]#ls ‑l /sbin
lrwxrwxrwx. 1 root root 8 Aug 16  2014 /sbin ‑> usr/sbin

ian@attic‑mint17 ~ $ #Linux Mint 17.2
ian@attic‑mint17 ~ $ echo $PATH
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games
ian@attic‑mint17 ~ $ sudo ‑s
attic‑mint17 ~ #echo $PATH
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
attic‑mint17 ~ #fortune
Command 'fortune' is available in '/usr/games/fortune'
The command could not be located because '/usr/games' is not included in the PATH environment variable.
fortune: command not found

As you can see, the PATH variable is just a list of directory names, separated by colons. Note the differences between the user path and root path order for CentOS and the two extra games directories in the Linux Mint user path. Since the fortune game command is actually located in /usr/games/fortune, the Mint root user will have to stick to work instead of playing games unless he or she provides a fully qualified path name. On systems where the /sbin directory was not included in user paths, it was common for programs such as tune2fs to only be found by the root user.

Usually, your path is set in an initialization file such as .bash_profile or .bashrc. You can change it for the current bash process by specifying a new path. Remember to export the PATH variable if you want the new value to be available to other processes that you start from this one. Suppose I am developing a "Hello World" application in the my test-hello directory. Listing 2 shows how to add it to my path and run my hello executable without having to give a fully qualified path each time.

Listing 2. Changing your PATH


[ian@atticf22 ~]$ hello
bash: hello: command not found...
[ian@atticf22 ~]$ test‑hello/hello
Hello World! 
[ian@atticf22 ~]$ export PATH=~/test‑hello:$PATH
[ian@atticf22 ~]$ hello
Hello World!

The which, type, and whereis commands

In the previous section, you saw why the hello command might not be available if you attempted to run it. However, there are several other useful commands that can help you find which command would run if you typed a command name.

The which command

You can use the which command to search your path and find out which command will be executed (if any) when you type a command. Listing 3 shows an example of finding the hello command before and after we update the path.

Listing 3. Using which


[ian@atticf22 ~]$ echo $PATH
/usr/local/bin:/usr/local/sbin:/usr/bin:/usr/sbin:/bin:/sbin:/home/ian/.local/bin:/home/ian/bin
[ian@atticf22 ~]$ which hello
/usr/bin/which: no hello in (/usr/local/bin:/usr/local/sbin:/usr/bin:/usr/sbin:/bin:/sbin:
/home/ian/.local/bin:/home/ian/bin)
[ian@atticf22 ~]$ export PATH=~/test‑hello:$PATH
[ian@atticf22 ~]$ echo $PATH
/home/ian/test‑hello:/usr/local/bin:/usr/local/sbin:/usr/bin:/usr/sbin:/bin:/sbin:
/home/ian/.local/bin:/home/ian/bin
[ian@atticf22 ~]$ which hello
~/test‑hello/hello

The which command shows you the first occurrence of a command in your path. If you want to know if there are multiple occurrences, then add the -a option as shown in Listing 4.

Listing 4. Using which to find multiple occurrences


[ian@atticf22 ~]$ which awk
/usr/bin/awk
[ian@atticf22 ~]$ which ‑a awk
/usr/bin/awk
/bin/awk

Here we find the awk command in two places: in /usr/bin (which is the main directory for commands on the system), and in /bin (which contains commands that may be used both by the system administrator and by users, but which are required when no other filesystems are mounted).

Another tutorial in this series, "Learn Linux 101: Create and change hard and symbolic links," shows you how to check that these three different files all eventually represent the same underlying gawk command as shown in Listing 5.

Listing 5. Awk commands lead to gawk


[ian@atticf22 ~]$ ls ‑l $(which ‑a awk)
lrwxrwxrwx. 1 root root 4 Jan  2  2015 /bin/awk ‑> gawk
lrwxrwxrwx. 1 root root 4 Jan  2  2015 /usr/bin/awk ‑> gawk
[ian@atticf22 ~]$ ls ‑ld /bin /usr/bin
lrwxrwxrwx. 1 root root     7 Aug 16  2014 /bin ‑> usr/bin
dr‑xr‑xr‑x. 2 root root 69632 Aug 10 10:40 /usr/bin

On this Fedora 22 system, /bin is actually a symbolic link to /usr/bin which may surprise you.

The type command

There are some commands that the which command will not find, such as shell builtins. The type command is a builtin that will tell you how a given command string will be evaluated for execution. Listing 6 uses which and type to show that the type command is not an executable found on your path, but is a shell builtin.

Listing 6. Using type


[ian@atticf22 ~]$ which type
/usr/bin/which: no type in (/usr/local/bin:/usr/local/sbin:/usr/bin:/usr/sbin:/bin:/sbin:
/home/ian/.local/bin:/home/ian/bin)
[ian@atticf22 ~]$ type type
type is a shell builtin

The whereis command

If you want more information than just the location of a program, you can use the whereis command. For example, you can find the man pages or other information, as shown in Listing 7.

Listing 7. Using whereis to find man pages


[ian@atticf22 ~]$ whereis awk
awk: /usr/bin/awk /usr/libexec/awk /usr/share/awk /usr/share/man/man1p/awk.1p.gz 
/usr/share/man/man1/awk.1.gz

Note that the copy of awk in /bin was not found by whereis. The directories used by whereis are fixed, so the command may not always find what you are looking for. The whereis command can also search for source files, specify alternate search paths, and search for unusual entries. Consult the man pages to see how to override this behavior or change the fixed paths used by whereis.

The find command

In an earlier tutorial in this series, "Learn Linux 101: File and directory management," you learned how to find files based on name (including wildcards), path, size, or timestamp. In another earlier tutorial in this series, "Learn Linux 101: Create and change hard and symbolic links," you learned how to find the links to a particular file or inode.

The find command is the Swiss Army knife of file-searching tools on Linux systems. Two other capabilities that you may find useful are its ability to find files based on user or group name and its ability to find files based on permissions.

Suppose you want to see what files a user has in the /tmp hierarchy. Listing 8 shows how the root user could find all the files for user ian in /tmp.

Listing 8. Finding files by user and group


 root@atticf22 ~find /tmp ‑user ian
/tmp/jna‑104022
...
/tmp/hsperfdata_ian
/tmp/hsperfdata_ian/2390
/tmp/tracker‑extract‑files.1000
/tmp/.esd‑1000
/tmp/.esd‑1000/socket
/tmp/.X11‑unix/X0
/tmp/.ICE‑unix/1668

You can also find files by group using the -group test. And you can find files that do not belong to any user or group on the system using the -nouser and -nogroup options. As with other tests, you can negate the test using !. I usually set my user number to 1000, as that is the default on many systems. I also create a group called ian with 1000 as the group number if the system does not automatically do that. Other systems still start at 500, or put new users in the group 'users' by default. Some of my older research material that was archived from a Red Hat 6.2 system still has user 500.

Listing 9 shows how to find some directories that are not owned by my current user group. The research/rh62/involution is owned by user 500 and group 4, neither of which exist on my current system. To find files or directories by numeric user id or group id, use the -uid or -gid tests.

Listing 9. Finding directories not owned by ian


[ian@atticf22 ~]$ find ‑L research ‑maxdepth 2 ‑type d ! ‑group ian
research/rh62/involution
research/rh62/programs
research/lost+found
find: ‘research/lost+found’: Permission denied
[ian@atticf22 ~]$ ls ‑ld research/rh62/involution
drwxr‑xr‑x. 2 500 adm 4096 Nov 10  1999 research/rh62/involution

To find files by permission, you can use the -perm test along with symbolic expressions similar to those used with the chmod or umask commands. You can search for exact permissions, but it is often more useful to prefix the permission expression with a hyphen to indicate that you want files with those permissions set, but that you don't care about other permissions. Listing 10 illustrates how to find files that are executable by user, group, and everyone, and two different ways of finding files that are not readable by others.

Listing 10. Finding files by permission


[ian@atticf22 ~]$ find . ‑maxdepth 1 ‑type f ‑perm ‑uga=x
./hello
[ian@atticf22 ~]$ find . ‑maxdepth 1 ! ‑perm  ‑o=r
.
./.ssh
./.ICEauthority
./.bash_history
./.config
...
./.kde
./.cache
[ian@atticf22 ~]$ find . ‑maxdepth 1 ! ‑perm  ‑0004
.
./.ssh
./.ICEauthority
./.bash_history
./.config
...
./.kde
./.cache

We have covered several major types of search that you can do with the find command. To further narrow your output, you can combine multiple expressions, and you can add regular expressions to the mix. To learn more about this versatile command, use the man page, or better, use info findinfofind if you have the info system installed.

Listing 11 shows a final example of searching with find. This example does a cd to /usr/include to keep the listing length manageable, then finds all files containing packet in their path name without regard to case. The second example further restricts this output to files that are not directories and that are at least 1500 bytes in size. Actual output on your system may differ depending on which packages you have installed.

Listing 11. A final example of find


[ian@atticf22 ~]$ cd /usr/include
[ian@atticf22 include]$ find . ‑iregex ".packet."
./net/if_packet.h
./linux/packet_diag.h
./linux/if_packet.h
./netpacket
./netpacket/packet.h
[ian@atticf22 include]$ find . ‑iregex ".packet." ! ‑type d ‑size +1500c
./linux/packet_diag.h
./linux/if_packet.h
./netpacket/packet.h

Note that the regular expression must match the full path returned by find, and remember the difference between regular expressions and wildcards.

The locate and updatedb commands

The find command searches all the directories you specify, every time you run it. To speed things up, you can use another command, locate, which uses a database of stored path information rather than searching the filesystem every time.

The locate command

The locate command searches for matching files in a database that is usually updated daily by a cron job.

The locate command matches against any part of a path name, not just the file name. Put the file name in single quotes and include at least one globbing character to match more precisely. Listing 12 shows how to find paths containing the string bin/ls, and shows two examples of using globbing characters to restrict the output.

Listing 12. Using locate to find paths and restrict output


[ian@attic4‑cent ~]$ #CentOS 6
[ian@attic4‑cent ~]$ locate /bin/ls
/bin/ls
/bin/lsblk
/bin/lscgroup
/bin/lssubsys
/usr/bin/lsattr
/usr/bin/lsb_release
/usr/bin/lscpu
/usr/bin/lsdiff
/usr/bin/lshal
/usr/bin/lslogins
/usr/bin/lsusb
/usr/bin/lsusb.py
[ian@attic4‑cent ~]$ locate '\/bin/ls'
/bin/ls
[ian@attic4‑cent ~]$ locate '/bin/ls*'
/bin/ls
/bin/lsblk
/bin/lscgroup
/bin/lssubsys

The updatedb command

The default database used by locate is stored in the /var filesystem, in a location such as /var/lib/locatedb. This may be different on systems that use slocate or mlocate packages to provide additional security or speed. You can find statistics on your locate database using locate -Slocate-S as shown in Listing 13.

Listing 13. Locatedb statistics


[ian@attic4‑cent ~]$ #CentOS 6
[ian@attic4‑cent ~] $locate ‑S
Database /var/lib/mloc/mlocate.db:
 77,143 directories
 776,460 files
 56,872,710 bytes in file names
 21,074,461 bytes used to store database

The database is created or updated using the updatedb command. This is usually run daily as a cron job. The file /etc/updatedb.conf, or sometimes /etc/sysconfig/locate, is the configuration file for updatedb. To enable daily updates, the root user needs to edit /etc/updatedb.conf and set DAILY_UPDATE=yes. To create the database immediately, run the updatedb command as root.

Other considerations for using locate include security considerations and network file I/O considerations for daily builds of the updatedb database. Check the man pages and updatedb configuration files for more details.

FHS directories in the root filesystem

The FHS goal is to keep the root filesystem as small as possible. However, it must contain all the files necessary to boot, restore, recover, or repair the system, including the utilities that an experienced administrator would need for these tasks. Note that booting a system requires that enough files be on the root filesystem to permit mounting of other filesystems.

Table 2 shows the purpose of the directories that the FHS requires in the root (or /) filesystem. Either the directory or a symbolic link to it must be present, except for those marked as optional, which are required only if the corresponding subsystem is present.

Table 2. FHS root filesystem

Directory	Purpose
bin	Essential command binaries
boot	Static files of the boot loader
dev	Device files
etc	Host-specific system configuration
lib	Essential shared libraries and kernel modules
media	Mount point for removable media
mnt	Mount point for mounting a filesystem temporarily
opt	Add-on application software packages
run
sbin	Essential system binaries
srv	Data for services provided by this system
tmp	Temporary files
usr	Secondary hierarchy
var	Variable data
home	User home directories (optional)
lib(less-thanqual>	Alternate format essential shared libraries (optional)
root	Home directory for the root user (optional)

/usr and /var hierarchies

The /usr and /var hierarchies are complex enough to have complete sections of the FHS devoted to them. The /usr filesystem is the second major section of the filesystem, containing shareable, read-only data. It can be shared between systems, although present practice does not often do this.

The /var filesystem contains variable data files, including spool directories and files, administrative and logging data, and transient and temporary files. Some portions of /var are not shareable between different systems, but others, such as /var/mail, /var/cache/man, /var/cache/fonts, and /var/spool/news, may be shared.

Note: Version 3.0 of the FHS moved the /var/run hierarchy to a separate /run hierarchy. The /run directory contains system information describing the system since it was booted. Files under this directory must be removed or truncated at boot time. Programs may continue to use /var/run to for the purposes of backwards compatibility but should migrate to using /run.

To fully understand the standard, read the FHS document (see resources on the right).

This completes your introduction to file placement on Linux.

27 January 2016

Tutorial

Linux
IT Infrastructure

Products

Languages

Technologies

All Events

External Resources