Archive

Posts Tagged ‘procfs’

Almost Everything is Represented by a File

May 25, 2011 Leave a comment

Device Files

Other than your standard files for documents, executables,  music, databases, configuration and what not, Unix based operating systems have file representations of many other types. For example all devices/drivers have what appears to be standard files.

In a system with a single SATA hard drive, the drive might be represented by the file /dev/sda. All it’s partitions will be enumerations of this same file with the number of the partition added as a suffix, for example the first three primary partitions will be /dev/sda1, /dev/sda2 and /dev/sda3.

When accessing these files you are effectively accessing the raw data on the drive.

If you have a CDROM drive, there will be a device file for it as well. Most of the time a symlink will be created to this device file under /dev/cdrom, to make access to the drive more generic. Same goes for /dev/dvd, /dev/dvdrw and /dev/cdrw the last 2 being for writing to DVDs or CDs. In my case, the actual cd/dvdrom device is /dev/sr0, which is where all of these links will point to. If I wanted to create an ISO image of the CDROM in the drive, I would simply need to mirror whatever data is available in the “file” at /dev/cdrom, since that is all an ISO image really is (a mirror of the data on the CDROM disc).

So to create this ISO, I can run the following command:

dd if=/dev/cdrom of=mycd.iso

This command will read every byte of data from the CDROM disc via the device file at /dev/cdrom, and write it into the filesystem file mycd.iso in the current directory.

When it’s done I can mount the ISO image as follows:

mkdir /tmp/isomount ; sudo mount -o loop mycd.iso /tmp/isomount

Proc Filesystems

On all Linux distributions you will find a directory /proc, which is a virtual filesystem created and managed by the procfs filesystem driver. Some kernel modules and drivers also expose some virtual files via the proc filesystem. The purpose of it all is to create an interface into parts of these modules/drivers without having to use a complicated API. This allows scripts and programs to access it with little effort.

For instance, all running processes have a directory named after it’s PID in /proc. These can be identified by all the numbers between 1 and 65535 in the /proc directory. To see this in action, we execute the ps command and select an arbitrary process. For this example we’ll pick the process with PID 16623, which looks like:
16623 ? Sl 1:55 /usr/lib/chromium-browser/chromium-browser

So, when listing the contents of the directory at /proc/16623 we see many virtual files.
quintin@quintin-VAIO:~$ cd /proc/16623
quintin@quintin-VAIO:/proc/16623$ ls -l
total 0
dr-xr-xr-x 2 quintin quintin 0 2011-05-21 19:48 attr
-r-------- 1 quintin quintin 0 2011-05-21 19:48 auxv
-r--r--r-- 1 quintin quintin 0 2011-05-21 19:48 cgroup
--w------- 1 quintin quintin 0 2011-05-21 19:48 clear_refs
-r--r--r-- 1 quintin quintin 0 2011-05-21 18:05 cmdline
-rw-r--r-- 1 quintin quintin 0 2011-05-21 19:48 coredump_filter
-r--r--r-- 1 quintin quintin 0 2011-05-21 19:48 cpuset
lrwxrwxrwx 1 quintin quintin 0 2011-05-21 19:48 cwd -> /home/quintin
-r-------- 1 quintin quintin 0 2011-05-21 19:48 environ
lrwxrwxrwx 1 quintin quintin 0 2011-05-21 19:48 exe -> /usr/lib/chromium-browser/chromium-browser
dr-x------ 2 quintin quintin 0 2011-05-21 16:30 fd
dr-x------ 2 quintin quintin 0 2011-05-21 19:48 fdinfo
-r--r--r-- 1 quintin quintin 0 2011-05-21 16:36 io
-r--r--r-- 1 quintin quintin 0 2011-05-21 19:48 latency
-r-------- 1 quintin quintin 0 2011-05-21 19:48 limits
-rw-r--r-- 1 quintin quintin 0 2011-05-21 19:48 loginuid
-r--r--r-- 1 quintin quintin 0 2011-05-21 19:48 maps
-rw------- 1 quintin quintin 0 2011-05-21 19:48 mem
-r--r--r-- 1 quintin quintin 0 2011-05-21 19:48 mountinfo
-r--r--r-- 1 quintin quintin 0 2011-05-21 16:30 mounts
-r-------- 1 quintin quintin 0 2011-05-21 19:48 mountstats
dr-xr-xr-x 6 quintin quintin 0 2011-05-21 19:48 net
-rw-r--r-- 1 quintin quintin 0 2011-05-21 19:48 oom_adj
-r--r--r-- 1 quintin quintin 0 2011-05-21 19:48 oom_score
-r-------- 1 quintin quintin 0 2011-05-21 19:48 pagemap
-r-------- 1 quintin quintin 0 2011-05-21 19:48 personality
lrwxrwxrwx 1 quintin quintin 0 2011-05-21 19:48 root -> /
-rw-r--r-- 1 quintin quintin 0 2011-05-21 19:48 sched
-r--r--r-- 1 quintin quintin 0 2011-05-21 19:48 schedstat
-r--r--r-- 1 quintin quintin 0 2011-05-21 19:48 sessionid
-r--r--r-- 1 quintin quintin 0 2011-05-21 19:48 smaps
-r-------- 1 quintin quintin 0 2011-05-21 19:48 stack
-r--r--r-- 1 quintin quintin 0 2011-05-21 16:30 stat
-r--r--r-- 1 quintin quintin 0 2011-05-21 19:48 statm
-r--r--r-- 1 quintin quintin 0 2011-05-21 18:05 status
-r-------- 1 quintin quintin 0 2011-05-21 19:48 syscall
dr-xr-xr-x 22 quintin quintin 0 2011-05-21 19:48 task
-r--r--r-- 1 quintin quintin 0 2011-05-21 19:48 wchan

From this file listing I can immediately see the owner of the process is the user quintin, since all the files are owned by this user.

I can also determine that the executable being run is /usr/lib/chromium-browser/chromium-browser since that is what the exe symlink points to.

If I wanted to see the command line, I can view the contents of the cmdline file, for example:

quintin@quintin-VAIO:/proc/16623$ cat cmdline
/usr/lib/chromium-browser/chromium-browser --enable-extension-timeline-api

More Proc Magic

To give another example of /proc files, see the netstat command. If I wanted to see all open IPv4 TCP sockets, I would run:

netstat -antp4

Though, if I am writing a program that needs this information, I can read the raw data from the file at /proc/net/tcp.

What if I need to get details on the system’s CPU and it’s capabilities? I can read /proc/cpuinfo.

Or if you need the load average information in a script/program you can read it from /proc/loadavg. This same file also contains the PID counter’s value.

Those who have messed around with Linux networking, would probably recognize the sysctl command. This allows you to view and set some kernel parameters. All of these parameters are also accessible via the proc filesystem. For example, if you want to view the state of IPv4 forwarding, you can do it with the sysctl as follows:

sysctl net.ipv4.ip_forward

Alternatively, you can read the contents of the file at /proc/sys/net/ipv4/ip_forward:

cat /proc/sys/net/ipv4/ip_forward

Sys Filesystem

Similar to the proc filesystem is the sys filesystem. It is much more organized, and as I understand it intended to supersede /proc and hopefully one day replace it completely. Though for the time being we have both.

So being very much the same, I’ll just give some interesting examples found in /sys.

To read the MAC address of your eth0 network device, see /sys/class/net/eth0/address.

To read the size of the second partition of your /dev/sda block device, see /sys/class/block/sda2/size.

All devices plugged into the system have a directory somewhere in /sys/class, for example my Logitech wireless mouse is at /sys/class/input/mouse1/device. As can be seen with this command:

quintin@quintin-VAIO:~$ cat /sys/class/input/mouse1/device/name
Logitech USB Receiver

Network Socket Files

This is mostly disabled in modern distributions, though remains a very cool virtual file example. These are virtual files for network connections.

You can for instance pipe or redirect some data into /dev/tcp/10.0.0.1/80, which would then establish a connection to 10.0.0.1 on port 80, and transmit via this socket the data written to the file. This could be used to give basic networking capabilities to languages that don’t have it, like Bash scripts.

The same goes for UDP sockets via /dev/udp.

Standard IN, OUT and ERROR

Even the widely known STDIN, STDOUT and STDERR streams, most probably available in every operating system and programming language there ever was, is represented by files in Linux. If you for instance wanted to write data to STDERR, you can simply open the file /dev/stderr, and write to it. Here is an example:

echo I is error 2>/tmp/err.out >/dev/stderr

After running this you will see the file /tmp/err.out containing “I is error”, proving that having written the message to /dev/stderr, resulted in it going to the STDERR stream.

Same goes for reading from /dev/stdin or writing to /dev/stdout.

Conclusion

So Why Love Linux? Because the file representation for almost everything makes interacting with many parts of the system much easier. The alternative for many of these would have been to implement some complicated API.