Archive

Posts Tagged ‘automation’

Pipes and FIFOs

June 8, 2011 Leave a comment

Overview

The basic design of the Unix command line world is the same one which makes it so powerful and which many people swear by, namely the pipes and FIFOs design. The basic idea is that you have many simple commands which take an input and produce an output, and then to string them together into something that will give the desired effect.

Standard IO Streams or Pipes

To start out, let me give a rough explanation of the IO pipes. These are Standard Input, Standard Output and Standard Error. Standard Error and Standard Output are both output streams writing out whatever the application puts into to them. Standard Input is an input stream giving to the application whatever is written into it. Each program when run has each of these 3 streams available to it by default.

From this point forward I’ll refer to the 3 streams as STDOUT for Standard Output, STDERR for Standard Error and STDIN for Standard Input. These are the general short form or constant names for these streams.

Take for example the echo and cat commands. The echo command takes all text supplied as arguments on it’s command line and writes it out STDOUT. For example, the following command will print the text “Hi There” to the STDOUT stream, which by default is linked to the terminal’s output.

echo Hi There

Then, in it’s simplest form the cat command takes all data it reads from it’s STDIN stream and writes it back out to STDOUT exactly as it was received. You can also instruct cat to read in the contents of one or more files and write it back out to STDOUT. For example, to read in the contents of a file name namelist, and write it to STDOUT (the terminal) you can do:

cat namelist

To see cat in it’s purest form, simply run it without arguments, as:

cat

Each line of input typed in will be duplicated. This is because the input you type is sent to STDIN. This input is then received by cat which will write it back to STDOUT. The end of your input can be indicated by pressing Ctrl+D, which is the EOF or End of File key. Pressing Ctrl+D will close the STDIN stream, and will be handled by the program the same as if it was reading a file and came to the end of that file.

Pipes and Redirects

Now, all command line terminals allow you to do some powerful things with these IO pipes. Each type of shell has it’s own syntax, so I will be explaining these using the syntax for the Bash shell.

You could for instance redirect the output from a command into a file using the greater than or > operator. For example, to redirect the STDOUT of the echo command into a file called message, you would do:

echo Hi There > message

You could also read this file back into a command using the less than or < operator. This will take the contents of the file and write it to the command’s STDIN stream. For example, reading the above file into the cat program, would have it written back to STDOUT. So this has the same effect as supplying the filename as an argument to cat, but instead uses the IO pipes to supply the data.

cat < message

Where things really get powerful is when you start stringing together commands. You can take the STDOUT of one command and pipe it into the STDIN of another command, with as many commands as you want. For example, the following command pipes the message “Pipes are very useful” into the cut command, instructing it to give us the 4th word of the line. This will result in the text “useful” being printed to the terminal.

echo Pipes are very useful | cut -f 4 -d " "

As you can see, commands are stringed together with the pipe or | operator. The pipe operator by itself makes many powerful things possible.

Using the pipe (|) and redirect (>) operator, let’s give a more complex example. Let’s say we want to get the PID and user name of all running processes, sorted by the PID and separated by a comma. We can do something like this:

ps -ef | tail -n+2 | awk '{print $2 " " $1}' | sort -n | sed "s/ /,/"

To give an idea of what happens here, let me explain the purpose of each of these commands with the output each one produces (which becomes the input of the command that follows it).

Command Description
ps -ef Gives us a list of processes with many columns of data, of these the 1st column being the user and the 2nd column being the PID.

Output:

UID        PID  PPID  C STIME TTY          TIME CMD
root      4222   443  0 20:14 ?        00:00:00 udevd
quintin   3922  2488  0 20:14 pts/2    00:00:00 /bin/bash
quintin   4107  2496  0 20:18 pts/0    00:00:00 vi TODO
tail -n+2 Takes the output of ps and gives us all the lines from line 2 onwards, effectively stripping the header.

Output:

root      4222   443  0 20:14 ?        00:00:00 udevd
quintin   3922  2488  0 20:14 pts/2    00:00:00 /bin/bash
quintin   4107  2496  0 20:18 pts/0    00:00:00 vi TODO
awk ‘{print
$2 ” ” $1}’
Takes the output of tail, and prints the PID first, a space and then the user name. The rest of the data is discarded here.

Output:

4222 root
3922 quintin
4107 quintin
sort -n This sorts the lines received from awk numerically.

Output:

3922 quintin
4107 quintin
4222 root
sed “s/ /,/” Replaces the space separating the PID and user name with a comma.

Output:

3922,quintin
4107,quintin
4222,root

Some Example Useful Commands

The above should give you a basic idea of what it’s all about. If you feel like experimenting, here are a bunch of useful commands to mess around with.

I’ll be describing the commands from the perspective of the standard IO streams. So even though I don’t mention it, some of these commands also support reading input from files specified as command line arguments.

To get more details about the usage of these commands, see the manual page for the given command by running:

man [command]

.

Command Description
echo Writes to STDOUT the text supplied in command line arguments.
cat Writes to STDOUT the input from STDIN.
sort Sorts all lines of input from STDIN.
uniq Strips duplicate lines. The input needs to be sorted first, thus same basic effect can be achieved with just sort -u
cut Cuts a string by a specified character and returns requested parts.
grep Search for a specified pattern or string in the data supplied via STDIN.
gzip Compresses the input from STDIN and writes the result to STDOUT. Uses gzip compression.
gunzip Uncompresses the gzip input from STDIN and writes the results to STDOUT. Basically the reverse of gzip.
sed Stream editor applying basic processing and filtering operations to STDIN and writes to STDOUT.
awk Pattern scanning and processing langauge. Powerful script-like processing of lines/words from input.
column Takes the input from STDIN and formats it into columns, writing the result to STDOUT. Useful for displaying data.
md5sum Takes the input from STDIN and produces a md5sum of the data
sha1sum Takes the input from STDIN and produces a sha1sum of the data
base64 Takes the input from STDIN and base64 encodes or decodes it
xargs Takes input from STDIN and a uses it as arguments to a specified command
wc Count the number of lines, words or characters read from input.
tee Read input and write it to both STDOUT as well as a specified file.
tr Translate or delete characters read from input

Conclusion

I would recommend anyone to get comfortable with these aspects of the Linux terminal as well as Bash scripting. Not knowing this, you might not even realize how many of your common tasks could be automated/simplified by it. Also remember that automation not only makes your tasks be completed quicker, but also reduces the chances for errors/mistakes that come from doing repetitive tasks by hand.

So Why Love Linux? Because the pipes and FIFOs pattern gives you a lot of power for building complex instructions.

Knowing the Moment a Port Opens

June 5, 2011 Leave a comment

Automated Attempts

Sometimes when a server is rebooted, whether a clean soft reboot or a hard reboot after a crash, I need to perform a task on it as quickly as possible. This can be for many reasons like ensuring all services are started to making a quick change. Sometimes I just need to know the moment a certain service is started to notify everyone of this fact. The point is that every second counts.

When the servers starts up and the network is joined you can start receiving ping responses from the server. At this point all the services haven’t started up yet (on most configurations at least), so I can’t necessarily log into the server or access the specific service, yet. Attempting to do so I would get a connection refused or port closed error.

What I usually do in cases where I urgently need to log back into the server is ping the IP address and wait for the first response packet. When I receive this packet I know the server is almost finished booting up. Now I just need to wait for the remote access service to start up. For Linux boxes this is SSH and for Windows boxes it’s RDP (remote desktop protocol).

I could try to repeatedly connect to it, but this is unnecessarily manual, and when every second counts probably less than optimal. Depending on what I’m trying to do I have different methods of automating this.

If I just needed to know that a certain service is started and available again, I would put a netcat session in a loop, which would repeatedly attempt a connection. As long as the service isn’t ready (the port is closed), the netcat command will fail and exit. The loop will then wait for 1 second and try again. As soon as the port opens the connection will succeed and netcat will print a message stating the connection is established and then wait for input (meaning the loop will stop iterating). At this point I can just cancel the whole command and notify everyone that it’s up and running. The command for doing this is as follows:

while true; do nc -v 10.0.0.221 80; sleep 1; done

If I needed remote access to the server, I would use a similar command as above, but use the remote access command instead, and add a break statement to quit the loop after the command was successful. For example, for an SSH session I would use the ssh command, and for a remote desktop session the rdesktop command. A typical SSH command will look like:

while true; do ssh 10.0.0.221 && break; sleep 1; done

This will simply keep trying the ssh command until a connection has been established. As soon as a connection was successful I will receive a shell, which when exited from will break the loop and return me to my local command prompt.

Automatically Running a Command

If you had to run some command the moment you are able to do so, you could use the above SSH command with some minor modifications.

Lets say you wanted to remove the file /opt/repository.lock as soon as possible. To keep it simple we’re assuming the user you log in as has permission to do so.

The basic idea is that each time you fail to connect, SSH will return a non-zero status. As soon as you connect and run the command you will break out of the loop. In order to do so, we need a zero exit status to distinguish between a failed and successful connect.

The exit status during a successful connect, however, will depend on the command being run on the other end of the connection. If it fails for some reason, you don’t want SSH to repeatedly try and fail, effectively ending up in a loop that won’t exit by itself. So you need to ensure it’s exit status is 0, whether it fails or not. You can handle the failure manually.

This can be achieved by executing the true command after the rm command. All the true command does is to immediately exit with a zero (success) exit status. It’s the same command we use to create an infinite while loop in all these examples.

The resulting command is as follows:

while true; do \
  ssh 10.0.0.221 "rm -f /opt/repository.lock ; true" && break; \
  sleep 1; \
done

This will create an infinite while loop and execute the ssh and sleep commands. As soon as a SSH connection is established, it will remove the /opt/repository.lock file and run the true command, which will return a 0 status. The SSH instance will exit with success status, which will cause a break from the while loop and end the command, returning back to the command prompt. As with all the previous examples, when the connection fails the loop will pause for a second, and then try again.

Conclusion

By using these commands instead of repeatedly trying to connect yourself, there is a max of 1 second from the time the service started till when you’re connected. This can be very useful in emergency situations where every second you have some problem could cost you money or reputation.

The Linux terminal is a powerful place and I sometimes wonder if those who designed the Unix terminal knew what they were creating and how powerful it would become.

So Why Love Linux? Because the Linux terminal allows you to optimize your tasks beyond humanly capability.

 

The Traveling Network Manager

June 3, 2011 Leave a comment

Overview

Networks are such a big part of our lives these days that being at a place where there isn’t some form of a computer network, it feels like something’s off or missing, or like it wasn’t done well. You notice this especially when you travel around with a device capable of joining WiFi networks, like a smartphone, tablet or laptop. And even more so when you depend on these to get internet access.

Ubuntu, and I assume most modern desktop distributions, come with a utility called NetworkManager. It’s this utility’s job to join you to networks and manage these connections. It was designed to make best attempt to configure a network for you automatically with as little user interaction as possible. Even when using the GUI components, all input fields and configuration UIs were designed to make managing your networks as painless as possible, keeping in mind the average user’s abilities. All complicated setup options were completely removed, so you can’t configure things like multiple IP addresses, or select the WiFi channel, etc.

NetworkManager is mostly used through an icon in the system tray. Clicking this icon brings up a list of all available networks. If you select a network, NetworkManager will attempt to connect to the network and configure for your device via DHCP. If it needs any more information from you (like for a WiFi pass phrase or SIM card pin code), it will prompt you. If this connection becomes available in the future it will then automatically try and connect to it. For WiFi connections it’s the user’s job to select the first connection from the menu. For ethernet networks NetworkManager will automatically connect the first time.

These automatic actions NetworkManager takes are to make things more comfortable for the end user. The more advanced user can always go and disable or fine tune these as needed. For example to disable automatically connecting to a certain network, or setting a static IP address on a connection.

Roaming Profiles

If you travel around a lot you end up with many different network “profiles”. Each location where you join a network will have it’s own setup. If all these locations have DHCP you rarely need to perform any manual configuration to join the network. You do get the odd location, though, where you need some specific configuration like a static IP address. NetworkManager makes this and roaming very easy and natural to implement, and seamlessly manages this “profile” for you.

You would do this by first joining the network. Once connected, and whether or not your were given an IP address, you would open the NetworkManager connections dialog and locate the connection for the network you just joined. From here you would edit it and set your static IP address (or some other configuration option) and save the connection.

By doing this you effectively created your roaming profile for this network. None of your other connections will be affected, so whenever you join any of your other networks, they will still be working as they did previously, and the new network will have it’s own specific configuration.

This was never really intended to be a roaming profile manager, so other options related to roaming (like proxy servers) will not be configured automatically. I’m sure with a few scripts and a bit of hacking you should be able to automate setting up these configurations depending on the network you’re joining.

Conclusion

NetworkManager is maybe not the advanced user’s favorite tool. But if you don’t need any of these advanced features I would certainly recommend it.

So Why Love Linux? Because NetworkManager does a brilliant job of making networking comfortable in a very natural way.

Within the Blue Proximity

June 2, 2011 2 comments

Overview

I read about the awesome little program called Blue Proximity. It’s a Python script that repeatedly measures the signal strength from a selected Bluetooth device. It then uses this knowledge to lock your computer if you are further away from it, and unlock it or keep it unlocked when you are close to it.

It’s very simple to setup. It has a little GUI from which you select which device you want to use for this and then specify the distance value at which to lock/unlock your computer, as well as which time delay for the lock/unlock process. The distance can’t be measured in meters/feet, but instead just a generic unit. This unit is an 8bit signed scale based on the signal strength measured from the device and isn’t terribly accurate. It’s not a perfect science and a lot of factors affect the reading.

So the general idea is that you try and get your environment as normal as you would usually have it and try different values for lock/unlock distances until you get a configuration that works best for you. There are a few more advanced parameters to play with as well. Especially the very useful ring buffer size, which allows you to effectively average that value over the last few readings, instead of using the raw value each time. It’s certainly worth playing around with these values until you find what gives you the best result.

You can even go as far as specifying the commands to be executed for locking/unlocking the screen. The default is probably sufficient for most purposes, but it’s definitely available for those that want to run other commands.

Beyond just locking/unlocking there is also a proximity command feature, which will ensure that the computer doesn’t lock from inactivity as long as you’re close to it. This is very useful for times where you’re watching a movie or presentation and don’t want the screen to keep locking just because you didn’t move the mouse or type on the keyboard.

My Setup

Before I had this program I would have my computer lock after a 10 minute idle period. Then if I return it would almost be automatic for me to start typing my password. The Gnome lock screen is optimized cleverly, in that you can simply start typing your password even if the password dialog doesn’t display yet. It will recognize the first key press in a locked state as an indication of your intent to unlock the screen as well as use it for the first character of your password.

After I configured and hacked Blue Proximity to my liking the screen would lock as soon as I’m about 3 meters away from the computer, and unlock when I’m right in front of it. I configured a 10 second ring buffer to average the reading it gives over the readings for the past 10 seconds. I also made 0 or higher values (closest reading to the computer) count as double entries. Meaning when 0 values are being read it will average down to 0 twice as fast. This allows for it to be more stable when moving around, but unlock very quickly when standing right next to the machine. It all works very well.

It’s been a few days now, and still when I get to the computer and it unlocks by itself I’m amused. Sometimes I even start getting ready to enter my unlock password when the screen is automatically unlocked. Very amusing.

It’s not perfect, and sometimes the screen would lock while I’m busy using the computer and then immediately unlock again. This is to be expected from the nature of wireless technologies, though I’m sure a bit more hacking and tuning will get it at least as close to perfect as it can be.

Conclusion

It’s typical of the software world to always produce amusing and fun utilities like this one. This one is definitely one of my favorites.

So Why Love Linux? Because there are tons of free and open source programs and utilities of all kinds.

Fourty One SQL Queries in 60 Seconds

May 21, 2011 Leave a comment

I needed to capture some data from a database into CSV files. MySQL supports this easily, so this in itself isn’t a problem. Part of the requirement was that the data had to be divided into separate files, one for each month since 2008. Being May 2011, that gives us 41 months.

The only way I could do this was to run 41 separate queries, one for each month. Each query will differ in 3 places from the other. These are

  1. The start date
  2. The end date
  3. The CSV filename

If I had to create 41 queries by hand it will take me quite a while and there is a big possibility of making a mistake and messing up the result.

But my friend Linux and Bash is around, which allows me to do it very quickly and with virtually 0 probability of errors or mistakes. I planned to generate a SQL script with the queries.

The command I ended up with was as follows (I split it into multiple lines for clarity):
year=2008; for from in {1..12} {1..12} {1..12} {1..5}; do 
[ $from -eq 12 ] \
&& { tyear=$((year + 1)); to=1; } \
|| { tyear=$year; to=$((from + 1)); }; 
echo "SELECT columns FROM table INFO OUTFILE '$year-$from.csv' " \
"WHERE time>='$year-$from-01' AND time<'$tyear-$to-01';"; 
year=$tyear;
done >> dbscript.sql

After executing this I had a file dbscript.sql which I can run on the database to give me the files I needed. It took me less than a minute to type up the command and use it to generate the full script.

The command itself might seem complex, but it’s really very simple.

  1. In the first line I’m looping through the numbers 1 to 12 three times, and then through the numbers 1 to 5
  2. The next 3 lines inside the loop calculates the “next” month’s year and month number. For example, if the current iteration’s month and year is 9 and 2008, then the next month would be 10 and 2008, or for the month 12 and 2008, it will be 1 and 2009.
  3. The 5th and 6th lines are just echoing the generated SQL query.
  4. The last line sets the base year to use for the next iteration’s month, as calculated in nr.2 above.
  5. The resulting output of the whole loop is appended to the file called dbscript.sql.

If you had to the command regularly, you could always make a bash script from it and execute it whenever you need it. This will save you having to type it out every time. Bash scripts are nothing more than what you can type on the command line, and are very powerful.

So Why Love Linux? Because the command line allows you to write “smart” commands.