Script writing | Mark Hamzy's weblog

For work, I was asked to help port a project to Linux. This involved
working on a computer behind a firewall. I am sure that every one is
familiar with ssh. For those of you who are not, ssh is a program that
encrypts a telnet session. And telnet is a program that allows someone to
log into a computer remotely. So, to access the computer, all I would
need to do is to ssh into it. Unfortunately, it was not that easy.

More follows:

The first problem is the firewall which exists to provide some measure of
security. The computer that I wanted to access (bob), was behind a
firewall and as a result was not accessable. To bypass the firewall, I
was provided with an account on that machine. Once I logged into the
firewall (bill), I could then turn around and access bob.

ssh hamzy@bill

and then from within the ssh session

ssh hamzy@bob

What if I wanted to run the second ssh on my own machine (localhost)?
Fortunately, ssh provides a solution called port forwarding. Port
forwarding essentially is a small program the reads input from one port
and sends it to another and at the same time reads output from that port
and sends it back to the first port. Ssh uses port 22. If I were to
create a virtual port (7777) and use port forwarding to connect bill with
bob (7777 <-> 22), then I could ssh from my machine as if bob were
directly accessible from my machine.

ssh -L 7777:bob:22 hamzy@bill

and then from my machine

ssh -p 7777 hamzy@localhost

It is still easy, right? Unfortunately, it wasn’t. The second problem
was the my connection is unreliable. Whatever the cause, some machine
along the chain from my computer to the remote computer had some temporary
problem and I would loose the connection. Whatever work I was in the
middle of performing would be lost and I would start over. To solve this
I turned to another powerful unix program called screen. Screen
essentially allows one session to controll many children session. While
doing this, screen has the ability to run in the background and not mind
if you exit the session (logout) which would normally finish the session.
So, on my local machine, I would normally start screen as follows

screen

Ssh allows any program to be run instead of the default login program by
performing

ssh -p 7777 hamzy@localhost screen

Every time you start screen a new screen process is created. To access a
previously running screen process, I would need to provide some command
line arguments

screen -DR

This will tell screen to detach the first screen process it finds and
reattach to it. All I have to do is add a -t option to ssh to provide a terminal for screen to use. So, now I will not loose my work if the connection is
lost. All I need to do is put the commands into a loop that sleeps for a
little bit (to be nice) and retry the connection process. So, was
everything solved? No. When I would go back to my session and press a
key, the program would wake up and realize that the connection was lost.
After waiting for a little while, I would automatically reconnect. I
could improve this. The version that I was using was an old version
of ssh. When I used the latest version, I found that I could create a
file called ~/.ssh/config and add the following:

ServerAliveInterval 15
ServerAliveCountMax 4
TCPKeepAlive yes

This would tell ssh to ping the remote machine every 15 seconds and
terminate the connection if the remote machine did not respond after four
attempts. The connection would be lost and my loop would reconnect.

This automation runs automatically but everytime I login to a machine it
wants a username and a password. I provide the username on the command
line since I do not care who knows that fact, but I do not want to provide
the password on the command line. How can I solve this? Ssh has the
ability to encrypt a password and if I do not want to be asked for a
password, then I can tell it to encrypt an empty password. I do this by
the following:

cd ~/.ssh
ssh-keygen -t dsa

Ssh will ask me what password I want and encrypt it with the DSA
algorithm. It creates a plain text file called ~/.ssh/id_dsa.pub . I can
notify ssh of this on the remote machine, bob, by putting the contents of
this file into another file called ~/.ssh/authorized_keys . This is of
course not as secure as asking for a password every time. But the
tradeoff is it will allow you to automate the connection.

Now, I can run commands on the remote machine and port the program. To
test the program, I need to run it locally. I will build the program on
the remote machine and copy the data to my local machine. This is called
mirroring. How can I copy the files? I could run tar on bob to bundle
files up and copy that bundle to a tar running on localhost that would
unbundle those files. The steps to accomplish this is 1) run tar to
create a bundle, 2) copy the bundle, and 3) run tar to unbundle the files.
This is too many separate steps. I can improve this.

Unix allows you to chain programs together by taking program A’s output
and forwarding it to program B. Ssh follows this philosophy. So, using
ssh, I can connect task 1 and task 3 together without performing task 2.
Now, I do the following:

cd location_of_files_on_localhost
ssh -p 7777 hamzy@localhost "(cd location_of_files_on_bob; tar c *)" | tar xv

However, there are more efficient ways to mirror files. If the data has
not changed, then nothing needs to be copied. Also, to speed the copying
process, if only small differences exist between two files, then only
those differences need to be copied. The program that does this in Unix
is rsync. The people who wrote rsync knew about ssh. Connecting the
dots, I would do the following:

rsync --rsh="ssh -l hamzy -p 7777" localhost:location_of_files_on_bob location_of_files_on_localhost

But what if I only wanted to copy files? I first need to set up the port
forwarding but that creates a login session. This session needs to be
closed manually by logging out. After I copy the files, I would leave
this session running. This is not very clean. To solve this I could
instead run some other program which would end automatically. I chose to
run sleep which runs for some amount of time which I arbitrarly set as
long enough (one day).

ssh -L 7777:bob:22 hamzy@bill "sleep 1d"
# run rsync

One day is of course not a perfect fit and I am not looking for perfect
but “good enough.” At least, it consumes a minimal amount of resources on
bob. And I can actually programmatically find it and kill it (unix
terminology).

# kill the sleep session

Now, I am happy. I can edit files and copy files. But inspiration hit me
and I thought I can use what I just learned and do something else with it.
Screen is a pretty powerful program. You can name screen sessions. You
can list running screen sessions. Screen even allows programmatic
manipulation.

I can start a named screen session that is initially detached:

screen -dm -S AR

I can check to see if that session is running:

screen -r AR -ls -q

I can close a named screen session:

screen -dr AR -X quit

I can add a program Y to a named screen session:

screen -dr AR -X screen Y

One of thing that I do, is use a program called bittorrent. Of course,
bittorrent is despised by Hollywood since it allows people to digitally
copy media. And I am not going to go into that can of worms. It does
have its legal uses to copy programs that allow themselves to be freely
copied like Linux. This downloading takes a lot of time. So I want to
start the downloading process simply and walk away.

Now I have all the tools that I need to accomplish that. I can remotely
control a machine. I can remotely start a standalone session (screen).
And I can remotely start bittorrent with screen. So what did I do? I
wrote a shell script to do all of that. And how did I start that process?
Email! Huh? Unix people allow email to run programs. When I send a
specially formatted email to a machine, that machine will turn around and
run my script. My script will then start the downloading. Mission
accomplished!

Links: