This is a short tutorial on very basic UNIX script editing. Obviously in order to run such a script you will have to have some sort of access to a UNIX computer, but that is not as hard as it might sound. See Getting Access for more details.
In this tutorial, we will edit and change my doctypeify script. Quickly read up on that before proceeding.
In order to make any type of change to a script one must first have a fairly clear idea as to what it does and how it works. So let's dissect doctypeify.
#! /bin/sh # This script is designed to peer into the provided directory # and check shtml files for DOCTYPES. If a .shtml file is # found without a DOCTYPE, then one will be added. # (C)2005 Nic Reveles
This is the first chunk of the script. It doesn't do too much. With one exception, anything following a # sign is a comment. Though you could change that part of the script or even delete it and have the script still work, it is polite and proper not to. It is customary to put a short explanation as to what the script actually does at the top of the script. If we make a change to someone else's script, we should add comments about the changes above the initial comments, and not over them.
So what is the one exception to that comment rule? The very top line (#! /bin/sh). This tells the computer that it is handling a bash script (Borne Again Shell) and points the computer to where the bash shell is located. Without this, the script will not work.
Next we have the following:
# First, make sure we have a valid use of doctypeify if [ $# -gt 1 ]; then echo "Improper use: doctypeify /some/HTML/directory" exit 0 fi if [ $# -eq 0 ]; then dir=`pwd` else dir=$1 fi
The above might look a bit confusing, but it's really not. The comment says that we are checking to make sure this is a valid use of the script (i.e. the user supplies the right amount of arguments when calling the script). If the number of arguments ($#) is greater than (-gt) one (1), then the script provides an error message and exits. This is because the script is not designed to be called with multiple directories (i.e. "doctypeify directory1 directory2" would not work).
The second part looks very similar, and it is! It just checks to see if the user called 'doctypeify' without any arguments (i.e. "doctypeify"), and if that is the case it decides that the directory is implied, and is the directory we are currently in. So, the script sets a variable dir to the current directory ("print working directory"). We could change this variable name to anything we like within reason and a few rules, but we would have to make the change to every reference of the variable.
If this was not the case and the user supplied a directory argument, then the script sets our dir variable to the first (and only!) argument, $1.
# Save any files that already have doctypes in a temporary file grep -H "DOCTYPE HTML PUBLIC" $dir/*.shtml > $HOME/.temp_doctype cut -d : -f 1 $HOME/.temp_doctype > $HOME/.temp_doctype2
In the next part, the script searches anyfile ending in .shtml for a line that has "DOCTYPE HTML PUBLIC" inside it, and then saves the list to .temp_doctype in the user's home directory. If we wanted the script to search for another type of file (such at HTML, php, htm, etc.), then we could simply change *.shtml to *.html.
The cut command removes any output from the grep command following a colon (:). This could have been performed in the same step as the grep command, but that's more advanced.
# Save _ALL_ applicable files in another temp file ls $dir/*.shtml > $HOME/.temp_doctype # Now get anyfile which does _not_ have a doctype diff -b $HOME/.temp_doctype $HOME/.temp_doctype2 > $HOME/.temp_doctype3
The ls command is used to find any file ending in .shtml, so if we wanted the script to work on another file extension (such as html, php, htm, etc.), we need to change this again to the proper extension. The result is saved in a temporary file.
The 'diff' command looks for any difference in the two lists we have. Any file missing from our first list needs a DOCTYPE Declaration added.
# If there are no files to be fixed, be done if [ $? -eq 0 ]; then echo "All *.shtml files were correct" rm $HOME/.temp_doctype $HOME/.temp_doctype2 $HOME/.temp_doctype3 exit 0 fi
When a program or script exits, it should leave an exit code behind (which is why we 'exit 0'). In this case, when the diff command is finished, if the two files were exactly the same it exits with a 0. So, the script checks to see if the two lists were the same (which would mean that all files had proper DOCTYPEs), and if this is true we exit.
Next the script simply saves the DOCTYPE Declaration we want to a temporary file. In this case it is 4.01 Transitional, but if we do not like that then we could change it to whatever we like. For that matter, we could change it to "THIS IS GIBBERISH THAT HAS BEEN ADDED TO THE TOP OF MY FILE!".
After that is complete, the script then loops through all of the files missing DOCTYPEs and appends whatever is in that temporary file to the top of your file. Once that is done the script outputs the list of all the files it changed, and then deletes all of those temporary files. And that's it!
The best way to get the hang of making changes to scripts is to write your own. Once you can do that, it would be very simple to make the changes we discussed. You could even make this script considerably shorter by combining many of the commands into single commands.
Good luck, and have fun!
(c) 2005 Nic Reveles
Updated