Shell

*NIX operating systems are made up of three parts; the kernel, the shell and the programs.

 

 

The Kernel is the core of the operating system: it interfaces and manages with the hardware it runs upon, allocates resources, time and memory to programs and handles the file system and communications, in response to system calls.

 

The Shell acts as an interface between the kernel, the user and other programs. When a user logs in, the login program checks the username and password, and then starts another program called the shell. The shell is a command line interpreter (CLI). It interprets the commands the user types in and arranges for them to be carried out. The commands are themselves programs: when they terminate, the shell gives the user another prompt.

 

Shell Scripting is not really a language but rather a collection of shell commands carried out in sequence in a special file, a script, to perform the user's program.

 

 

*useful    http://www.tldp.org/LDP/abs/html/index.html

Standard Streams / File Descriptors

In computer programming, standard streams refer to the pre-connected I/O (input/output) communication channels between a program (/process) and its environment. Originally, I/O was carried out via physically connected system console / terminal with input supplied by the keyboard and output sent to the monitor, but standard streams abstract this.

 

 

There are always three default files open, stdin (the keyboard), stdout (the screen), and stderr (error messages output to the screen), each with a specific number:

 

  • 0 - stdin    - default: keyboard
  • 1 - stdout  - default: screen
  • 2 - stderr  - redirects to screen

*note: std = standard

 

You may have heard: "Everything in *nix is a file". By convention in UNIX and Linux, data streams and peripherals (device files) are treated as files, in a fashion analogous to ordinary files.

 

file descriptor is simply a number that the operating system assigns to an open file to keep track of it.

 

Moreover, a file descriptor (FD or, less frequently, fildes) is an abstract indicator (handle) used to access a file or other input/output resource, such as a pipe or network socket.

Exit codes

An Exit Code or Exit Status is an unsigned 8-bit integer returned by a command that indicates how its execution went.

 

An Exit Code of 0 indicates the command (program) was successful at what it was supposed to do. I think of it as zero means the program exited successfully with zero errors.

 

Any other Exit Code indicates that something went wrong.

 

Applications can choose for themselves what number indicates what went wrong; so refer to the manual of the application to find out what the application's Exit Code means.

 

The $? shell parameter provides the exit status (/ return value) of the most recently executed foreground pipeline.

 

echo $?

 

 

 

Developers are free to specify their own exit codes, but must not interfere with the Reserved Exit Codes:

 

Exit Code Number Meaning
0 Success
1 Catchall for general errors
2 Misuse of shell builtins (according to Bash documentation)
126 Command invoked cannot execute
127 "command not found"
128 Invalid argument to exit
128+n Fatal error signal "n"
130 Script terminated by Control-C
255* Exit status out of range

 

Exit codes can be viewed using echo $?

Hello world!

The customary first program!

 

Create a new file and save as .sh file 'hello.sh'

 

Use vi as it's always available, otherwise use a good text editor that will preserve line-endings.

  1. First line consists of #! (aka shebang) and path to bash /bin/bash
  2. Followed by echo "and text to be echoed to the terminal".

 

Next, make it executable:

chmod +x hello.sh

 

So, it can now be run:

./hello.sh

 

Sample output:

Hello, World!

Shebang #!

The #! at the beginning of a shell script is commonly referred to as the Shebang.

 

The Shebang #! is usually followed by the path to the command interpreter e.g. /bin/bash or /bin/sh and an optional argument:

 

Any intervening white space is ignored / optional. Accordingly the following are considered equivalent:

  • #!/bin/bash
  • #! /bin/bash

 

*notice the subtle difference of the ignored space between the Shebang #! and the path: /bin/bash

./ and $PATH

When using a standard shell command it's usually typed straight on to the command line after the prompt and executed by hitting enter. But when a shell script is run it is often preceded by a dot and a forward slash ./

 

In the case of the former standard commands, Bash tries to find the typed command from a series of directories that are stored in a variable called $PATH, which can be seen by typing echo $PATH which returns an output such as:

 

/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin

 

You can see that each directory is separated by a colon :

 

Since the $PATH variable is individual to each user on a system, it can be set to suit each specific user.

 

Accordingly, for convenience sake, one could add a specific directory (e.g. ~/myscripts) for your custom shell scripts that would therefore not require the preceding ./

 

Temporary path addition:

 

export PATH=$PATH:~/myscripts

 

Permanent path addition - add the above to the bottom of your ~/.bashrc file:

 

export PATH=$PATH:~/myscripts

 

Now, if you were to save all your shell scripts in ~/myscripts, you'd no longer need to preceded with them with ./ to run them.

Execution

To run / execute a script it must have its execute permission set.

 

This can be achieved for all users by:

chmod +x myscript.sh

 

Alternatively, specific user, group, others can be set by preceding the x by the appropriate flag:

  • chmod +ux myscript.sh where u stands for user
  • chmod +gx myscript.sh where g stands for group
  • chmod +ox myscript.sh where o stands for others
  • chmod +ax myscript.sh where a stands for all
    • this can also be achieved by omitting the permission identifier chmod +x myscript.sh as per above

The script may now be executed:

./myscript.sh

 

Alternatively, you may also use sh before the script name to run it:

sh myscript.sh

Comments

Using a # at the start of a line (without the ! of a shebang) turns the line into a comment.

 

Comment Frequently!!!

 

#    This is a comment, to remind me of what follows!

 

 

Alternatively, the HEREDOC /  HERE DOCUMENT notation <<IDENTIFIER can be used to create comments over multiple lines and is terminated by stating the same IDENTIFIER without the <<

 

 

Invoke:

sh comments.sh

 

Output:

Example of a block of comment lines:
Notice how the commented heredoc section is ignored.

Echo

As we've already seen in the first example above, echo is used to display a line of text/string the the standard output (stdout), usually the terminal, or a file.

 

echo "Hello, World!"

 

If you specify the -e option it enables interpretation of the following backslash escaped characters:

Option Description
\\ A literal backslash character ("\")
\a An alert (The BELL character)
\b Backspace
\c Produce no further output after this
\e Escape character; equivalent to pressing the escape key
\f Form feed
-n or \n Newline
\r Carriage return
\t Horizontal tab
\v Vertical tab
\0NNN Byte with octal value NNN (1 to 3 digits)
\xHH Byte with hexadecimal value HH (1 or 2 digits)

 

-E Disables the interpretation of backslash escape sequences. This is the default.

Variables

A variable is a temporary store of alphanumerical information, that can be set or retrieved within a script / program and manipulated to produced an effect on that or another variable.

 

Variables are, usually, given a user friendly name which then has a value assigned to it.

 

In shell scripting, variables are declared by simply assigning a value upon the variable name, stated on the left e.g. myvar=17

 

The shell script can then read this variable by preceding a $ symbol against its name e.g. $myvar

 

Variable names can be alphanumeric, UPPER or lower case, or a mixture of both and must begin with an alphanumeric character or underscore character (_), followed by one or more alphanumeric or underscore characters.

 

Also, variables names are case-sensitive, just like filenames. i.e all the following are valid different variable names:

  • MYVAR
  • myvar
  • mYvAr

 

Variables are declared by use of an equals symbol with no spaces either side:

  • myvar=20
  • vegtable=potato

 

Do not put spaces on either side of the equal sign when assigning a value to variable, or it will result in an error.

 

If you need to use a space within a variable value it must be encapsulated within quotes. Single quotes will interpret the value literally, whilst double quotes will allow any encapsulated variable to be interpreted within the value being defined:

  • myvar='This will be interpreted exactly as it is written'
    • When echo'd would produce: This will be interpreted exactly as it is written
  • bike="Ducati Desmosedici"
  • motorbikes="My $bike is fun to ride!"
    • When $motorbikes is echo'd it would produce: My Ducati Desmosedici is fun to ride!

 

You can also define a NULL variable as follows (NULL variable is variable which has no value at the time of definition):

  • myvar=""

 

 

Output:

Hello, World!

 

 

You may also sometimes see a variable being used within curly braces {}, which is particularly useful to avoid any ambiguity when referring to a specific variable:

 

Output:

Hello, World!

 

*notice:

  • the $there variable just being used for "Wor"; the first half of World!
  • the variables are expanded within the double quotes

Read only variables

The Shell provides a way to mark variables as read-only by using the readonly command. After a variable is marked readonly, its value cannot be changed.

 

For example, the following script generates an error while trying to change the value of the bike variable:

 

 

Invoke:

sh ro.sh

 

Output:

Ducati Desmosedici
ro.sh: line 6: bike: readonly variable

Unset / delete a variable

Unsetting or deleting a variable directs the shell to remove the variable from the list of variables that it tracks. Once you unset a variable, you cannot access the stored value in the variable.

 

Following is the syntax to unset a defined variable using the unset command:

 

unset variable_name

 

The above command unsets the value of a defined variable.

 

 

Invoke:

sh unset.sh

 

Output:

My favoutire motorbike is a Ducati Desmosedici
The bike variable has now been unset and will not display here:

 

You cannot use the unset command to unset variables that are marked readonly.

Default shell variables

Default shell variables can be set using the := syntax.

 

If a variable is not set, nothing will be display.

 

Consider, echo $myvar

 

If $myvar has not been set, nothing will be displayed.

 

However, it can be set as follows:

 

echo ${myvar:=SuperSport}

 

Similarly, a dash/minus can be used instead of the =

 

echo ${myvar:-SuperSport}

 

Once set, it can not be set again unless unset first.

 

 

Output:

First
Second
Fourth

Special Variables

The locations at the command prompt of the arguments as well as the location of the command, or the script itself, are also stored in corresponding variables, along with a number of other special system set variables, as follows:

 

  • $0 - The name of the Bash script
  • $1 - $9 - The first 9 arguments to the Bash script
  • $# - How many arguments were passed to the Bash script
  • $* - All arguments supplied to the Bash script
    • takes the entire list as one argument with spaces between
  • $@ - All arguments supplied to the Bash script
    • takes the entire list and separates it into separate arguments
  • $? - Exit status (/return value) of the most recently run process
  • $$ - PID (Process ID) of the current script
  • $USER - Username of the user running the script
  • $HOSTNAME - Hostname of the machine the script is running on
  • $SECONDS - Number of seconds since the script was started
  • $RANDOM - Returns a different random number each time is it referred to
  • $LINENO - Returns the current line number in the Bash script

 

Note: the $* special parameter takes the entire list as one argument with spaces between and the $@ special parameter takes the entire list and separates it into separate arguments.

 

 

Invoke:

sh specialVars.sh Derrick Ducati Electronics

 

Output:

The name of this script is: specialVars.sh
3 arguments were used
The passed in arguments were: Derrick and Ducati and Electronics
The user running this script is: derrick
The hostname of this computer is: xps13
It has been 6 seconds since this script was started
The current line giving this output is 10
Here's a random number 4130
All the arguments were: Derrick Ducati Electronics
And again using a different special variable: Derrick Ducati Electronics
The PID of this script is 31360
The exit code of this script is 0

Command line arguments

You're probably already aware of providing command line arguments when working with standard command lines, such as:

ls -la /var/www/

which is supplying two command line arguments (the first being) -la along with the path to use (as the second argument) /var/www/ which lists the contents of /var/www/ in long format with all files and directories.

 

Similarly, we can use the same approach to provide arguments into our shell scripts, with $1 being used within the shell script to represent the first argument and $2 used to represent the second argument and so on. These are automatically set by the system when we run our script so all we need to do is refer to them.

 

 

The arguments are passed into the script on the actual command line, by specifying the script name followed by the arguments separated by a space:

 

./arguments.sh Hello, World!
The first argument is: Hello,
The second argument is: World!

Bracket types

The following has already been expertly written (by Giles on Stack Overflow) Kudos!

 

  • … ) parentheses indicate a subshell. What's inside them isn't an expression like in many other languages. It's a list of commands (just like outside parentheses). These commands are executed in a separate subprocess, so any redirection, assignment, etc. performed inside the parentheses has no effect outside the parentheses.
    • With a leading dollar sign, $(…) is a command substitution: there is a command inside the parentheses, and the output from the command is used as part of the command line (after extra expansions unless the substitution is between double quotes, but that's another story).
  • {} braces are like parentheses in that they group commands, but they only influence parsing, not grouping. The program x=2; { x=4; }; echo $x prints 4, whereas x=2; (x=4); echo $x prints 2. (Also braces require spaces around them and a semicolon before closing, whereas parentheses don't. That's just a syntax quirk.)
    • With a leading dollar sign, ${VAR} is a parameter expansion, expanding to the value of a variable, with possible extra transformations.
  • (()) double parentheses surround an arithmetic instruction, that is, a computation on integers, with a syntax resembling other programming languages. This syntax is mostly used for assignments and in conditionals.
    • The same syntax is used in arithmetic expressions $((…)), which expand to the integer value of the expression.
  • [[]] double brackets surround conditional expressions. Conditional expressions are mostly built on operators such as -n $variable to test if a variable is empty and -e $file to test if a file exists. There are also string equality operators: "$string1" = "$string2" (beware that the right-hand side is a pattern, e.g. [[ $foo = a* ]] tests if $foo starts with a while [[ $foo = "a*" ]] tests if $foo is exactly a*), and the familiar !&& and || operators for negation, conjunction and disjunction as well as parentheses for grouping. Note that you need a space around each operator (e.g. [[ "$x" = "$y" ]], not [[ "$x"="$y" ]]), and a space or a character like ; both inside and outside the brackets (e.g. [[ -n $foo ]], not [[-n $foo]]).
  • [] single brackets are an alternate form of conditional expressions with more quirks (but older and more portable). Don't write any for now; start worrying about them when you find scripts that contain them.

 

This is the idiomatic way to write your test in bash:

 

If you need portability to other shells, this would be the way (note the additional quoting and the separate sets of brackets around each individual test):

Command substitution

Command substitution allows us to take the output of one command and save it to a variable. This is achieved by placing the command within parentheses which is preceded by a dollar sign $

 

var=$(command)

 

*note: ls ~ (list contents of home directory  ~) is being piped into wc -l (word count, number of lines)

 

Invoke:

./substitution.sh

 

Output:

You have 5 files in your home directory.

 

 

Traditionally carried out using `backticks`, but has now been superceded with the above parentheses method.

Test

test is frequently used in shell scripting, but is more likely to be seen in its symbolic link format, using the notation of the left square bracket symbol [

 

Accordingly, just like other shell programs, such as ls, it is actually a program and must therefore be surrounded by spaces:

 

test condition

becomes

[ condition ]

 

The returned result of test is going to be either TRUE or FALSE

 

Accordingly, the following examples, using the pseudo ternary construct, will perform the command after the && if the result of the test is successful, otherwise they will perform the result after the ||

 

Operators

Various operations can be performed on the data within your shell scripts, falling under the following categories:

  • Arithmetic Operators
  • Relational Operators
  • Boolean (/Logical) Operators
  • Bitwise Operators
  • String Operators
  • File Test Operators

 

Further, there are a number of methods that can be carried out to perform these operations:

  • Arithmetic Expansion
    • echo $(( 6 * 7 ))
  • expr
    • expr 6 + 7
  • let
    • let "a = 6 * 7"; echo $a
  • bc
    • echo "6*7" | bc

Arithmetic Operators

+ Addition $(( a + b ))
- Subtraction $(( a - b ))
* Multiplication $(( a * b ))
/ Division $(( a / b ))
% Modulus (remainder) $(( a % b ))
** Exponent $(( a ** b ))
++ (pre) Increment (post)

$(( ++a ))

$(( a++ ))

-- (pre) Decrement (post)

$(( --b ))

$(( b-- ))

+= N Increment by constant N $(( a += b ))
-= N Decrement by constant N $(( a -= b ))
*= N Multiply by constant N $(( a *= b ))
/= N Divide by constant N $(( a /= b ))
%= N Remainder of Dividing by constant N $(( a %= b ))

 

*note: pre and post operations can be quite confusing and you may not get the result you expect

  • pre will Increment/Decrement the variable before the operation is carried out
  • post will Increment/Decrement the variable after the operation is carried out

 

Relational Operators

== Equal $(( a == b ))
!= Not equal $(( a != b ))
< Less than $(( a < b ))
<= Less than equal $(( a <= b ))
> Greater than $(( a > b ))
=> Greater than equal $(( a >= b ))

Returned results will be True 1 or False 0

Boolean (/Logical) Operators

&& Logical AND $(( a != b ))
|| Logical OR $(( a > b ))
! Negation (inversion) $(( a >= b ))

 

  • The right side of && will only be executed if the EXIT STATUS of the left is zero (i.e. no errors).
  • || is the opposite and will only evaluate the right side if the left side exit status is nonzero.
  • If the EXIT STATUS evaluates to TRUE, it returns a zero. If FALSE it returns a nonzero.

*remember, it is the result of the EXIT STATUS that is being evaluated

 

Bitwise Operators

Bitwise operators are carried out at the binary level of the operands, within a Byte.

 

& Bitwise AND $(( a < b ))
^ Bitwise Exclusive OR $(( a <= b ))
| Bitwise OR $(( a == b ))
~ Bitwise Complement $(( a != b ))
<< Left Shift $(( a > b ))
>> Right Shift $(( a >= b ))

 

Bits are operated upon in each position of their place in a byte.

 

To demonstrate we need to be able to think of our denary (base 10) numbers in binary form:

 

  •   5 == 0101
  •   8 == 1000
  • 10 == 1010
  • 15 == 1111

 

Remember that negative numbers are stored as the two's complement of the positive counterpart. As an example, here's the representation of -2 in two's complement: (8 bits):

 

1111 1110

 

The way you get this is by taking the binary representation of a number, taking its complement (inverting all the bits) and adding one. Two starts as 0000 0010, and by inverting the bits we get 1111 1101. Adding one gets us the result above. The first bit is the sign bit, implying a negative.

 

So let's take a look at how we get ~2 = -3:

Here's two again:

 

0000 0010

 

Simply flip all the bits and we get:

 

1111 1101

 

 

What does -3 look like in two's complement?
Start with positive 3: 0000 0011
Flip (invert) all the bits to 1111 1100
Add one to become negative value (-3), 1111 1101

So if you simply invert the bits in 2, you get the two's complement representation of -3

String Operators

# Length ${#string}
= Equal [[ $a = $b ]]
!= Not equal [[ $a != $b ]]
> Greater than [[ $a > $b ]]
< Less than [[ $a < $b ]]
-z is Zero [[ -z $a ]]
-n is non Zero [[ -n $a ]]
str Assigned [[ $a ]]

*note: above is using pseudo ternary one liner with && and || replacing normal ? and :

 

 

Substrings

 

The following tests whether a string contains substring:

 

Substring extraction

 

Extract from a position and from a position with length

 

${string:position}

${string:position;length}

 

 

Substring removal

 

Remove shortest starting match.

If string starts with substring, delete the shortest part that matches the substring:

${string#*substring}

 

Remove longest starting match.

If string starts with substring, delete the longest match from string and return the rest:

${string##substring}

 

Remove shortest ending match.

If string ends with substring, delete the shortest match from the end of string and return the rest:

${string%substring*}

 

Remove longest ending match.

If string ends with substring, delete the longest match from the end of string and return the rest:

${string%%substring*}

 

 

 

Substring replacement

 

Replace first occurrence of substring with replacement.

 

If replacement is null, substring is deleted from string.

${string/substring/replacement}

 

If substring matches the front end of  string, substitute replacement for  substring.

${string/#substring/replacement}

 

If substring matches the back end of  string, substitute replacement for  substring.

${string/%substring/replacement}

 

Replace all occurrences of substring.

${string//substring/replacement}

 

If sub is null, substring is deleted from variable.

 

 

 

Substring extraction

 

Extract substring from string at position;

${string:position}

 

Extract length characters of substring from string at position;

${string:position:length}

 

File test operators

-b file Checks if file is a block special file; if yes, then the condition becomes true. [ -b $file ] is false.
-c file Checks if file is a character special file; if yes, then the condition becomes true. [ -c $file ] is false.
-d file Checks if file is a directory; if yes, then the condition becomes true. [ -d $file ] is not true.
-e file Checks if file exists; is true even if file is a directory but exists. [ -e $file ] is true.
-f file Checks if file is an ordinary file as opposed to a directory or special file; if yes, then the condition becomes true. [ -f $file ] is true.
-G file Checks if file exists and is owned by effective group ID
-g file Checks if file has its set group ID (SGID) bit set; if yes, then the condition becomes true. [ -g $file ] is false.
-k file Checks if file has its sticky bit set; if yes, then the condition becomes true. [ -k $file ] is false.
-L file Symbolic link
-O file True if file exists and is owned by the effective user id
-p file Checks if file is a named pipe; if yes, then the condition becomes true. [ -p $file ] is false.
-r file Checks if file is readable; if yes, then the condition becomes true. [ -r $file ] is true.
-S file Checks if file is socket
-s file Checks if file has size greater than 0; if yes, then condition becomes true. [ -s $file ] is true.
-t file Checks if file descriptor is open and associated with a terminal; if yes, then the condition becomes true. [ -t $file ] is false.
-u file Checks if file has its Set User ID (SUID) bit set; if yes, then the condition becomes true. [ -u $file ] is false.
-w file Checks if file is writable; if yes, then the condition becomes true. [ -w $file ] is true.
-x file Checks if file is executable; if yes, then the condition becomes true. [ -x $file ] is true.

 

Upper/lower case conversion

  • ${var^} capitalizes the first letter of var
  • ${var^[aeiou]} capitalizes the first letter of var if it is a vowel
  • ${var^^} capitalizes all the letters in var
  • ${var,} lower-cases the first letter of var
  • ${var,[abc]} lower-cases the first letter of var if it is a, b or c
  • ${var,,} lower-cases all the letters in var

POSIX Charachter Classes [:class:]

Alternate method of specifying a range of characters to match:

 

  • [:alnum:] matches alphabetic or numeric characters. This is equivalent to A-Za-z0-9
  • [:alpha:] matches alphabetic characters. This is equivalent to A-Za-z
  • [:blank:] matches a space or a tab
  • [:cntrl:] matches control characters
  • [:digit:] matches (decimal) digits. This is equivalent to 0-9
  • [:graph:] (graphic printable characters). Matches characters in the range of ASCII 33 - 126. This is the same as [:print:], below, but excluding the space character
  • [:lower:] matches lowercase alphabetic characters. This is equivalent to a-z
  • [:print:] (printable characters). Matches characters in the range of ASCII 32 - 126. This is the same as [:graph:], above, but adding the space character
  • [:space:] matches whitespace characters (space and horizontal tab)
  • [:upper:] matches uppercase alphabetic characters. This is equivalent to A-Z
  • [:xdigit:] matches hexadecimal digits. This is equivalent to 0-9A-Fa-f

POSIX character classes generally require quoting or double brackets ([[ ]]).

Conditionals

Two main types of conditionals:

  • if
    • note, terminated by fi
  • case
    • note, terminated by esac

Also a pseudo Ternary using && and ||

  • command && next command || or this command

 

if conditionals

Basic if test construction:

if [ condition ]
then
    command
fi

if, then, else construction

if [ condition ]
then
    command
else
    command
fi

if, then, elif construction

if [ condition ]
then
    command
elif [ condition ]
then
    command

else

    command
fi

 

 

 

Case conditionals

 

Case tests several conditions at a time, carrying out the command of the first condition met. If no condition is met, the default case will be carried out as indicated by the final *)

 

 

Pseudo Ternary conditionals

The ternary conditional can be emulated using a one liner with && and || replacing normal ? and :

Loops

Loops are used for automating repetitive tasks and continue to run until a specific condition is met.

 

The main loop structures in shell scripting are:

 

  • for
  • while

 

Syntax of a for loop:

 

for value in source
do

command

done

 

*where source can be:

  • a number of separately listed items
  • a variable
  • an array
  • a substituted command e.g. $(command)

 

 

Syntax of a while loop:

 

while condition
do

command

done

 

Basically, while the condition is true, do the following command.

 

 

Another example, this time using a colon : to indicate an infinite loop. TRUE could also have been used instead of the :

Loop Controls

A loop can be stopped or skipped using either:

 

  • break
  • continue

 

The break command will stop the loop when a specific condition is met.

 

Alternatively, the loop can be made to continue past a specific condition, using the continue command.

 

Read

To capture input from the user, the read command is used to assign the characters entered on the command line to a variable:

 

Multiple values may also be read:

 

Use the -a option to instruct read to store the words in to an indexed array.

Select

select is used to acquire user input in the form of a numbered menu.

 

The PS3 variable can be set for select's command prompt, to give a more meaningful prompt than the standard # or $

 

Globbing

The glob command, short for global, originates in the earliest versions of Bell Labs' Unix.

 

Globbing is the process of carrying out filename expansion (i.e. expanding filename patterns) using special characters, namely the wildcards:

  • ? representing any single character
  • * representing zero or more characters

*

Matches any string, of any length

 

foo*

Matches any string beginning with foo

 

*x*

Matches any string containing an x (beginning, middle or end)

foo?

Matches foot or food but not fools

 

Filename expansion can also match dotfiles, but only if the pattern explicitly includes the dot as a literal character e.g. *.txt

Accordingly, a file starting with a dot will not be matched with a wildcard.

 

*.tar.gz

Matches any string ending with .tar.gz

 

Square brackets specify a range of characters to match one of any characters inside [...]

 

  • [abc]
    • matches either a, b or c, but not the string abc
    • *.[ch] matches anything ending with .c or .h
  • [a-c]
    • matches any character between (inclusive) a and c
    • similar to the above
  • [!a-c] or [^a-c]
    • The ! or ^ at the beginning tells Bash to invert the match
    • Therefore matches any other character that is not a, b or c

 

Curly braces specify terms separated by a comma. Spaces are not allowed after the commas, or anywhere else

 

This example will move anything ending with .txt or .doc to the user's home directory:

  • mv {*.txt,*.doc} ~

Arrays

An array is a variable containing multiple values.

 

There is no maximum limit to the size of an array, nor any requirement that member variables be indexed or assigned contiguously.

 

Indexed arrays are zero-based with the first element starting from the number 0.

 

 

Creating Arrays

 

The easiest way to create a simple array with data is by using the =() syntax, separating elements with whitespace:

  • bikes=(Ducati Aprilia Triumph Yamaha Honda)

 

Elements containing whitespace should be encapsulated within quotes:

  • names=("Freddie Mercury" Bono 'Brian May' $bandmember)

 

Elements may also be indexed (remember starting from 0), with spaces if required:

  • cars=([0]=Probe [1]Saab [2]Lotus [16]BMW [17]DeLorean)

An array with holes in it is called a sparse array and can be quite useful.

 

Similarly, specific elements of an indexed array can be defined using their index value:

  • bikes[0]=Ducati
  • bikes[3]=Aprilia
  • bikes[4]=Triumph

 

Or, as an associative array:

  • declare -A transport=([bike]=Ducati [car]=probe [jet]=F35)

*note: capital A to signify an associative array. Also, if already declared as an indexed array, you will not be able to convert indexed to associative array and will therefore have to unset it first.

 

Globs can also be used to fill an array with filenames:

  • scripts=(~/"My Scripts"/*.sh)

Or adding to an array:

files+=(*.txt)

 

 

readarray

 

The readarray command is used to populate (read lines into) an array.

 

*readarray is a synonym for mapfile

 

 

Reading Arrays

 

@ or * are used to refer to all elements of an array.

 

To get the number of elements:

  • echo "${#myvar[@]}"

 

A single element:

  • echo "${myvar[7]}"

 

All elements (at once):

  • echo "${myvar[@]}"

 

Iterate through all elements:

  • for value in "${myvar[*]}"; do echo "$value"; done

 

 

Deleting Arrays / Elements

 

Use unset on the array:

  • unset $myvar

 

Use unset on a specific array element:

  • unset ${myvar[2]}

 

 

A real world example

 

I had two sets of IP addresses that I wanted to check against each other.

 

The first set of about 60 IPs are from uptimerobot, are those that I want to whitelist.

 

However, I thought fail2ban had already caught some of the above IPs and added rules to iptables (which had ~2000 rules!), so I needed to check one list against the other.

 

I therefore created the following shell script... First, I assigned my uptimerobot IPs (in an external file) to a variable $myvar and then used readarray to assign the contents into an array: $MYIPs. I then used readarray again to assign all the IPs in the external banned.txt file into another array: $ADDRESSES.

 

The astute reader might say why not use the second method for both (ways of getting the external data into an array), but I wanted to do it this way to show the different ways to achieve the desired result.

 

Next, I iterated through each of the arrays, echo'ing out the IP from uptimerobot.txt when it matched against an IP from banned.txt

 

 

Next, 'All' I had to do was delete the specific IP from the iptables rule chain and all was good.

Redirection

As mentioned above, there are always three default files open, stdin (the keyboard), stdout (the screen) and stderr (error messages output to the screen).

 

These, and any other open files, can be redirected.

 

The redirection operators are:

 

  • command > filename
    • Create file, if it doesn't already exist, and send output to it
    • If file does exist, its contents will be truncated (overwritten)
  • command >> filename
    • Append output to file
    • Create file, if it doesn't already exist
  • command < filename
    • Read input from file

 

Redirection simply means capturing output from a file, command, program, script, or even code block within a script and sending it as input to another file, command, program, or script.

 

Send output from previous shell script into text file:

 

Append output from previous shell script to text file

 

Read text file into cat command:

 

Similarly, redirection can be performed on:

  1. stdout to a file
  2. stderr to a file
  3. stdout to a stderr
  4. stderr to a stdout
  5. stderr and stdout to a file
  6. stderr and stdout to stdout
  7. stderr and stdout to stderr

 

 

 

The redirector operators can also be combined:

Pipes

Pipes are used to direct the output of the preceding command (on the left) into the input of the following command (on the right).

 

Uses the | operator:

 

  • command 1 | command 2

Linking commands

Following on from pipes, we also have the ability to link commands in other ways:

 

  • ;
    • command 1 ; command 2 ; command 3
    • Do command 1, followed by command 2, followed by command 3
  • &
    • command &
    • Do command in a background subshell
    • Current shell does not wait for the command to finish, thus allowing the user to continue in the current shell
    • Use the jobs command to which processes are running in the background
  • &&
    • command 1 && command 2
    • Do command 1 and then proceed to command 2, only if command 1 exited successfully i.e. zero exit status
    • akin to logical AND
  • ||
    • command 1 || command 2
    • Do command 1 and then proceed to command 2, only if command 1 failed i.e. non-zero exit status
    • akin to logical OR

 

Above should be very obvious.

 

We have also already seen the pseudo ternary function delivered using:

  • [  condition ] && command || command

Filters

A filter is generally referred to as a program that reads standard input, performs an operation upon it and writes the results to standard output.

 

The following provides examples against some of the more common command line tools.

*example data

 

 

cut

Extracts specified column from each line of a file, based on the following options:

  • -b    byte position
  • -c    character position
  • -f     field position
    • -d    specifies a delimiter (default: tab)

 

 

grep

Extracts lines from files matching a specified pattern. Syntax, grep 'pattern' 'files-to-be-matched:

  • grep 'rossi' *

*useful examples

 

 

head

Extracts first 10 lines of a file by default. Use -n to specify number of lines:

  • head targetFile
  • head targetFile -6

 

 

nl

Number each line within a file:

  • nl targetFile

 

 

paste

Merges multiple files into a single multi-column file:

  • paste targetFile somethingElse

 

 

sort

Sorts input. Default: alphabetically

  • sort targetFile

 

 

split

Splits a file into separate parts:

  • split targetFile
    • default size is 1000 lines

 

 

tac

Reverse of cat:

  • tac targetFile

 

 

tail

Extracts last 10 lines of a file by default. Use -n to specify number of lines:

  • tail targetFile
  • tail targetFile -6

 

 

tee

Used in conjunction with the pipe filter |

Copies the stdin to the specified file(s) and stdout:

  • command | tee targetFile
  • command | tee targetFile1 targetFile2 targetFile3

or, to append:

  • command | tee -a targetFile

 

 

tr

Translates (or deletes) characters

  • tr "[a-z]" "[A-Z]"
    • above translates lower to upper case
    • -d delete pattern
    • -c complements

 

 

uniq

Removes duplicate lines from targetFile:

  • uniq targetFile

 

 

wc

Counts the number of words in a targetFile

  • wc targetFile
    • -c byte count
    • -m character count
    • -l line count
    • -w word count

$IFS Internal Field Separator

Reserved / special variable that determines how the shell recognises field or word boundaries (i.e. the character that separates field or words) when it interprets strings.

 

$IFS defaults to whitespace (space, tab or newline) but can be changed, for example, to parse a csv file:

  • IFS=','

Functions

Functions provide a method of grouping frequently used (sets of) commands together using an easily referenced user defined name. This allows pieces of code to be grouped in a more logical way, that can be called upon or repeated, as required.

 

Created by placing parentheses after the function name, or by use of the function keyword followed by the function name:

 

function_name () {

<commands>

}

or

function function_name {

<commands>

}

 

Either of the above is valid, but unlike other programming languages the parentheses can not be used to pass arguments into the function.

 

To use a function, simply specifying the function name:

 

Arguments can be passed into a function in a similar method as passing in from the command line:

 

Numerical values can be returned from the function by using the keyword return and then specifying the numerical value to be returned, which can then be accessed via use of the special variable $?

*Typically a return status of 0 indicates that everything went successfully. A non zero value indicates an error occurred.

 

Command substitution can be used to return the result of a calculation:

The target file is passed in as an argument when running the script:

  • myScript.sh targetFile.txt

The result is that the passed in file is used as the argument when the function is called and its returned value is assigned to the calling variable, ultimately counting and echoing the number of lines in the file.

sed

sed, the Stream Editor is a text editor that performs editing operations on information coming from standard input or a file.

 

sed (non-interactively) edits its input line-by-line, operates on it, and outputs the resulting text before repeating the process on the next line.

 

Basic usage:

  • sed [options] command [targetFile]

 

sed sends it results to screen by default and can therefore be used to display a file's contents:

  • sed '' targetFile.txt
    • in the above example there's no command being sent in the single quotes, so it simply printed each line it received to stdout
  • cat targetFile.txt | sed ''
    • sed can use stdin by piping the output of cat for the same result

 

 

Printing lines

Using sed's explicit "print" command, specified by the 'p' character within single quotes.

  • sed 'p' targetFile.txt
    • notice each line is now printed twice, due to it automatically printing each line and plus we've told it to print explicitly with the 'p' command

The -n option can be used to suppress (the default) printing:

  • sed -n 'p' targetFile.txt
    • so it now only prints each line once

 

 

Address Ranges

Print just the first line:

  • sed -n '1p' targetFile.txt

 

Print five lines:

  • sed -n '1,5p' targetFile.txt
    • this is known as an address range
  • sed -n '1,+4p' targetFile.txt
    • alternative syntax

Print every other line:

  • sed -n '1~2p' targetFile.txt

 

 

Deleting text

Delete every other line:

  • sed '1~2d' targetFile.txt

Send output to a file:

  • sed '1~2d' targetFile.txt > newFile.txt

 

By default, sed doesn't edit the source file. However this can be overridden by providing the -i option to perform edits in place, to edit the source file:

  • sed -i '1~2d' targetFile.txt

This can be dangerous, but sed allows a backup to be created prior to editing:

  • sed -i.bak '1~2d' targetFile.txt
    • above creates a backup file with the .bak extension

 

 

Substituting text

  • sed s/search/replace/
    • the initial s specifies the action to be performed: substitute
    • the pattern being searched for is placed between the first and second /
    • the replacement pattern is between the second and third /

 

Other characters can also be used as delimiters, which come in handy if / is within the search or replace patterns:

  • s_search_replace_
  • echo "https://www.example.com/index.html" | sed 's_com/index_org/home_'
    • https://www.example.org/home.html

 

sed substitutes the first instance it finds of the pattern, line by line! To replace all occurrences, use the g flag after the substitution set:

  • sed s/search/replace/g
    • the g specifies a global replacement, otherwise it would only replace the first pattern match

 

If we only wanted to change the second instance of the search pattern that sed finds on each line, then we could use the number "2" instead of the "g":

  • sed s/search/replace/2

 

More complex patterns can be found by using regular expressions. For instance, if we want to match the from the beginning of the line to "at" we can use the expression:

  • sed 's/^.*at/REPLACED/' targetFile.txt

 

 

some further examples