Filters

A filter is generally referred to as a program that reads standard input, performs an operation upon it and writes the results to standard output.

 

The following provides examples against some of the more common command line tools.

*example data

 

 

cut

Extracts specified column from each line of a file, based on the following options:

  • -b    byte position
  • -c    character position
  • -f     field position
    • -d    specifies a delimiter (default: tab)
#! /bin/bash

#a few cut byte position examples
cut -b 2 2018motogp.txt
cut -b 2-15 2018motogp.txt
cut -b 2,7 2018motogp.txt

#a few cut character position examples
cut -c 2 2018motogp.txt
cut -c 2-15 2018motogp.txt
cut -c 2,7 2018motogp.txt

#a few cut field position examples
cut -f 2 2018motogp.txt
cut -f 2-4 2018motogp.txt
cut -f 2,5 2018motogp.txt

 

 

grep

Extracts lines from files matching a specified pattern. Syntax, grep 'pattern' 'files-to-be-matched:

  • grep 'rossi' *
#! /bin/bash

#simple grep
grep ros *

#case insensitive
grep -i ros *

#full word
grep -iw rossi *

*useful examples

 

 

head

Extracts first 10 lines of a file by default. Use -n to specify number of lines:

  • head targetFile
  • head targetFile -6
#! /bin/bash
#default first 10 lines
head 2018motogp.txt
#specified first 6 lines
head -6 2018motogp.txt

 

 

nl

Number each line within a file:

  • nl targetFile
#! /bin/bash

nl 2018motogp.txt

 

 

paste

Merges multiple files into a single multi-column file:

  • paste targetFile somethingElse
#! /bin/bash

#create first.txt with a few lines of text
cat << hereFirst > first.txt
one
two
three
hereFirst

#create second.tx with a few lines of test
cat << hereSecond >second.txt
banana
potatoes
figs
hereSecond

#paste them together
paste first.txt second.txt > third.txt

#have a look
cat third.txt

 

 

sort

Sorts input. Default: alphabetically

  • sort targetFile
#! /bin/bash

sort 2018motogp.txt

 

 

split

Splits a file into separate parts:

  • split targetFile
    • default size is 1000 lines
#! /bin/bash
#create file
echo > '2000lines.txt'
i=1
#populate 2000 lines into file
while [ $i -lt 2000 ] ; do
	echo "line: $i" >> '2000lines.txt'
	((i++))
done
#split file
split '2000lines.txt'
#clear screen
clear
#observe two new files have been created xaa and xab
ls -ltr

 

 

tac

Reverse of cat:

  • tac targetFile
#! /bin/bash

tac '2018motogp.txt'

#observe output is sent to stdout in reverse order

 

 

tail

Extracts last 10 lines of a file by default. Use -n to specify number of lines:

  • tail targetFile
  • tail targetFile -6
#! /bin/bash
echo default last 10 lines
head 2018motogp.txt
echo specified last 6 lines
head -6 2018motogp.txt

 

 

tee

Used in conjunction with the pipe filter |

Copies the stdin to the specified file(s) and stdout:

  • command | tee targetFile
  • command | tee targetFile1 targetFile2 targetFile3

or, to append:

  • command | tee -a targetFile
#! /bin/bash
clear
#cat targetFile and also send as input into new file
cat 2018motogp.txt | tee myNewFile.txt
echo -----------------------

#same as above, but appended
cat 2018motogp.txt | tee -a myNewFile.txt
echo -----------------------

#obesrve appended text
cat myNewFile.txt

 

 

tr

Translates (or deletes) characters

  • tr "[a-z]" "[A-Z]"
    • above translates lower to upper case
    • -d delete pattern
    • -c complements
#! /bin/bash
clear

#translate text file to all upper case
tr "[a-z]" "[A-Z]" < '2018motogp.txt' > upper.txt
cat upper.txt

#translate text file to all upper case
tr "[A-Z]" "[a-z]" < '2018motogp.txt' > lower.txt
cat lower.txt

#delete all numerals from file
tr -d 0-9 < '2018motogp.txt' > alphasOnly.txt
cat alphasOnly.txt

echo "Nothing to be seen here" | tr -d nothing

echo abcdefghijk | tr -d d-h

echo abcdefghijk | tr -c -d d-h

 

 

uniq

Removes duplicate lines from targetFile:

  • uniq targetFile
#! /bin/bash

#test file to pull out unique lines
cat << heredoc > uniqTest.txt
First repeated line
First repeated line
First repeated line
First repeated line
First repeated line
First repeated line
Second repeated line
Second repeated line
Second repeated line
Second repeated line
Third repeated line
Third repeated line
Third repeated line
Third repeated line
heredoc

uniq uniqTest.txt

 

 

wc

Counts the number of words in a targetFile

  • wc targetFile
    • -c byte count
    • -m character count
    • -l line count
    • -w word count
#! /bin/bash

a=($( wc -c '2018motogp.txt' ))
echo "There are $a bytes in the file"

b=($( wc -m '2018motogp.txt' ))
echo "There are $b characters in the file"

c=($( wc -l '2018motogp.txt' ))
echo "There are $c lines in the file"

d=($( wc -w '2018motogp.txt' ))
echo "There are $d words in the file"