Thursday, December 11, 2008

Unix command examples


man get manual page on a UNIX command


example: man uniq




cut extract columns of data

example: cut -f -3,5,7-9 -d ' ' infile1 > outfile1
example: ls -l | awk '{print $8}' | cut -d '.'. -f -2


-f 2,4-6 field
-c 35-44 character
-d ':' delimiter (default is a tab)



sort sort lines of a file (Warning: default delimiter is white space/character transition)

example: sort -nr infile1 more
Linux example: ls -l | sort -nr +4 -5
Linux example: df -k '/b01/oradata/' | sort -nr +1 -2

-n numeric sort
-r reverse sort
-k 3,5 start key


wc count lines, words, and characters in a file

example: wc -l infile1

-l count lines
-w count words
-c count characters


paste reattach columns of data

example: paste infile1 infile2 > outfile2


cat concatenate files together

example: cat infile1 infile2 > outfile2

-n number lines
-vet show non-printing characters (good
for finding problems)


uniq remove duplicate lines (normally from a sorted file)

example: sort infile1 uniq -c > outfile2

-c show count of lines
-d only show duplicate lines


join perform a relational join on two files

example: join -1 1 -2 3 infile1 infile2 > outfile1

-1 FIELD join field of infile1
-2 FIELD join field of infile2


cmp compare two files

example: cmp infile1 infile2


diff or diff3 compare 2 or 3 files - show differences

example: diff infile1 infile2 more
example: diff3 infile1 infile2 infile3 > outfile1


head extract lines from a file counting from the beginning

example: head -100 infile1 > outfile1


tail extract lines from a file counting from the end

example: tail +2 infile1 > outfile1

-n count from end of file (n is an integer)
+n count from beginning of file (n is an integer)


dos2unix convert dos-based characters to UNIX format (the file is
overwritten).

example: dos2unix infile1


tr translate characters - example shows replacement of spaces
with newline character

example: tr " " "[\012*]" <> outfile


grep extract lines from a file based on search strings and
regular expressions

example: grep 'Basin1' infile1 > outfile2
example: grep -E '15:2015:01' infile1 more


sed search and replace parts of a file based on regular
expressions

example: sed -e 's/450/45/g' infile1 > outfile3


Regular Expressions

Regular expressions can be used with many programs including ls, grep, sed,
vi, emacs, perl, etc. Be aware that each program has variations on usage.

ls examples:

ls Data*.txt
ls Data4[5-9].ps list ps files beginning with Data numbered 45-49


sed examples: (these are the regex part of the sed command only)

s/450/45/g search for '450' replace with '45' everywhere
s/99/-9999\.00/g search for all '99' replace with '-9999.00'
s/Basin[0-9]//g remove the word Basin followed by a single digit
s/^12/12XX/ search for '12' at the beginning of a line,
insert XX
s/Basin$// remove the word Basin if it is at the end of
the line.
s/^Basin$// remove the word Basin if it is the only word on
the line.
s/[cC]/100/g search for 'c' or 'C' replace with 100

45,$s/\([0-9][0-9]\)\.\([0-9][0-9]\)/\2\.\1/g
on lines 45 to the end of file, search for two digits
followed by a '.' followed by two digits. replace
with the digit pairs reversed.

2,$s/,\([^,]*\),/,\"\1\",/
on all lines except the first, search for a comma,
followed by any text, followed by a comma. replace
the found text surrounded by double quotes.

s/\([0-9][0-9]\):\([0-9][0-9]\):\([0-9][0-9][0-9][0-9]\)/Year = \3, Month = \2, Day = \1/
search for 2 digits, followed by a colon, followed by 2 digits,
followed by a colon, followed by 4 digits. replace with
text plus values in a different order.


Pipes, standard input, standard output:

Standard output, ">", places the results of a command into the file named
after the ">". A new file will be written (an old file with the same name
will be removed). In order to append to an existing file use ">>".

Pipes allow you to connect multiple commands together to form a data stream.
For example, to count the number of times the string "Nile" occurs in the
3rd column of a file run this:

cut -f 3 infile1 sort uniq -c grep 'Nile'

or do this:

cut -f 3 infile1 grep 'Nile' wc -l


From a global STN Attributes data set (tab delimited):

- extract all North American basins draining into the Atlantic Ocean
- select only columns 2,3,4,5,11,12,13, and 17
- replace all missing data values (either -99 or -999) with -9999.0
- remove duplicate lines
- sort by the first column
- number all lines sequentially
- save to a new file

grep 'North America' STNAttributes.txt grep 'Atlantic Ocean' \
cut -f 2-5,11-13,17 sed -e 's/-99\-999/-9999\.0/g' \
sort uniq cat -n > NewSTNAttributes.txt