man get manual page on a UNIX command
example: man uniq
cut extract columns of data
example: cut -f -3,5,7-9 -d ' ' infile1 > outfile1
example: ls -l | awk '{print $8}' | cut -d '.'. -f -2
-f 2,4-6 field
-c 35-44 character
-d ':' delimiter (default is a tab)
sort sort lines of a file (Warning: default delimiter is white space/character transition)
example: sort -nr infile1 more
Linux example: ls -l | sort -nr +4 -5
Linux example: df -k '/b01/oradata/' | sort -nr +1 -2
-n numeric sort
-r reverse sort
-k 3,5 start key
wc count lines, words, and characters in a file
example: wc -l infile1
-l count lines
-w count words
-c count characters
paste reattach columns of data
example: paste infile1 infile2 > outfile2
cat concatenate files together
example: cat infile1 infile2 > outfile2
-n number lines
-vet show non-printing characters (good
for finding problems)
uniq remove duplicate lines (normally from a sorted file)
example: sort infile1 uniq -c > outfile2
-c show count of lines
-d only show duplicate lines
join perform a relational join on two files
example: join -1 1 -2 3 infile1 infile2 > outfile1
-1 FIELD join field of infile1
-2 FIELD join field of infile2
cmp compare two files
example: cmp infile1 infile2
diff or diff3 compare 2 or 3 files - show differences
example: diff infile1 infile2 more
example: diff3 infile1 infile2 infile3 > outfile1
head extract lines from a file counting from the beginning
example: head -100 infile1 > outfile1
tail extract lines from a file counting from the end
example: tail +2 infile1 > outfile1
-n count from end of file (n is an integer)
+n count from beginning of file (n is an integer)
dos2unix convert dos-based characters to UNIX format (the file is
overwritten).
example: dos2unix infile1
tr translate characters - example shows replacement of spaces
with newline character
example: tr " " "[\012*]" <> outfile
grep extract lines from a file based on search strings and
regular expressions
example: grep 'Basin1' infile1 > outfile2
example: grep -E '15:2015:01' infile1 more
sed search and replace parts of a file based on regular
expressions
example: sed -e 's/450/45/g' infile1 > outfile3
Regular Expressions
Regular expressions can be used with many programs including ls, grep, sed,
vi, emacs, perl, etc. Be aware that each program has variations on usage.
ls examples:
ls Data*.txt
ls Data4[5-9].ps list ps files beginning with Data numbered 45-49
sed examples: (these are the regex part of the sed command only)
s/450/45/g search for '450' replace with '45' everywhere
s/99/-9999\.00/g search for all '99' replace with '-9999.00'
s/Basin[0-9]//g remove the word Basin followed by a single digit
s/^12/12XX/ search for '12' at the beginning of a line,
insert XX
s/Basin$// remove the word Basin if it is at the end of
the line.
s/^Basin$// remove the word Basin if it is the only word on
the line.
s/[cC]/100/g search for 'c' or 'C' replace with 100
45,$s/\([0-9][0-9]\)\.\([0-9][0-9]\)/\2\.\1/g
on lines 45 to the end of file, search for two digits
followed by a '.' followed by two digits. replace
with the digit pairs reversed.
2,$s/,\([^,]*\),/,\"\1\",/
on all lines except the first, search for a comma,
followed by any text, followed by a comma. replace
the found text surrounded by double quotes.
s/\([0-9][0-9]\):\([0-9][0-9]\):\([0-9][0-9][0-9][0-9]\)/Year = \3, Month = \2, Day = \1/
search for 2 digits, followed by a colon, followed by 2 digits,
followed by a colon, followed by 4 digits. replace with
text plus values in a different order.
Pipes, standard input, standard output:
Standard output, ">", places the results of a command into the file named
after the ">". A new file will be written (an old file with the same name
will be removed). In order to append to an existing file use ">>".
Pipes allow you to connect multiple commands together to form a data stream.
For example, to count the number of times the string "Nile" occurs in the
3rd column of a file run this:
cut -f 3 infile1 sort uniq -c grep 'Nile'
or do this:
cut -f 3 infile1 grep 'Nile' wc -l
From a global STN Attributes data set (tab delimited):
- extract all North American basins draining into the Atlantic Ocean
- select only columns 2,3,4,5,11,12,13, and 17
- replace all missing data values (either -99 or -999) with -9999.0
- remove duplicate lines
- sort by the first column
- number all lines sequentially
- save to a new file
grep 'North America' STNAttributes.txt grep 'Atlantic Ocean' \
cut -f 2-5,11-13,17 sed -e 's/-99\-999/-9999\.0/g' \
sort uniq cat -n > NewSTNAttributes.txt