The Definitive Guide to Bash Analytics

cat sales.csv | less -S
cat sales.csv |  column -t -s "," | less -S

Let the Data Science begin

The first thing you’d want to look at is aggregations. The most basic aggregation is a count. In this specific file, we can aggregate by item type, region, country and more. Let’s say we wanted to see how many sales items there are by region. First, we look at the first column just to make sure we chose the right one:

cat sales.csv |  cut -f 1 -d "," | head
cat sales.csv |  cut -f 1 -d "," | sort | uniq -c
cat sales.csv |  cut -f 1 -d "," | sort | uniq -c | sort -n
cat sales.csv |  cut -f 1 -d "," | tail -n+2 | sort | uniq -c | sort -n

Conditional Filtering

Another super useful snippet is to filter only a set of rows, given a certain condition. For example, only output columns 3 and 4 if column 5 is equal to “Asia”.

cat sales.csv | awk -F ',' '{ if ($1 == "Asia") print $9 }'
cat sales.csv | awk -F ',' '{ if ($1 == "Asia") print $0 }'
cat sales.csv | awk -F ',' '{ if ($1 == "Asia") print $0 }' | wc -l
cat sales.csv | awk -F ',' '{ if ($1 == "Asia") print $0 }' | awk '{s+=$1} END {print s}'

Joining

Surprisingly, Unix also has a simple join tool called join which can be used to join two CSVs fairly easily.

curl -O https://raw.githubusercontent.com/cristiroma/countries/master/data/csv/countries.csv
cat countries.csv | tr -d '"' > countries_clean.csv
join -t "," -1 2 -2 1 <(sort -k 2 -t "," sales.csv) <(sort -k 1 -t "," countries_clean.csv) 

Histograms

Sometimes the command line is not enough for what we want to do, so we need to add a little extra Python.

cat sales.csv | cut -f 10 -d "," | tail -n+2 | python -c "import sys, collections; values = [float(x) for x in sys.stdin]; min_v = min(values); max_v = max(values); norm = [int(10*(x-min_v)/(max_v-min_v)) for x in values]; print '\n'.join(['%12.4f - %12.4f: (%8d) %s' % (x[0]*((max_v-min_v)/10)+min_v, (x[0]+1)*((max_v-min_v)/10)+min_v, x[1], '*' * (100 * x[1] / len(values))) for x in sorted(collections.Counter(norm).items())])"

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store