The Definitive Guide to Bash Analytics

cat sales.csv | less -S
cat sales.csv |  column -t -s "," | less -S

Let the Data Science begin

cat sales.csv |  cut -f 1 -d "," | head
cat sales.csv |  cut -f 1 -d "," | sort | uniq -c
cat sales.csv |  cut -f 1 -d "," | sort | uniq -c | sort -n
cat sales.csv |  cut -f 1 -d "," | tail -n+2 | sort | uniq -c | sort -n

Conditional Filtering

cat sales.csv | awk -F ',' '{ if ($1 == "Asia") print $9 }'
cat sales.csv | awk -F ',' '{ if ($1 == "Asia") print $0 }'
cat sales.csv | awk -F ',' '{ if ($1 == "Asia") print $0 }' | wc -l
cat sales.csv | awk -F ',' '{ if ($1 == "Asia") print $0 }' | awk '{s+=$1} END {print s}'

Joining

curl -O https://raw.githubusercontent.com/cristiroma/countries/master/data/csv/countries.csv
cat countries.csv | tr -d '"' > countries_clean.csv
join -t "," -1 2 -2 1 <(sort -k 2 -t "," sales.csv) <(sort -k 1 -t "," countries_clean.csv) 

Histograms

cat sales.csv | cut -f 10 -d "," | tail -n+2 | python -c "import sys, collections; values = [float(x) for x in sys.stdin]; min_v = min(values); max_v = max(values); norm = [int(10*(x-min_v)/(max_v-min_v)) for x in values]; print '\n'.join(['%12.4f - %12.4f: (%8d) %s' % (x[0]*((max_v-min_v)/10)+min_v, (x[0]+1)*((max_v-min_v)/10)+min_v, x[1], '*' * (100 * x[1] / len(values))) for x in sorted(collections.Counter(norm).items())])"

--

--

--

An entrepreneur, and a web expert.

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

TIL 0513

My First CLI Project — Lessons Learned Along The Way

Quick note on recent development

Favorite Developer Tools for 2021

BitCanna Testnet

Web analytics toolbelt

Attempting every GMAT question within a minute

How can I be more Interested in Coding

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Ron Reiter

Ron Reiter

An entrepreneur, and a web expert.

More from Medium

Data Preparation for Analytics

Fitbit Sleep Data Analysis (Step 1: Import)

Data Wrangling Python

The road ahead. Chapter 2 Module 2–4