Field Surgery on Text

Concept:

sed (Stream Editor) — performs text substitution on a stream.
sed "s/old/new/" replaces the first occurrence per line. Add g for all: sed "s/old/new/g".

awk — processes columnar data. It splits each line into fields ($1, $2, etc.) by whitespace.
awk "{print $1}" prints the first column.

Together with pipes, sed and awk let you transform any text data on the fly.

Terminal: You can find data, filter it, sort it. But sometimes you need to change it — perform surgery. Meet 'sed', the Stream Editor. It operates on text as it flows through.

You: Surgery?

Terminal: 'echo "dirty water" | sed "s/dirty/clean/"' — the 's' means substitute. It finds 'dirty' and replaces it with 'clean'. The text passes through the pipe, sed operates, clean result comes out. Like purifying water.

You: What about data in columns? Like a table?

Terminal: That's 'awk' — your field specialist. It sees every line as columns separated by spaces. $1 is the first column, $2 the second, and so on. 'awk "{print $1}"' extracts just the first column from every line.

You: So if I have a log file with timestamps, IPs, and messages...

Terminal: 'awk "{print $1}"' grabs just the timestamps. Pipe that through sort and uniq -c, and you know exactly when things happened and how often. There's an access log from the old outpost. Extract the IPs and count which ones connected most.

Example Code:

# sed: search and replace
echo "Hello World" | sed 's/World/Bash/'
# Hello Bash

# sed on a file
cat prices.csv | sed 's/USD/EUR/g'

# awk: extract columns
ls -l | awk '{print $5, $9}'
# 4096 documents
# 123 notes.txt

Your Assignment

Use awk to extract only the IP addresses (first column) from 'access.log', then pipe through sort and uniq -c to count requests per IP.

Solution:

awk '{print $1}' access.log | sort | uniq -c

Bash Console

bash>