Lesson 2: The Text Alchemist (Advanced Text Processing)
You've learned to redirect output and use pipes. Now it's time to master the tools that transform text. These are the crown jewels of Linux — the tools that turn raw data into gold.
cut — The Column Extractor
Extracts specific columns from structured text:
cut -d: -f1 /etc/passwd # Get usernames (field 1, colon-delimited)
cut -d, -f2,3 data.csv # Get columns 2 & 3 from a CSV
-d— Delimiter (what separates columns).-f— Field number(s) to extract.
sed — The Stream Editor
sed performs find-and-replace on text streams without opening a file:
sed 's/old/new/g' file.txt # Replace all "old" with "new"
sed -i 's/DEBUG/INFO/g' app.log # Edit the file in-place (-i)
sed '5d' file.txt # Delete line 5
s/— Substitute command./g— Global (replace ALL occurrences, not just the first).
awk — The Pattern Processor
awk is a mini programming language for text processing. It processes text line by line and column by column.
awk '{print $1}' file.txt # Print first column
awk -F: '$3 >= 1000 {print $1}' /etc/passwd # Users with UID >= 1000
awk '{sum += $1} END {print sum}' numbers.txt # Sum a column
sort & uniq — Organize and Deduplicate
sort access.log # Sort lines alphabetically
sort -n numbers.txt # Sort numerically
sort access.log | uniq -c | sort -rn # Count unique lines, most frequent first
booting...
Mission Objective
Process data like a pro:
- Extract: Run
cut -d: -f1 /etc/passwdto pull out just the usernames. - Transform: Use
sed 's/ERROR/FIXED/g' app.logto replace error markers. - Analyze: Use
awk -F: '$3 >= 1000 {print $1}' /etc/passwdto find regular (non-system) users.