Linux 'grep'

Preview

grep is the fastest way to filter lines that match a pattern across files, directories, or streams. This guide focuses on GNU grep with production-ready examples.


TL;DR (Cheatsheet)


Core Patterns You’ll Actually Use

# Recursive search with file filters
grep -Rni --include='*.log' --exclude-dir='.git' 'ERROR' .

# Count matches per file (quick signal for CI)
grep -Rc --include='*.py' 'TODO' src/

# Show only the matched portion (e.g., HTTP status in journal)
journalctl -u nginx | grep -oE 'status=[0-9]{3}'

# Word boundary search (avoid over-matching)
grep -rw --include='*.c' '\<init\>' src/

# Literal text (fast, ignores regex metacharacters)
grep -F '[INFO]' app.log

# Safe with odd filenames (spaces/newlines)
find . -name '*.md' -print0 | xargs -0 grep -n 'architecture'

Performance That Matters


Regex Primer (Just Enough)


Context & Reporting

# Show surrounding lines (2 before/after)
grep -nC2 'panic' kernel.log

# Only filenames that contain matches
grep -Rl 'TODO' src/

# Suppress filenames (multi-file, lines only)
grep -h 'pattern' file1 file2

Real-World Mini Cookbook

# 1) Summarize error bursts with context around each match
grep -Rni --include='*.log' 'CRITICAL' logs/ | sed -n 's/:.*//p' | sort -u \
 | xargs -I{} sh -c "echo '--- {} ---'; grep -nC2 'CRITICAL' {}"

# 2) Extract JSON values (simple cases)
grep -Po '"user_id"\s*:\s*"\K[^"]+' events.jsonl | sort | uniq -c | sort -nr | head

# 3) Find lines NOT matching (e.g., exclude health checks)
grep -R 'GET /' access.log | grep -v '/healthz'

Cross-Platform Notes


Common Pitfalls (and Fixes)