Search notes:

Shell command: grep

In its basic use, grep prints lines of a text file that match a basic regular expression.
grep understands three regular expression dialects: BRE (basic regex), ERE (extended regex) and PCRE (Perl compatible regex), but GNU grep does not differentiate between BRE and ERE.
The dialect can be chosen with the command line options -G, -E or -P
The name grep originates in g/re/p which was a command in qed and ed where a regular expression (re) is globally (g) searched for and matched text is printed (p).
grep [option…] patterns [file…]
grep [option…] -e patterns … [file…]
grep [option…] -f patern-file … [file…]

Command line options

Options related to pattern

--extended-regexp -E
--fixed-strings -F pattern is not interpreted as a regular expression, but as a fixed text instead.
--basic-regexp -G
--perl-regexp -P pattern is interpreted as PCRE (Perl compatible regex)

Matching control

--regexp=… -e … Specifies the regular expression for which grep searches. Multiple regular expressions can be stated (of which at least one must match in order for a line to be printed)
--file FILE -f FILE Read patterns, being separated by end of line, from FILE.
--ignore-case -i -y is an obsolete synonym for -i.
--no-ingore-case This is the default.
--invert-match -v Print lines where the regular expression does not match; don't print lines where it matches.
--word-regexp -w
--line-regexp -x The regular expression must match the entire line (i. e. as though the regexp were enclosed between ^…$)

General output control

--count -c Count the number of lines that matched (or didn't match if combined with -v)
--color=$W $W = Use escape sequences to color interesting parts of the output. $W is never, always or auto
--files-without-match -L Instead of printing matched lines, print the names of the files where no match was found.
--files-with-matches -l Similar to -L but print name of files where at least one match was found.
--max-count=N -m N Stop processing a file after encountering N matches.
--only-matching -o Print only the portion that was matched by the regular expression.
--silent -q Print nothing to stdout (for example used to query process's exit status: if grep -q xyz /path/to/file; then … ; fi)
--no-messages -s Don't print to stderr(?) if indicated file is inexistent or not readable.

Output line prefix control

--byte-offset -b print 0-based byte offset. Compare with -n
--with-filename -H Also print filename where regular expression matched / default when grepping multiple files
--no-filename -h Opposite of -H
--label=L
--line-number -n Also print matched line's 1-based line number. Compare with -b
--initial-tab -T Preceed matched line with a tabulator, typically used in combination with -H, -n or -b
--unix-byte-offsets -u DOS/Windows end of lines (which consists of two bytes) are counted as one. No effect when used with -b
--null -Z add a NUL byte after file names to make reported file names unambiguous (for example if they contain new lines). Used with commands like find -print0, perl -0, sort -z and xargs -0 to process arbitrary file names.

Context line control

--after-context N -A N Print N lines following matched line
--before-context N -B N Print N lines preceding matched line
--context N -C N

File and directory selection

--text -a Process binary file as if it were a text file. Same as --binary-files=text
--binary-files
--devices=ACTION -D ACTION
--directories=ACTION -d ACTION ACTION = read, skip or recurse. -d recurse is equivalent to -r
--exclude=GLOB
--exclude-from
--exclude-dir
-I
--include=GLOB specify a glob (see man 7 glob) to limit the searched files. For example grep -rl --include '*.c' 'int main' should (hopefully) find a C program's main function by searching all C sources below the current directory.
--recursive -r Process all files under a given directory(-tree). Equivalent to -d recurse. See also combining find with grep to prevent Permission denied error messages. By default, using -r prepends the matched lines with the filename where it was found - Use -h to suppress this.
--dereference-recursive -R

Other options

--line-buffered
--binary -U
--null-data -z
--group-separator
--no-group-separator
--version -V Print version and exit
--help

Recursive grep

Only search *.txt files:
grep -R --include=*.txt search-pattern
In PowerShell, a recursive grep might be achieved with a combination of select-string and get-childItem.

-P

With the -P flag, the pattern is interpreted as a Perl-compatible regular expression (PCRE).
The default is to interpret is as basic regular expression (BRE).
For example, to match a digit with BRE, you'd use [0-9], with PCRE, [0-9] as well as \d is possible.
To match three consecutive numbers, the curly braces must be escaped with backslashes: [0-9]\{3\}. With PCRE, they don't: \d{3}.

Print matched part, not entire line

With -o (or --only-matching), only the matched part of a line is printed (as opposed to the entire line that contains the matched text).
If the mattern matches multiple times per line, each match is printed
So, using the regex \w+ (together with -P, Perl commands regular expressions) prints the words in a (text-)file:
$ cat <<EOT | grep -P -o '\w+'
> foo bar baz
> one two three
> EOT
foo
bar
baz
one
two
three
When using -P, the pattern can contain a (positive) lookbehind assertions which prevents portions of the matched text not to be printed:
$ cat <<EOT | grep -P -o '(?<=value=)\d+'
This number 99 does not match
because is is not prepended by "value=", but
this one is: value=42. The -o option
combined with (?<=…) prevents the value=
to be returned by grep.
EOT
42
A combination of grep -o and the shell commands sort and uniq allows to find all KCONFIG_* symbols in the Linux kernel sources. (The -h option turns off printing file names which are turned on by -r):
$ grep -rhoP '\bKCONFIG_\w+' | sort | uniq

Print lines around matched line

-An prints the matched line and the n lines following it.
-Bn prints the matched line and the n lines preceding it.
-Cn prints the matched line and the n lines preceding and following it.

Searching multiple patterns at once

$ grep -e PatternOne -e PatternTwo -e PatternThree file
Find lines that match any of PatternOne, PatternTwo or PatternThree.

\b vs \< and \>

\b matches at a word boundary.
\< only matches at the beginning of a word, \> only at the end of a word.
Thus, \b matches whenever either \< or \> matches.
The following regexp matches the input …
echo 'abcd  efgh' | grep '\b..\b'
… but this one does not:
echo 'abcd  efgh' | grep  '\<..\>'
Note: with -P, \< and \> don't have any meaning.

Searching for hexadecimal values

With ansi c quoting ($'...'), it is possible to search for hexadecimal values. This is not a feature of grep, but of the shell from which it is invoked.
The following example searches the utf 8 representation of é:
$ grep $'\xc3\xa9' some-file

Binary file ... matches

Apparently, grep considers a file with a NUL character a binary file and won't by default print matched text for such files.
In order to still print matches, one can use -a or the equivalent --text or still --binary=text.

See also

ack, a grep like tool, optimized for programmers.
Shell commands
Perl function: grep
git grep
Bash: regular expressions
The R function grep.
find.exe, findstr.exe
The PowerShell equivalent for grep seems to be the PowerShell cmdLet select-string.
grep might be considered a filter operation as known in functional programming.
pdfgrep is a commandline utility to search text in PDF documents. This utility tries to be compatible with grep.

Index