[sf-lug] regex: how to match any one to four character word in a file
Charles-Henri Gros
chgros at coverity.com
Wed Dec 10 10:07:48 PST 2008
jim wrote:
> i've given up on the online tutorials.
>
> i have a text file with over 100000 words (and lines,
> one word per line). i wanna grep out all words that
> are from one to four characters, e.g. 'a' or 'and'
> or "fact" but not "apple" or "zounds".
>
> $ grep '[.]{4}' words.txt
> got me a newline.
>
> i've got lots of other variations in my .bash_history
> if anyone wants a good laugh.
>
So, here goes:
- you need to anchor your regex (with ^ and $) or it will do partial
matches and match longer words
- '{' needs to be escaped with \
- "{4}" will match exactly 4, so use {1,4}
- [.] will match a '.'
Result:
grep '^.\{1,4\}$' words.txt
--
Charles-Henri
More information about the sf-lug
mailing list