[sf-lug] regex: how to match any one to four character word in a file

Charles-Henri Gros chgros at coverity.com
Wed Dec 10 10:07:48 PST 2008


jim wrote:
> i've given up on the online tutorials. 
>
> i have a text file with over 100000 words (and lines, 
> one word per line). i wanna grep out all words that 
> are from one to four characters, e.g. 'a' or 'and' 
> or "fact" but not "apple" or "zounds". 
>
> $  grep '[.]{4}' words.txt
> got me a newline. 
>
> i've got lots of other variations in my .bash_history 
> if anyone wants a good laugh. 
>   

So, here goes:
- you need to anchor your regex (with ^ and $) or it will do partial 
matches and match longer words
- '{' needs to be escaped with \
- "{4}" will match exactly 4, so use {1,4}
- [.] will match a '.'

Result:
grep '^.\{1,4\}$' words.txt


-- 
Charles-Henri





More information about the sf-lug mailing list