[sf-lug] regex: how to match any one to four character word in a	file
    Charles-Henri Gros 
    chgros at coverity.com
       
    Wed Dec 10 10:07:48 PST 2008
    
    
  
jim wrote:
> i've given up on the online tutorials. 
>
> i have a text file with over 100000 words (and lines, 
> one word per line). i wanna grep out all words that 
> are from one to four characters, e.g. 'a' or 'and' 
> or "fact" but not "apple" or "zounds". 
>
> $  grep '[.]{4}' words.txt
> got me a newline. 
>
> i've got lots of other variations in my .bash_history 
> if anyone wants a good laugh. 
>   
So, here goes:
- you need to anchor your regex (with ^ and $) or it will do partial 
matches and match longer words
- '{' needs to be escaped with \
- "{4}" will match exactly 4, so use {1,4}
- [.] will match a '.'
Result:
grep '^.\{1,4\}$' words.txt
-- 
Charles-Henri
    
    
More information about the sf-lug
mailing list