[sf-lug] [on-list] site up, http[s] down: Re: Wierd problems trying to access linuxmafia.com

Rick Moen rick at linuxmafia.com
Tue Dec 11 04:40:31 PST 2018


I wrote:

> Seems like maybe my lowest-hanging fruit is to spank 46.229.168.0 and
> 46.229.161.0 (the semrush.com IPs), 37.9.87.228 (the Yandex IP), and
> -- especially -- the worst-by-far offenders, who don't have reverse DNS
> at all (64.62.252.163, 64.62.252.174, 66.160.140.183, 64.62.252.176,
> 66.160.140.188) via iptables banishment.  

_Much_ improvement now achieved via this stuff in /var/www/.htaccess :

SetEnvIfNoCase User-Agent .*AdIdxBot.* bad_bot
SetEnvIfNoCase User-Agent .*AhrefsBot.* bad_bot
SetEnvIfNoCase User-Agent .*ajSitemap.* bad_bot
SetEnvIfNoCase User-Agent .*aolbuild.* bad_bot
SetEnvIfNoCase User-Agent .*Applebot.* bad_bot
SetEnvIfNoCase User-Agent .*Ask Jeeves.* bad_bot
SetEnvIfNoCase User-Agent .*baidu.* bad_bot
SetEnvIfNoCase User-Agent .*BLEXBot.* bad_bot
SetEnvIfNoCase User-Agent .*boitho.com-dc.* bad_bot
SetEnvIfNoCase User-Agent .*bot/1.0.* bad_bot
SetEnvIfNoCase User-Agent .*CCBot.* bad_bot
SetEnvIfNoCase User-Agent .*Charlotte/1.0b.* bad_bot
SetEnvIfNoCase User-Agent .*Cliqzbot.* bad_bot
SetEnvIfNoCase User-Agent .*coccocbot.* bad_bot
SetEnvIfNoCase User-Agent .*crawler4j.* bad_bot
SetEnvIfNoCase User-Agent .*Daum.* bad_bot
SetEnvIfNoCase User-Agent .*Exabot.* bad_bot
SetEnvIfNoCase User-Agent .*ExtLinksBot.* bad_bot
SetEnvIfNoCase User-Agent .*facebookexternalhit.* bad_bot
SetEnvIfNoCase User-Agent .*Freshbot.* bad_bot
SetEnvIfNoCase User-Agent .*GarlikCrawler.* bad_bot
SetEnvIfNoCase User-Agent .*Gigabot.* bad_bot
SetEnvIfNoCase User-Agent .*GiHoBBy.* bad_bot
SetEnvIfNoCase User-Agent .*Gogolbot.* bad_bot
SetEnvIfNoCase User-Agent .*Go-http-client.* bad_bot
SetEnvIfNoCase User-Agent .*HubSpot Links Crawler.* bad_bot
SetEnvIfNoCase User-Agent .*ia_archiver.* bad_bot
SetEnvIfNoCase User-Agent .*IABTechLab Ads.txt Crawler.* bad_bot
SetEnvIfNoCase User-Agent .*istellabot.* bad_bot
SetEnvIfNoCase User-Agent .*Java/1.4.1_04.* bad_bot
SetEnvIfNoCase User-Agent .*Java/1.5.0_11.* bad_bot
SetEnvIfNoCase User-Agent .*Jeeves.* bad_bot
SetEnvIfNoCase User-Agent .*linkdexbot.* bad_bot
SetEnvIfNoCase User-Agent .*ltx71.* bad_bot
SetEnvIfNoCase User-Agent .*Mail.RU_Bot.* bad_bot
SetEnvIfNoCase User-Agent .*MauiBot.* bad_bot
SetEnvIfNoCase User-Agent .*MBCrawler.* bad_bot
SetEnvIfNoCase User-Agent .*MJ12bot.* bad_bot
SetEnvIfNoCase User-Agent .*MojeekBot.* bad_bot
SetEnvIfNoCase User-Agent .*netseer.* bad_bot
SetEnvIfNoCase User-Agent .*panscient.com.* bad_bot
SetEnvIfNoCase User-Agent .*psbot.* bad_bot
SetEnvIfNoCase User-Agent .*Qwantify.* bad_bot
SetEnvIfNoCase User-Agent .*Scrapy.* bad_bot
SetEnvIfNoCase User-Agent .*Screaming Frog SEO Spider.* bad_bot
SetEnvIfNoCase User-Agent .*SemrushBot.* bad_bot
SetEnvIfNoCase User-Agent .*SEOkicks.* bad_bot
SetEnvIfNoCase User-Agent .*SeznamBot.* bad_bot
SetEnvIfNoCase User-Agent .*Sogou web spider.* bad_bot
SetEnvIfNoCase User-Agent .*Steeler.* bad_bot
SetEnvIfNoCase User-Agent .*Teoma.* bad_bot
SetEnvIfNoCase User-Agent .*The Knowledge AI.* bad_bot
SetEnvIfNoCase User-Agent .*tracemyfil.* bad_bot
SetEnvIfNoCase User-Agent .*Twiceler.* bad_bot
SetEnvIfNoCase User-Agent .*Twitterbot.* bad_bot
SetEnvIfNoCase User-Agent .*Uptimebot.* bad_bot
SetEnvIfNoCase User-Agent .*Vagabondo.* bad_bot
SetEnvIfNoCase User-Agent .*VoilaBot BETA 1.2.* bad_bot
SetEnvIfNoCase User-Agent .*WebDataCentreBot/1.0.* bad_bot
SetEnvIfNoCase User-Agent .*WikiDo.* bad_bot
SetEnvIfNoCase User-Agent .*yandex.* bad_bot
SetEnvIfNoCase User-Agent .*YisouSpider.* bad_bot
SetEnvIfNoCase User-Agent .*zooms.* bad_bot


order allow,deny
deny from env=bad_bot
allow from all
deny from 64.62.252.163
deny from 64.62.252.174
deny from 66.160.140.183
deny from 64.62.252.176
deny from 66.160.140.188
deny from 37.9.87.228
deny from 46.229.168.135
deny from 46.229.168.154
deny from 46.229.168.147
deny from 46.229.168.131
deny from 46.229.168.130
deny from 46.229.168.136
deny from 46.229.168.134
deny from 46.229.168.132
deny from 46.229.168.137
deny from 46.229.168.133
deny from 46.229.168.138
deny from 46.229.168.152
deny from 46.229.168.146
deny from 46.229.168.129
deny from 46.229.168.153
deny from 46.229.168.150
deny from 46.229.168.145
deny from 46.229.168.149
deny from 46.229.168.151
deny from 46.229.168.141
deny from 46.229.168.148
deny from 46.229.168.140
deny from 46.229.168.143
deny from 46.229.168.142
deny from 46.229.168.144
deny from 46.229.168.139
deny from 46.229.161.131



I expect I'll revisit that after a while -- and I also need to find a
way to do some more-global throttling without having to name and spank
User-Agent strings and IP addreses.




More information about the sf-lug mailing list