Blog Crwaler / Spammer ?

Random stuff about serendipity. Discussion, Questions, Paraphernalia.
Post Reply
nick
Regular
Posts: 21
Joined: Wed Oct 26, 2005 10:34 pm
Contact:

Blog Crwaler / Spammer ?

Post by nick »

After implementing badbot trap in my s9y installation,
I've got quite a lot of bots tresspassing my trap.

Seems like the majority are using java User Agent in their browser,
anyone knows what kind of browser it is ?
Are they really spambots, or just another misbehaving crawler ?

Here's some :

216.255.189.226 - - [2006-02-17 (Fri) 23:56:04] "GET /trap/index.html HTTP/1.1" Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
65.110.3.251 - - [2006-02-28 (Tue) 01:29:37] "GET /trap/index.html HTTP/1.1" Java/1.4.2_04
68.97.56.15 - - [2006-03-03 (Fri) 22:16:48] "GET /trap/index.html HTTP/1.1" Java/1.4.1_04
66.90.118.184 - - [2006-03-07 (Tue) 11:22:36] "GET /trap/index.html HTTP/1.1" Java/1.4.1_04
70.32.236.42 - - [2006-03-07 (Tue) 18:45:49] "GET /trap/index.html HTTP/1.1" Java/1.4.1_04
64.255.161.43 - - [2006-03-07 (Tue) 18:45:49] "GET /trap/index.html HTTP/1.1" Java/1.4.1_04
69.42.82.164 - - [2006-03-07 (Tue) 18:45:49] "GET /trap/index.html HTTP/1.1" Java/1.4.1_04
67.52.52.202 - - [2006-03-12 (Sun) 16:45:12] "GET /trap/index.html HTTP/1.1" Java/1.4.1_04
70.21.250.131 - - [2006-03-14 (Tue) 20:48:55] "GET /trap/index.html HTTP/1.1" Java/1.4.1_04
62.68.160.38 - - [2006-03-14 (Tue) 20:56:48] "GET /trap/index.html HTTP/1.1" Java/1.5.0_06
70.27.135.194 - - [2006-03-15 (Wed) 00:34:30] "GET /trap/index.html HTTP/1.1" Java/1.4.1_04
82.77.91.88 - - [2006-03-15 (Wed) 18:34:27] "GET /trap/index.html HTTP/1.1" Java/1.4.1_04
70.21.251.103 - - [2006-03-15 (Wed) 22:48:07] "GET /trap/index.html HTTP/1.1" Java/1.4.1_04
209.10.165.50 - - [2006-03-16 (Thu) 19:58:04] "GET /trap/index.html HTTP/1.1" Java/1.5.0_06
24.239.153.192 - - [2006-03-16 (Thu) 19:58:04] "GET /trap/index.html HTTP/1.1" Java/1.5.0_06
67.52.194.74 - - [2006-03-18 (Sat) 16:26:36] "GET /trap/index.html HTTP/1.1" Java/1.4.1_04
207.226.172.226 - - [2006-03-18 (Sat) 16:26:36] "GET /trap/index.html HTTP/1.1" Java/1.4.1_04
203.92.89.47 - - [2006-03-21 (Tue) 00:38:43] "GET /trap/index.html HTTP/1.1" Java/1.4.1_04
217.174.253.232 - - [2006-03-29 (Wed) 13:57:50] "GET /trap/index.html HTTP/1.1" Java/1.4.1_04
24.73.233.122 - - [2006-04-02 (Sun) 19:35:42] "GET /trap/index.html HTTP/1.1" Java/1.5.0_06
84.178.156.208 - - [2006-04-03 (Mon) 10:50:13] "GET /trap/index.html HTTP/1.1" Java/1.4.2_03
65.26.239.117 - - [2006-04-07 (Fri) 09:59:22] "GET /trap/index.html HTTP/1.1" Java/1.4.2_06
64.58.224.132 - - [2006-04-08 (Sat) 17:13:31] "GET /trap/index.html HTTP/1.1" Java/1.5.0_06
69.159.128.254 - - [2006-04-08 (Sat) 18:11:12] "GET /trap/index.html HTTP/1.1" Java/1.5.0_06
220.130.12.159 - - [2006-04-09 (Sun) 08:11:50] "GET /trap/index.html HTTP/1.1" Java/1.4.1_04
69.16.199.218 - - [2006-04-10 (Mon) 03:01:17] "GET /trap/index.html HTTP/1.1" Java/1.4.1_04
67.167.192.173 - - [2006-04-14 (Fri) 17:36:24] "GET /trap/index.html HTTP/1.1" Java/1.5.0_06
82.4.26.89 - - [2006-04-15 (Sat) 14:44:57] "GET /trap/index.html HTTP/1.1" Java/1.4.1_04
217.129.156.59 - - [2006-04-16 (Sun) 08:22:19] "GET /trap/index.html HTTP/1.1" Java/1.4.1_04
217.206.92.150 - - [2006-04-17 (Mon) 15:07:52] "GET /trap/index.html HTTP/1.1" Java/1.4.1_04
84.178.159.211 - - [2006-04-26 (Wed) 11:13:22] "GET /trap/index.html HTTP/1.1" Java/1.4.2_03
216.246.22.90 - - [2006-04-27 (Thu) 18:08:47] "GET /trap/index.html HTTP/1.1" Java/1.4.1_04
84.178.170.67 - - [2006-04-28 (Fri) 21:08:00] "GET /trap/index.html HTTP/1.1" Java/1.4.2_03
84.178.149.236 - - [2006-04-29 (Sat) 21:00:01] "GET /trap/index.html HTTP/1.1" Java/1.4.2_03
217.158.143.213 - - [2006-04-30 (Sun) 23:43:43] "GET /trap/index.html HTTP/1.1" Java/1.4.2_02
84.178.180.30 - - [2006-05-02 (Tue) 23:23:14] "GET /trap/index.html HTTP/1.1" Java/1.4.2_03
211.241.24.90 - - [2006-05-03 (Wed) 14:27:10] "GET /trap/index.html HTTP/1.1" Java/1.5.0_06




--
--------
http://cryingwolf.net
--------
garvinhicking
Core Developer
Posts: 30022
Joined: Tue Sep 16, 2003 9:45 pm
Location: Cologne, Germany
Contact:

Re: Blog Crwaler / Spammer ?

Post by garvinhicking »

Hi!

Hm, I've not heard of those bugs. The bots do seem like some kind of harvester or spidering page, though. With the IPs you get, can you check if the spider the whole other pages from you?

It might also be some java-based blogging application which your visitor(s) use?!

Best regards,
Garvin
# Garvin Hicking (s9y Developer)
# Did I help you? Consider making me happy: http://wishes.garv.in/
# or use my PayPal account "paypal {at} supergarv (dot) de"
# My "other" hobby: http://flickr.garv.in/
nick
Regular
Posts: 21
Joined: Wed Oct 26, 2005 10:34 pm
Contact:

Post by nick »

Well If I'm not mistaken,
those Java Apps usually just check all rdf / rss links ?

I explicitly put Disallow on it, in my robots.txt exclusions,
so those apps. are definitely doesn't obey it.

If they are just browsers, then I'm in jeopardy :cry:
--------
http://cryingwolf.net
--------
Post Reply