Internet Business and Marketing Trends

Verifying MSN Bot Activity

Keeping track of your web server stats is an important responsibility for webmasters and SEOers alike. They can provide a lot of valuable marketing information as well as the knowledge concerning which search engines crawlers have hit your site and which ones haven’t. One thing they don’t show is whether or not the inspecting bot is legitimate or not.

Malicious programmers have been known to try and disguise the true intentions of their bots, often making them appear friendly. Because of this, the Live.com (MSN Search) blog posted a guide to help webmasters identify which visitors are actual MSN Search bots. The first part of the post reveals the bot names used MSN, as well as what each is responsible for:

MSNBot - Main web crawler
MSNBot-Media - Images & all other media
MSNBot-NewsBlogs - News and blogs
MSNBot-Products - Products & shopping
MSNBot-Academic Academic search

Not only does the blog entry supply the MSN bot name, they also provide an additional method of identification using the IP address and a reverse DNS lookup. Once the unknown bot is identified, you can tell whether or not MSN is responsible for it. The blog entry reveals more:

Once you have the host name (in this case, livebot-207-46-98-149.search.live.com), you can check that it really is coming from Live Search. The name of all live search crawlers will end with ’search.live.com’. If the name doesn’t end with ‘search.live.com’, you know it’s not really our crawler.

Finally, you need to verify that the name is accurate. In order to do this, you can use Forward DNS to see the IP address associated with the host name. This should match the IP address you used in Step 2 – if it doesn’t, it means the name was fake.

Once you compile a list of the fake bots hammering your site, you can use your site’s robot.txt file to try and keep them out.

RSS feed | Trackback URI

Comments »

No comments yet.

Name (required)
E-mail (required - never shown publicly)
URI
Subscribe to comments via email
Your Comment (smaller size | larger size)
You may use <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong> in your comment.

Search WebProBlog

 

WebProBlog Email

 


Recent Posts


» iEntry Links


Categories


Contact WebProBlog

RSS Feeds



Titan Quest Forum
The #1 Titan Quest forum
Halo 3 Forum
The best Halo, Halo 2, Halo 3 forum
Nintendo Wii
Nintendo Wii news and views
Mac Software
The best in OS X freeware
Graphics Forum
Your source for graphic tutorials

About WebProBlog

Welcome to WebProBlog! WebProBlog is essentially the WebProNews staff community blog. Frequently, we may have ideas or observations that may not necessarily be a great fit for a full WebProNews article but would work great in a blog. As a result, you can expect to see posts here from a few WebProNews writers and staff...


WebProBlog WebProNews WebProNews WebProBlog RSS Feed Rich Ord, CEO iEntry inc. Susan Coppersmith David Utter Jason Miller Doug Caverly Mike McDonald Chris Richardson Tiffany Doughty Nathaniel Drake Jay Fougere Rachel Harvey Joe Lewis