Internet Business and Marketing Trends

MSN Live Ripe For Duplicate Content?

Most of us understand how damaging duplicate content can be to a successful search marketing campaign and because of that, most search marketers do what they can to avoid these penalties, which can be pretty severe.

There have been mistakes based on session ids and other dynamic content issues, but for the most part, duplicate content is quite easy to stay away from, that is until someone figured out a hole in the MSN Live Search algorithm. A post appearing at an online marketing blog with a peculiar name - BoogyBonBon.com - revealed an algorithm anomaly that could wreak some havoc with MSN’s search index… not to mention the webmasters and site owners who have fallen victim to this exploit.

Essentially, the hole works by manipulating a site’s URL with the _GET URL function. By adding a parameter after the “?” (?test=dupecontent) and then saving these additional URLs into an .html file that will be crawled by Live’s search bot, you can make it look like a competitor’s site has duplicate content issues.

There is an additional step to take before you can drop this depth charge which involves an HTTP Status Codes Checker and the 200 OK response code.

These steps are, under normal circumstances, should be seen as merely additional URLs to the same page and not duplicate content. However, there is an apparent issue with MSN’s anti-spam algorithm, which the post elaborates on:

Once MSN has managed to find the new URL’s it will start to index the site’s content. Unfortunately for your competitor; MSN’s anti-spam algo is so bad that its does not have the brains to just not count the new URL’s because of dupe content, but instead just removes the page or entire website from it’s index.

Now I haven’t tested this theory, and I’m not about to. But if this indeed accurate, and judging by the responses at Threadwatch (who pointed this out) there appears to be some merit, the Live.com developers need to address it quickly before it gets severely exploited.

Until Live.com’s algo is refined to ignore these attempts, there is a developer-side fix available, but you have to have PHP pages in order for it to work. BoogyBonBon has more:

In all your headers of the pages that have been attacked you will need to add the following code as well as change the part with yourdomain.com to be the page all users should be at.

if($_GET) {
ignore_user_abort(true);
header (”Pragma: no-cache”);
header(”Cache-Control: no-store, no-cache, must-revalidate”);
header(”HTTP/1.1 301 Moved Permanently”);
header(”Location: http://www.yourdomain.com/ “);
header(”Connection: close”);
exit;
}
?>

RSS feed | Trackback URI

10 Comments »

Comment by JB
2006-11-11 03:36:19

THANKS A BUNCH FOR TELLING ME HOW I CAN RUIN MY COMPETITORS!!!

 
Comment by AC
2006-11-11 15:31:00

And thanks a bunch for telling my competitors how to ruin my site…

 
Comment by Chris Richardson
2006-11-13 09:48:48

i don’t think you guys are grasping the concept. if MSN’s spam algo wasn’t so sensitive, these extra links pointing to one page wouldn’t make a difference at all.

 
Comment by Rich Z
2006-11-13 15:25:10

Is there any new word on this situation?

 
Comment by CD
2006-11-14 05:12:50

First, “thanks a lot” to the….’person’ that did this webpage THAT INFORMED EVEN MORE PARASITES NO HOW TO RUIN PEOPLE!!!! That is TOTALLY irresponsible to post something like this WITHOUT A FIX FOR IT for HTML pages!! I’m all for telling MSN about this and working with them DIRECTLY, but NOT telling the world about it! Chris, I don’t think you’re “grasping” *that* concept. Everyone knows their new algo (just like every one of G’s new algo’s) screw things up every time for the worse and slam legit white-hat websites! The fact of the matter IS, these extra links DO hurt and make a difference and thanks to webpages like this, more slimeballs KNOW ABOUT IT and can and WILL harm us!

Rich et al, I’ve asked MSN about this and got one of their typical “replies of avoidance”, I asked again and I’ll post here what they say about it.

Someone suggested to post it here. http://blogs.msdn.com/livesearch , but of course they’ve disabled any new comments to the posts! You can try try asking about it here: http://blogs.msdn.com/livesearch/contact.aspx (you have to click around for the text input boxes, but they are there).

I also suggest we contact MSN as well through their webform.
http://feedback.live.com/eform.aspx?productkey=wlsiteowner&page=wlfeedback_home_form , and: http://support.msn.com/eform.aspx?productKey=msnbot&page=support_home_options_form_byemail&ct=eformts

And I saw this:
“We may remove a website from the index if the website was reported as spam. If you suspect that your website was incorrectly identified as spam, please send an e-mail message to webspam[at]microsoft.com “

 
Comment by CD
2006-11-14 05:39:17

First, “thanks a lot” to the….’person’ that did this webpage THAT INFORMED EVEN MORE PARASITES NO HOW TO RUIN PEOPLE!!!! That is TOTALLY irresponsible to post something like this WITHOUT A FIX FOR IT for HTML pages!! I’m all for telling MSN about this and working with them DIRECTLY, but NOT telling the world about it! Chris, I don’t think you’re “grasping” *that* concept. Everyone knows their new algo (just like every one of G’s new algo’s) screw things up every time for the worse and slam legit white-hat websites! The fact of the matter IS, these extra links DO hurt and make a difference and thanks to webpages like this, more slimeballs KNOW ABOUT IT and can and WILL harm us! (Cont’d….)

 
Comment by CD
2006-11-14 05:40:04

(Cont’d) Rich et al, I’ve asked MSN about this and got one of their typical “replies of avoidance”, I asked again and I’ll post here what they say about it.

Someone suggested to post it here. http://blogs.msdn.com/livesearch , but of course they’ve disabled any new comments to the posts! You can try try asking about it here: http://blogs.msdn.com/livesearch/contact.aspx (you have to click around for the text input boxes, but they are there).

I also suggest we contact MSN as well through their webform.
http://feedback.live.com/eform.aspx?productkey=wlsiteowner&page=wlfeedback_home_form , and: http://support.msn.com/eform.aspx?productKey=msnbot&page=support_home_options_form_byemail&ct=eformts

And I saw this:
“We may remove a website from the index if the website was reported as spam. If you suspect that your website was incorrectly identified as spam, please send an e-mail message to webspam[at]microsoft.com .”

 
Comment by CD
2006-11-14 05:42:02

(Cont’d, and it would HELP if someone mentioned about text limit to this box!) Rich et al, I’ve asked MSN about this and got one of their typical “replies of avoidance”, I asked again and I’ll post here what they say about it.

Someone suggested to post it here. http://blogs.msdn.com/livesearch , but of course they’ve disabled any new comments to the posts! You can try try asking about it here: http://blogs.msdn.com/livesearch/contact.aspx (you have to click around for the text input boxes, but they are there). (Cont’d again…..)

 
Comment by CD
2006-11-14 05:43:01

(…Cont’d) I also suggest we contact MSN as well through their webform.
http://feedback.live.com/eform.aspx?productkey=wlsiteowner&page=wlfeedback_home_form , and: http://support.msn.com/eform.aspx?productKey=msnbot&page=support_home_options_form_byemail&ct=eformts

And I saw this:
“We may remove a website from the index if the website was reported as spam. If you suspect that your website was incorrectly identified as spam, please send an e-mail message to webspam[at]microsoft.com .” (END)

 
Comment by Elmar Rezept
2007-08-03 16:45:29

Obscuring algo flaws won’t help in the long run. MSN does a rather good job in trying to catch up with Google, so pointing their engineers into the right direction is a good thing.

 
Name (required)
E-mail (required - never shown publicly)
URI
Subscribe to comments via email
Your Comment (smaller size | larger size)
You may use <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong> in your comment.

Search WebProBlog

 

WebProBlog Email

 


Recent Posts


» iEntry Links


Categories


Contact WebProBlog

RSS Feeds



Titan Quest Forum
The #1 Titan Quest forum
Halo 3 Forum
The best Halo, Halo 2, Halo 3 forum
Nintendo Wii
Nintendo Wii news and views
Mac Software
The best in OS X freeware
Graphics Forum
Your source for graphic tutorials

About WebProBlog

Welcome to WebProBlog! WebProBlog is essentially the WebProNews staff community blog. Frequently, we may have ideas or observations that may not necessarily be a great fit for a full WebProNews article but would work great in a blog. As a result, you can expect to see posts here from a few WebProNews writers and staff...


WebProBlog WebProNews WebProNews WebProBlog RSS Feed Rich Ord, CEO iEntry inc. Susan Coppersmith David Utter Jason Miller Doug Caverly Mike McDonald Chris Richardson Tiffany Doughty Nathaniel Drake Jay Fougere Rachel Harvey Joe Lewis