Solve the Duplicate content issues in WordPress Blog

Wednesday, October 15, 2008


Adding ‘noindex, follow’ tags

What can you do to avoid this problem? You can tell the search engines what URL to index by using ‘noindex, follow’ meta tag, robots.txt exclusions or 301 redirects. Let’s say you want Google to index your front page, posts, single pages and category pages and forbid the spiders from crawling the content of archives, feeds and ‘next entries’ pages - page/2, /3, … To do this you have to add to your header.php the following code:

Code Is :
if((is_home() && ($paged < name="robots" content="index,follow">‘;
} else {
echo ‘< name="robots" content="noindex,follow">‘;}

For those not familiar with editing templates in WordPress: in your dashboard click Presentation menu item and after the new page is opened – click Theme Editor. In the Theme Editor choose ‘header.php’ and then paste the above code into the editor form. This code has to be inserted anywhere between head tags .

Here the tag is added to the home page but not the ‘next entries’ page (is_home() and ($paged<2)), to your posts (is_single()); to solo pages, like ‘About me’, if you created any (is_page()); and to category pages (is_category()). If you don’t want your categories to be indexed just delete || is_category(). All the other pages will get . They will not be indexed, but this will not prevent crawlers from following their outgoing links.

Adding unique meta description

For this purpose I use Head Meta Description plugin. This plugin can be configured to use an excerpt of your post as a meta description – this is especially useful if you have to add this tag to hundreds of existing pages. Or you can add your own manually as a custom field, which is my personal preference.

Using more tag

By using this tag you tell WordPress to display only the first few lines of your post. This greatly reduces the similarity of home page and your articles. If you have too many existing posts to edit, you can use an ‘excerpt’ plugin, such as this one from Semiologic

Redirect to a canonical URL

You should edit your .htaccess file to perform 301 redirects. Non-www addresses like yoursite.com should be redirected to www.yoursite.com. URL without trailing slashes like www.yoursite.com/category should be rewritten to include it: www.yoursite.com/category/ This can be done by inserting the following code into your .htaccess file:

RewriteEngine On
RewriteCond %{HTTP_HOST} !^www\.yoursite\.com$ [NC]
RewriteRule ^(.*)$ http://www.yoursite.com/$1 [R,L]
RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]

For more details read this: the process or rewriting the URL layout.

Preventing spiders from crawling feeds and auxiliary pages

For this purpose you should edit your robots.txt file by inserting the following code

User-agent: *
Disallow: /wp-
Disallow: /search
Disallow: /feed
Disallow: /comments/feed
Disallow: /feed/$
Disallow: /*/feed/$
Disallow: /*/feed/rss/$
Disallow: /*/trackback/$
Disallow: /*/*/feed/$
Disallow: /*/*/feed/rss/$
Disallow: /*/*/trackback/$
Disallow: /*/*/*/feed/$
Disallow: /*/*/*/feed/rss/$
Disallow: /*/*/*/trackback/$

Spread the word: readit

Web Sources: The Web Marketing Blog

Tag: duplicate content, web marketing blog, wordpress blog

0 Comments: