How To Help Search Engines Find Your Content | Van SEO Design


Additional Resources

Meta Robots

Canonical Tag

Robots.txt

301 Redirects

Robotic spider

Summary

The way you structure your content plays a part in how well your content gets crawled and indexed. If you want a search engine to list one of your pages in their results, the search engine first needs to find that page. It’s important that we make it easier for spiders to find all of the pages we want indexed.

Fortunately most of the ways you help search engines find your content also helps real people find that same content. A sitemap for example can serve as a great backup to your main navigation and can be organized in a way that makes it a table of contents for your entire site. Shorter click paths mean people as well as spiders can get to your content quicker.

Sometimes though, we need to understand the difference in how people and search engines see things. Real people won’t have any problem with multiple URLs pointing to the same content. If anything it likely makes it easier for them. Search engines on the other hand still get confused by “duplicate content” and you need to be aware of that so you can help make things clearer for them.

Next week we’ll look beyond crawling and indexing and talk about siloing or theming your content. The idea is to develop the structure of your content in a way to help reinforce the different keyword themes on your site and in the process help your pages rank better for keyword phrases around those themes.

vía:

How To Help Search Engines Find Your Content | Van SEO Design.

Enhanced by Zemanta

SEO Manager Wanted. Bots Need Not Apply

If you were looking for an SEO manager, where would you advertise?

Even if you follow all the rules Lou Adler laid out, it would be hard to top what the Daily Mail in the UK did.

The newspaper embedded an ad in its robots.txt file, a place there is no reason for any human to look. This is a file strictly to be read by the crawlers from search engines. It tells them what pages to index and what not to. For normal humans, there’s nothing of interest there, as you have may already have discovered if you clicked the link.

True SEO geeks, though, check those files. Sometimes the instructions to the crawlers contain interesting tidbits, such as the location where dummy editions might be found. A blogger in 2007 posted about what he found in some UK newspaper files.


If you were looking for an SEO manager, where would you advertise?

Even if you follow all the rules Lou Adler laid out, it would be hard to top what the Daily Mail in the UK did.

The newspaper embedded an ad in its robots.txt file, a place there is no reason for any human to look. This is a file strictly to be read by the crawlers from search engines. It tells them what pages to index and what not to. For normal humans, there’s nothing of interest there, as you have may already have discovered if you clicked the link.

True SEO geeks, though, check those files. Sometimes the instructions to the crawlers contain interesting tidbits, such as the location where dummy editions might be found. A blogger in 2007 posted about what he found in some UK newspaper files. Leer más “SEO Manager Wanted. Bots Need Not Apply”

Daily Mail Places Stealth Job Ad In Its Robots.txt

It’s more commonly used as a way to block search crawlers from certain parts of publishers’ sites.

But the most-visited newspaper website in the UK is using its robots.txt file as a clever hiring tool, as eagle-eyed Malcolm Coles spotted: (…)Genius! You could also contact Mail Online MD James Bromley on Twitter.

For those who don’t know, the robots.txt file is how you tell search engines which pages they can and can’t crawl on your site to include in their index.

In the past it was worth occasionally checking out newspapers’ robots.txt files as they listed the URLs of stories that they’ve had to withdraw for legal reasons (or joke Polish editions). Sadly, they don’t seem to do that so much these days (and they’d get lost in the Mirror’s massive file). Plus there’s no easy way to check if they’ve been updated – Google Reader’s ability to track changing webpages doesn’t work with robots.txt files. Boo.(… http://www.malcolmcoles.co.uk/blog/seo-job-mail-robots/)


Image representing Google Reader as depicted i...
Image via CrunchBase

It’s more commonly used as a way to block search crawlers from certain parts of publishers’ sites.

But the most-visited newspaper website in the UK is using its robots.txt file as a clever hiring tool, as eagle-eyed Malcolm Coles spotted: (…)Genius! You could also contact Mail Online MD James Bromley on Twitter.

For those who don’t know, the robots.txt file is how you tell search engines which pages they can and can’t crawl on your site to include in their index.

In the past it was worth occasionally checking out newspapers’ robots.txt files as they listed the URLs of stories that they’ve had to withdraw for legal reasons (or joke Polish editions). Sadly, they don’t seem to do that so much these days (and they’d get lost in the Mirror’s massive file). Plus there’s no easy way to check if they’ve been updated – Google Reader‘s ability to track changing webpages doesn’t work with robots.txt files. Boo.(… http://www.malcolmcoles.co.uk/blog/seo-job-mail-robots/) Leer más “Daily Mail Places Stealth Job Ad In Its Robots.txt”

5 Web Files That Will Improve Your Website

The amount of code that developers encounter regularly is staggering. At any one time, a single site can make use of over five different web languages (i.e. MySQL, PHP, JavaScript, CSS, HTML).

There are a number of lesser-known and underused ways to enhance your site with a few simple but powerful files. This article aims to highlight five of these unsung heroes that can assist your site. They’re pretty easy to use and understand, and thus, can be great additions to the websites you deploy or currently run.
An Overview

Which files are we going to be examining (and producing)? Deciding which files to cover was certainly not an easy task for me, and there are many other files (such as .htaccess which we won’t cover) you can implement that can provide your website a boost.

The files I’ll talk about here were chosen for their usefulness as well as their ease of implementation. Maximum bang for our buck.

We’re going to cover robots.txt, favicon.ico, sitemap.xml, dublin.rdf and opensearch.xml. Their purposes range from helping search engines index your site accurately, to acting as usability and interoperability aids.

Let’s start with the most familiar one: robots.txt.


by Alexander Dawson

5 Web Files That Will Improve Your Website

The amount of code that developers encounter regularly is staggering. At any one time, a single site can make use of over five different web languages (i.e. MySQL, PHP, JavaScript, CSS, HTML).

There are a number of lesser-known and underused ways to enhance your site with a few simple but powerful files. This article aims to highlight five of these unsung heroes that can assist your site. They’re pretty easy to use and understand, and thus, can be great additions to the websites you deploy or currently run.

An Overview

Which files are we going to be examining (and producing)? Deciding which files to cover was certainly not an easy task for me, and there are many other files (such as .htaccess which we won’t cover) you can implement that can provide your website a boost.

The files I’ll talk about here were chosen for their usefulness as well as their ease of implementation. Maximum bang for our buck.

We’re going to cover robots.txt, favicon.ico, sitemap.xml, dublin.rdf and opensearch.xml. Their purposes range from helping search engines index your site accurately, to acting as usability and interoperability aids.

Let’s start with the most familiar one: robots.txt. Leer más “5 Web Files That Will Improve Your Website”