It is very easy
to inadvertently create duplicate content on a website. The definition of
duplication in this context is where the same content appears under multiple
URLs. Duplicate content can potentially decrease traffic to a web page because
the search engines will be unsure of which page is the most relevant for a
particular query. Below are some common issues and solutions:
Canonicalization
This is
the most common
form of duplicate content problem and is caused when the web pages can be
accessed using various URLs e.g.
Although all of these
URLs access the same web page, the search engines view them as separate pages
and confusion will arise.
Solution
This can be
resolved by using a server-side redirect to ensure that only one page URL is
served; how to carry this out will depend on the specific server set up.
Alternatively, choosing the preferred URL can be done via Google Webmaster
Tools. Specific instructions for the latter option can be found
Printable Pages
Websites which
have information pages such as News sites and RSS feeds often have a 'Print'
feature which loads the page content with a format which is stripped of CSS
styling and images more suitable for printing. However, the printable format
creates a new URL by adding /print and is seen as duplicate content.
Solution
There is a very
simple way to overcome this type of duplication and that is to use a
rel=canonical tag on the \print page pointing towards the original page.
Blog Pages
The problem with
blogs is not the pages themselves; it is the tags that are associated with them
or the categories that they are placed under. Different blog articles will
often contain the same tag words and will inevitably be posted under a category
more than once.
Solution
The solution to
this particular problem is again fairly simple. Most bloggers use a 'format'
which features only a few different categories but numerous tags, in which case
using "no index, no follow" on all of the 'tag' pages will resolve
the problem. For those who do have a larger number of categories than tags, add
the code to the 'category' pages.
Relative Linking
Many webmasters
use relative internal linking as the faster, easier option, but when used in
conjunction with HTTPS pages and sub domains especially, it can result in
occurrences of duplicate pages.
Solution
Some issues can
be overcome by simply using a 'self-referential' rel=canonical tag
for each page that could be affected in this way. However, there is no
guarantee that this will be successful for every searchengine.
The only other
way is to make all internal links 'absolute' which means using the full URL in
the link rather than serving the information from an internal source. Each
absolute URL is unique and points directly to the file required.
Summary
There are other
issues that can cause duplicated content to arise quite innocently, but the
ones mentioned above are by far the most commonplace.
Most
are easily resolved with a little bit of time and application, others can
easily be avoided by a little more thought going in to the design and coding of
any new web pages produced.
0 comments:
Post a Comment