|
Page
Cloaking - To Cloak or Not to Cloak.
By Sumantra Roy
Page
cloaking can broadly be defined as a technique
used to deliver different web pages under
different circumstances. There are two primary
reasons that people use page cloaking:
i)
It allows them to create a separate optimized
page for each search engine and another page
which is aesthetically pleasing and designed
for their human visitors. When a search engine
spider visits a site, the page which has been
optimized for that search engine is delivered
to it. When a human visits a site, the page
which was designed for the human visitors
is shown. The primary benefit of doing this
is that the human visitors don't need to be
shown the pages which have been optimized
for the search engines, because the pages
which are meant for the search engines may
not be aesthetically pleasing, and may contain
an over-repetition of keywords.
ii)
It allows them to hide the source code of
the optimized pages that they have created,
and hence prevents their competitors from
being able to copy the source code.
Page
cloaking is implemented by using some specialized
cloaking scripts. A cloaking script is installed
on the server, which detects whether it is
a search engine or a human being that is requesting
a page. If a search engine is requesting a
page, the cloaking script delivers the page
which has been optimized for that search engine.
If a human being is requesting the page, the
cloaking script delivers the page which has
been designed for humans.
There
are two primary ways by which the cloaking
script can detect whether a search engine
or a human being is visiting a site:
i)
The first and simplest way is by checking
the User-Agent variable. Each time anyone
(be it a search engine spider or a browser
being operated by a human) requests a page
from a site, it reports an User-Agent name
to the site. Generally, if a search engine
spider requests a page, the User-Agent variable
contains the name of the search engine. Hence,
if the cloaking script detects that the User-Agent
variable contains a name of a search engine,
it delivers the page which has been optimized
for that search engine. If the cloaking script
does not detect the name of a search engine
in the User-Agent variable, it assumes that
the request has been made by a human being
and delivers the page which was designed for
human beings.
However,
while this is the simplest way to implement
a cloaking script, it is also the least safe.
It is pretty easy to fake the User-Agent variable,
and hence, someone who wants to see the optimized
pages that are being delivered to different
search engines can easily do so.
ii)
The second and more complicated way is to
use I.P. (Internet Protocol) based cloaking.
This involves the use of an I.P. database
which contains a list of the I.P. addresses
of all known search engine spiders. When a
visitor (a search engine or a human) requests
a page, the cloaking script checks the I.P.
address of the visitor. If the I.P. address
is present in the I.P. database, the cloaking
script knows that the visitor is a search
engine and delivers the page optimized for
that search engine. If the I.P. address is
not present in the I.P. database, the cloaking
script assumes that a human has requested
the page, and delivers the page which is meant
for human visitors.
Although
more complicated than User-Agent based cloaking,
I.P. based cloaking is more reliable and safe
because it is very difficult to fake I.P.
addresses.
Now
that you have an idea of what cloaking is
all about and how it is implemented, the question
arises as to whether you should use page cloaking.
The one word answer is "NO". The reason is
simple: the search engines don't like it,
and will probably ban your site from their
index if they find out that your site uses
cloaking. The reason that the search engines
don't like page cloaking is that it prevents
them from being able to spider the same page
that their visitors are going to see. And
if the search engines are prevented from doing
so, they cannot be confident of delivering
relevant results to their users. In the past,
many people have created optimized pages for
some highly popular keywords and then used
page cloaking to take people to their real
sites which had nothing to do with those keywords.
If the search engines allowed this to happen,
they would suffer because their users would
abandon them and go to another search engine
which produced more relevant results.
Of
course, a question arises as to how a search
engine can detect whether or not a site uses
page cloaking. There are three ways by which
it can do so:
i)
If the site uses User-Agent cloaking, the
search engines can simply send a spider to
a site which does not report the name of the
search engine in the User-Agent variable.
If the search engine sees that the page delivered
to this spider is different from the page
which is delivered to a spider which reports
the name of the search engine in the User-Agent
variable, it knows that the site has used
page cloaking.
ii)
If the site uses I.P. based cloaking, the
search engines can send a spider from a different
I.P. address than any I.P. address which it
has used previously. Since this is a new I.P.
address, the I.P. database that is used for
cloaking will not contain this address. If
the search engine detects that the page delivered
to the spider with the new I.P. address is
different from the page that is delivered
to a spider with a known I.P. address, it
knows that the site has used page cloaking.
iii)
A human representative from a search engine
may visit a site to see whether it uses cloaking.
If she sees that the page which is delivered
to her is different from the one being delivered
to the search engine spider, she knows that
the site uses cloaking.
Hence,
when it comes to page cloaking, my advice
is simple: don't even think about using it.
Article by Sumantra
Roy. Sumantra is one of the most respected
search engine positioning specialists on the
Internet. To have Sumantra's company place
your site at the top of the search engines,
go to http://www.1stSearchRanking.com/t.cgi?3761
For more advice on how you can take your web
site to the top of the search engines, subscribe
to his FREE newsletter by going to http://www.1stSearchRanking.com/t.cgi?3761&newsletter.htm
|