Black Hat SEO and Search Engine Spam
Every measure that is only taken to improve ranking in search engines without an effect for the visitor of the website, is, concerning the Google and other search engines terms of service, to be marked as search engine spam.
Definitions of Search Engine Spam
- Content, that is especially produced for search engines
- Produce more than one listing for a given keyword
- Redirects to webpages that are not providing any content for the given keyword combination
There are many different types of Search Engine Spam techniques; some are used because there are mistakes in the interpretation of “Search Engine Optimization” by hobby webmasters, which tend to falsely interpret information about Search Engine Optimization.
2 different types of Search Engine Optimization
In order to define the different Types of Search Engine Optimization, SEOs coined the terms “White Hat SEO” and “Black Hat SEO”.
These 2 terms are borrowed from the “hacker” scene, where these describe a similiar type of use.
White Hat SEO
One is considered as a white hat SEO, if one follows the rules that are stated officially, e.g. the Google Guidelines for webmasters, and does not violate these. Any white hat SEO does not use search engine spam techniques to rank webpages for customers or own projects.
Black Hat SEO
Any Black Hat SEO most often uses one or more types of Search Engine Spam to reach its optimization goals. The problem with Black Hat SEO is the high risk of getting banned from the search engines index.
General remark
If you are doing online marketing/search engine optimization, you need to decide which techniques you use. If you decide to use white hat seo, you can be sure that your webpages stay in the index of search engines and your ranking should be considered quite stable.
On the other hand, if you are using Black Hat SEO you could gain some profit through good rankings, but these could only be for weeks, or even only days, until spam filters and spam reports can get your website banned by the search engines. If you are offering search engine optimization services for clients you also have to risk to explain a possible exclusion from the search-engines to the client, which might loose a lot of money through not beeing listed in the major search engines.
Search Engine Spam Techniques
There are many different search engine spam techniques, most of them can be easily applied to a website/webpage. Nevertheless some of the spam techniques need further technical experience and knowledge, like Cloaking.
Keyword Stuffing
Keyword stuffing is often seen on many webpages, especially older webpages.
Keyword Stuffing is defined as the inapropriate use of keywords in a very obvious way, stuffing them in the title, the Metatag keyword or the websites content.
The goal of Keyword Stuffing is to reach a high keyword density to influence this on the page factor in a positive way.
Example
This webpage contains information about printer drivers, printer driver, and printer drivers
Explanation
Webmasters that use Keyword Stuffing try to increase their Keyword Density without having to write longer texts/detailed information, which would cost more money. It is easy to produce Keyword Stuffing in titles, or webpages, and you can get free keyword lists, for example with the Overture Keyword Suggestion Tool.
Why Keyword Stuffing doesn’t work
Keyword Stuffing does not work, because search engines analyse every word on a webpage, and can detect Keyword Stuffing easily. In addition to this, on the page factors only make up about 10 % of the ranking algorithm, Metatags (if stuffing occurs in the Metatag Keyword), are not considered important.
White Hat SEO Alternatives
In order to reach keyword density, that is positive for your webpage and not considered as keyword stuffing, you should stick to producing content for your visitors, ask them for feedback, and let friends and customers read and correct your webpages.
Invisible Text
Another type of Search Engine Spam, that is similar in its aim to Keyword Stuffing, is the placement of invisible text on the webpage. Webmasters put invisible text on webpages, in order to include their keywords and search engine “optimized” spam content without visitors getting to see it.
Invisible Text with HTML
The method that is often used by hobby webmasters, and is very simple to apply, is to create your text in the color of your websites background.
Search engines can detect this type of spam through analysing the html tags, and comparing the colors stated for the background and the content/text.
Search engines also penalize a very small difference, e.g. white and grey, as spam.
Example using HTML
<body bgcolor=”#FFFFFF”>
<p><font color=”#FFFFFF”>This is search engine spam, printer drivers, devices,
Problems, information, this is spam, </font></p>
The disadvantage of using invisible text with HTML is that search engines can easily detect it, and visitors could mark the text and read it, and possibly send a spam report to the search engines.
Invisible Text using CSS
As an alternative to the use of HTML Tags for formatting a website, there is the possibility to use Cascading Style Sheets. Not all search engines are able to read CSS, nevertheless, Google has improved this possibility in the last few months, and other search engines like Yahoo! and MSN Search will follow this development.
Example using CSS
In the following example, there is a Layer that is used in combination with CSS to position content outside the visible area of a webpage.
CSS
.position {position: absolute; width: 180px; height: 75px; z-index: 3; left: -220px; top: -95px; visibility: visible}
Tag
<div class=”position”>Keywords outside the area seen by the visitor</div>
Summary Invisible Text
Invisible text can be produced easily, either using methods of HTML or CSS. Small texts and blind texts are considered as search engine spam by the engines. If you decide to use these methods of search engine spam, you should nevertheless consider the maximal amount of keyword density. The probability that seach engines spiders detect this type of spam is high.
The secure alternative is to produce or buy content for your webpages, which costs more, but provides the security of not getting banned/filtered by search engine algorithms.
Metatag Spamming
Metatag Spamming was widely used in the past years, but since search engines have reacted to this type of spam, and it doesn’t work any longer, it is not often seen on webpages.
Metatag Spamming is similar to Keyword Stuffing, but it can also be the case the wrong Metatag informations are set without intending to do so.
Metatags without contextual sense
Metatags are intended to provide information about the content of a webpage. So, if you include a title that reads, for example
<title>Information about Printer drivers</title>
You should not write about toasters and recipes. You should always include the information and keywords you mentioned in your Metatag title in your content, so that the user gets what he searched for.
Another typical type of Metatag Spamming is to include wrong description/keyword information. Often webmasters try to include keywords in their metatag that are considered to be searched often, like insurances, loans, mortgages …
Often, Metatags are the same on all webpages of an entire website, so that special areas of information are not described through the title or description tags separately.
Metatag Spamming does not work, because Metatags are not considered important (as they can be easily manipulated), and search engines can detect spamming through analysing the content and the matching with the given Metatag information.
Doorway Pages
In order to talk about Doorway Pages, it is necessary to define them first. Doorway Pages are websites, which are produced in a high number, to increase traffic on a customer’s webpage.
Doorway Pages are highly optimized websites, which include all types of on the page search engine optimization. Most doorway Pages include a redirect to the given webpage; this is most often done with javascript redirect scripts or with the help of 301 Redirects.
Search engines have improved their mechanisms to detect Doorway Pages, but Black Hat SEOs have reacted to this, adopting their methods to still have success.
At the moment, Doorway pages are often not used to redirected, but to provide “search engine results”.
Pseudo Search engines that are optimized for a given area, including Google Adsense to gain money are fairly popular at the moment; the scripts for this type of optimization can even be bought at Ebay.
I would define a doorway page as every webpage that is provided only for the search engines to artificially spam their index.
Doorway Page Generators
With the use of Doorway Pages, a lot of programs and scripts, called Doorway Page Generators, were sold or freely available for download on the internet. In the past year, if haven’t heard of any significant ranking, resulting through the use of Doorway pages, but these are still used by the leading black hat seos, only changing the methods.
Doorway Page generators created 100, 1000 or even 10.000 webpages within minutes or hours, depending on the complexity and possibilities. Doorway Pages create random content, filling this content spam with given keywords related to the topic.
For a visitor, doorway page generators often generated interesting, or funny nonsense, sometimes poems were also used.
It was very frequent that Doorway Page generators were sold with the pretext to provide great search engine rankings on all keywords, but nobody could ever answer, why they would sell their software instead of using it themselves.
At the moment, doorway page generators have lost their market, because more and more customers and SEO companies try to follow the search engines guidelines.
Cloaking
Cloaking describes a process of a search engine spam technique. In order to explain Cloaking, it is necessary to first explain how a search engine spider is retrieving its documents from the webserver.
Explanation
If a client requests a webpage from a server, usually every User Agent/Browser/Spider gets the same document. Nevertheless, it is possible to deliver special documents to search engine spider Ips (IP delivering) and User Agent/Client Information (Cloaking).
Necessary information for Cloaking
- User Agent and IP information from Search Engine Spiders (MSN Search,Yahoo,Google)
- Special optimized webpages for every spider, that are updated according to changings in the ranking criteria
- Webhost that supports Cloaking
How Cloaking is detected by search engine robots
Search enginges detect cloaking and IP-Delivering through visiting the given webpages under different User Agents, so that they get the information what a user would see. Considering IP-Delivering, search engines use many different IP nets to crawl the web, and not all of them are known. Also, it isn’t possible to update the given lists on the server on such a frequent basis. Nevertheless, IP Delivering/Cloaking is only considered spam if there is a significant change in the delivered version of the webpage.
Since this is always the case if Cloaking was planed to spam the search engines, this leads to the deletion of an URL.
Although Cloaking is a more sophisticated approach then the other types of search engine spam, because it takes technical knowledge and more work to carry it out, it can be detected and penalized by search engines, even if it might take some weeks until the robots realize the difference.
Duplicate Content
Duplicate Content can not only provide problems to websites who apply this technique, but also to unguilty webmasters, whose content has been used on other webpages.
Content Scrapper, as pseudo search engines are often called, use the content from other webpages and provide their descriptions/titles as “results”.
Many companies think that providing their content under different URLs, e.g. www.domain.de and www.keyword-domain.de and www.keyword-company.de can provide them better results,
Fact is that this is not true, because search engines can analyse duplicate content in most cases and filter it from their result pages, and, eventually, from their index.
I also made this experience with my own webpages during research.
Through using different domains, you can increase your rankings, but you should do so through providing unique content on each page of your web sites. This can be reached through focusing on different aspects of a topic.
Canonical URL problem
If you are providing a website, you should define a standard domain name, so that your website can either be reached with or without the www in front, but not with both, risking duplicate content if both versions are linked by link partners.
Search engines are not very good in analysing where the original content was and who has copied the content, so you shoul use tools in order to prevent duplicate content problems through grabbers, since you have copyright on your content.
Link Farms and Link Spamming
The most important factor in search engine optimization is, and most probably will always last on links that are pointing to a webpage. As a result of this development, many webmasters moved from on the page search engine optimization, and on the page search engine spam techniques, to Link Farms/Link Spamming, trying to improve the Link Popularity of websites.
Link Popularity can be explained as the number of links that are pointing to your website. Your website gains more reputation if 100 links point to it than 10, providing and basing this idea on the technique of citation in technical papers/reviews.
As a result, webmasters started to create Link Farms and link networks in order to push their own websites.
Automated systems that offered link trading to their users, auto generating the pages for links, were highly popular, and they worked out for many webmasters, until search engines realized the problem and started to react.
Nowadays, search engines are no longer valuating/following links from guestbooks or link farms. If your website only has links from link farms and other spam webpages, this can bring yourself in “Bad Neighbourhood”, resulting in a ban from the search engines index.
It is simple for search engine spiders to analyse link farms through comparing the number of outgoing links, the number of incoming links (only very few link farms have incoming links, as do doorway pages) , and the amount of content on a webpage.
Summary Search Engine Spam
There are many different methods of search engine spam available, which can be used by any webmaster. Nevertheless, search engines can find and penalize spam methods, and improve their ability to do so constantly. If you are not willing to risk your ranking and listing in the search engines you should follow the search engines guidelines.