How-to: De-bunk Search Engine Indexing

It's come around again. That question. How do you get a good listing in the search engines? How do the search engines index the web? Why can't I see my site? Do I need to employ one of those Search Engine Optimisation companies (short answer: NO!)?

Forget the SEO merchants, a good search ranking depends on two things: good content and a good description of that content. Will that guarantee you spot at the top of the first page of results? No; there are no guarantees in this life and certainly ont on the web. Can you push up onto that first page? Maybe. If you match precisely what users searching are for.

How search engines work

Search engines send out little programs - spiders - to crawl the web and find content. The spiders send this back home, where a set of algorithms try to work out what the content is actually about, then index it seven ways to Sunday in order to spit it out in (hopefully) relevant search results later on.

If you leave it to the search engines' best guess, you probably won't get a good ranking. If you can't describe what your site does and can't tailor that description to what your users are searching for, you certainly won't.

But that doesn't mean resorting to dishonest dirty tricks like keyword stuffing or hidden div's with long strings of keywords; that will certainly get you downgraded or suppressed in the search rankings.

The last couple of years, Google made huge changes to the way it indexes websites, not just bringing back page rank but also weighting results around  search terms and the proximity to the location of your IP address (not always useful) or stated site location (much more useful, especially if you are selling merchandise or pushing location-dependent content. Google has also introduced a real time display of results as you type your search into the search field, revising results as you type in a more detailed inquiry.

How do you get this to work for you? Same as you ever did.

Step One: Content is king.

It always was. Surely you don't need a reminder to produce relevant, well-edited, well-directed content focused entirely on the thing your site is about? If it's a magazine site about lots of things, then each article or post needs to be focused like a laser beam. And it better be organised using a good taxonomy of sections and topics. As many as it takes - but no more.

Keywords: of course your content needs to contain the key words that it is about. But now the search engines are parsing your content using natural language engines and have rules about the density of keywords relative to length of content; don't think you can get away with keyword stuffing or nonsensical repetition, that's cheating, you'll get downgraded for it. Think of the context of your content, what is it and what is is immediately related to? You're looking for relevance all the time. In every page. Do not preserve 'turkey' pages with no relevance. Review old content and update or take it down if it's likely to harm the net present value of your site.

Back to keywords: 15-30 keywords is a good guide, we'll go on to making use of these next.

Step Two: Description is queen

In two words: meta-data (or one word if you observe hyphens).

In order to get your site properly indexed by any search engine, you need to include the relevant meta data tags in your site, describing what it's about; not only in your terms - how you think it should be described - but thinking about all the ways in which the Ordinary Joe will search, using the variants of search terms that the uninitiated, the expert and the robot might search under - not just you and your buddy circle.

This is where your keywords come into play, because these are going to appear in the meta data for your site and individual pages; and those keywords better appear in your body content. Don't think that gratuitously inserting random Kardashians in your keyword meta-data is going to increase your ranking if the site is about stainless steel flange grommets. The search engines know better now.

Your content management system should contain plenty of slots for meta data; tags, topics, labels, 'SEO'; whatever Wordpress, Joomla, Drupal, Weebly, Wix or Modx call it, use it; often there will be multiple slots; if you're serious, use them all. Do the prep, reap the rewards.

And you need to optimise your site for search within individual pages; because not every page is about exactly the same thing; the more relevant and specialised you can make it, the more individual pages the search engines will index, making it more likely to attract traffic and drive up your rankings.

Order your keywords in descending order of importance if you can, otherwise by order of appearance in the content.


Nothing enhances your site ranking like a network of back links by other reputable, ranking sites. We don't mean building spurious content wheels, the algorithms spot these in seconds. It's all about building (and maintaining) reputation. The kind you get by word of mouth, or in web terms, inline links to your content pages and positive reviews. Being an authoritative source, notably entertaining or having a good product in a specific geographic area;  all these things get taken into account by the search indexes, you may not own a famous brand, but your page ranking becomes your brand over the Internet. Think about the Internet as the massive network it is; if you're plugged into even a small part of that, with lots of mentions, your ranking will go up.

How do search engines see your website?

Unless you work in the algorithm department or go drinking with the programmers who do, you have no guaranteed way of knowing. As Google demonstrated, this can change over time, so even if you think you're on to a winning way to game the indexing routines, that isn't guaranteed.

The best you can do is read the guidelines given by the search engines themselves and try to stick within them; it's mostly common sense and there's no point trying to read between the lines to game the system; they employ teams of people mostly cleverer than us.

It takes between 48 hours and a week for Google to update it's site listings, it then needs to re-run the algorithm and verify you're not trying to game the system; only then does it update it's indexes.

As your site begins to get traffic via searches, with visitors landing and, importantly, staying on your site, your site's value or reputation goes up and you start to accumulate ranking points which can drive you up the search results in a virtuous circle where success begets success.

Don't forget that the spiders are looking at every single page of your site looking for relevance; you have to optimise every single page of your site. The search engines will then index as many as they can as relevant to specific topics; the aggregated relevance of all your pages then drives up the value of the site and the page rank. The search engines like to promote results they have already evaluated and ranked over wild card sites that randomly pop out of the index for a few keyword hits. It's not an ideal quality mark but given the billions of pages on the web, that's how they do it.

Why? Because the search engines are nearly all in the business to monetise the web. These are mostly marketing and advertising agencies with a search engine attached. To repeat a previous article, web users are the product, not the search results; searches only make money when users click on the sponsored links; the search engines want to maximise the ratio of paid hits to general traffic. The websites in the search results are just a medium for the search engines to make money.

What the search engines don't want is for users to click on links to garbage sites, get annoyed and go somewhere else. The more searches a user makes, the more likely they are ultimately to click on a money-earning link. Search engines want content that is 'sticky' so that users keep coming back, or even set the search engine as their home page.

Back to you...

Look at your site, the pages, content and meta-data as an outsider. You know what cheats and cheesy tricks look like. So do I. So do the search engines. Don't use them.

Don't create garbage pages for the sake of it. Quantity does not make for quality.

Don't wrap everything in Adobe Flash. It's horrible, slow to load and insecure; it's opaque to the search engines and it's a dying technology. Use HTML5 if you want whizzy multi-media content. There are plenty of tools coming through to help you generate good HTML5 content.

Don't manually over-submit your site to the search engines, they will treat you as spam.

Nobody knows your content like you do. Don't hand it over to some paid SEO monkey and expect to get a quality ranking that will last. You can do better. RC

Image: Card index by klynslis via Flickr, Creative-Commons