Search Engine Optimisation: The Soon to be Impossible Dream!
There are today search engine and internet marketing services, in fact a new industry has materialised to exploit the fear of low search rankings.
This is not a new trend, back when simply resubmitting your website to the engines resulted in keeping your site at the top of the index, there was an accompanying boom in resubmitting "companies", as we know, these were just men in back bedrooms with a host of CGI and Perl submitting scripts and a timetable.
Search Engine optimisation or "SEO", is the latest incarnation of this bedroom profiteering, the important difference is that now the webmaster's are not just passively involved but are being forced to adopt totally artificial and unsocial practices that ultimately serve only to help damage the Internet!
SEO is supposedly the methodology and processes related to designing search engine "friendly" web content, the basic premise is something like "If I follow all the engines formatting and connectivity criteria, then my website will rank higher then a comparable website that does not".
All other things being equal, this seems quite positive given that the quality of a search engines database (index) directly effects its output; then webmaster's optimising their content so that search engines can correctly categorise the internet should logically improve the speed and quality of "the crawl".
SEO then, logically, should be good for the search providers, being able to maintain an efficient index, this should use less raw processing power, require less equipment and thus less energy; this must also be good for the users, being able to quickly and intuitively find what they want from a reliable source. Sounds reasonable right?
Well that's the happy version. The fact is that initially this may be true, you may gain a short term advantage, but once we have all optimised our content for analysis and (in so doing) ignored our users; We will then be back to where we started, and the search providers will just think up some even more ridiculous "laws" by which to "judge" us by, and like sheep we will all do that as well, thus the causal paradox is perpetuated and the users feel abused!
Even this is a vast oversimplification, the true nature of SEO is a lot more complicated; The heart of the problem and the real issue here is related to the search providers task, which is to strip mine the information junk yard otherwise known as the Internet, it may be full of interesting stuff but also plenty of garbage and they need to devise intelligent techniques to mine the interesting stuff!
The current "solution" is literally for the search engines to use their hegemonic standing to bully the webmaster's into organising their work in ways that have the primary effect of allowing quick "analysis" so they can categorise the website, but this has the secondary effect of requiring content to be designed "for" analysis, which typically translates to highly distributed connectivity, ie the website being effectively divided into "micro sites", which makes the maintenance of links and content more troublesome!
This is not necessarily a bad thing, most of these imposed linking and design methodologies are often positive and beneficial for a lot of subjects. My problem is that this is unilaterally enforced and it is this type of issue that is generating all the money for the SEO boys.
However this will soon be of no consequence. To understand the problem with this type of SEO operation, it is necessary to think about how we can approximate and simulate the human process of mining information and knowledge.
Let us assume we have set our Crawlers to work, automatically indexing pages (at random, looking at previous indexing and guided by user requests); we then format the resulting text: ASCII is usually used and validation follows this, search engines tend to ignore some tags and make use of good ones that help identify the content. At this point we would have reduced the Internet to a corpora, ie the collection of all HTML documents about no particular subject.
We then would set about item normalisation, ie identification of tokens (words), characterisation of tokens (tagging meaning to words), and finally running stemming algorithms to remove suffixes (and/or prefixes) to derive the final database of terms; this can be efficiently and compactly represented in lower term dimensional spaces, (Goggle are still essentially using inverted file structures).
Imagine each document of a corpus as a point ie a term in an N dimensional space, here the literal word matching type search is lost, but we acquire more of a semantic flavour, where closely related information can be grouped in to clusters of documents bearing similarities, however N dimensional vector spaces are of no help to the users.
After applying our algorithms to the corpora, we get a term by document matrix, where terms and documents are represented by vectors, a query can also be represented by a vector. So we have a query and our corpora (represented as vectors, both having the same dimensions), we can now start matching the query against all the available documents using the cosine angle between these two vectors.
But we now have a new artificial "problem"; we know the general answer to the question "which website's best match my search terms", this information now exists in our mathematical object, at a high level of abstraction, ie the cosine angles for all terms against the query vector, this is expressed as a vector corresponding to the sought column and therefore the document we are after, all we need do is present this to the user, right, well....
The issue is that a search engine needs to generate a linear index, ie convert the vectors corresponding to the minimum cosine angles into a human readable format, and until such time as someone thinks of a better way to do it, all engines output lists, like your shopping list, it has a start, a middle and an end, therein lies the problem, how to order the list!
The hypothesis seems simple, ordering information that might look chaotic at first, using the fact that closely associated documents tend to be relevant to similar requests. However, the internet (being a scale free network) is so vast that it is not possible to present a chosen feature space that represents the x closest documents to the convergence point in a given cluster from the common Euclidean distance. This is what should then be presented to the user in a more intelligible (semantic) display.
The engines could just present the returns as produced by the matching algorithms after decomposition, because the grouping generated using probabilistic/fuzzy patterns directly from the cluster might belong to more than one class, but the strength (degree of membership) value measured on a scale; using probability on a [0,1] interval, is quite adequate.
The reason decomposition in singular values works for ordering is related to the fact that the occurrence of two terms (say tomato and potato) is very high is reflected in the term-by-document matrix by showing that only x of the n terms are used very frequently.
The idea is that since the term say pepper is used/mentioned very little, then its axis/dimension does not affect much the search space, making it flat and relevant only in the other two dimensions
However the engine's demonic creators can't do this because they are still essentially using an inverted file structure, but they still want absolute correctness in their indexes and returned results which means trouble, because this assumes your index is perfect, incapable of being manipulated and that you can somehow order the returns in a meaningful way!
So the returned results can't generally represent the documents that match semantically, we now need to account for some subjective quantities, that can not be derived directly from the corpora, they attempt to deal with this by a cocktail of criteria that rank the returns in such a way as its more likely that the "better" results are closer to the top of the list.
There are many ways of doing this, the current trend is to use inference about the quality of web sites were possible because such quantities are beyond the direct control of the content creators and the webmaster's.
PageRank provides a more sophisticated way of citation counting but this is embodied in the consept of link analysis, using a relative value of importance for a page measured based on the average number of citations per referance item.
PageRank is currently one of the main ways to determine who gets into the top of the listings, but soon this will all become irrelevant when the engines stop using inverted file structures, because they can just use the grouping generated using probabilistic/fuzzy patterns resulting from the convergence point in a given cluster from the common Euclidean distance.
When the changeover from inverted file structures occurs, there will be two direct consequences:
1) The corpora will be capable of vastly more representative and more detailed data then is Currently possible.
2) The corpora will no longer be indexed as is currently done, they will embody semantic meaning and value, where some subjective quantities can be derived directly from the corpora without the need for cocktails or totally artificial rules.
The effect is that corpora will be more accurate and incapable of manipulation, thus variations of SEO that involve indirect manipulation of the index will become pointless overnight.
It is worth noting that the search providers are becoming increasingly pessimistic about website promotion in all forms, they currently penalise many things that can effect the results such as duplicated content (which can be perfectly legitimate), and satellite sites, ie one webmaster interlinking seemingly separate but highly relevant website's.
They may well start penalising webmaster's that promote their website's through articles they submit for third party distribution, as they do for people that post their sites information to bulletin boards!
Being banned from the top search engines can effectively destroy your business, if not directly through loss of visibility then indirectly in that people tend to judge you on weather your are organised enough to be listed !
The criteria are continually changing, as the amoral SOE boys attempt to pervert the resultes, these "laws" are not always clear and there are no appeals, where we are all subject to the providers up ending a drum then dispensing swift and hard "judgements", that can doom us at any time!
The part that erks the most is that as the indexes converge, (goggle's index is used directly by 2 of the 3 top engines and 5 others indirectly use it for their rankings) a bann by anyone of these engines is enforced by them all.
© I am the website administrator of the Wandle industrial museum (http://www.wandle.org). Established in 1983 by local people determined to ensure that the history of the valley was no longer neglected but enhanced awareness its heritage for the use and benefits of the community.
This RSS feed URL is deprecated, please update. New URLs can be found in the footers at https://news.google.com/news
Do-It-Yourself Search Engine Optimization
Search engine optimization, or SEO, is big business. If you rely on search engines to bring visitors to your business website, you need to rely on more than luck.
Search Engine Optimization: What Is It?
Search Engine Optimization is the creation of a web page, purposely designed to rank well with the Search Engines.If you want traffic to your website then you must Optimize your website for the Search Engines.
"Google Friendly" Solutions to Graphic-Intense Sites
We all know that the search engines can't "see" or "read" the graphics on our pages. We also know that we need to provide text on a page, so the spiders will have something to crawl and index.
7 Search Engine Optimization Strategies
Search engine optimization refers to the technique of making your web pages search engine friendly so that search engines are more easy to understand and analyze your website. Consequently, your site has a better chance to gain high search engine ranking.
Link Popularity --- Its Role and Importance In Getting Top Search Engine Rankings
Introduction"Link Popularity" - these words may have caught your attention several times while you have been searching the Internet for tips on optimizing your website for top search engine rankings. Link popularity means popularizing a particular link of a website by increasing the number of websites that link to that site.
How to REALLY Profit from SEO
I want to give you a few more things to think about as you excel and grow in the craft of search engine marketing. If you are anything like me, you were hooked the first time you really made a difference to someone else's success.
Optimze Your Web Site on a Shoe String Budget
Let me start off by saying I'm not a marketing guru or work for a fortune 500 advertising company. I don't even have a marketing background.
The Benefits of Organic SEO (Search Engine Optimization)
If you're going to sell any type of product or service online, you're eventually going to have to optimize your Web site for the Search Engines, in order to boost traffic and sales.Many years ago, it was quite easy to draw effective traffic, as a good content-driven site, with good products, was enough to get a Web site ranked highly in the Search Engines.
Search Engine Marketing: Are You Accidentally Hiding From Potential Customers?
You may be hiding if the search engines can't 'see' all of your site's relevant content.You've invested a great deal of time and effort carefully crafting compelling, helpful website content for prospective customers.
LSI and Link Popularity
When Paypal's official Web site no longer ranked #1 in Google on a search for "paypal," it was obvious that Google had become more aggressive in penalizing sites with "unnatural" backlink anchor text. Although the high-profile Paypal example has since been rectified, thousands of webmasters are suffering the consequences of not ranking for even their official company name, let alone their top keywords.
How and When Should I Submit My Website to Google?
As soon as you register your domain name, submit it to Google! Even if you haven't built your site, or written an copy, or even thought about your content, submit your domain name to Google. In fact, even if you haven't fully articulated your business plan and marketing plan, submit your domain name to Google.
Sales And Crawlers, Update! Update! Update!
The importance to the algorithmic web crawlers that speed throughout the web is crucial to the successful marketing campaign of your site. Your site would simply be a pretty compilation of HTML, XML, Java and the sort if search engine crawlers did not come around and look over your site.
Website Submission - What You May Not Know About Search Engine Indexing
Getting your website listed in the search engines is relatively easy if you submit it, but even if you don't you can still get your website indexed by the larger search engines by following a few simple tips. The reason for this is the search engines spiders search the web on a daily basis looking for new websites to index, old ones that are no longer valid, and the like.
Why Articles Are Not The Route To High Search Engine Rankings
If you have any interest in getting high search engine rankings for your website (and who doesn't) you've probably been sold the idea that writing and publishing your own articles will do it for you. Here's why that's not entirely true.
Search Engine Optimization Tips For 2005 - Part Three
Welcome to part three of our series of articles on search engine optimization. In the third and final part of our series of articles on search engine optimization we cover the topic of links, the types of links and what makes them so important.
Googles PR System Explained
The complexities of Google's PR (Page Ranking) System have grown more difficult to understand since the Hilltop Algorithm was introduced. This beginner's guide to the PR system explains the basics of what PR is, what it does, and how it affects your site's rankings.
Search Engine Wars - Quality Searches Vs Quantity!
It is no secret that Google and Yahoo are on a continuous battle to win our hearts and get everyone to convert, but is converting someone really a matter of the quantity or the quality?Let's take a look at some top key searches and compare them with some search engines online. I will outline a few things for each search result:1) Search Engine 2) Number of results found 3) Quality & content of the top 10 sites 4) What you find going beyond the first 10 pagesEach section will get ranked out of 10 points for quality (information taken on August 26,2005).
The Simple Formula To Search Engines
Search engines are one of the best tools to bring targeted traffic to your business. Millions of people are always using them every day to search for information that's suitable to them.
SEO - Get Your Site Out of the Google Sandbox Fast!
Is your new site sitting in the infamous Google "sandbox"? There is a way to get it out fast, as well as getting all of your other pages indexed!How?Write an article on your site topic and upload it to your website. You can either put it on your index page, or place a snippet of the content with a link to your new article on your index page.
Link Popularity Explained and How To Build Links
Link popularity is the single most influential factor for determining how well a web site will perform in search engine rankings. A web site's link popularity is computed from the number and more importantly, the quality of links pointing to a web site.
|home | site map|