Comments on: Is it time to abandon Google search for docs?
Over the last year during the major rework of both the REBOL website and documentation (hundreds of pages) we selected Google as the search engine for the site. However, Google's been doing a poor job for some unknown reason.
Let me give you an example:
Today, while working on R3 Docs, I searched for the construct function page. Google displayed four results. None of them were the Construct Function page (which has been there for more than a year, so Google's had plenty of time to index it.) Similarly, when I search for "compose" I don't see the compose function listed.
This is a problem. Many of us depend on the web site's search feature to find the pages we need. Note, I'm not trying to bad-mouth Google here, I just need it to work really well for our web pages.
Those of you with admin access to the R3 Docs know that you can use a REBOL-based search function from the admin page. It uses the same basic search code as this blog, which is by no means fancy, but it tends to give good results. Searching for "construct", I get:
-
Functions: construct
-
Datatypes: Char!
-
Functions: make
-
Guide: Basic syntax
The top result is what I was seeking.
So, we've got ourselves a bit of a conundrum. To figure out what's wrong with Google (or perhaps the way we're presenting the pages to it), or drop it as the primary search engine and find an alternative (how's RIX doing these days?)... or use our own code.
Suggestions?
PS: Adding this blog with the above link may cause construct to become better indexed in Google, but you can still find various R3 functions that are mysteriously missing from Google search.
Update
Tried a few other engines to search for construct, here's a summary of results (in alpha order):
- Alltheweb - found the R3 page and a few related pages.
- AltaVista - found the R3 page and a few related pages, same as above.
- Bing - found the R3 page and a few related pages, same as above (surprised!)
- Cuil - found the R3 page (top result) but very few related pages (derails totally by third result.)
- Gigiblast - horrible.
- Google - failed to find the R3 page
- Lycos - failed to find the R3 page
- RIX (REBOL specific) - failed to find the R3 page
- Yahoo - failed to find the R3 page
Because this website is easy to modify, we'll test out the Bing search engine for a while. Let's see if it does better.
5 Comments Comments:
Oldes 27-Apr-2010 1:44:51 |
First of all you should provide a sitemap. It's the best way to let the search engines know what is important and what is new (without need to crawl all the pages again and again).
Here is some how to:
http://www.google.com/support/webmasters/bin/topic.py?topic=8476
It's easy, it's just a simple XML file. Here is some more reading:
http://en.wikipedia.org/wiki/Sitemap | Oldes 27-Apr-2010 1:53:14 |
And if I can have one more note: The I would choose reverse order in the page TITLE. For example instead of:
REBOL 3 Functions: construct
I would choose something like:
Construct / Functions / REBOL 3
The reason is, that if you have a lot of tabs opened in the browser, you see just the beginning of the title. If I have opened more REBOL related tabs, I see just REBOL many times.
Just go and check Wikipedia - notice, that they don't put word Wikipedia at the beginning of the title as well. | Oldes 27-Apr-2010 2:21:44 |
Also we have a very nice service related to SEO optimization. Check this:
http://seo-servis.cz/source-zdrojovy-kod/3379678
It's only in czech language, but you can see what's the main problem: there is no description, no keywords, no sitemap and there is no robots meta tag in the header, for example:
| Oldes 27-Apr-2010 2:35:11 |
The empty code above is:
<meta name="robots" content="index, follow">
| Carl Sassenrath 27-Apr-2010 19:04:19 |
Hi Oldes, yes, there's always been a sitemap for those doc pages, see http://www.rebol.com/r3/docs/sitemap.xml -- and Google knows about it... but, like so many other myths-of-search, it really makes "no difference". Proof: Bing finds all the pages just fine, and it doesn't even know the sitemap's url.
Also, it's another myth that modern search engines use the HTML metatags. They haven't for many years. Reason: those tags are not "trustworthy" and the SE's find the actual content of pages to be more accurate. However, Google will use the description line for summary text in its results, but we only do that on the primary page.
Regarding title... perhaps we should, but then it's also reversed in the search engine summary page, which isn't a good thing. So, that's a bit of a trade off.
|
Post a Comment:
You can post a comment here. Keep it on-topic.
|