Newsgroups: rec.arts.books.tolkien From: Michael@xenite.org (Michael Martinez) Subject: OT: Alta Vista and Search Engine Resarch (was Re: LOTR/HOBBIT Movie Fact/Rumor Roundup updated) Organization: The Xenite.Org Domain -- Worlds of Imagination on the Net Distribution: world Message-ID: <7rjh51$348_008@Org.xenite.org> References: <7qcmlt$1vk_004@Org.xenite.org> <7rdrd8$uc_026@Org.xenite.org> <7rfntj$2u0_012@Org.xenite.org> X-Newsreader: News Xpress 2.01 Lines: 109 Date: Mon, 13 Sep 1999 18:52:17 GMT NNTP-Posting-Host: 207.224.149.183 X-Trace: news.uswest.net 937248845 207.224.149.183 (Mon, 13 Sep 1999 13:54:05 CDT) NNTP-Posting-Date: Mon, 13 Sep 1999 13:54:05 CDT Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news1.sunrise.ch!news.imp.ch!uni-erlangen.de!news.belnet.be!carrier1.net!newspeer.clara.net!news.clara.net!newsfeed.icl.net!logbridge.uoregon.edu!news-out.uswest.net!news.uswest.net!Xenite In article , s.souter@edfac.usyd.edu.au (Stephen Souter) wrote: >In article <7rfntj$2u0_012@Org.xenite.org>, Michael@xenite.org (Michael >Martinez) wrote: >> >Dunno about *official* sites, but then how many official *Tolkien* sites >> >are there out on the web? :) >> >> 1 > >I suppose you're referring to www.lordoftherings.com. > >That may be the official LotR *movie* site, but hardly an official >*Tolkien" site. :) No, I wasn't referring to that at all, although I suppose that could raise the count to 2 if we're going to allow official movie sites. Technically, however, the only official Tolkien site I know of is Tolkien Enterprises' site. They literally own the rights to movies and merchandising, and they can and do control trademarks. Run afoul of their good will (and they seem to have been highly tolerant or ignorant of the online fannish presence), and one can get shut down quickly, I'm sure. >> I wouldn't be surprised to find more Web sites for "Alien" and its >> successors than for "The Fifth Element", but Alta Vista not only uses four >> databases, its result sets have turned flaky of late. I still prefer it >> above other search engines, but I don't trust its results the way I used >> to. > >I don't understand what you mean by "four databases". Altavista is not a >metasearch like (say) SavvySearch (which pushes yours search criteria >through multiple search engines). First of all, I know relatively little about the internal workings (or, as the gurus might say, the infernal workings) of Alta Vista. But I've been lurking on Virtual Promote's search engines forums and the people there who have analyzed Alta Vista's behavior say they have identified four separate databases. I have been able to confirm their findings by performing a simple test. Go to Alta Vista, type in a search phrase, and count the hit results. Wait a few minutes, hit SEARCH again, and compare the hit results to the previous search. Repeat the process every few minutes for about 15-20 minutes (I know, that's boring for most people, but if you're interested in search engine positioning for your Web sites, it's a valuable 15-20 minute lesson). Basically, you'll find there are four databases. >As to the "flaky" comment, if you're referring to the way it (amongst >other things) now changes the number of hits it retrieves midway through a >search, that is (believe or not) a feature! Well, that's a part of it, but by "flaky" I mean a lot of the "top ten" hits have absolutely nothing to do with the search phrase. Observers are suggesting this is due to "cloaking", which I'm still not entirely clear on. I think "cloaking" is the practice of running a CGI script that intercepts page calls to a Web server and submits specially prepared text when it detects a spider is making the request. If that's the case, people like me who use virtual Web-hosting services cannot "cloak" our pages. If it's not the case, then I'm kind of lost. Anyway, a lot of commercial sites are apparently "cloaking" their real pages so as to get better rankings on some of the more popular search engines. And any time someone complains about unrelated results showing up on Alta Vista (at the Virtual Promote search engine forums), the usual glib reply is, "They are probably cloaking". >I was startled the first time it started doing that to me. I was even more >startled when some of the pages of hits would up and vanish right in front >of me (especially if I left them sitting for a while while I went off to >investigate some of the pages it had listed. So eighteen months or so ago >I emailed a query off to Altavista Support. This was the answer I got >back: > > "the results from page to page change all the time. First it brings > up all documents in the search. Then it narrows it down to the exact > search." > >What was meant by "narrows...down" was not made entirely clear. Possibly >the person was referring to the way Altavista will (now) often bring up >multiple instances of the same page. What I have noticed, though, is that >it will sometimes (maybe more than "sometimes", except it's hard to catch >it in the act) winnow out pages I would judge to be quite valid hits (ie >within the terms of the search criteria I'd given it). They also expire results sessions. This is necessary because they cache your search for about 2 hours (maybe less, now) so they can let you go back and forth between the results pages. However, I usually type in very specific search criteria. I can often get fewer than 50 hits on a search, but not consistently. >This appeared to me to be a case of Altavista trying to be too clever by >half! But the (lengthy and rather annoyed) complaint I sent off to >Altavista Support about this never got a reply. :( Well, the guys analyzing what Alta Vista is doing say they change the algorithm every few months. And they have also decided that a lot of the major search engines performed massive purges over the summer. I'm not sure what's up with that. Some theories have been floated, but I don't believe anyone outside the search engine companies really knows what's going on. -- \\ // Worlds of Imagination on the Web info@xenite.org \\// FREE! Watch Internet TV shows at Xenite.Org! //\\ [http://www.xenite.org/index.htm] // \\ENITE.org...............................................