~ Main search engines ~
         to basic    Main
search
engines
Updated August 2005, version 1.49
    
SEARCH ESSAYS OF CHOICE
Powerbrowsing ~ Polylinguistic search ~ Inktomi's search syntax ~ effective queries ~ Search Engines Anti-Optimization ~ Fishing for troubles ~ Music searching ~ Catching the rabbit's ears ~ When your search fails ~ Follow Links in the Underground ~ Google's wild side ~ Using Fuzzy Logic ~ A Re-ranking trilogy ~ Searching scarcity ~

Back to Portal   ~    Library   ~   Bk:flange of myth  ~  
pda searches    low band searches (good for GPRS)


INTRODUCTION 
(read this)


Fravia's searching MAPA (masks and pages)   
Best s.e.
Ando (cached)
Google (cached) 
 ¤[GOOGLE]¤ (cached)
Fast  
 ¤[FAST]¤
Teoma  
 ¤[TEOMA]¤
Best/Main s.e.
Inktomi
¤[INKTOMI]¤
Hotbot
¤[HOTBOT]¤
A 9 (cached)
MSNsearch (cached)
Yahoo!
Alta!
Adva  Simple
Useful s.e.
Openfind stale :-(
Baidu (cached)
Gigablast (cached)
IceRocket (webarchive)
Furl (webarchive)
Auxil s.e.
Kart00 (graph)
Touch (graph)
Lycos
Ouverture
Looksmart
Excite (ill)
Other
Entireweb
Wayback (past)
Factbite (ency)
Wisenut
Exalead
[FTPSEARCH]
@ PHP
¤[Our searching scrolls!]¤
[600 engines for next to nothing]
@ fravia's
Targets
Local
Regional
Compound
Usenet
Accmail
@ fravia's
Live searches
Page Providers
Combing
Details
Databases
Allinones
@ fravia's
Images
Books
Laws
Files
Filez
Passwords
Completely bogus & crap
look.com
2020search
tygo.com


Quick forms
   
 Always 100 rez, safe off
    
  fastsearching for:   100 rez
   
Find this Phrase 100 rez
   
  Use the sliders! e.g. {frsh=86} {popl=13} {mtch=80}  
   
Da biggest index?



Instructions & Caveats

This is for sure one of the most important parts of searchlores, and you would be well advised to try some of the incredibly powerful search engines listed below. If you limit yourself to google (that in november 2004 suddenly doubled it's index to 8 billions documents) or to Yahoo (that in July 2005 claimed to have indexed almost 20 billion documents) you 'll just cover less than one half of the visible web (and not even 1/100 of the hidden one).

Just copy this page onto your harddisk as c:\main.htm (or whatever), and then bookmark it there and use it (after having edited or thrown away anything you fancy) in order to perform effective searches on the web using any main search engine and starting from an unpolluted jumping off place, a page that has as few frills as possible and as many useful forms as we know of. A page that you can modify -and ameliorate- yourself (feedback, in that case, would be appreciated).

The main reason you should use more than one main search engine is that search engines' results overlap FAR less than you would think. Recent studies point out that around 3/4 of the results of a given search are UNIQUE for each search engine.

Remember that search engines list only the first part of any BIG DOCUMENT: the size varies.
Google had a famous limit of 101K, which was abolished in January 2005, the new limit should be around 150K. These limits are very annoying when dealing with large documents (or on-line books).

Note also that just because one, hundred, or thousand pages from a given site are crawled and made searchable trough one of the main search engines, this does not guarantee that every page from an indexed site has really been crawled and indexed. This shortcoming hits not only 'new' pages, that can take MONTHS to be indexed: beehives of spiders harvesting a site often MISS whole subdirectories, old and new. Useful material may be all but invisible to those that only use 'main' search tools to seek. Moreover anyone that uses regularly google (for instance, but other search engines are not that different) will have noticed how polluting commercial sites results nowadays are. Would a search engine introduce a new, simple "please hide all commercial sites form your SERPs" (Search Engines Result Pages) option, or switch, or slide, it would probably become king of the hill in a couple of months.

Therefore, seen the commercial-oriented pollution of the web, you would be well advised to use regional engines, usenet and other specialized or targeted search tools and combing techniques and also to rely on your own bots as well, when searching your various targets.

Note that you can also easily search and find targets that do not exist any more :-)


A useful tool to compare results in google and yahoo:
http://www.langreiter.com/exec/yahoo-vs-google.html?q=searchlores


SEARCH ENGINES FORMS
(Use the MAPA to navigate)



ALTAVISTA ADVANCED SEARCH [Only 400 results viewable]
AND,OR,(),NOT,NEAR,",*
link:text (search for links to 'text') anchor:text (search for links with the description 'text') url:text (search for given text in the url) domain:targetdomain (search files within 'targetdomain') host:hostname (search files on 'hostname') title:text (search 'text' inside the title tags) applet:text (search Java applets named 'text') image:filename (search images with such 'filename')

Read the Altavista in depth page!
Spammed as if there were no tomorrow & very badly commercialized.
The idiots behind altavista's marketing managed to ruin the best search engine of the middle nineties.
It is still THE ONLY search engine which is TRULY BOOLEAN, hence offering truly amazing opportunities to real seekers... once you have taken care yourself of the spam.

Altavista algos' main drawback is that they are very easy to spam, so you'll get most useless results in the first 20-30 positions: "hic alta, hic salta" (a seekers' proverb)... experienced searchers mostly jump directly in the middle of altavista's results lists.
Altavista is the 'dead links champion' among the 'main' search engines. Use the Simple search (which defaults to OR) ONLY if you really know what you are doing :-)


Boolean query: 

            Sort by:

        Language:          Show one result per Web site

                From:     To:   (e.g. 31/12/99)

Simple search - Graphic Version


ALTAVISTA SIMPLE SEARCH [Only 400 results viewable]
For boolean operators, and more info, use Advanced Altavista instead!

Ask AltaVista a question.  Or enter a few words in

search refine

Search - Advanced




Altavista's ad hoc strings

One of Altavista's most SPECIFIC features is the anchor: operator, which will allow patient searchers to find relevant pages trough tha anchor tag.
For instance: anchor:snowflakes or anchor:posette or anchor:beria or anchor:kafka will give you a series of noise reduction arrows...
of course you can extend the trick to whatever...
anchor:warez or anchor:gamez or anchor:whatever :-)


Kart00
A "Graphical" search engine, rather interesting result clusters.
Here follows the text search form, but by all means try the cartographic interface

Worldwide web   English web  
more options    To use the best of KartOO, try the cartographic interface.


Try Openfind
Staggering results... once upon a time... now stale and blocked
"MySearch" did let the user register his/her interested terms. The system will automatically search in the new-page database every day and notify the user of matches if any of the registered queries are matched. A Whats'new system was (once upon a time) also provided.



All Pages English Pages


Baidu
The powerful chinese Google alternative... with CACHE!
"...the world's second largest independent search engine..."



BAIDU ADVANCED
DIQU BAIDU (regional)


Looksmart ~ For instance: searchlores
Quite commercial oriented... powered by Inktomi... but uses its own databases!
Search for    

IceRocket

(a compound engine with some own and blog results)
IceRocket uses innovative metasearch technology to search the Internet's top search engines, including WiseNut, Yahoo, MSN, Teoma, Altavista, Alltheweb, Lycos, and many more.... Based in Dallas, so beware :-)

Search the Web:

Furl

(hard to say if this is useful or not)
"Save, search and share your Personal Web. Furl it"
"Furl saves a personal copy of any page on the Web and lets you to find it again instantly, from any computer. Share the sites you find, and discover useful new sites. Become a member to start building your Personal Web"

Fact is you can use some of the 'comments' this s.e. will dig.

Search for  

The Entireweb
This is -for some queries- a very useful search engine, check it!

   Advanced
 Preferences

 
 

The Wayback machine
This is not only a -powerful- search engine, but also an incredible stalking tool! Explore the Net as it was!


YAHOO [Only 677 results viewable]
",*

Yahoo recognized the tragical mistake of going commercial and went 'back to basic' in late 2002 (better late than never) it seems to be gaining momentum as part of the inktomi factories :-)
Note that yahoo recently bought the wondrous fast/alltheweb search engine (and promptly killed it :-(
Yahoo is now one of the three "big players" (google, MSN and Yahoo) and claims to index 19 billion sites (against google's 8 billion).

Advanced Yahoo search


Note that there are some direct addresses for yahoo (see google's UF, point 14), for instance: http://216.109.117.135/search.
There is an interesting "MSN alike" slider tool you should be aware of: Yahoo Mindset, try for instance fravia
Yet, strangely enough, this does not seem to work for "caravaggio": http://mindset.research.yahoo.com/search.php?p=caravaggio. Why? :-)
EXCITE [Only 4011 results viewable]
AND,OR,(),NOT,,",
Excite is a classical example of just another 'ignoble corporate merge'. Just click on rthe link above and look at it! See? Idiotical & useless, obsolete (late-ninety) 'portal' approach. As a consequence it ceased to be a major player in January 2002 when Infospace killed it injecting tons of paid search results. This applies to all merges btw: attempts to escape the fate of all pyramide schemes that always forebode catastrophes. Recently the Italians and Germans at Tiscali have try to revamp this engine on the sunset boulevard. It is still full of pay-per-click crap, so noone in his right mind uses it.


 Web Search 
exclude words 
search in 

excite image search (powered by fast)

 Image Search 
Format  ALL  JPEG  GIF  BMP 
Type  ALL  COLOR  B/W  LINE ART 

Visit the ad hoc GOOGLE page
WARNING: Google has been moved to its specific page, where you will find a wealth of information. Here only a few masks:

Simple Google


        
Advanced GOOGLE
(only 3% of users take advantage of it, poor 97% zombies :-)

G. scholar  ~  G. Univ search  ~  G. Classical :-)

and a nice "GoogleRanking" bookmarklet: internet+searching


Googlette:


On 10/NOV/2004, probably as a counter to Microsoft's MSN new beta "super" search, google *doubled* its indexed pages, claiming now a total of up to 8 billion pages, which should correspond, approximately, to 1/4 of the web (around 35 billions pages according to our own data). One wonders where did they hid all these billions pages until november 2004 :-)

LYCOS [As many results viewable as you get!]
AND,OR,(),NOT,NEAR,",

"Part Man, Part Machine" ~ Open Directory & DMOZ used. Uses especially Fast's index, with updates at greater intervals than FAST. Major sin: Has closed the VERY useful Trondheim ftp-search facility.

Lycos advanced: fields    Lycos advanced: language    Lycos advanced: link referrals
Lycos help page
Gigablast
Most recent search engine, quite good, it seems. HEY! It has a cache, like Google!
Search for...
all of these words
this exact phrase
and this exact phrase
any of these words
none of these words
Sort by date
Restrict to this Site
Restrict to this URL
Pages that link to this URL
Site Clustering yes   no
Number of summary excerpts 0   1   2   3   4
Results per Page 10   20   30   40   50


TOUCHGRAPH

A graphical map of incoming and outcoming links, still in beta, uses google.
http://www.touchgraph.com/TGGoogleBrowser.html

FACTBITES

Factbites, quite interesting australian aggregator, more encyclopedia than search engine

Enter topic:  


INKTOMY RAW

Mighty'raw' access to Inktomi's data

(pointed out by Shally)

Search @ http://169.207.238.189/search.cfm...



Visit the ad hoc HOTBOT section

WARNING: Hotbot has been moved to its specific page, where you will find a wealth of information. Here only the mask:

or use some...
...ADVANCED WEB FILTERS FOR HOTBOT
Language 
 
Limit results to a specific language
Domain/Site 
Include    Exclude
  
Return results in specific domain (e.g. wired.com) or top-level domains (e.g. .gov). Multiple domains/sites may be specified, separated by a comma.
Region 
Limit results to a specific continent or country.
Word Filter 
  
  
  
Limit results to pages containing/excluding the words specified.

Limit your query to specific parts of pages. .
Date 
or on
Limit results to pages published within a specified period of time.
Page Content 
Audio       MS PowerPoint   Shockwave/Flash
Image       MS Word   Video
Java       PDF (Acrobat)   WinMedia
MP3       RealAudio/Video
MS Excel       Script
Specific Extension:        (e.g. .gif)
Return only pages containing the specified media types of technologies
Block Offensive Content 
Always
Sometimes (for non-ambiguous, offensive content)
Never
Prevent pages containing offensive content from being returned.


A9
The powerful Amazon Google alternative... with RESULTS + IMAGES + Books extracts!

 
Advanced Search


Visit the ad hoc FAST section
WARNING: Fast is being killed by Yahoo¡ (March 2004)

Fast knowledge has been moved to its specific page, where you will find a wealth of information. Here only the mask:
Search for:  
After having chosen the "boolean expression" option, you can use AND, OR, ANDNOT, and parentheses (!!) for nesting

FAST's [Advanced Search] is truly amazing and second to none.

FAST's [help] FAST's [news] FAST's [pictures]

ANDO

Pointed out by Nemo
AndoSearch, at Alexa, tries to have exactly the same query syntax as google's, the biggest difference is the field restriction options.
It has a wealth of parameters and a huge database (modify the stop word threshhold to 100 and count to 1000 for instance... slow but fine!)
Query :

Time-out (milliseconds) :
Max results per host :

XML :
HTML :

Count Only :

Get Context :

Fast Relevance :
Faster Relevance :
Slow Relevance : Limit
Make Phrase :
Debug :
Return DMOZ Categories :
AGrep Args
Adultfilter :
Crawl Mask :

stop word threshhold (percent) :
Shingle Threshold :
Fluff Threshold :

Start : Count : Summary : Max Per Server :
Save To Disk :

Relevance Parameters A: B: C: D:

Words : Min Occ: Max%: Max Ret: Max Tot: Max Uniq: Min Len:

AndoSearch Query Syntax. AndoSearch Parameter Help.

OUVERTURE

This used to be "Go To", the commercial clowns changed the name because this "reinforces our leadership in performance-based search", haha :-) Uses Inktomi, like Hotbot. Ranks results by how much a company is willing to pay for listings and is heavily akamai infested :-(
The mask below is relatively "clean", you should NEVER use ouverture's site own mask to perform your searches (has visitor tracking sniffing annoying logging options aplenty).
 

WISENUT [Only 300 results viewable]
default to AND     phrase searching: use ""     use - for NOT    
no truncation     use + to force stopwords

Example string:
http://www.wisenut.com/search/query.dll?q=%22advanced+searching+techniques%22

WiseNut is a "Korean/Japanese" new 'main' search engine. has good customization feature and one single huge database of indexed Web pages. It lacks almost all advanced search capabilities, yet it seems useful because it gives results that you will not find elsewhere.
Search for Web pages... 
... WITH ALL of these words
... WITHOUT ANY of these words
... WITH this EXACT PHRASE


EXALEAD

For instance "advanced internet searching"

http://beta.exalead.com/search/C=0/2p=1: Exalead Advanced


ADVANCED EXALEAD SEARCH
Visit the ad hoc TEOMA section
WARNING: Teoma has been moved to its specific page, where you will find a wealth of information. Here only the mask:

 
Find this Phrase
advanced teoma!! (& advanced search tips)

Microsoft's search engine attempts

MSN new beta "super" search   and   MSNsearch



MSN new beta "super" search

Plusses: It's quicker than google (a lot) and has three interesting "sliders" (more below).
Minusses: is smaller than google, is noisier (has less relevant signal) than google, is staler than google, its 'images' search is very poor.