Madame Schmidt compounding   
~ Compound ("Meta") search engines ~

         to basic    Compound
Updated August 2005

Back to searchlores' portal

Back to basic



~ Compound ("Meta") search engines ~

[Introduction]   [Our own scrolls]
[Compound searches]   [Compound pointers]
   red [Mamma]       red [Dogpile]      red [Metacrawler]     red [Vivisimo] (very good parsing!)       red [Surfwax] 

    [Inquirus] (±)   [Ixquick]     [Clusty](ß)   [Mearf](ß)   [jux2](ß)  

[Snooz]     [Ilectric]   [Iboogie]  [Inference] (±)  [Metafind]   [Profusion]
   [Ithaki]   [Debriefing (ex Ixquick)]   [Metor]   [Queryserver]  

Regional Metsearches
DE: [Metager]  
[Corrections and Additions]    [Metacrawler form]   ["Metameta" form]

Best compound engines: [Dogpile]   [Vivisimo]  [Mamma]   [Surfwax]   [Ez2find]   

Introduction

"Metasearch" engines
(aka "compound" ~ "parallel" ~ "inference" engines, aka "pumps" aka "metas", aka "multi-threaded")


Metasearch engines are search tools that send a query simultaneously to several search engines and Web directories... and sometimes to the so-called Invisible (Deep) Web: online informationand databases not indexed by the traditional main search engines.


Advantages: they query more search engines at the same time, which is important given the low degree of overlapping among the main search engines.
Disadvantages: 1) they are good mostly only for unique terms queries; 2) they spend a short time in each database; 3) they discard complex searching logic; 4) they try to "accommodate" all search engines they drain using always the same query (not very good); 5) they don't drain some important main search engines (which you should always consider using if you want to cover more than 1/3 ~ 1/2 of the Web).


Alas! Many 'compound' (or 'meta' or 'parallel') search engines are NOT what they promise to be. They are just all-in-one simple CGI scripts, trying to attract traffic to their sites simply querying with the SAME querymask various main search engines, one after the other.

In order to check how really META yor metasearch engine is, you should take account of the following parameters: As you probably know, there is a huge difference between the results that the main search engines pretend to have found and the results they will show you (see for some examples the [yoyo] searching technique). A good metasearch engine will always inform you about the claimed reported results and the effectively found results.


Our own scrolls

Internet searching is not a matter of just dropping one word inside google. In order to query at the same time more than one search engine, of course, you could use our "scrolls" - short php bots, prepared by Laurent with the help of DQ at the [PHP Lab]. The most obvious advantages -for you- would be the absolute absence of any advertisement whatsoever, the possibility to build on the source code and some fairly advanced configuration possibilities.   Give them a try!

To the scrolls room!
Major scrolls:   
Blue Scroll Of Searching    Clear Scroll of Clairvoyance
Minor scrolls:   
Indigo Scroll Of Local searchings    Purple Scroll of Effective searching
Azure Scroll Of Second Effective searching    Lavender Scroll Of most recent engines

Compound searches

Inquirus, one of the most promising compound search engines...
Alas! Seems to be DEFUNCTnow: here some story. Previously you could have used following links
http://inquirus.nj.nec.com/
http://inquirus.nj.nec.com/?dd=2
  http://inquirus.nj.nec.com/i2/inq2.pl?SI=FALSE
Alas these links do not work any more.

There still is a working prototype, though: http://inquirus.nj.nec.com/i2/inq2.pl (see http://inquirus.nj.nec.com/) Requires java though.

" The Inquirus metasearch engine - real-time analysis of current web contents. Query-sensitive summaries and Specific Expressive Forms for question answering."


Ixquick
Searches AltaVista, Fast Search, Excite, HotBot, Infoseek, MSN, Yahoo & more.
See also the new version: Debriefing below.

Web  News  MP3  Pictures 

Ixquick uses Alta's style rules. Space search terms to search for pages with as many of the terms as possible. Use the + operator to compel.
AND, NEAR, Parentheses, etc.
Additionally you can specify where certain information must appear with fields. Currently supported fields include:
Web  MP3  News  Pictures 

Clusty

Uses GigaBlast, MSN, Lycos, Looksmart, Wisenut, Open Directory & Overture
"Clusters" links... clicking on the name of a cluster will display all of the search results that it contains


Request:


Mearf

Uses Alltheweb, altavista, google & yahoo
Mearf is an experimental meta search engine using content based collection fusion methods. It sends the given query up to 5 different search engines, and merges the results using different collection fusion strategies.


 
   Link info
alltheweb altavista google yahoo

http://jux2.com/

Uses google, yahoo and AskJeeves
Jux2 is a beta meta engine intended as a comparative research tool to check (mainly) the differences between google's and yahoo's results. Jux2 claims that their three metapool search engines typically share fewer than 3.5 results among their respective top 10 results.


search , and  simultaneously
  


THE BEST ONE!
was [Inference Find], now unfortunately defunct
http://www.infind.com/



[Mamma Metasearch]
http://www.mamma.com/psearch.html


DIRECTORIES:
Open Directory Looksmart Directory Business.com About.com Mamma's Collection

INDEXES:
Teoma Google MSN Entireweb Gigablast

Mamma supports advanced operators:
• Exact match using "quotations"
• Compelled terms are included using +plus +signs
• Undesired terms are excluded using the -minus -sign

Also note their "refine your search" options à la teoma

Canadian compound engine.
Here their own hype & spin: "Created in 1996 as a master's thesis, Mamma.com helped to introduce metasearch to the Internet as one of the first of its kind. Due to its quality results, and the benefits of metasearch, Mamma grew rapidly through word of mouth, and quickly became an established search engine on the Internet"

      border=0>
     
 


[Snooz Metasearch]
http://www.ijs.co.nz/info/snooz.htm


Snooz Metasearch a very good metasearch engine: Snooz Metasearch. also has syntax translation (e.g. it translates "&" to "AND" and "+link:" to "+linkdomain:" (for HotBot/Anzwers/etc.)). It allows use of Booleans and dates (e.g. in queries going to AltaVista), which are features absent from other metasearch engines. It can allow metasearching inside regions and return results that are entirely inside a specified country (or inside either one out of 2 countries). Opera-hostile?


[Dogpile]
http://www.dogpile.com/


Dogpile is a good metasearch tool to use if you are looking for a lot of information or something that is hard to find, need to use a more complex type of query, and don't mind getting some duplicate hits. Dogpile accesses many more search engines then any other metasearch tool. It does not rank or sort the results or eliminate duplicates. You get results from one engine after the other in the same format you would get if you visited the site and entered your query there. The results come in batches so you can deal with one group of results before going on to the next. Dogpile brings back the first 10 or 20 hits from each site. There is a button that allows you to bring back still more hits if you want them. Dogpile also lets you build your own customized search strategy. Dogpile and its companion Metafind are resources developed by the same person, and Metafind has now been fagocitated by Metacrawler.

There is also an interesting 'compare search engines results' tool, that will compare your (first 10 results) queryes on yahoo, MSN and gogle.

Search Options
You can select where to search first: The Web, Usenet, FTP.
And Then Options
Presents these choices: STOP (default), The Web, Usenet, FTP. This allows you to start with the targets on the Web list and go on with the Usenet or FTP targets. The default (sensibly) is to STOP.
Search Engines - searched three at a time in the group you have selected
The Web: Yahoo!, Lycos' A2Z, Excite Guide, World Wide Web Worm, WWW Yellow Pages, PlanetSearch, What U Seek, Lycos, WebCrawler, InfoSeek, OpenText, AltaVista, Excite & HotBot.
Usenet: Hotbot News, Reference.com, Dejanews, Infoseek News, Altavista and Dejanews' old Database.
FTP: Filez, FTP Search and Snoopie!.(Only the first word in your query will be passed on to these search engines.)
Speed
Relatively fast for each group of targets. Dogpile does not do any reformatting or sorting, so the results will be passed on as soon as they are received by Dogpile.
Query Options
Dogpile and Metafind use the same query format. They accept Boolean queries and translate them for each search engine that is accessed. If you just type a list of words, the system treats it as if the words were joined by AND, equivalent to searching for all the words. The Boolean terms you can use are: AND, OR, NOT, NEAR. You can also use quotes to delimit phrases as in "search tips". Parentheses can be used to group terms. Some sites do not handle phrases, parentheses or some of the Boolean options. These will be deleted before the query is sent to the site. A moderately complex query: "search tips" hints NOT database is interpreted properly and translated appropriately for all the engines on the list.
Results
Results are presented without any reformatting just as received from each search engine. After the results from each engine, Dogpile adds a button allowing you to get more (if more are available).
Customized searching
The Custom Search button on the Dogpile home page takes you to a page where you can customize Dogpile by picking which search engines Dogpile will query and the order of the search. Your browser has to be capable of receiving "cookies" and cookies have to be enabled for you to use this option. Dogpile also provides an Advanced Search page that lets you select which engines will be searched but not the sequence of search.
Wait option
This option tells Dogpile how long to wait for each engine to return results. The default is 20 seconds.
Metafind option
When Dogpile is displaying search results, it provides an option to run the same query on Metafind (see description). You might pick this option if you would like to see merged results with duplicates identified or see a longer list of results.


[about, altavista, goto (Bleah :-(, infoseek, lycos, thunderstone (Bleah :-(, yahoo]


[MetaCrawler]
http://www.metacrawler.com/index_text.html


Metacrawler is the resource to use if you want quick, accurate results. It uses all of the most effective search engines except HotBot. The standard interface allows you to click fast or complete for quick and detailed searches respectively. The complete choice seems almost as quick as fast and returns a longer list of results. Metacrawler provides a [power user interface] that provides more control and search options, including an ability to filter hits by site location.

Search Engines
AltaVista, Excite, InfoSeek, Lycos, WebCrawler, & Yahoo!
Speed
Very fast. Results are typically returned in about 15 seconds, faster than searching many of the engines directly.
Query Options
You can select one of these options: any, all, as a phrase.
Results
Metacrawler combines the results from each search engine into a single list, eliminating duplicates and noting which search engine or engines detected each item in the list.
Power Search options
Results from: Everywhere, North America, Europe, Asia, South America, Africa...
Metafind filters the list of hits from each search engine, only including those with URL domains matching the region you have chosen. When using this option, it best to ask for 30 hits from each engine if you want to have something left after the hits have been filtered. Querying "how to search" with [Africa] produced no hits, with [North America] there were more than fifty.
Results per page: 10, 20, 30
Controls how many hits per page in the merged list.
Timeout: 5, 15, 30, 60... seconds.
This is how long Metafind will wait for information from each site.
Results per site: 10, 20, 30
This is how many results Metafind accepts from each search engine before eliminating duplicates and filtering on location.



[Metafind]
http://www.metafind.com/


Metafind
Metafind is the metasearch tool to use if you need to use the more complex queries that it supports, want to get more than 100 hits, and can do without document summaries and abstracts. Metafind, like Dogpile, accepts more complex queries than other metasearch tools. There is an option to sort results by domain, a capability not offered by other metasearch tools. (Metacrawler can filter on domain, which is not the same thing.) Going against six search engines, Metafind regularly returns a list of about 175 items sorted so you can identify duplicates. All it returns is the title and URL - no summaries or abstracts - so you may have to spend a fair amount of time fetching and looking at documents that you could have skipped if you had the document summary returned by other metasearch tools. Metafind and its companion Dogpile are new resources, both developed by the same person who has established a new company and is looking for commercial support. The future prospects of this resource should be considered uncertain.

Search Engines
AltaVista, Excite, HotBot, Infoseek, OpenText and Webcrawler. Metafind retrieves 10 links from AltaVista twice, 10 from Excite twice, 50 from HotBot, 25 from Infoseek, 10 from OpenText and 50 from Webcrawler.

Speed
Quite fast. Results are typically returned in about 30 seconds.

Query options
Metafind accepts Boolean queries and translates them for each search engine that is accessed. If you just type a list of words, Metafind treats it as if the words were joined by AND, equivalent to searching for all the words. The Boolean terms you can use are: AND, OR, NOT, NEAR. You can also use quotes to delimit phrases as in "search tips". Parentheses can be used to group terms. Metafind points out that some of the engines will ignore the parentheses. A moderately complex query: "search tips" hints NOT database is interpreted properly and translated accurately for all the engines on the list.

Results
Metafind combines the results from each search engine into a single list, eliminating duplicates and noting which search engine or engines detected each item in the list.

Sort options:
Sort by keyword (default) - alphabetically by keyword found in title or URL
Do not sort - not clear why you would want this option
Sort alphabetically - probably the most useful choice
Sort by domain - could be useful looking for foreign or regional information - try it!


Wait option
This option tells Metafind how long to wait for each engine to return results. The default is 40 seconds and results are almost always available quicker than that.


[Ithaki]
[http://www.ithaki.net/dir.html]
"we're unable to detect your country, please choose it now" :-)

A rather weak server at 24.120.30.35, with "http://24.120.30.35/cgi-bin/alicia/nph-gogol.cgi?" if you choose Russia, "http://24.120.30.35/cgi-bin/alicia/nph-metabuscador.cgi?" if you choose Argentina, "http://24.120.30.35/cgi-bin/alicia/nph-bossa.cgi?" for Brasil and so on.




[Profusion]
[http://www.profusion.com/CatNav.asp?ID=1&AGTID=1&queryterm= (advanced)]
You can choose 'fastest 3' or 'best 3'

parallel search, AND OR booleans, eliminates doubles
Example of a web search (queryterm=%22advanced+search+techniques%22)
http://www.profusion.com/searchresults.asp?
queryterm=%22advanced+search+techniques%22&AGT=Web&Category=1%2C1&CATID=1
&option=all&RPP1=10&rpe=10&totalverify=0&auto=all&Engine=349&E=349&Engine=1166
&E=1166&Engine=1144&E=1144&Engine=1146&E=1146&Engine=1175&E=1175
&Engine=1141&E=1141&Engine=1129&E=1129&Engine=1143&E=1143&Engine=354&E=354
&Engine=363&E=363&Engine=1176&E=1176&Engine=1139&E=1139
&Category=245%2C245%2C20%2Cxchg&SHW245=0&Category=6%2C6%2C20%2C
user&SHW6=0&Category=172%2C172%2C20%2Cuser&SHW172=0&Category=96
%2C96%2C20%2Cuser&SHW96=0




Debriefing (ex ixquick)
[debriefing (ixquick)]

Please note the search string [ http://debriefing.ixquick.com/do/metasearch.pl?
cat=web&cat=web&cmd=process_search&language=english&rl=DEBRIEFING
&query=searchlores&engine0=aol&engine1=altavista&engine2=excite
&engine3=findwhat&engine4=looksmart&engine5=lycos&engine6=msn
&engine7=netscape&engine8=dmoz&engine9=goto&engine10=sprinks_eng
&engine11=teoma&engine12=yahoo&engine13=directhit
]



Metor
[Metor]

Not bad. A german Metaengine by Volker Carlguth: "A search and retrieval system that integrates information from hundreds of databases whose contents can not be reached by traditional search engines. Metor includes specialized databases, archives and catalogs..."


Search for



Queryserver
[Queryserver]

Quite Interesting time/access data, clusters reports
Query queryserver for: Help!
Submit search formSearch
Customize



[Metager]
[http://meta.rrzn.uni-hannover.de/ (Uni Hannover)]
Has also a 'Teste Existenz' (test if page exist) function

"Bei den Metasuchmaschinen hat MetaGer die Nase vorn."
parallel search, AND OR booleans, eliminates doubles



[Surfwax]

Click the small lenses to get a quick summary (SiteSnap) Lenses with a red plus are "most relevant", red little stars represents Home pages, and then there is a list of the search engine used as "source" for each given result...  (hotbot, yahoo, open directory...)


[Ez2find] (ex ew2www)

Ez2Find is a good French metasearch engine that gathers results from various main search engines, parses the results, removes the duplicates and includes links to relevant directory categories (results from the Open Directory) and to clustered results.

http://ez2find.com/meta/global/search.mpl?mode=all&per_page=20&timeout=10&depth=1&safe=&qry_str=fravia&category=Any+Language

Note the "clustered results" on ez2find's right side!
Web Metasearch
Dmoz Google MSN Yahoo WiseNut Teoma


Ez2find also offers Systran's translation (pseudo-proxi) service.


[Iboogie] ~ [Advanced Iboogie]
Metasearch for images as well
"The algorithms developed by iBoogie are using a combination of linguistic clustering and statistical clustering. They generate hierarchical clustering as opposed to a simple "flat" grouping of similar documents. This is done in real-time on a set of documents return by the search, without any predefine grouping, pre-build knowledge base, or pre-processing of all the document collections used by the search engines."

"Iboogie's clustering is computationally inexpensive and very fast, it will process 250 text snippets in about 140 milliseconds on a Pentium III, 864Mhz with 256MB of main memory"

Form below uses cookies, so it wont work if you have a good browser.
 
Web  BuyWeb  DeepWeb  Images  Video  Audio 



[Ilectric]

Nice meta with images and links together.
"With a single query, you get the most comprehensive results from Altavista, Teoma, Alltheweb, Amazon, Sprinks, DMOZ, Yahoo, and Kanoodle. For every query, ilectric metasearch sorts and ranks each hit, removes duplicates, and presents the end result with inline images"



The ILECTRIC stalking trick: clicking on the bottom URL of each queryresult you will get an automated, full, whois query!


Corrections and Additions

Dave's (March 2000)

Gheez Fravia+: I found an interesting and very helpful search technique encapsulated at www.quickbrowse.com that I didn't see referenced by URL or concept on your new pages.  Concept is simple... goes to desired engine and downloads ALL the pages of hits from your search.  No more "For the next 20 hits, click here and wait 20 seconds" stuff.  You just get a long HTML doc that has all the hits from that engine.  There is a onetime login process that take about 30 seconds and a cookie no doubt, but after that you can use the "Quicksearch" link and the rest of the iterations of the quickbrowse implementation.
 
Makes life a lot easier for me, and thought you might want to include it in your excellent pages of lores.  I have learned SO much from you over the years, thanks for sharing your knowledge!
 
Dave Lamme (California)


Jeremy's (December 2001)

Sir, on your webpage relating to compound search I dont see the search site which I use: Queryserver
http://www.queryserver.com/web.htm
It uses 10 search engines and returns categorised results. I have no connection with this site other than as a (fairly) satisfied user.
(I was studying your site in the hope of finding something better :*)
_______________________
Jeremy L Hinton


Wanna search Metacrawler?

Note that Metacrawler has now fagocitated Metasearch, yet another nice example of oligarchical development and diminishing variety, typical of the impoverished webworld of the 'commercial bastards'.
Have a go!
   
any   all   phrase
            [metacrawler.com/help/faq/]


"Metameta" search

Yup, a Metameta search. A tool for searchers that want to cast a big net over the pond and want to catch only the biggest and most visible fishes. Of course this kind of search multiplies ad libitum the known disadvantages of all metasearch engines as well: spending a very short time in each, already "shorttimed", meta-data-base...
Note also that the metameta bot below pumps from some metaengines (which are not listed above) that I would not actually recommend
CleverSearch meta search engine
Enter words to search for in the box below:

Choose index:
Results per page:
Match:

Engines to use:
MetaCrawler
ProFusion
HuskySearch
SavvySearch
MetaFind
Mamma
Dogpile
Surfy

Compound pointers

A big thank to friend Iefaf


As a first try: www.metaeureka.com

More links:
Metaværktøj (metasearch in Danish)
Lists 25 Meta-search Engines (but no super-meta)
http://www.db.dk/dbi/internet/metavrkt.htm

A guide to specialized SE
http://www.searchability.com

A huge list
http://www.leidenuniv.nl/ub/biv/specials.htm

Another one (340K)
http://natur.exit.mytoday.de/koi/Special_Links_Search.htm

iRover was nice but now dead.
http://irover.linuxave.net/cgi-bin/isearch?

http://it.umary.edu/Library/research/support/search_engines.html

Vivisimo (sic): the correct spelling would be vivissimo :-)
See the essay by Shally Steckerl, that points out how Vivisimo "goes miles above and beyond simply getting content from other search services"!

A typical Vivisimo's query: http://vivisimo.com/search?query=%22advanced+searching%22&v%3Asources=Web&x=35&y=16



Petit image

(c) 1952-2032: [fravia+], all rights reserved