~ Hints & Tips ~
         Petit image    Hints & Tips
Version December 2004
+Fravia's searching tips      Others' tips      Seekers' proverbs     


Fravia's 16 quick searching tips 

I use mostly fast, teoma and google for my broad queries. But there are MANY other good 'global' search engines, like for instance hotbot that can come handy at times.
See [elsewhere] on my site WHICH search engines or bots you should use for your specific queries. The number of retrieved pages, given in parenthesis, will of course vary each time you re-perform each example search. Note also how some of the examples given here represent quite useful "links factories" per se.

1) USE VARIOUS RESOURCES!
Should I give only ONE advice, it would be this one. Even more important that "keeping on track" (see below). Never, never, never overestimate your search tool of choice. EACH search engine, [main], [regional] or [local] has its own quirks and its own blindness patterns ("shadows"). Everytime I 're-play' a given search on some specific "free-pages depository" local search engine ( la geocities) I get amazing results... Thus you should NEVER 'stick' to a 'given' search engine 'of choice'. Learn how much they differ and - even more important - understand how much each "given s.e." results' set changes over time! The web is a quicksand, and search engines databases AND POLICIES are continuously changing as well. Altavista and ftpsearch, for instance, are now actively 'censoring' results. Try there a god search for MP3, DVD reversing, Napster, Gnutella or Infraseek and you'll quickly see what I mean.


2) KEEP ON TRACK!
Nothing easier than to loose your thread when you are working on the web. The examples used on my site represent links to interesting (I hope) searches / places /startpoints as well. As you'll soon realize, the examples and links offer you continuous opportunities to leave this site in order to browse to other very promising ones. This is done on purpose: The hyper bastard approach to web page building is -on most sites- to restrict click away opportunities to a bare minimum. Even when a reference demands a link, new methods hide or reduce the visibility of that same link. Everything in order to keep a visitor 'caged' or 'trapped' in a given site. I'll do the exact contrary, since you must learn some discipline if you'r going to be a good seeker. You leave my site for good while searching for a target? Good riddance. My links will offer you a lot of added knowledge AND will at the same time test your capability to keep on track :-)


3) LOWERCASE
Always enter your search terms in lower case (unless you want to limit your search). Most search engine will thus find both upper and lower case occurences of your searchstring. "pAris" (24) is NOT the same as "paris" (4110800)


4) EXACT SEQUENCE [""]
Enclose terms in double quotation marks if you want to retrieve those exact terms in that exact sequence. This may be very useful in order to find a specific page. Thus "searchengines" will give you (20940) pages with the two terms 'glued' together. Similarly "saerch engine" will retrieve some (11) pages WITH THIS SAME MISSPELLING ERROR.


5) NARROW DOWN [ AND | & | + ] and ELIMINATE MERCILESSY [ AND NOT | | | - ]
Narrow your searches by linking your search terms with AND or &, or simply use the plus sign [+]. The search engine will find only those pages that contain all of your search terms. Similarly, exclude pages that are not relevant to your search by preceding the search term with AND NOT or | or simply use the minus sign [-]. +"search engines" +hints +tips +techniques -tits -sex -"make money" (5200) is better than the more simple +"search engines" +hints +tips +techniques (7700)


 DOWNSIDE OF THE + & - SIGN
With the + sign you may miss related documents that don't have the words you specify as required. For example, the search "searching tips" +searchlores would not include documents that have the words "searching tips", but not searchlores.
With the - sign it's easy to exclude too much. For example, if you were looking for information on "bots script" but not in javascript, the search +"bots scripts" -javascript would exclude a document that was all about bots scripts, but that had the sentence "this kind of bot would be impossible in javascript"


6) DOWNSIDE OF THE BOOLEAN operators
It's often difficult to specify exactly what you want to include or exclude. You can also get unexpected results if you are not careful about your use of operators and parentheses. For example, the search seeking OR searching AND finding is the same as the search seeking OR (searching AND finding). Both queries will find documents that contain both searching and finding, together with documents that contain the word seeking. However, the query (seeking OR searching) AND finding is not the same. It will find documents containing the word finding and, in the same document, either seeking or searching. Be careful with the boolean operators!


7) "PECULIAR" strings
You should always strive to use differentiating keywords when searching the web. Words that are commonly used will not help you much. Extremely common words like articles and prepositions are so worthless that they are completely ignored. Try to use words which underline the peculiarity of your target. Common words, when combined with boolean qualifiers, can be very effective. You must identify the main concepts in your topic and determine any synonyms, alternate spellings, or variant word forms for the concepts. Remember that the most "peculiar" a word, the more useful it will be in order to sharpen your search.
+ title:"search strateg*" +hints +tips
in this case we did include the "search strateg*" string (which already has an elevate PEC) in the title: keyword.


8) SPECIAL KEYWORDS
Note the use of a keyword in the previous example. Here a short list of the main keywords (for altavista):
9) ASTERISK[*]
Note also the use of the asterisk [*] in the previous example: it MUST be used after at least 3 characters, it is valid for up to 5 characters or as an element of a phrase.
For Altavista:
  1. Asterisk (*): After 3 specified characters will search for matches in up to 5 trailing letters.
  2. Question Mark (?): After 3 specified characters will match exactly one more character.
  3. Double Asterisk (**) More flexible as it will search for matches for an unlimited number of trailing characters.
You also have the ability use the wildcards interchangeably and more than once in the same search string


10) ARCHIVE
You should archive your useful queries and repeat them over time. All search engines that contain the "cgi-bin" snippet in the query produced can be saved and used again later. Since the results of all queryes VARY WITH THE TIME (when traffic is particolarly heavy the search engines "cut" the results) you would be well advised, for important queries, to repeat them again and again.


11) STOP WORDS
Stop words are words such as "and" "the" and "or" which search engines exclude from their searches to make them more effective. These terms are excluded because they are either extremely common or they are used by the search engine for performing more specialized searches. Just think about how many documents on the Web contain the word "the" and you'll understand how important is a good stop words list for all search engines.
If you really do want to search for one of these terms, there is an easy way to work around stop words. By bracketing words in quotation marks, search engines will look for every word inside the quotes, in the sequence you specify. Thus, if you wanted to look for sites with the words search the web you would use the searchstring "search the web".


12) SNOOPING BOLDLY AROUND -1
As you'll learn elsewhere on this site, there are many methods to access some 'non public' portions of the web.
A quick tip is to look for a file called ROBOTS.TXT in the main directory of your target site, entering per hand the URL with the following pattern:
http://www.targetsite.com/robots.txt
This file is used to tell search engines which directories and files they should not index on a specific site. Thus anything that has been put inside a 'robots.txt' file will not be found by your searchqueries. However, once you have seen the names, you can still type them directly into your browser in order to access the various subdirectories and pages.


13) GO REGIONAL!
A fantastic tip: go regional, both using local search engines and using some linguistic tricks that work even for languages you do not know and/or using ad hoc dictionaries and translation free services.
The depth you can reach on a specific long term search using a -say- korean search engine cannot be surpassed by any google search. Go regional everytime you're stuck. You wont regret it!



14) DOWNLOADING FILES FROM BUSY SERVERS
If you are trying to download some (ahem) popular files, you are probably competing with many other people for access. Pick a server in a country where it is very early in the morning if you have this option, alternatively schedule the download so that it will be effectuated when the time IN THE STATES or in EUROPE is early in the morning (GMT 05.00 or GMT 12.00) or, MUCH MUCH better, use an automatic email downloader like downloadslave instead (see the accmail section) and spare you the hassle :-)


15) PROXIMITY SEARCHES... HIT PAYLOAD EVERYTIME YOU SEARCH!
Real ~S~eekers use proximity operators quite a lot (for obvious - ahem - reasons) as you'll learn in the advanced sections of my site.
Altavista uses the NEAR command in order to select keywords within 10 words of each other, useful but quite limited.
When you seriously work using proximity searches THERE WAS ONLY ONE SEARCH ENGINE FOR YOU: Infoseek which did allow you to choose any of the following options... or to combine them... :-)
ADJ, ADJ/#, OADJ, OADJ/#, NEAR, NEAR/#, ONEAR, ONEAR/#, FAR, FAR/#, OFAR, OFAR/#,
  • ADJ (adjacent words in any order)
  • ADJ/# (# number of words apart - exact: no more, no less)
  • 0ADJ (adjacent words in specified order)
  • 0ADJ/# (# number of words apart - exact - in specified order)
  • NEAR (within 25 words)
  • NEAR/#(within # words)
  • ONEAR (within 25 words in specified order)
  • ONEAR/#(within # words in specified order)
  • FAR (more than 25 words from each other in at least ONE instance)
  • FAR/#(more than # words from each other in at least ONE instance)
  • OFAR (more than #25 words from each other in at least ONE instance in specified order)
  • OFAR/#(more than # words from each other in at least ONE instance in specified order)
As you can imagine, being something useful this has been ELIMINATED from the web :-(


16) THE YO-YO TECHNIQUE
The 'down yonder' problem is well known by searchers. Being easily spammed, the main search engines suffer a terrible draw-back: indeed some interesting results may indeed be listed somewhere, yet the spam-cram keeps popping up in the first positions while the juicy targets you are looking for lie buried somewhere in the huge list, 'down yonder'.
There is an interesting approach, that can be used for most main search negines. They are all subject to spam, the most infamous one being Altavista... Hic alta, hic salta is a well-known proverb among seekers, meaning that you should jump straight at the 6/7th page of results when searching with Altavista, since all pages that are listed in the first postions are - mostly - irrelevant, or even paid scum. This approach is called the Yo-yo approach.

Others' tips 


Some notes: fast vs slow searching (by Mordred)

There is a side to searching which - while actually well covered in information - is not clearly stated - WHEN do you want your results. I think that there is an important border between finding fast (FF: that's when you need a quick answer) and finding slow (SF) (for example:

- things that are NOT YET answered,
- generic time consuming research,
- searching for "all of" (instead of "any of", which is more likely into the FF category)


Thus various search techniques can be applied to the two categories:
-
klebing, luring, ask-an-expert, password breaking are slow finding methods
- webbits, keyword shortcuts, and other tricks are fast finding (FF) methods
- some, like combing & yoyo, are useful in both

For example, searches of 'where is my car' type are obviously calls for quick answers, and if the info you need is short, you can even get it in the SERP summaries (the asterisk (*) trick is nice, and also 'asking the answer'). Yesterday I looked for some time the metaspy erom mentioned at the seeker's and saw with my very eyes the query [what is brittany speers shoe size], I swear! With a quick fix of the spelling and some quotes you can get ["britney spears" "shoe size *"] faster than the guy can figure out why google asks him for an alternative spelling, when his own is undoubtedly right.

Also, a useful thing to mention is what to do when you realize your search strategy is wrong - like:

If you get little or no results in a main SE, switch to metasearch.
If your keywords are too generic, try a SE that can cluster results (kartoo-like) until you learn enough specific keywords about your target.
If your target is very topic specific, find a topic-centric SE



Seekers' proverbs 


Google alone...
you'll never be done!

(Fravia)

Lowercase
just in case

(Fravia)

one-two-three-four
and if possible even more!
(5 words searching);
(Fravia)

A wise man getting few results at night
tries new spelling, even if it's right

(Mordred)

Few results on google too?
Don't waste time and use kartoo!

(Mordred)




Petit image

(c) III Millennium: [fravia+], all rights reserved and reversed