Searching Upstream
You've just found a great target web page. You've even followed
the links "downstream" from that web page to other web pages.
Now do the most important search. Find the web pages that point towards
the target web page (= searching "upstream") |
 |
| My Concept of Searching Upstream: |

Look at these three scenarios and see the value in these different approaches to
searching.
- You discover a valuable web page called target.html. Most people simply explore
the hyperlinks contained within target.html. Those links take you to places
suggested by the author of target.html. In short, all you are able to discover, are
the pages which are located "downstream" of target.html
- You discover a valuable web page called target.html. In order to help judge the
importance, value, popularity of the page, it would be nice to know "How many other
web pages contain hyperlinks pointing towards target.html". This shows
how many web authors know about the target site, and felt that the site was "good
enough" to deserve a hyperlink from their own web site. If a web page
looks fairly anonymous, you might be able to infer something about the source of the
web page based on the who else links toward the web page. Fortunately, some search
engines (like search.yahoo.com and AltaVista) will allow you to search for all web pages which contain a
hyperlink to a specific URL. This is what I call "searching upstream" of a
web page. Here are two example searches:
link:http://navigators.com/isp.html
<-- Shows who links specifically towards my ISP Page.
link:http://www.whitehouse.gov
site:ru <-- shows pages which link to whitehouse.gov
which also happen to be have a domain name ending in .jp.
Note: some search engines can rank their search results based on the
"popularity" of a web paged derived from how many links point toward that web
page.
- You discover two valuable web pages called target.html and other_target.html. If
you can find any web pages that link to both of these target pages, then you may discover
a great "directory" such as "joe's list of targets on the
Internet" for the subject covered by these target pages. For example;
Suppose you want to find a list of news sources for the country of Colombia.
There is a short list of Colombian newspapers listed in Yahoo. First I
check how many web pages link towards Elmundo and then how many link towards
Elheraldo. Once I am satisifed that these two newspapers websites are
reasonably popular, I can then search for web pages that link to both
newspapers. The links to accomplish this search are provided here: Colombian Newspapers:
Yahoo's List,
- search.yahoo.com = link:http://www.elheraldo.com.co
, link:http://www.elmundo.com
,
links to both newspapers.
Here are additional search examples:
|
Search Terms (in
search.yahoo.com) |
Search Results |
| link:http://www.example.com |
Web pages containing links toward
example.com |
| link:http://www.example.com/pageA.html |
Web pages containing links toward the
specific web page |
link:http://www.example1.com
link:http://www.example2.com |
Web pages which contain links towards both
example sites. This is a great way to discover virtual libraries (i.e.
Joe's mega-guide to example-sites) |
| link:http://www.example.com site:ru |
Web pages hosted on .ru servers which
contain a link towards example.com |
| site:example.com |
Web pages hosted on any kind of
example.com server (i.e. site:gov.ru shows pages from www.gov.ru, duma.gov.ru,
economy.gov.ru, etc) |
| link:http://www.example.com site:gov.ru |
Web pages hosted on gov.ru servers which
contain links to example.com |
| searchterm site:gov.ru |
web pages hosted on gov.ru servers which
mention the "search term" |
| Another example of searching upstream: |
In the diagram below, you will see how your "Searching upstream"
results will vary depending on which pair of target sites you use:

You may want to also try
ALEXA Traffic
Rankings, for similar information based on the traffic patterns of millions
of users of the Alexa Toolbar.
| The importance of Searching upstream |

Searching upstream is not only a good idea, it is actually necessary
in order to even reach almost half of all web pages known to Alta Vista
Detailed research paper on this: http://www9.org/w9cdrom/160/160.html
Note for my Alumni: Your referrals are always appreciated