METHOD AND SYSTEM FOR CRAWLING THE WORLD WIDE WEB

Number of patents in Portfolio can not be more than 2000

United States of America Patent

APP PUB NO 20090287641A1
SERIAL NO

12119651

Stats

ATTORNEY / AGENT: (SPONSORED)

Importance

Loading Importance Indicators... loading....

Abstract

See full text

A method and system for crawling the World Wide Web is described. One embodiment avoids becoming bogged down by dynamically generated Uniform Resource Locators (URLs) pointing to Web pages having the same or substantially similar content (e.g., URLs generated by a “spam poison” Web site) by browsing automatically and systematically Web pages within a first domain of the World Wide Web, each Web page having its own content; determining that the content of a currently visited Web page is the same as that of a predetermined number of other Web pages that have already been visited; and ceasing to browse the first domain and instead browsing a second domain of the World Wide Web different from the first domain in response to determining that the content of the currently visited Web page is the same as that of a predetermined number of other Web pages that have already been visited.

Loading the Abstract Image... loading....

First Claim

See full text

Family

Loading Family data... loading....

Patent Owner(s)

Patent OwnerAddress
WEBROOT SOFTWARE INC2560 55TH STREET BOULDER CO 80301

International Classification(s)

  • [Classification Symbol]
  • [Patents Count]

Inventor(s)

Inventor Name Address # of filed Patents Total Citations
Rahm, Eric Boulder , US 1 54

Cited Art Landscape

Load Citation

Patent Citation Ranking

Forward Cite Landscape

Load Citation