WEB PAGE CLASSIFICATION BASED ON NOISE REMOVAL

Number of patents in Portfolio can not be more than 2000

United States of America Patent

APP PUB NO 20180025012A1
SERIAL NO

15214245

Stats

ATTORNEY / AGENT: (SPONSORED)

Importance

Loading Importance Indicators... loading....

Abstract

See full text

Systems and methods for improving accuracy of web content classification by removing perceived noise are provided. The system receives a Uniform Resource Locator (URL) of a web page that needs to be classified, and parses the web page so as to construct a tree containing a list of tags. Unwanted tags are removed from the list of tags to yield a tree containing only desired tags that form part of the web page. Subsequently, a list of hyperlinks are based on processing of the tree having desired tags, wherein the list of hyperlinks can include unwanted/undesired/invalid hyperlinks and valid hyperlinks. Unwanted hyperlinks can accordingly be removed from the list of hyperlinks, and each valid hyperlink can be categorized based on a list of categories, and a final category for the web page is determined based on a vector analysis of each category assigned to each valid hyperlink.

Loading the Abstract Image... loading....

First Claim

See full text

Family

Loading Family data... loading....

Patent Owner(s)

Patent OwnerAddress
FORTINET INC899 KIFER RD SUNNYVALE CA 94086

International Classification(s)

Inventor(s)

Inventor Name Address # of filed Patents Total Citations
Cao, Xiping Coquitlam, CA 3 22
Ma, Ye Burnaby, CA 11 37

Cited Art Landscape

Load Citation

Patent Citation Ranking

Forward Cite Landscape

Load Citation