Web Page Segmentation Using Combined Vips

The World Wide Web (WWW) serves a huge, widely distributed global information service. Web page usually contains various contents, which are relevant or irrelevant to the main topic of the web page. Retrieving of useful or relevant information in mass information has become the focus of information extraction research.Web page segmentation is an important technology for web driven application such as search engine and web browser. But it has to overcome a great amount of irrelevant information. This paper proposed a simple web page segmentation method and main content extraction system by combining DOM and Vision of the web pages. Keywords- Web Page Segmentation, Vision based page segmentation algorithm, DOM.