said the user search experience, "the relative timeliness coverage of sentences more intuitive, such as your search results in a search result, when you click the page does not exist, what to think? The search engine is trying to avoid these, so the spider crawls the web is also a limitation an important assessment point. Internet information more, spiders crawl round requires a longer period of time, this time before indexing many pages may have changed or deleted, which leads to the search results in a part of the data is expired.
search engine spiders to search engines is the source of information, for the webmaster, always want the website to the search engine spider friendly, spider can hope on his website will be more grasping points ". In fact, these spiders did not want to grab the page, multi page updates, but the Internet information is too great, sometimes the spider is out of reach. This leads to an assessment of the search engine spiders, spiders also worked hard to force every day, also need evaluation, of which there are 3 main assessment criteria: the importance of web crawling coverage, timeliness and grab web crawling.
thus, grab web coverage is a key standard for assessment of the search engine spiders, this is a great base, relates to the back of the index weight, and show the amount of sorting, the search user experience is essential.
crawls the web coverage rate refers to the number of web spiders crawl the Internet accounted for the proportion of the number of all the web pages, obviously, the coverage rate is high, the search engine can order the index ranking is also bigger, can participate in comparison to show search results are more and more, the better the user search experience. So in order to let users can get more accurate and more comprehensive results in the search, crawl web coverage is essential, in addition to grasp mode improvement of dark web crawl data has become an important research direction of the major search engines.
crawling efficiency of the
‘s search engine, not all the web pages which search engines can crawl on the Internet, all part of a search engine can index the Internet only, there is a concept of "dark net", the dark network refers to the search engine spiders in a conventional manner is hard to grab the Internet page. The spider is dependent on the links in the discovery of new page, and then crawl, but many pages are stored in the database. This spider is difficult or unable to grasp the information, the user is also unable to get these information in the search engine search.
crawls the web coverage of
with a word that is not in the spider web page change after the first time these changes reflect to the web page in the library, so the question is, for example, first page is content changes, compared to the re > search engine can not timely