The problem about love Shanghai snapshot spiders imaginary theory


in this way, we can from the following several aspects to solve the problem of slow snapshot, one is to let body spider more diligent, ascension back to 200 spider weights, to bring the advantages of voting. The two is to reduce the monitoring area, the monitoring area too much, then nothing is more like spiders, and the relevant departments of our country, so we must be streamlined, >

spider group theory: love Shanghai by spiders crawl page every day, the spider is not only one, but a group of. In a group of spiders, they are each responsible for different division of labor. The specific division of labor and I also don’t know fully, it is known that the division can be divided into new web crawling and old web crawling. As a new 123.125.*.* crawling web pages in the majority, 61.135.*.* crawling with old web pages in the majority, love Shanghai and other IP segment of the spider, I of Fuzhou modern obstetrics and gynecology hospital observation network, the 2 kinds of. Website snapshot update was undoubtedly the fastest home is not home, you may be K, because the weight is highest, and will call the new content, but also the most likely to be found spider. Just said is a spider, so any a spider is not on a website to stay, so although the love of spiders in Shanghai every day to stay at your site, but not group of spiders all stay on your website. In the presence of spider your site, start their distribution of different surveillance area, some spiders responsible for head, some spiders responsible for root, some responsible for body, body which has different spiders do different things. That is to say the same spider is responsible only for small area, but only at certain times of the day came, when it found that you have updated it tomorrow, if not to give you a 304, more than 304, it will reduce the frequency of the crawl, what specific frequency I clearly, the theory model should be the sine curve.

for many novice webmaster, often have such doubt, why every spider crawling, but the site is not updated snapshot. The iceberg today to share a spider imaginary.

similarly, all spiders use the same rules, there will be monitoring head, root and other regions of spider is lazy, body part of the more diligent, but also lazy among body. If your site is updated every day, so body the update region is 200, the other is the 304. so love Shanghai to consider the latest snapshot for you or not? And to vote, because it is a group of spiders. Spiders have the right to vote, it seems fair, but there is a the problem, the spider’s different components, such as the body area of the spider, is hard work, head is relatively easy, so the spider spider need to vote decentralization, weights of body high, root weight low, of course there are different weights. The calculation results weighted to determine the love of Shanghai to give the site the latest snapshot. This is one reason why the website is updated every day but not updated snapshot.

