We recommend providing full log files if possible. This enables us to do things such as provide diagnostics for indexing issues. For instance, if we find that some URLs are not indexed, we can determine if they’ve ever been crawled. And we can provide data such as how long it takes for a site to be completely recrawled. And if the site is load balancing, having logs from all servers helps us find misconfiguration issues with a single server that may not be evident. However, if we have only sampled logs, Blueprint can still provide substantial information, such as canonicalization issues, server errors, problems with redirects (302s instead of 301s for instance).
Posted in: Blueprint - Server Log Analysis

