| |||||||||||||||
Controlling Mirago robotsIf you would like to prevent Mirago from indexing your site or alternatively you would like to limit robot activity to certain areas, here are some possible mechanisms: Meta tagsMirago supports the use of "noindex" and/ or "nofollow"
To activate them, just include this tag in the <META NAME="robots" CONTENT="noindex,nofollow"> N.B. The Mirago robot does not index keyword and description Robots exclusion standardMirago supports the Standard for Robot Exclusion which specifies a format for robots.txt files. When placed in a server's root directory, this text file allows a webmaster to deny access to all robots or certain robots and specify which areas of the site (if any) robots can index. The file is checked periodically by Mirago and permissions for the site are modified accordingly. The robots.txt file must be located at the root of a site. It will not be read from a subdirectory. N.B. If a robots.txt file is not present, robots assume they can index the entire domain or subdomain based on the premise that you have 'published' the site on the Internet for general access. If you also operate subdomains, the robots.txt file should be present in each root directory. You can indicate to well behaved robots such as Mirago that certain parts of your server should not be indexed by some or all robots. The following example illustrates the possible contents of a robots.txt file: # robots.txt file for http://mywebsite.co.uk/ The first line, starting with '#', specifies a comment. The next two lines specifies that the Mirago robot has nothing disallowed. This means permission is granted to go anywhere on that site. This is optional, as a robot will assume it has permission to access your site if it is not excluded by any The next two lines indicates that the robot called 'naughtyrobot' has all relative URL's starting with '/' disallowed. As all relative URL's on a server start with '/', this means the entire site should not be accessed by the robot. The third paragraph indicates that all other robots should not visit URL's starting with /stay_out or /devproject. It should be noted that the '*' is a special token meaning 'all robots' and is not a regular expression. Instead of For more complex access restrictions we support the use of multiple user-agents and the For example:
User-agent: robot1 In this case robot2, robot3 and robot4 all behave identically. The Where * can be used to identify collections of entries (eg /devproject/client*.htm). Multiple *'s may be included in any line. N.B. Mirago must be specified as the Password-protecting parts of your siteMirago robots use similar protocols to a browser. They have no mysterious access system, so documents which are in an authentication area protected by a password cannot be visited by Mirago. Removing your site from the Mirago indexWe hope that inclusion within the Mirago index brings more visitors to your site, but we will of course remove your site's entry upon request. This can be accomplished by emailing remove@mirago.co.uk Back to help menu © 2008 - Mirago
|
|||||||||||||||