Tutorial: Configuring the RobotsTxt file
You can control the access of a visiting Web robot. You can configure the
robots.txt
file that exists on your web server, usually at the root level, to
control access. Web robots are programs that crawl through the web to obtain web content for all the
sites that are visited, and provide indexing for better performance of search engines. You can also
specify separate rules for different robots.
Why would I want to edit Drupal's pre-existing robots.txt file?
Malicious robots might choose not to honor the robots.txt
file, and by editing
this file you are broadcasting which sites you do not want others to see. Therefore, you should not
use this file to hide sensitive data. Instead, you might want to edit your
robots.txt
file to:
- Prevent duplicate information from being identified on your site
- Prevent internal pages from appearing in search engines
- Prevent private pages from appearing in search engines
- Prevent particular images, files, and so on, from being crawled
- Specify a
crawl-delay
attribute to prevent robots from overloading your server at load time - Exclude a particular robot
Before you begin
You must have a Developer Portal enabled, and you must have administrator access to complete this tutorial.About this tutorial
You will edit the pre-existing robots.txt
file and exclude access to a visiting
robot called BadBot.
- Log in to your Developer Portal as an administrator.
- Navigate to .
- In the
Contents of robots.txt
section, enter the policy to exclude access to a robot called BadBot.User-agent: BadBot Disallow: /
- Click Save Configuration to save your changes.
What you did in this tutorial
You have now successfully customized the robots.txt
file. Robots now use this
updated file to decide where they can crawl on your site. The BadBot robot is excluded access.
You can check whether your robots.txt
file is successfully changed by navigating
to your site and appending /robots.txt. You should see the content you entered
into that file.
For more information on how to edit your robots.txt
file, see https://www.robotstxt.org/.
What to do next
You can edit the robots.txt
at any time by navigating back to the page within
the configuration settings. You might choose to duplicate this file across all of your sites, or
choose different policies for different sites.