The robot.txt file is cached for a bit (not sure exact time
but not
long).
However, the GSA will only check the robots file before it
crawls a
host. So in the case where it did not seem to pick up when
you had
only one URL, it's likely that the GSA had not recrawled
that URL and
therefore did not find the changes. It probably would have
picked up
the changes if you had ordered a recrawl of that URL.
Brian
On Dec 19, 7:35 am, fetacheese <hors... gmail.com> wrote:
> I am conducting a test for one of our server admins and
I put only
> their server on one of our GSAs. They want to see how
changes to
> their robots.txt affects the files that end up in the
GSA index.
> I reset the index initially and had only their server
in the start and
> follow crawl url definitions.
>
> They got the expected results (one file indexed). Then
they allowed
> more directories in their robots.txt, but I never saw
them come up in
> the index until I reset the index again. Then they
immediately showed
> up.
>
> If a robots.txt is changed, doesn't that get read again
by the GSA or
> is that cached for an amont of time? I know another
server admin was
> about to take OUT some directories using robots.txt
(this was on
> 4.6.4G70) , but adding directories don't seem to work
on 5.0.
>
> Has anyone had any experience with this?
> Thanks!
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the
Google Groups "Google Search Appliance" group.
To post to this group, send email to
Google-Search-Appliance googlegroups.com
To unsubscribe from this group, send email to
Google-Search-Appliance-unsubscribe googlegroups.com
For more options, visit this group at http://groups.google.com/group/Google-Search-Applian
ce?hl=en
-~----------~----~----~----~------~----~------~--~---
|