List Info

Thread: Google Mini not able to login to wiki and crawle




Google Mini not able to login to wiki and crawle
country flaguser name
United States
2007-09-24 17:57:57
HI,
I am trying to setup google Mini for Wiki I am having 2
problems
------------------------------------------------------------
-----
Problem 1
------------------------------------------------------------
-----
I have given in Crawler Access
For URLs Matching Pattern, Use:http://wiki.mydom
ain.com/mediawiki/
index.php/Special:Userlogin
Username:myusername
Password: mypassword
Confirm Password: mypassword

but seems its not able to login

------------------------------------------------------------
-----
Problem 2
------------------------------------------------------------
-----
I see some pages crawled but in Crawl Diagnostics it shows
as

Excluded: On page with robots nofollow meta tag


Please Help How to get this working


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the
Google Groups "Google Search Appliance" group.
To post to this group, send email to
Google-Search-Appliancegooglegroups.com
To unsubscribe from this group, send email to
Google-Search-Appliance-unsubscribegooglegroups.com
For more options, visit this group at http://groups.google.com/group/Google-Search-Applian
ce?hl=en
-~----------~----~----~----~------~----~------~--~---


Re: Google Mini not able to login to wiki and crawle
country flaguser name
United States
2007-09-24 18:42:35
To solve problem 2, you will need to look for the following
string in
the mediawiki php source files:

 <meta name="robots"
content="noindex,nofollow" />

You will need to change this string to one of the following
selections
based upon your indexing goals:

<meta name="robots"
content="index,follow" />
<meta name="robots"
content="noindex,follow" />
<meta name="robots"
content="index,nofollow" />

Also, Here are the GSA exclusion rules (Do Not Crawl URLs
with the
Following Pattern) we use for mediawiki:

#MediaWikki exclusion rules - begin
contains:=Special:
contains:Image:
contains:redirect=no
contains:=Template:
contains:&feed=
contains:action=edit
contains:action=history
contains:printable=yes
contains:&limit=
contains:&oldid=
contains:Userlogin&
contains:Recentchangeslinked

As for problem 1, we do not log in to the wiki to index the
content.

-tom

On Sep 24, 3:57 pm, "palanvi...gmail.com"
<palanvi...gmail.com>
wrote:
> HI,
> I am trying to setup google Mini for Wiki I am having 2
problems
>
------------------------------------------------------------
-----
> Problem 1
>
------------------------------------------------------------
-----
> I have given in Crawler Access
> For URLs Matching Pattern, Use:http://wiki.mydom
ain.com/mediawiki/
> index.php/Special:Userlogin
> Username:myusername
> Password: mypassword
> Confirm Password: mypassword
>
> but seems its not able to login
>
>
------------------------------------------------------------
-----
> Problem 2
>
------------------------------------------------------------
-----
> I see some pages crawled but in Crawl Diagnostics it
shows as
>
> Excluded: On page with robots nofollow meta tag
>
> Please Help How to get this working


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the
Google Groups "Google Search Appliance" group.
To post to this group, send email to
Google-Search-Appliancegooglegroups.com
To unsubscribe from this group, send email to
Google-Search-Appliance-unsubscribegooglegroups.com
For more options, visit this group at http://groups.google.com/group/Google-Search-Applian
ce?hl=en
-~----------~----~----~----~------~----~------~--~---


Re: Google Mini not able to login to wiki and crawle
country flaguser name
United States
2007-09-25 22:57:30
The "URLs matching pattern" field needs to contain
the URL that is at
the start of all pages that can only be viewed by an
authenticated
user. It can not just contain the login page. Also the
Special:Userlogin page of a MediaWiki normally uses cookie
based
authentication while the Crawler Access page handles HTTP
BASIC auth
and NTLM. To crawl a page protected by cookie based
authentication you
will need to use Forms Authentication or Cookie SItes on the
GSA.

Thor.

On Sep 25, 8:57 am, "palanvi...gmail.com"
<palanvi...gmail.com>
wrote:
> HI,
> I am trying to setup google Mini for Wiki I am having 2
problems
>
------------------------------------------------------------
-----
> Problem 1
>
------------------------------------------------------------
-----
> I have given in Crawler Access
> For URLs Matching Pattern, Use:http://wiki.mydom
ain.com/mediawiki/
> index.php/Special:Userlogin
> Username:myusername
> Password: mypassword
> Confirm Password: mypassword
>
> but seems its not able to login
>
>
------------------------------------------------------------
-----
> Problem 2
>
------------------------------------------------------------
-----
> I see some pages crawled but in Crawl Diagnostics it
shows as
>
> Excluded: On page with robots nofollow meta tag
>
> Please Help How to get this working


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the
Google Groups "Google Search Appliance" group.
To post to this group, send email to
Google-Search-Appliancegooglegroups.com
To unsubscribe from this group, send email to
Google-Search-Appliance-unsubscribegooglegroups.com
For more options, visit this group at http://groups.google.com/group/Google-Search-Applian
ce?hl=en
-~----------~----~----~----~------~----~------~--~---


[1-3]

about | contact  Other archives ( Real Estate discussion Medical topics )