List Info

Thread: ypbind hangs as of current from midday yesterday (kern+user)




ypbind hangs as of current from midday yesterday (kern+user)
user name
2007-06-25 22:08:37
Hi all,

Since upgrading to yesterdays current my NIS server (xen
domu) won't start 
ypbind.

I have tried reinitializing the yp database and files with
no change. All the 
other NIS relevant processes are starting fine. NFS is
working better than 
ever.

What can I do to provide more details? I've not had to debug
a process before 
so any information that can enable me to help would be
appreciated.

Sarton

Re: ypbind hangs as of current from midday yesterday (kern+user)
country flaguser name
France
2007-06-26 11:17:26
On Tue, Jun 26, 2007 at 01:08:37PM +1000, Sarton O'Brien
wrote:
> Hi all,
> 
> Since upgrading to yesterdays current my NIS server
(xen domu) won't start 
> ypbind.
> 
> I have tried reinitializing the yp database and files
with no change. All the 
> other NIS relevant processes are starting fine. NFS is
working better than 
> ever.
> 
> What can I do to provide more details? I've not had to
debug a process before 
> so any information that can enable me to help would be
appreciated.

Can you ping your ypbind client when this happens ?
I would first start with tcpdump on both lo0 and the network
interface ...

-- 
Manuel Bouyer <bouyerantioche.eu.org>
     NetBSD: 26 ans d'experience feront toujours la
difference
--

Re: ypbind hangs as of current from midday yesterday (kern+user)
user name
2007-06-26 18:54:22
On Wed, 27 Jun 2007 02:17:26 am Manuel Bouyer wrote:
> On Tue, Jun 26, 2007 at 01:08:37PM +1000, Sarton
O'Brien wrote:
> > Since upgrading to yesterdays current my NIS
server (xen domu) won't
> > start ypbind.
> >
> > I have tried reinitializing the yp database and
files with no change. All
> > the other NIS relevant processes are starting
fine. NFS is working better
> > than ever.
> >
> > What can I do to provide more details? I've not
had to debug a process
> > before so any information that can enable me to
help would be
> > appreciated.
>
> Can you ping your ypbind client when this happens ?
> I would first start with tcpdump on both lo0 and the
network interface ...

It is the server I am trying to start ypbind on. When
initiating ypbind via an 
ssh session, I can ping the server and the ssh session I am
in is fine but I 
can't ssh in from another console. As soon as I ^C the ssh
client in the 
other console logs in.

Loopback is seeing:

09:44:50.049092 IP (tos 0x0, ttl  64, id 21366, offset 0,
flags [none], 
length: 164, bad cksum 0 (->28d1)!) localhost.65458 >
localhost.sunrpc: UDP, 
length: 136
09:44:50.049290 IP (tos 0x0, ttl  64, id 21367, offset 0,
flags [none], 
length: 148, bad cksum 0 (->28e0)!) localhost.exp1 >
localhost.1020: UDP, 
length: 120
09:44:50.049429 IP (tos 0x0, ttl  64, id 21368, offset 0,
flags [none], 
length: 56, bad cksum 0 (->293b)!) localhost.1020 >
localhost.exp1: [bad udp 
cksum b3df!] UDP, length: 28
09:44:50.049453 IP (tos 0x0, ttl  64, id 21369, offset 0,
flags [none], 
length: 164, bad cksum 0 (->28ce)!) localhost.65458 >
localhost.sunrpc: UDP, 
length: 136
09:44:50.049591 IP (tos 0x0, ttl  64, id 21370, offset 0,
flags [none], 
length: 148, bad cksum 0 (->28dd)!) localhost.exp1 >
localhost.1020: UDP, 
length: 120
09:44:50.049716 IP (tos 0x0, ttl  64, id 21371, offset 0,
flags [none], 
length: 56, bad cksum 0 (->2938)!) localhost.1020 >
localhost.exp1: [bad udp 
cksum 73df!] UDP, length: 28
09:44:56.099756 IP (tos 0x0, ttl  64, id 21488, offset 0,
flags [none], 
length: 164, bad cksum 0 (->2857)!) localhost.65458 >
localhost.sunrpc: UDP, 
length: 136
09:44:56.099934 IP (tos 0x0, ttl  64, id 21489, offset 0,
flags [none], 
length: 148, bad cksum 0 (->2866)!) localhost.exp1 >
localhost.1020: UDP, 
length: 120
09:44:56.100066 IP (tos 0x0, ttl  64, id 21490, offset 0,
flags [none], 
length: 56, bad cksum 0 (->28c1)!) localhost.1020 >
localhost.exp1: [bad udp 
cksum 33df!] UDP, length: 28
09:44:56.100089 IP (tos 0x0, ttl  64, id 21491, offset 0,
flags [none], 
length: 164, bad cksum 0 (->2854)!) localhost.65458 >
localhost.sunrpc: UDP, 
length: 136
09:44:56.100221 IP (tos 0x0, ttl  64, id 21492, offset 0,
flags [none], 
length: 148, bad cksum 0 (->2863)!) localhost.exp1 >
localhost.1020: UDP, 
length: 120
09:44:56.100320 IP (tos 0x0, ttl  64, id 21493, offset 0,
flags [none], 
length: 56, bad cksum 0 (->28be)!) localhost.1020 >
localhost.exp1: [bad udp 
cksum f3de!] UDP, length: 28
09:45:02.159762 IP (tos 0x0, ttl  64, id 22181, offset 0,
flags [none], 
length: 164, bad cksum 0 (->25a2)!) localhost.65458 >
localhost.sunrpc: UDP, 
length: 136
09:45:02.159939 IP (tos 0x0, ttl  64, id 22182, offset 0,
flags [none], 
length: 148, bad cksum 0 (->25b1)!) localhost.exp1 >
localhost.1020: UDP, 
length: 120
09:45:02.160071 IP (tos 0x0, ttl  64, id 22183, offset 0,
flags [none], 
length: 56, bad cksum 0 (->260c)!) localhost.1020 >
localhost.exp1: [bad udp 
cksum b3de!] UDP, length: 28

And with daily output telling me this in dom0 (daily from
domu is fine):

network:
netstat: kvm_read: Bad address
Name            Ipkts  Ierrs        Opkts  Oerrs  Colls

I'm getting the impression it's network card related.

A bit more info:

uname -a&&pkg_info|grep xen
NetBSD gogeta.internal 4.99.21 NetBSD 4.99.21 (XEN3_DOM0)
#4: Mon Jun 25 
05:04:37 EST 2007  
rootspike.internal:/usr/obj/sys/arch/i386/compile/XEN3_DO
M0 i386
xenkernel3-3.1.0    Xen3 Kernel
xentools3-3.1.0     Userland Tools for Xen

ifconfig -a
bge0:
flags=8b43<UP,BROADCAST,RUNNING,PROMISC,ALLMULTI,SIMPLEX,
MULTICAST> mtu 
1500
        
capabilities=3f80<TSO4,IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx,
TCP4CSUM_Tx,UDP4CSUM_Rx,UDP4CSUM_Tx>
        enabled=0
        address: 00:13:72:18:02:ad
        media: Ethernet autoselect (100baseTX 
full-duplex,flowcontrol,rxpause,txpause)
        status: active
        inet 192.168.210.10 netmask 0xffffff00 broadcast
192.168.210.255
        inet6 fe80::213:72ff:fe18:2ad%bge0 prefixlen 64
scopeid 0x1

I could start poking but this probably looks obvious to
someone else. Should I 
disable hardware csums?

Thanks,

Sarton

Re: ypbind hangs as of current from midday yesterday (kern+user)
country flaguser name
France
2007-06-27 04:08:06
On Wed, Jun 27, 2007 at 09:54:22AM +1000, Sarton O'Brien
wrote:
> 
> It is the server I am trying to start ypbind on. When
initiating ypbind via an 
> ssh session, I can ping the server and the ssh session
I am in is fine but I 
> can't ssh in from another console. As soon as I ^C the
ssh client in the 
> other console logs in.

It's probably because it's waiting on ypbind. Once ypbind is
killed
nsswitch only uses local files again.


> 
> Loopback is seeing:
> 
> 09:44:50.049092 IP (tos 0x0, ttl  64, id 21366, offset
0, flags [none], 
> length: 164, bad cksum 0 (->28d1)!) localhost.65458
> localhost.sunrpc: UDP, 
> [...]

it's normal on loopback I think, checksums are not computed
on loopback.
You can try:
sysctl -w net.inet.ip.do_loopback_cksum=1
sysctl -w net.inet.tcp.do_loopback_cksum=1
sysctl -w net.inet.udp.do_loopback_cksum=1

> And with daily output telling me this in dom0 (daily
from domu is fine):
> 
> network:
> netstat: kvm_read: Bad address
> Name            Ipkts  Ierrs        Opkts  Oerrs 
Colls
> 
> I'm getting the impression it's network card related.

Maybe kernel/userland mismatch ?

-- 
Manuel Bouyer, LIP6, Universite Paris VI.          
Manuel.Bouyerlip6.fr
     NetBSD: 26 ans d'experience feront toujours la
difference
--

Re: ypbind hangs as of current from midday yesterday (kern+user)
user name
2007-06-27 20:33:51
On Wed, 27 Jun 2007 07:08:06 pm Manuel Bouyer wrote:
> As soon as I ^C the ssh
> > client in the other console logs in.
>
> It's probably because it's waiting on ypbind. Once
ypbind is killed
> nsswitch only uses local files again.

Does the ordering matter? I have files first then nis. I
found this behaviour 
strange as it _is_ the server and the auth files reside
there.

> > 09:44:50.049092 IP (tos 0x0, ttl  64, id 21366,
offset 0, flags [none],
> > length: 164, bad cksum 0 (->28d1)!)
localhost.65458 > localhost.sunrpc:
> > UDP, [...]
> it's normal on loopback I think, checksums are not
computed on loopback.
> You can try:
> sysctl -w net.inet.ip.do_loopback_cksum=1
> sysctl -w net.inet.tcp.do_loopback_cksum=1
> sysctl -w net.inet.udp.do_loopback_cksum=1

Yeah no difference. I'm guessing this probably isn't
related.

> > network:
> > netstat: kvm_read: Bad address
> > Name            Ipkts  Ierrs        Opkts  Oerrs 
Colls
> >
> > I'm getting the impression it's network card
related.
>
> Maybe kernel/userland mismatch ?

I blew away the obj dir and recompiled with sources from
last night and this 
is now working but ypbind is still hanging.

Is ypbind suppose to hang indefinitely? Does anyone actually
have a current 
system running nis?

I was so stoked to see nfs working properly again, figures
nis would break on 
me 

Thanks,

Sarton

Re: ypbind hangs as of current from midday yesterday (kern+user)
user name
2007-06-27 22:49:22
On Thu, 28 Jun 2007 12:29:34 pm Christos Zoulas wrote:
> Did you post a ktrace of it?

Here goes ... first time for everything. If I've done
something wrong or could 
do it better or properly, let me know 

Below is the output. This repeats indefinitely:


  1447      1 ypbind   RET   read 500/0x1f4
  1447      1 ypbind   CALL  read(9,0x8068000,0x4000)
  1447      1 ypbind   GIO   fd 9 read 0 bytes
       ""
  1447      1 ypbind   RET   read 0
  1447      1 ypbind   CALL  close(9)
  1447      1 ypbind   RET   close 0
  1447      1 ypbind   CALL 
sendto(6,0xbfbfd4b0,0x88,0,0xbfbfde88,0x10)
  1447      1 ypbind   GIO   fd 6 wrote 136 bytes
      
"^B^AM^F240^B
^E^ADFM^C^Y
        
M-\0^Nspike.internalb
0^B^C
        
^D^E^T^_^B
0^AM^FM-$^B
        ^Bfbinternal"
  1447      1 ypbind   RET   sendto 136/0x88
  1447      1 ypbind   CALL  read(8,0x8064000,0x4000)
  1447      1 ypbind   GIO   fd 8 read 0 bytes
       ""
  1447      1 ypbind   RET   read 0
  1447      1 ypbind   CALL  gettimeofday(0xbfbfcbb8,0)
  1447      1 ypbind   RET   gettimeofday 0
  1447      1 ypbind   CALL 
select(8,0xbfbfdfb0,0,0,0xbfbfdfe8)
  1447      1 ypbind   RET   select 0
  1447      1 ypbind   CALL  gettimeofday(0xbfbfcbb8,0)
  1447      1 ypbind   RET   gettimeofday 0
  1447      1 ypbind   CALL 
select(8,0xbfbfdfb0,0,0,0xbfbfdfe8)
  1447      1 ypbind   RET   select 0
  1447      1 ypbind   CALL  gettimeofday(0xbfbfcbb8,0)
  1447      1 ypbind   RET   gettimeofday 0
  1447      1 ypbind   CALL 
select(8,0xbfbfdfb0,0,0,0xbfbfdfe8)
  1447      1 ypbind   RET   select 0
  1447      1 ypbind   CALL  gettimeofday(0xbfbfcbb8,0)
  1447      1 ypbind   RET   gettimeofday 0
  1447      1 ypbind   CALL 
select(8,0xbfbfdfb0,0,0,0xbfbfdfe8)
  1447      1 ypbind   RET   select 0
  1447      1 ypbind   CALL  gettimeofday(0xbfbfcbb8,0)
  1447      1 ypbind   RET   gettimeofday 0
  1447      1 ypbind   CALL 
select(8,0xbfbfdfb0,0,0,0xbfbfdfe8)
  1447      1 ypbind   RET   select 0
  1447      1 ypbind   CALL  gettimeofday(0xbfbfcbb8,0)
  1447      1 ypbind   RET   gettimeofday 0
  1447      1 ypbind   CALL 
select(8,0xbfbfdfb0,0,0,0xbfbfdfe8)
  1447      1 ypbind   RET   select 0
  1447      1 ypbind   CALL  gettimeofday(0xbfbfcbb8,0)
  1447      1 ypbind   RET   gettimeofday 0
  1447      1 ypbind   CALL 
__sysctl(0xbfbfca48,2,0xbfbfcac7,0xbfbfca50,0,0)
  1447      1 ypbind   RET   __sysctl 0
  1447      1 ypbind   CALL  geteuid
  1447      1 ypbind   RET   geteuid 0
  1447      1 ypbind   CALL  getegid
  1447      1 ypbind   RET   getegid 0
  1447      1 ypbind   CALL  getgroups(0x10,0xbfbfca84)
  1447      1 ypbind   RET   getgroups 8
  1447      1 ypbind   CALL  gettimeofday(0xbfbfc8a0,0)
  1447      1 ypbind   RET   gettimeofday 0
  1447      1 ypbind   CALL  lseek(8,0,0,0,0)
  1447      1 ypbind   RET   lseek 0
  1447      1 ypbind   CALL  read(8,0x8064000,0x4000)
  1447      1 ypbind   GIO   fd 8 read 15 bytes
       "spike.internal
       "
  1447      1 ypbind   RET   read 15/0xf
  1447      1 ypbind   CALL 
__stat30(0xbbb9d6a0,0xbfbfca68)
  1447      1 ypbind   NAMI  "/etc/nsswitch.conf"
  1447      1 ypbind   RET   __stat30 0
  1447      1 ypbind   CALL  open(0xbbb9b9e3,0,0x1b6)
  1447      1 ypbind   NAMI  "/etc/hosts"
  1447      1 ypbind   RET   open 9
  1447      1 ypbind   CALL  __fstat30(9,0xbfbfc908)
  1447      1 ypbind   RET   __fstat30 0
  1447      1 ypbind   CALL  read(9,0x8068000,0x4000)
  1447      1 ypbind   GIO   fd 9 read 500 bytes
       "::1                     localhost      
localhost.internal
        127.0.0.1               localhost      
localhost.internal

        # Entries to allow offline-DNS NFS mounts
        192.168.8.10          gogeta         
gogeta.internal
        192.168.8.9           symbiote       
symbiote.internal
        192.168.8.8           spike          
spike.internal
        192.168.8.7           sammy          
sammy.internal
        192.168.8.20          cre             cre.internal
        192.168.8.6           babylon        
babylon.internal
        192.168.8.102         portal-wired   
portal-wired.internal

        192.168.8.123         portal         
portal.internal

        # NIS update hack (strips the last char)
        # 192.168.8.8         spike           spike.interna
       "

I hope this helps ...

Sarton

Re: ypbind hangs as of current from midday yesterday (kern+user)
country flaguser name
United Kingdom
2007-06-29 12:29:56
On Thu, Jun 28, 2007 at 01:49:22PM +1000, Sarton O'Brien
wrote:
> On Thu, 28 Jun 2007 12:29:34 pm Christos Zoulas wrote:
> > Did you post a ktrace of it?
> 
> Below is the output. This repeats indefinitely:
> 
>   1447      1 ypbind   RET   read 500/0x1f4
>   1447      1 ypbind   CALL  read(9,0x8068000,0x4000)
>   1447      1 ypbind   GIO   fd 9 read 0 bytes
>        ""
>   1447      1 ypbind   RET   read 0
>   1447      1 ypbind   CALL  close(9)
>   1447      1 ypbind   RET   close 0
>   1447      1 ypbind   CALL 
sendto(6,0xbfbfd4b0,0x88,0,0xbfbfde88,0x10)
>   1447      1 ypbind   GIO   fd 6 wrote 136 bytes
>       
"^B^AM^F240^B
^E^ADFM^C^Y
>         
>
M-\0^Nspike.internalb
0^B^C
>         
>
^D^E^T^_^B
0^AM^FM-$^B
>         ^Bfbinternal"
>   1447      1 ypbind   RET   sendto 136/0x88

I did wonder if I'd managed to break sendto() - not much
actually uses it.
But traceroute(8) does, and seems to work on my system
running a very
recent kernel.  But I'm not inside xen....

Unfortunately ktrace doesn't get the 'sockaddr' ...

	David

-- 
David Laight: davidl8s.co.uk

Re: ypbind hangs as of current from midday yesterday (kern+user)
user name
2007-07-01 14:48:57
On Thu, Jun 28, 2007 at 11:33:51AM +1000, Sarton O'Brien
wrote:
> On Wed, 27 Jun 2007 07:08:06 pm Manuel Bouyer wrote:
> > As soon as I ^C the ssh
> > > client in the other console logs in.
> >
> > It's probably because it's waiting on ypbind. Once
ypbind is killed
> > nsswitch only uses local files again.
> 
> Does the ordering matter? I have files first then nis.
I found this behaviour 
> strange as it _is_ the server and the auth files reside
there.

When login, the sytems needs to look at all the groups to
know in which groups
the user is. As long as NIS is enabled for groups the system
will look at
it at login time, whichever order is specified in
nsswitch.conf for group

> Is ypbind suppose to hang indefinitely?

I believe it'll wait for a server to reply, yes

> Does anyone actually have a current 
> system running nis?

I just set up a test nis domain on a domU. With default
files in /etc it
came up without troubles.

Do you have NIS or DNS enabled for hosts in
/etc/nsswitch.conf ? It could
be the cause of the hang. Especially if NIS is enabled and
the hosts
involved in NIS startup aren't listed in /etc/hosts (with
the proper names),
there will be a deadlock.
Are the hosts in the ypservers map and
/var/yp/binding/<domain>.ypservers
properly recorded in /etc/hosts ?

-- 
Manuel Bouyer <bouyerantioche.eu.org>
     NetBSD: 26 ans d'experience feront toujours la
difference
--

[1-8]

about | contact  Other archives ( Real Estate discussion Medical topics )