List Info

Thread: Re: ZFS lockup in "zfs" state




Re: ZFS lockup in "zfs" state
country flaguser name
2008-06-02 16:55:14
Jeremy Chadwick wrote:
> On Mon, Jun 02, 2008 at 04:04:12PM +1000, Andrew Hill
wrote:
...
>> unfortunately i couldn't get a backtrace or core
dump for 'political'
>> reasons (the system was required for use by others)
but i'll see if i can
>> get a panic happening after-hours to get some more
info...
> 
> I can't tell you what to do or how to do your job, but
honestly you
> should be pulling this system out of production and
replacing it with a
> different one, or a different implementation, or a
different OS.  Your
> users/employees are probably getting ticked off at the
crashes, and it
> probably irritates you too.  The added benefit is that
you could get
> Scott access to the box.

It's a home fileserver rather than a production
"work" system, so the 
challenge is finding another system with an equivalent
amount of 
storage..  As one
knows these things are often hard enough to procure 
out of a company budget, let alone out of ones own pocket!

--Antony
_______________________________________________
freebsd-fsfreebsd.org mailing list

http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to
"freebsd-fs-unsubscribefreebsd.org"

Re: ZFS lockup in "zfs" state
country flaguser name
2008-06-03 07:45:26
Hello,

just to add one more voice to the issue:

I'm experiencing the lockups with zfs too.

Environment: development test machine, amd64, 3GHz AMD, 2GB
ram,
running FreeBSD/amd64 7.0-STABLE #8, Sat Apr 26 10:10:53
CEST 2008,
with one 400GB SATA disk devoted completely to a zpool (no
raid of
any kind). This disk has 5 filesystems which get rsynced on
a
daily basis from different other development hosts. Some of
the
filesystems are nfs-exported.

/boot/loader.conf contains:

vm.kmem_size=900M
vm.kmem_size_max=900M
vfs.zfs.arc_max=300M
vfs.zfs.prefetch_disable=1

The disk itself has no known hw problems.

A script controlled by cron makes a daily or weekly snapshot
of the
filesystems (at 2:30 AM). Before that, a
"housekeeping" script
checks for available space, and if the space is getting
below
a certain threshold, it destroys older snapshots (at 1:30
am).
The rsyncs to the pool all happen a few hours later (4:30
am).

I've seen lockups periodically, where I could not do
anything else
but  hard-reboot the machine to unstuck it. It was possible
to use
other filesystems, but any process trying to access the
zpool would
hang.

Now the very first hang was about 3 months after 7.0-BETA4,
which
was when I first setup the pools.

I then csupped and rebuilt world and kernel periodically,
the
last time being end of april. After that I got those lockups
more
often, that is, after a maximum of 2 weeks.

I noticed that now that I lowered the threshold of the
"housekeeping" script, it hasn't locked up for
about 3 weeks.

That seems to point at a problem with zfs destroy fssnapshot
-
or to anything my script does, so here's a link to it:
http://lorenzo.yellowspace.net/zfs_housekeeping.sh.txt


haven't seen any adX- timeouts or any other suspicious
console
messages so far.

If there is anything I can provide to help nail down zfs
problems
please refer to it and I'll do my best...


Thanx to everyone working on this great OS and on this
cute file/volsystem 


Regards,

Lorenzo



On 02.06.2008, at 23:55, Antony Mawer wrote:

> Jeremy Chadwick wrote:
>> On Mon, Jun 02, 2008 at 04:04:12PM +1000, Andrew
Hill wrote:
> ...
>>> unfortunately i couldn't get a backtrace or
core dump for  
>>> 'political'
>>> reasons (the system was required for use by
others) but i'll see  
>>> if i can
>>> get a panic happening after-hours to get some
more info...
>> I can't tell you what to do or how to do your job,
but honestly you
>> should be pulling this system out of production and
replacing it  
>> with a
>> different one, or a different implementation, or a
different OS.   
>> Your
>> users/employees are probably getting ticked off at
the crashes, and  
>> it
>> probably irritates you too.  The added benefit is
that you could get
>> Scott access to the box.
>
> It's a home fileserver rather than a production
"work" system, so  
> the challenge is finding another system with an
equivalent amount of  
> storage..  As one
knows these things are often hard enough to  
> procure out of a company budget, let alone out of ones
own pocket!
>
> --Antony
> _______________________________________________
> freebsd-fsfreebsd.org mailing list
> 
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to
"freebsd-fs-unsubscribefreebsd.org"

_______________________________________________
freebsd-fsfreebsd.org mailing list

http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to
"freebsd-fs-unsubscribefreebsd.org"

[1-2]

about | contact  Other archives ( Real Estate discussion Medical topics )