List Info

Thread: Problem with "xm save" x86-64 cores - crash.4.0-3.5




Problem with "xm save" x86-64 cores - crash.4.0-3.5
user name
2006-10-04 18:09:56
Tejasvi Aswathanarayana wrote:
Crash exits with an error "cannot determine pid_hash array
dimensions". Looking at the crash change log, it appears that it was
fixed in 4.0-2.24 for the 2.6.17 kernel. The core I have is of a
2.6.16.13 xenified kernel. Is the fix even relevant ?

<output>
$ ./crash vmlinux-2.6.16.13-xen test.core

crash 4.0-3.5
Copyright (C) 2002, 2003, 2004, 2005, 2006  Red Hat, Inc.
Copyright (C) 2004, 2005, 2006  IBM Corporation
Copyright (C) 1999-2006&nbsp; Hewlett-Packard Co
Copyright (C) 2005  Fujitsu Limited
Copyright (C) 2005  NEC Corporation
Copyright (C) 1999, 2002  Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions.  Enter "help copying" to see the conditions.
This program has absolutely no warranty.&nbsp; Enter "help warranty" for details.

GNU gdb 6.1
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu"...

please wait... (gathering task table data)
crash: cannot determine pid_hash array dimensions
</output>

Is there a way I can get crash's source at releases&nbsp; 4.0-2.24 and
4.0-2.23 so that I can try a fix for this kernel in case it is a
specific kernel fix  ?

Thanks
-Tejasvi

Interesting -- I got a private message yesterday from another
2.6.16-era xen user with the same problem.

Anyway, over time the manner of pid table handling has
changed several times, and so to handle the various manners,
the crash utility's task_init() function assigns an appropriate
function to gather all pids/tasks running on the system.

You're referring to this entry in the changelog:

4.0-2.24 - Fix for 2.6.17 kernels that do not use "pgdat_list" memory node
   ; &nbsp; &nbsp; &nbsp;  list header, which would cause crash to fail during initialization
&nbsp; &nbsp; &nbsp;   ; &nbsp; with a "crash: cannot resolve: pgdat_list" error message.
 &nbsp; &nbsp; &nbsp;   ;  (andersonredhat.com)

&nbsp; &nbsp; &nbsp; &nbsp;  - Fix for 2.6.17 kernels that have re-worked the kernel pid_hash
 &nbsp; &nbsp; &nbsp;   ;  handling, which would cause crash to fail during initialization
&nbsp; &nbsp; &nbsp;   ; &nbsp; with a "crash: cannot determine pid_hash array dimensions" error
&nbsp;   ; &nbsp; &nbsp; &nbsp; message.&nbsp; (andersonredhat.com)
...
 

The change above implemented yet another pid hash handler
called refresh_hlist_task_table_v2(), which as the comments
above indicate, was required for 2.6.17 kernels.&nbsp; In crash
version 4.0-2.24, a new refresh_hlist_task_table_v2() function
was added to replace refresh_hlist_task_table(), to account
for this change:

/*
&nbsp;*  2.6.17 replaced:
 *&nbsp;   static struct hlist_head *pid_hash[PIDTYPE_MAX];
 ;*  with
&nbsp;*   ;  static struct hlist_head *pid_hash;
 */

So, for starters, can you display how "pid_hash" is
declared in your kernel?

Unfortunately the only 2.6.16-era x86_64 kernel dumpfile that
I have on-hand as a reference is an early kdump dumpfile,
(non-xen) which selects the older refresh_hlist_task_table():

crash> sys
&nbsp; &nbsp; &nbsp; KERNEL: /usr/dumps/kdump/vmlinux (2.6.16-20-smp)
 &nbsp;  DUMPFILE: /usr/dumps/kdump/vmcore
 ; &nbsp; &nbsp; &nbsp; CPUS: 2
 ; &nbsp; &nbsp; &nbsp; DATE: Mon Apr 24 11:02:03 2006
&nbsp; &nbsp; &nbsp; UPTIME: 00:31:04
LOAD AVERAGE: 0.00, 0.00, 0.00
&nbsp; &nbsp;   ; TASKS: 63
&nbsp; &nbsp; NODENAME: elm3a242
 &nbsp; &nbsp; RELEASE: 2.6.16-20-smp
&nbsp; &nbsp;  VERSION: #1 SMP Mon Apr 10 04:51:13 UTC 2006
 &nbsp; &nbsp; MACHINE: x86_64&nbsp; (3000 Mhz)
&nbsp; &nbsp;   MEMORY: 4.6 GB
&nbsp; &nbsp; &nbsp;  PANIC: "SysRq : Trigger a crashdump"
crash> help -t
&nbsp; &nbsp; &nbsp; &nbsp;   ; current: 1678f60 [62]
&nbsp; &nbsp;   ; &nbsp; &nbsp; &nbsp;  .pid: 3235
&nbsp; &nbsp;   ; &nbsp; &nbsp; &nbsp; .comm: "bash"
  ; &nbsp; &nbsp; &nbsp; &nbsp;   .task: ffff810121a7d810
&nbsp;   ;  .thread_info: ffff81011d826000
&nbsp;   ; &nbsp;  .processor: 1
 ; &nbsp; &nbsp; &nbsp; &nbsp;   .ptask: ffff810121d91790
&nbsp;   ; &nbsp;  .mm_struct: ffff810122f9eb00
&nbsp;   ; &nbsp; &nbsp;  .tc_next: 0
   ;  context_array: 1677df0
refresh_task_table: refresh_hlist_task_table()
...
 ;
This is the code sequence in task_init() that selects
refresh_hlist_task_table() or refresh_hlist_task_table_v2():

 &nbsp;   ; } else {
 ; &nbsp; &nbsp; &nbsp; &nbsp;   ;  tt->pidhash_addr = symbol_value("pid_hash");
&nbsp; &nbsp; &nbsp;   ; &nbsp; &nbsp;  if (!get_array_length("pid_hash", NULL, sizeof(void *)) &&
 &nbsp;   ; &nbsp; &nbsp; &nbsp; &nbsp;   ;  VALID_STRUCT(pid_link))
 &nbsp; &nbsp; &nbsp;   ; &nbsp; &nbsp; &nbsp; &nbsp;   ;  tt->refresh_task_table = refresh_hlist_task_table_v2;
  ; &nbsp; &nbsp; &nbsp; &nbsp;   ; else
&nbsp;   ; &nbsp; &nbsp; &nbsp; &nbsp;   ; &nbsp; &nbsp; &nbsp; tt->refresh_task_table = refresh_hlist_task_table;
&nbsp; &nbsp; &nbsp; }

Since you are breaking down on initialization, can you put
some debug code in place that displays?:

1. the output of the get_array_length("pid_hash",...) call, and
2. the output of VALID_STRUCT(pid_link)

Something must be slightly different between the mainline
and xen 2.6.16-era kernels.

What appears to be happening is that refresh_hlist_task_table()
is being selected, but in that function, this subsequent call
is failing:

 &nbsp;   ; &nbsp; if (!(plen = get_array_length("pid_hash", NULL, sizeof(void *))))
&nbsp;   ; &nbsp; &nbsp; &nbsp; &nbsp;   ; error(FATAL, "cannot determine pid_hash array dimensionsn");

Alternatively, if you want to make the vmlinux/dumpfile pair
available to me, I can take a look at it.

Thanks,
&nbsp; Dave
&nbsp;
&nbsp;
&nbsp;
&nbsp;
 ;

Problem with "xm save" x86-64 cores - crash.4.0-3.5
user name
2006-10-04 18:50:41
> So, for starters, can you display how
"pid_hash" is
> declared in your kernel?
>
static struct hlist_head *pid_hash[PIDTYPE_MAX];


> This is the code sequence in task_init() that selects
> refresh_hlist_task_table() or
refresh_hlist_task_table_v2():
>
>       } else {
>               tt->pidhash_addr =
symbol_value("pid_hash");
>               if
(!get_array_length("pid_hash", NULL, sizeof(void
*)) &&
>                   VALID_STRUCT(pid_link))
>                       tt->refresh_task_table =
refresh_hlist_task_table_v2;
>               else
>                       tt->refresh_task_table =
refresh_hlist_task_table;
>       }

Yes, refresh_hlist_task_table is being selected, but because
the "if"
clause itself was failing.

get_array_length("pid_hash",...)  = 1
VALID_STRUCT(pid_link)  = 0
VALID_MEMBER(pid_link_pid) = 0
VALID_MEMBER(pid_hash_chain)) = 0

	if (VALID_MEMBER(pid_link_pid) &&
VALID_MEMBER(pid_hash_chain)) {
			get_symbol_data("pid_hash", sizeof(ulong),
&tt->pidhash_addr);
                	tt->refresh_task_table =
refresh_pid_hash_task_table;
        } else {

> Alternatively, if you want to make the vmlinux/dumpfile
pair
> available to me, I can take a look at it.
Thanks, I will see how I can get you the files.

> Another thing to check -- there are two places that
print that
> "cannot determine..." error message.  Can you
verify that it's
> happening in refresh_hlist_task_table()?  That's where
the
> previous reporter said that it happened on his system,
but I
> just want to make absolutely sure.

I confirmed that the error message "crash: cannot
determine pid_hash
array dimensions " was  from the
refresh_hlist_task_table() function

-- 
Thanks
-Tejasvi

--
Crash-utility mailing list
Crash-utilityredhat.com
https://www.redhat.com/mailman/listinfo/crash-utility
[1-2]

about | contact  Other archives ( Real Estate discussion Medical topics )