Sharyathi Nagesh wrote:
> Hi
>
> We encountered this problem with Crash showing invalid
memory when ran
> on live machine:
> ======================================
> [root venuslp11 ~]# free -m
> total used free shared
buffers cached
> Mem: 2151 1594 557 0
583 795
> ^^^^^^
> -/+ buffers/cache: 215 1935
> Swap: 1983 0 1983
> [root venuslp11 ~]# cat /proc/ppc64/lparcfg | grep DesMem
> DesMem=2304
> ^^^^^^
> [root venuslp11 ~]# rpm -q crash
> crash-4.0-3.7
> [root venuslp11 ~]# crash
> ...
> KERNEL:
/usr/lib/debug/lib/modules/2.6.18-1.2732.el5/vmlinux
> DUMPFILE: /dev/mem
> CPUS: 2
> DATE: Fri Oct 27 07:58:54 2006
> UPTIME: 01:12:34
> LOAD AVERAGE: 1.52, 1.05, 0.48
> TASKS: 98
> NODENAME: venuslp11.upt.austin.ibm.com
> RELEASE: 2.6.18-1.2732.el5
> VERSION: #1 SMP Tue Oct 17 18:24:27 EDT 2006
> MACHINE: ppc64 (2301 Mhz)
> MEMORY: 3.2 GB
> ^^^^^^^^
> PID: 25097
> COMMAND: "crash"
> TASK: c000000000fedbe0 [THREAD_INFO:
c00000000b190000]
> CPU: 0
> STATE: TASK_RUNNING (ACTIVE)
>
> crash>
> ==================================
> As I looked into the code I found:
> The differences are observed because of the
different way in which
> they(proc and crash) are implemented to calculate Total
Memory.
> In /proc/meminfo it traverse through the memory
counting each page and it has
> different routines to calculate No of pages in highmem,
init section, bootmem etc.
> Which may be difficult to implement with Crash.
> Instead we can look into sys file
>
implementation(/sys/devices/system/node/node<n>/meminf
o). Here the Total Page is
> got not from unsigned long node_spanned_pages but from
long node_present_pages.
> The definitions of node_present_pages says 'total
number of physical pages'
> while node_spanned_pages says 'total size of physical
page range, including holes'.
> This is observed because of way Node 2 is
spread in the machine its
> pfn(physical frame number) starts from 0 while that of
0th and 1st node
> starts from 4096 and 8192 pfns respectively. so
node3->spanned_pages has
> double counted value from even the node 0 and node 1.
Hence I feel its
> better to use present_pages which has only the pages
from the node
> excluding the holes.
>
> ======================================
> The patch to fix the problem:
> Let me know of your opinion..
>
>
------------------------------------------------------------
------------------------------------------------------------
> Name:
node_present_page.patch
> node_present_page.patch Type: text/x-patch
> Encoding: 7bit
Hi Sharyathi,
I agree that the "MEMORY:" size output is
incorrect on your
system for the reasons you have outlined.
But your proposed patch would cause other things to break,
because it would set each internal per-node
"nt->size" field
to the present pages instead of the spanned pages.
Therefore, any function that uses the per-node nt->size
value -- such as dump_mem_map(), is_page_ptr(),
phys_to_page(), next_kpage(), etc...) would end up
using an incorrect, shortchanged, value.
However, the total_node_memory() function specifically
should probably use the pglist_data node_present_pages
value instead of the "nt->size" value that is
based upon the
node's spanned pages.
I originally thought this might come up, as evidenced by
the fact that I initialize the
"pglist_data_node_present_pages"
member offset, but don't use it anywere.
Let me tinker with this for a while...
Thanks,
Dave
--
Crash-utility mailing list
Crash-utility redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility
|