[ https://issues.apache.org/jira/browse/HADOOP-1565?page=co
m.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
a> ]
dhruba borthakur updated HADOOP-1565:
-------------------------------------
Attachment: (was: memoryReduction.patch)
> DFSScalability: reduce memory usage of namenode
> -----------------------------------------------
>
> Key: HADOOP-1565
> URL: htt
ps://issues.apache.org/jira/browse/HADOOP-1565
> Project: Hadoop
> Issue Type: Bug
> Reporter: dhruba borthakur
> Assignee: dhruba borthakur
> Attachments: memoryReduction2.patch
>
>
> Experiments have demonstrated that a single file/block
needs about 300 to 500 bytes of main memory on a 64-bit
Namenode. This puts some limitations on the size of the file
system that a single namenode can support. Most of this
overhead occurs because a block and/or filename is inserted
into multiple TreeMaps and/or HashSets.
> Here are a few ideas that can be measured to see if an
appreciable reduction of memory usage occurs:
> 1. Change FSDirectory.children from a TreeMap to an
array. Do binary search in this array while looking up
children. This saves a TreeMap object for every intermediate
node in the directory tree.
> 2. Change INode from an inner class. This saves on one
"parent object" reference for each INODE instance.
4 bytes per inode.
> 3. Keep all DatanodeDescriptors in an array.
BlocksMap.nodes[] is currently a 64-bit reference to the
DatanodeDescriptor object. Instead, it can be a 'short'.
This will probably save about 16 bytes per block.
> 4. Change DatanodeDescriptor.blocks from a
SortedTreeMap to a HashMap? Block report processing CPU cost
can increase.
> For the records: TreeMap has the following fields:
> Object key;
> Object value;
> Entry left = null;
> Entry right = null;
> Entry parent;
> boolean color = BLACK;
> and HashMap object:
> final Object key;
> Object value;
> final int hash;
> Entry next;
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue
online.
|