Eliminate redundant searches in the namespace directory
tree.
------------------------------------------------------------
-
Key: HADOOP-2002
URL: htt
ps://issues.apache.org/jira/browse/HADOOP-2002
Project: Hadoop
Issue Type: Bug
Components: dfs
Affects Versions: 0.13.0
Reporter: Konstantin Shvachko
Fix For: 0.16.0
There is no need to look for the same INode multiple times
in the same name-node operation.
For example in FSNamesystem.exists()
public boolean exists(String src) {
if (dir.getFileBlocks(src) != null || dir.isDir(src)) {
return true;
} else {
return false;
}
}
both getFileBlocks() and isDir() call rootDir.getNode(src)
inside, which causes two separate lookups in the directory
tree while one is enough.
Why not check whether the inode is a directory as well as
that it has blocks at the same time.
Other methods do the same thing.
- completeFile() calls getINode in different parts at least
3 times.
- getAdditionalBlock() - 2 getINode calls
- startFile() - I counted 5 calls, may be missed some.
In order to prevent that we should define all methods beyond
the top level based on INode parameters rather than path
names.
E.g. all FSDirectory methods should take INode as a
parameter, not the String.
We should be careful though not to use INode across separate
synchronized sections.
Once the lock is released the INode should be accessed by
the path again.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue
online.
|