It would be a good idea to have Mapper and Reducer expose a
getLogger
() method. It could be extending a seperate interface like
Loggable.
The logger is initialized when the Map and Reduce tasks are
initialized. The logger will be named using the job Id in
the end -
like hadoop.mapred.jobs.<jobid>. This enables user
written map reduce
code to log to a common logger for the job. Hadoop code can
also log
to the same logger in case of failures etc.
These logs can then be redirected to a specific hadoop
directory for
the job, in this case logs from all nodes running the same
MR task
will be available in a single directory in DFS. This will
also help
in separating hadoop's internal logs from the user logs
without any
logging configuration on user's part.
thoughts?
~Sanjay
|