[ https://issues.apache.org/jira/browse
/HADOOP-1981?page=com.atlassian.jira.plugin.system.issuetabp
anels:comment-tabpanel#action_12537052 ]
Doug Cutting commented on HADOOP-1981:
--------------------------------------
I'd rather keep this separate from HADOOP-2046, since it not
just documentation, but an incompatible code change.
As for names, I still like having 'output' in them, to
remove potential confusion with join-like stuff that
operates on inputs. We probably don't need 'key' in their
name, since only keys are comparable anyway. So I'd vote
for outputSortComparator and outputGroupComparator. Perhaps
in HADOOP-2046 we should document "grouping" as a
primary mapreduce pipeline stage: map, (combine), sort,
group, reduce?
> Need to document the controls for sorting and grouping
into the reduce
>
------------------------------------------------------------
----------
>
> Key: HADOOP-1981
> URL: htt
ps://issues.apache.org/jira/browse/HADOOP-1981
> Project: Hadoop
> Issue Type: Task
> Components: mapred
> Reporter: Owen O'Malley
> Assignee: Arun C Murthy
>
> The JavaDoc for the Reducer should document how to
control the sort order of keys and values via the JobConf
methods:
>
> setOutputKeyComparatorClass
> setOutputValueGroupingComparator
>
> Both methods desperately need better names. (I'd vote
for setKeySortingComparator and setKeyGroupingComparator.)
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue
online.
|