List Info

Thread: Updated: (HADOOP-941) Enhancements to Hadoop record I/O - Part 1




Updated: (HADOOP-941) Enhancements to Hadoop record I/O - Part 1
country flaguser name
United States
2007-02-28 15:44:51
     [ https://issues.apache.org/jira/browse/HADOOP-941?page=com.
atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Milind Bhandarkar updated HADOOP-941:
-------------------------------------

    Description: 
Hadoop record I/O can be used effectively outside of Hadoop.
It would increase its utility if developers can use it
without having to import hadoop classes, or having to depend
on Hadoop jars. Following changes to the current translator
and runtime are proposed.

Proposed Changes:

1. Use java.lang.String as a native type for ustring
(instead of Text.)
2. Provide a Buffer class as a native Java type for buffer
(instead of BytesWritable), so that later BytesWritable
could be implemented as following DDL:
module org.apache.hadoop.io {
  record BytesWritable {
    buffer value;
  }
}
3. Member names in generated classes should not have
prefixes 'm' before their names. In the above example, the
private member name would be 'value' not 'mvalue' as it is
done now.
4. Convert getters and setters to have CamelCase. e.g. in
the above example the getter will be:
  public Buffer getValue();
5. Generate clone() methods for records in Java i.e. the
generated classes should implement Cloneable.
6. Make generated Java codes for maps and vectors use Java
generics.

These are the proposed user-visible changes. Internally, the
translator will be restructured so that it is easier to
plug-in translators for different targets.


  was:
Hadoop record I/O can be used effectively outside of Hadoop.
It would increase its utility if developers can use it
without having to import hadoop classes, or having to depend
on Hadoop jars. Following changes to the current translator
and runtime are proposed.

Proposed Changes:

1. Use java.lang.String as a native type for ustring
(instead of Text.)
2. Provide a Buffer class as a native Java type for buffer
(instead of BytesWritable), so that later BytesWritable
could be implemented as following DDL:
module org.apache.hadoop.io {
  record BytesWritable {
    buffer value;
  }
}
3. Member names in generated classes should not have
prefixes 'm' before their names. In the above example, the
private member name would be 'value' not 'mvalue' as it is
done now.
4. Convert getters and setters to have CamelCase. e.g. in
the above example the getter will be:
  public Buffer getValue();
5. Provide a 'swiggable' C binding, so that processing the
generated C code with swig allows it to be used in scripting
languages such as Python and Perl.
6. The default --language="java" target would
generate class code for records that would not have Hadoop
dependency on WritableComparable interface, but instead
would have "implements Record, Comparable". (i.e.
It will not have write() and readFields() methods.) An
additional option "--writable" will need to be
specified on rcc commandline to generate classes that
"implements Record, WritableComparable".
7. Optimize generated write() and readFields() methods, so
that they do not have to create BinaryOutputArchive or
BinaryInputArchive every time these methods are called on a
record.
8. Implement ByteInStream and ByteOutStream for C++ runtime,
as they will be needed for using Hadoop Record I/O with
forthcoming C++ MapReduce framework (currently, only
FileStreams are provided.)
9. Generate clone() methods for records in Java i.e. the
generated classes should implement Cloneable.
10. As part of Hadoop build process, produce a tar bundle
for Record I/O alone. This tar bundle will contain the
translator classes and ant task (lib/rcc.jar), translator
script (bin/rcc), Java runtime (recordio.jar) that includes
org.apache.hadoop.record.*, sources for the java runtime
(src/java), and c/c++ runtime sources with Makefiles
(src/c++, src/c).
11. Make generated Java codes for maps and vectors use Java
generics.

These are the proposed user-visible changes. Internally, the
translator will be restructured so that it is easier to
plug-in translators for different targets.


        Summary: Enhancements to Hadoop record I/O - Part 1 
(was: Make Hadoop Record I/O Easier to use outside Hadoop)

Split the issue of making record I/O usable outside Hadoop
into a separate issue. Will pull the current patch, and
upload a new patch.

> Enhancements to Hadoop record I/O - Part 1
> ------------------------------------------
>
>                 Key: HADOOP-941
>                 URL: http
s://issues.apache.org/jira/browse/HADOOP-941
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: record
>    Affects Versions: 0.10.1
>         Environment: All
>            Reporter: Milind Bhandarkar
>         Assigned To: Milind Bhandarkar
>
> Hadoop record I/O can be used effectively outside of
Hadoop. It would increase its utility if developers can use
it without having to import hadoop classes, or having to
depend on Hadoop jars. Following changes to the current
translator and runtime are proposed.
> Proposed Changes:
> 1. Use java.lang.String as a native type for ustring
(instead of Text.)
> 2. Provide a Buffer class as a native Java type for
buffer (instead of BytesWritable), so that later
BytesWritable could be implemented as following DDL:
> module org.apache.hadoop.io {
>   record BytesWritable {
>     buffer value;
>   }
> }
> 3. Member names in generated classes should not have
prefixes 'm' before their names. In the above example, the
private member name would be 'value' not 'mvalue' as it is
done now.
> 4. Convert getters and setters to have CamelCase. e.g.
in the above example the getter will be:
>   public Buffer getValue();
> 5. Generate clone() methods for records in Java i.e.
the generated classes should implement Cloneable.
> 6. Make generated Java codes for maps and vectors use
Java generics.
> These are the proposed user-visible changes.
Internally, the translator will be restructured so that it
is easier to plug-in translators for different targets.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue
online.


[1]

about | contact  Other archives ( Real Estate discussion Medical topics )