List Info

Thread: Just another report of performance improvements with recent releases...




Just another report of performance improvements with recent releases...
user name
2007-07-20 12:57:55
I've been using the Hadoop project off and on for the last
year in some
ongoing work studying Wikipedia. One of the tasks I
developed computes the
revision-to-revision diff across all edits in the Wikipedia
history. From
the time I first developed the job (last summer) to the
latest operation
(last week, running on the 0.13.0 release), I've seen a
pretty remarkable
increase in performance. Even though the the input size has
more than
doubled, the time to run the job on Hadoop has dropped by
half, for a
roughly 4x overall improvement in performance.  Thanks
everyone!

-- 
Bryan A. P. Pendleton
Ph: (877) geek-1-bp
[1]

about | contact  Other archives ( Real Estate discussion Medical topics )