List Info

Thread: Hadoop Job Submission




Hadoop Job Submission
country flaguser name
United States
2008-01-11 10:00:50
Hi,
I have some basic questions in Hadoop Job submission. Could
you please let me know.


1)      Once Hadoop daemons (dfs, JobTracker etc...) are
started by hadoop user.

2)      Can any user submit job to Hadoop.

3)      Or does each user has to start the Hadoop daemons
and submit job.

4)      Is there any queue available, like condor, to submit
multiple jobs.

Thanks,
Senthil
Re: Hadoop Job Submission
user name
2008-01-12 18:00:47
Hi,

once you have started the jobtracker+namenode on your
cluster, you can
launch a job from any node of the cluster.

AFAIK, to submit multiple jobs you need to do that yourself
either:
 - by writing a bash script to launch several jobs.jar one
after the other
 - by bundling several jobs in a single job.jar (calling
API::
JobClient.runjob( job ) repeatdly for each job in jobs ),
for this you have
to create a new instance of JobClient for each job.

On 12/01/2008, Natarajan, Senthil <senthilpitt.edu> wrote:
>
> Hi,
> I have some basic questions in Hadoop Job submission.
Could you please let
> me know.
>
>
> 1)      Once Hadoop daemons (dfs, JobTracker etc...)
are started by hadoop
> user.
>
> 2)      Can any user submit job to Hadoop.
>
> 3)      Or does each user has to start the Hadoop
daemons and submit job.
>
> 4)      Is there any queue available, like condor, to
submit multiple
> jobs.
>
> Thanks,
> Senthil
>
RE: Hadoop Job Submission
country flaguser name
United States
2008-01-14 14:10:29
Hi,
Thanks for the reply.

Actually, what I am trying to ask is suppose if the
jobtracker+namenode are started by user hadoop, does user
"senthil" can submit the job without starting its
own jobtracker+namenode.

The test Hadoop cluster I setup is, using individual Redhat
Linux machines. So for the user hadoop I need to copy the
SSH key to all the machines, so that user hadoop can ssh to
all the nodes without password.

Do I need to do this (generating SSH key and copying to all
the node) for all the users who are going to use Hadoop and
MapReduce.

Thanks,
Senthil

-----Original Message-----
From: Khalil Honsali [mailto:k.honsaligmail.com]
Sent: Saturday, January 12, 2008 7:01 PM
To: hadoop-userlucene.apache.org
Subject: Re: Hadoop Job Submission

Hi,

once you have started the jobtracker+namenode on your
cluster, you can
launch a job from any node of the cluster.

AFAIK, to submit multiple jobs you need to do that yourself
either:
 - by writing a bash script to launch several jobs.jar one
after the other
 - by bundling several jobs in a single job.jar (calling
API::
JobClient.runjob( job ) repeatdly for each job in jobs ),
for this you have
to create a new instance of JobClient for each job.

On 12/01/2008, Natarajan, Senthil <senthilpitt.edu> wrote:
>
> Hi,
> I have some basic questions in Hadoop Job submission.
Could you please let
> me know.
>
>
> 1)      Once Hadoop daemons (dfs, JobTracker etc...)
are started by hadoop
> user.
>
> 2)      Can any user submit job to Hadoop.
>
> 3)      Or does each user has to start the Hadoop
daemons and submit job.
>
> 4)      Is there any queue available, like condor, to
submit multiple
> jobs.
>
> Thanks,
> Senthil
>

Re: Hadoop Job Submission
country flaguser name
United States
2008-01-14 15:11:25
On Jan 14, 2008, at 12:10 PM, Natarajan, Senthil wrote:

> Hi,
> Thanks for the reply.
>
> Actually, what I am trying to ask is suppose if the
jobtracker 
> +namenode are started by user hadoop, does user
"senthil" can  
> submit the job without starting its own
jobtracker+namenode.

It works fine, but you need to define a map/reduce system
directory  
(mapred.system.dir) that is constant. (The default defines a
 
directory name that depends on the user...)

-- Owen

[1-4]

about | contact  Other archives ( Real Estate discussion Medical topics )