List Info

Thread: Created: (NUTCH-337) Fetcher ignores the fetcher.parse value configured in config file




Created: (NUTCH-337) Fetcher ignores the fetcher.parse value configured in config file
user name
2006-08-02 00:06:13
Fetcher ignores the fetcher.parse value configured in config
file
------------------------------------------------------------
-----

                 Key: NUTCH-337
                 URL: http:/
/issues.apache.org/jira/browse/NUTCH-337
             Project: Nutch
          Issue Type: Bug
          Components: fetcher
    Affects Versions: 0.8, 0.9
            Reporter: Jeremy Huylebroeck
            Priority: Trivial


using the command line call to Fetcher, if the noParsing
parameter is given, everything is fine.
if the noParsing is not given, the value in the
nutch-site.xml (or nutch-default.xml) should be taken but it
is "true" that is always given to the call to
fetch.
it should be the value from the conf.


-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the
administrators: http://issues.apache.org/jira/secure/Administrators.jspa

-
For more information on JIRA, see: http://www.atl
assian.com/software/jira

        
nutch
user name
2006-08-02 07:37:50
I use nutch 0.8(mapred). Nutch started on 3 servers.
When my nutch try index segment I get error on tasktracker:
060727 215111 task_0025_r_000000_1  SEVERE FSError from
child
060727 215111 task_0025_r_000000_1
org.apache.hadoop.fs.FSError:
java.io.IOException: No space left on device
060727 215111 task_0025_r_000000_1      at
org.apache.hadoop.fs.LocalFileSystem$LocalFSFileOutputStream
.write(LocalFile
Sys
tem.java:152)
060727 215111 task_0025_r_000000_1      at
org.apache.hadoop.fs.FSDataOutputStream$Summer.write(FSDataO
utputStream.java
:69
)
060727 215111 task_0025_r_000000_1      at
org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(
FSDataOutputStre
am.
java:98)
060727 215111 task_0025_r_000000_1      at
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStrea
m.java:65)
060727 215111 task_0025_r_000000_1      at
java.io.BufferedOutputStream.write(BufferedOutputStream.java
:109)
060727 215111 task_0025_r_000000_1      at
java.io.DataOutputStream.write(DataOutputStream.java:90)
060727 215111 task_0025_r_000000_1      at
org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile
.java:192)
060727 215111 task_0025_r_000000_1      at
org.apache.hadoop.io.SequenceFile$Sorter$MergeQueue.merge(Se
quenceFile.java:
873
)
060727 215111 task_0025_r_000000_1      at
org.apache.hadoop.io.SequenceFile$Sorter$MergePass.run(Seque
nceFile.java:760
)
060727 215111 task_0025_r_000000_1      at
org.apache.hadoop.io.SequenceFile$Sorter.mergePass(SequenceF
ile.java:696)
060727 215111 task_0025_r_000000_1      at
org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.j
ava:522)
060727 215111 task_0025_r_000000_1      at
org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:316)
060727 215111 task_0025_r_000000_1      at
org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.
java:755)
060727 215111 task_0025_r_000000_1 Caused by:
java.io.IOException: No space
left on device
060727 215111 task_0025_r_000000_1      at
java.io.FileOutputStream.writeBytes(Native Method)
060727 215111 task_0025_r_000000_1      at
java.io.FileOutputStream.write(FileOutputStream.java:260)
060727 215111 task_0025_r_000000_1      at
org.apache.hadoop.fs.LocalFileSystem$LocalFSFileOutputStream
.write(LocalFile
Sys
tem.java:150)
060727 215111 task_0025_r_000000_1      ... 12 more


But on server with tasktracker free space on the HDD is
115G. I try get
segment from dfs. Segment occupies 2,4G on HDD. Why I get
this errors?
Anybody can help me decide this problem?



nutch
user name
2006-08-02 07:41:00
I forget....  One more
question:
This problem with nutch or hadoop?

-----Original Message-----
From: antonorbita1.ru [mailto:antonorbita1.ru] 
Sent: Wednesday, August 02, 2006 11:38 AM
To: nutch-devlucene.apache.org
Subject: nutch
Importance: High

I use nutch 0.8(mapred). Nutch started on 3 servers.
When my nutch try index segment I get error on tasktracker:
<skiped>




nutch
user name
2006-08-02 14:00:45
most propably you have run out of space in tmp (local)
filesystem

use properties like

<property>
  <name>mapred.system.dir</name>
  <value><!-- path to fs that contains a lots of
space --></value>
</property>
<property>
  <name>mapred.local.dir</name>
  <value><!-- path to fs that contains a lots of
space --></value>
</property>

in hadoop-site.xml to get over this problem.


antonorbita1.ru wrote:

>I forget....  One more
question:
>This problem with nutch or hadoop?
>
>-----Original Message-----
>From: antonorbita1.ru [mailto:antonorbita1.ru] 
>Sent: Wednesday, August 02, 2006 11:38 AM
>To: nutch-devlucene.apache.org
>Subject: nutch
>Importance: High
>
>I use nutch 0.8(mapred). Nutch started on 3 servers.
>When my nutch try index segment I get error on
tasktracker:
><skiped>
>
>
>
>
>
>  
>

nutch
user name
2006-08-03 05:39:21
My settings:
....
<property>
  <name>mapred.local.dir</name>
  <value>/hadoop/mapred/local</value>
  <description>The local directory where MapReduce
stores intermediate
  data files.  May be a comma-separated list of
  directories on different devices in order to spread disk
i/o.
  </description>
</property>

<property>
  <name>mapred.system.dir</name>
  <value>/hadoop/mapred/system</value>
  <description>The shared directory where MapReduce
stores control files.
  </description>
</property>
....

My device which mounted onto "/" have free space
is 115G.

[rootxxxxx /]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda2             133G   13G  113G  11% /

Anybody have other ideas?








-----Original Message-----
From: Sami Siren [mailto:ssirengmail.com] 
Sent: Wednesday, August 02, 2006 6:01 PM
To: nutch-devlucene.apache.org
Subject: Re: nutch
Importance: High

most propably you have run out of space in tmp (local)
filesystem

use properties like

<property>
  <name>mapred.system.dir</name>
  <value><!-- path to fs that contains a lots of
space --></value>
</property>
<property>
  <name>mapred.local.dir</name>
  <value><!-- path to fs that contains a lots of
space --></value>
</property>

in hadoop-site.xml to get over this problem.


antonorbita1.ru wrote:

>I forget....  One more
question:
>This problem with nutch or hadoop?
>
>-----Original Message-----
>From: antonorbita1.ru [mailto:antonorbita1.ru] 
>Sent: Wednesday, August 02, 2006 11:38 AM
>To: nutch-devlucene.apache.org
>Subject: nutch
>Importance: High
>
>I use nutch 0.8(mapred). Nutch started on 3 servers.
>When my nutch try index segment I get error on
tasktracker:
><skiped>
>
>
>
>
>
>  
>



Updated: (NUTCH-337) Fetcher ignores the fetcher.parse value configured in config file
user name
2006-08-18 06:32:14
     [ http://issues.apache.org/jira/browse/NUTCH-337?page=all ]

Stefan Groschupf updated NUTCH-337:
-----------------------------------

    Attachment: respectFetcherParsePropertyV1.patch

Hi Jeremy, thanks for catching this. Attached a fix. Should
be easy for a contributor to commit this to trunk....

> Fetcher ignores the fetcher.parse value configured in
config file
>
------------------------------------------------------------
-----
>
>                 Key: NUTCH-337
>                 URL: http:/
/issues.apache.org/jira/browse/NUTCH-337
>             Project: Nutch
>          Issue Type: Bug
>          Components: fetcher
>    Affects Versions: 0.8, 0.9.0
>            Reporter: Jeremy Huylebroeck
>            Priority: Trivial
>         Attachments:
respectFetcherParsePropertyV1.patch
>
>
> using the command line call to Fetcher, if the
noParsing parameter is given, everything is fine.
> if the noParsing is not given, the value in the
nutch-site.xml (or nutch-default.xml) should be taken but it
is "true" that is always given to the call to
fetch.
> it should be the value from the conf.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the
administrators: http://issues.apache.org/jira/secure/Administrators.jspa

-
For more information on JIRA, see: http://www.atl
assian.com/software/jira

        
Updated: (NUTCH-337) Fetcher ignores the fetcher.parse value configured in config file
user name
2006-08-18 06:32:15
     [ http://issues.apache.org/jira/browse/NUTCH-337?page=all ]

Stefan Groschupf updated NUTCH-337:
-----------------------------------

    Priority: Major  (was: Trivial)

> Fetcher ignores the fetcher.parse value configured in
config file
>
------------------------------------------------------------
-----
>
>                 Key: NUTCH-337
>                 URL: http:/
/issues.apache.org/jira/browse/NUTCH-337
>             Project: Nutch
>          Issue Type: Bug
>          Components: fetcher
>    Affects Versions: 0.8, 0.9.0
>            Reporter: Jeremy Huylebroeck
>         Attachments:
respectFetcherParsePropertyV1.patch
>
>
> using the command line call to Fetcher, if the
noParsing parameter is given, everything is fine.
> if the noParsing is not given, the value in the
nutch-site.xml (or nutch-default.xml) should be taken but it
is "true" that is always given to the call to
fetch.
> it should be the value from the conf.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the
administrators: http://issues.apache.org/jira/secure/Administrators.jspa

-
For more information on JIRA, see: http://www.atl
assian.com/software/jira

        
Closed: (NUTCH-337) Fetcher ignores the fetcher.parse value configured in config file
user name
2006-09-23 18:57:23
     [ http://issues.apache.org/jira/browse/NUTCH-337?page=all ]

Andrzej Bialecki  closed NUTCH-337.
-----------------------------------

    Fix Version/s: 0.8.1
                   0.9.0
       Resolution: Fixed

Patch applied to branch-0.8 and trunk. Thanks!

> Fetcher ignores the fetcher.parse value configured in
config file
>
------------------------------------------------------------
-----
>
>                 Key: NUTCH-337
>                 URL: http:/
/issues.apache.org/jira/browse/NUTCH-337
>             Project: Nutch
>          Issue Type: Bug
>          Components: fetcher
>    Affects Versions: 0.8, 0.9.0
>            Reporter: Jeremy Huylebroeck
>             Fix For: 0.8.1, 0.9.0
>
>         Attachments:
respectFetcherParsePropertyV1.patch
>
>
> using the command line call to Fetcher, if the
noParsing parameter is given, everything is fine.
> if the noParsing is not given, the value in the
nutch-site.xml (or nutch-default.xml) should be taken but it
is "true" that is always given to the call to
fetch.
> it should be the value from the conf.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the
administrators: http://issues.apache.org/jira/secure/Administrators.jspa

-
For more information on JIRA, see: http://www.atl
assian.com/software/jira

        
[1-8]

about | contact  Other archives ( Real Estate discussion Medical topics )