Yes, Jose is correct in that these tags are caused by the
default rules
in the tagger which fire when the words in question are not
in the
lexicon. Possible solutions are either to map these into
tags from your
Spanish tagset, modify the Spanish lexicon manually to
include missing
words not being recognised, or modify the tagger code
appropriately to
change the way the default rules are applied.....
I'm afraid we don't have a better version of the Spanish
tagger - if we
did, it would have been included......
There are quite a few people on this list using GATE for
Spanish -
someone might have some solution they have already tried.
Regards
Diana
José Ramón Pérez Agüera wrote:
>Hi Sergi,
>
>I work with Gate's POS Tagger in my thesis, and I think
that this tags (NN, NNS, NNP) are generic and the tagger use
it by default when he don't know the suitable tag. I need
re-train the POS Tagger for spanish but this is no posible
with the Gate's API. I don't have any solution, sorry, but
i think this is the problem.
>
>Regards, and sorry for my english
>
>jose
>
>José Ramón Pérez Agüera
>Despacho 411 tlf. 913947599
>Dept. de Sistemas Informáticos y Programación
>Facultad de Informática
>Universidad Complutense de Madrid
>
>----- Mensaje original -----
>De: Sergi Fernandez <devilsf hotmail.com>
>Fecha: Martes, Febrero 14, 2006 0:40 am
>Asunto: Unknown taggs for Spanish.
>
>
>
>>Hi there!
>>
>>Thank you for your quick answer!!
>>
>>I've just solved the problem of using independent
grammars.
>>
>>I'm working with the Spanish Plugin for GATE 3.0.
As I read in the
>>documentation "D1.4.1a Language Issues",
GATE uses a tagger based
>>on the Brill tagger, but trained on Spanish text.
The taggers for
>>Spanish are different from the ones for English, and
they are
>>defined in " Guia para la anotacin
morfosintctica del corpus CLiC-
>>TALP by M. Civit.". Until now everything was
ok. But right now I'm
>>working with gate for my final degree project and
there are some
>>taggers that don't work quite well. What I mean is
that there are
>>some taggers, as NN, NNP or NNS, that are not
described in my Guide
>>by M.Civit.
>>
>>I believe those taggs came from the English tagger.
>>
>>I'm trying to adapt Text2Onto, from Univerisity of
Karlsruhe
>>(Deutschland), AIFB Institute, to Spanish, and
maybe 30% or 40% of
>>nouns are tagged with those "English
Taggs". That causes a very
>>high losing of accuracy and recall if I want only to
use the
>>correct taggs. Then, If I consider the NNS and NNP
tags as nouns, I
>>win recall but lose accuracy. Then my questions are:
What I can do
>>with those tag's? Is there a new version of the
Spanish plugin
>>avaiable and more accurate? Could you please explain
it to me or
>>tell me where to find the way Spanish Plugin is
built so that I can
>>figure out which of those taggs to accept or reject
?
>>
>>Best regards.
>>
>>Sergi
>>
>>
>
>
>
>
|