If you have the java code to train the tagger in english i
could work in one port to spanish, is that posible?
Regards
jose
José Ramón Pérez Agüera
Despacho 411 tlf. 913947599
Dept. de Sistemas Informáticos y Programación
Facultad de Informática
Universidad Complutense de Madrid
----- Mensaje original -----
De: Diana Maynard <d.maynard dcs.shef.ac.uk>
Fecha: Martes, Febrero 14, 2006 12:16 pm
Asunto: Re: Unknown taggs for Spanish.
> Yes, Jose is correct in that these tags are caused by
the default
> rules
> in the tagger which fire when the words in question are
not in the
> lexicon. Possible solutions are either to map these
into tags from
> your
> Spanish tagset, modify the Spanish lexicon manually to
include
> missing
> words not being recognised, or modify the tagger code
appropriately
> to
> change the way the default rules are applied.....
> I'm afraid we don't have a better version of the
Spanish tagger -
> if we
> did, it would have been included......
> There are quite a few people on this list using GATE
for Spanish -
> someone might have some solution they have already
tried.
> Regards
> Diana
>
>
> José Ramón Pérez Agüera wrote:
>
> >Hi Sergi,
> >
> >I work with Gate's POS Tagger in my thesis, and I
think that this
> tags (NN, NNS, NNP) are generic and the tagger use it
by default
> when he don't know the suitable tag. I need re-train
the POS Tagger
> for spanish but this is no posible with the Gate's
API. I don't
> have any solution, sorry, but i think this is the
problem.
> >
> >Regards, and sorry for my english
> >
> >jose
> >
> >José Ramón Pérez Agüera
> >Despacho 411 tlf. 913947599
> >Dept. de Sistemas Informáticos y Programación
> >Facultad de Informática
> >Universidad Complutense de Madrid
> >
> >----- Mensaje original -----
> >De: Sergi Fernandez <devilsf hotmail.com>
> >Fecha: Martes, Febrero 14, 2006 0:40 am
> >Asunto: Unknown taggs for Spanish.
> >
> >
> >
> >>Hi there!
> >>
> >>Thank you for your quick answer!!
> >>
> >>I've just solved the problem of using
independent grammars.
> >>
> >>I'm working with the Spanish Plugin for GATE
3.0. As I read in
> the
> >>documentation "D1.4.1a Language
Issues", GATE uses a tagger based
> >>on the Brill tagger, but trained on Spanish
text. The taggers for
> >>Spanish are different from the ones for
English, and they are
> >>defined in " Guia para la anotacin
morfosintctica del corpus CLiC-
> >>TALP by M. Civit.". Until now everything
was ok. But right now
> I'm
> >>working with gate for my final degree project
and there are some
> >>taggers that don't work quite well. What I
mean is that there are
> >>some taggers, as NN, NNP or NNS, that are not
described in my
> Guide
> >>by M.Civit.
> >>
> >>I believe those taggs came from the English
tagger.
> >>
> >>I'm trying to adapt Text2Onto, from
Univerisity of Karlsruhe
> >>(Deutschland), AIFB Institute, to Spanish, and
maybe 30% or 40%
> of
> >>nouns are tagged with those "English
Taggs". That causes a very
> >>high losing of accuracy and recall if I want
only to use the
> >>correct taggs. Then, If I consider the NNS and
NNP tags as nouns,
> I
> >>win recall but lose accuracy. Then my questions
are: What I can
> do
> >>with those tag's? Is there a new version of
the Spanish plugin
> >>avaiable and more accurate? Could you please
explain it to me or
> >>tell me where to find the way Spanish Plugin is
built so that I
> can
> >>figure out which of those taggs to accept or
reject ?
> >>
> >>Best regards.
> >>
> >>Sergi
> >>
> >>
> >
> >
> >
> >
>
|