List Info

Thread: Unknown taggs for Spanish.




Unknown taggs for Spanish.
user name
2006-02-14 11:33:52
If you have the java code to train the tagger in english i
could work in one port to spanish, is that posible?

Regards

jose

José Ramón Pérez Agüera
Despacho 411 tlf. 913947599
Dept. de Sistemas Informáticos y Programación
Facultad de Informática
Universidad Complutense de Madrid

----- Mensaje original -----
De: Diana Maynard <d.maynarddcs.shef.ac.uk>
Fecha: Martes, Febrero 14, 2006 12:16 pm
Asunto: Re: Unknown taggs for Spanish.

> Yes, Jose is correct in that these tags are caused by
the default 
> rules 
> in the tagger which fire when the words in question are
not in the 
> lexicon. Possible solutions are either to map these
into tags from 
> your 
> Spanish tagset, modify the Spanish lexicon manually to
include 
> missing 
> words not being recognised, or modify the tagger code
appropriately 
> to 
> change the way the default rules are applied.....
> I'm afraid we don't have a better version of the
Spanish tagger - 
> if we 
> did, it would have been included......
> There are quite a few people on this list using GATE
for Spanish - 
> someone might have some solution they have already
tried.
> Regards
> Diana
> 
> 
> José Ramón Pérez Agüera wrote:
> 
> >Hi Sergi,
> >
> >I work with Gate's POS Tagger in my thesis, and I
think that this 
> tags (NN, NNS, NNP) are generic and the tagger use it
by default 
> when he don't know the suitable tag. I need re-train
the POS Tagger 
> for spanish but this is no posible with the Gate's
API. I don't 
> have any solution, sorry, but i think this is the
problem.
> >
> >Regards, and sorry for my english
> >
> >jose
> >
> >José Ramón Pérez Agüera
> >Despacho 411 tlf. 913947599
> >Dept. de Sistemas Informáticos y Programación
> >Facultad de Informática
> >Universidad Complutense de Madrid
> >
> >----- Mensaje original -----
> >De: Sergi Fernandez <devilsfhotmail.com>
> >Fecha: Martes, Febrero 14, 2006 0:40 am
> >Asunto: Unknown taggs for Spanish.
> >
> >  
> >
> >>Hi there!
> >>
> >>Thank you for your quick answer!!
> >>
> >>I've just solved the problem of using
independent grammars. 
> >>
> >>I'm working with the Spanish Plugin for GATE
3.0. As I read in 
> the 
> >>documentation "D1.4.1a Language
Issues", GATE uses a tagger based 
> >>on the Brill tagger, but trained on Spanish
text. The taggers for 
> >>Spanish are different from the ones for
English, and they are 
> >>defined in " Guia para la anotacin
morfosintctica del corpus CLiC-
> >>TALP by M. Civit.". Until now everything
was ok. But right now 
> I'm 
> >>working with gate for my final degree project
and there are some 
> >>taggers that don't work quite well. What I
mean is that there are 
> >>some taggers, as NN, NNP or NNS, that are not
described in my 
> Guide 
> >>by M.Civit.
> >>
> >>I believe those taggs came from the English
tagger.
> >>
> >>I'm trying to adapt Text2Onto, from
Univerisity of Karlsruhe 
> >>(Deutschland), AIFB Institute,  to Spanish, and
maybe 30% or 40% 
> of 
> >>nouns are tagged with those "English
Taggs". That causes a very 
> >>high losing of accuracy and recall if I want
only to use the 
> >>correct taggs. Then, If I consider the NNS and
NNP tags as nouns, 
> I 
> >>win recall but lose accuracy. Then my questions
are: What I can 
> do 
> >>with those tag's? Is there a new version of
the Spanish plugin 
> >>avaiable and more accurate? Could you please
explain it to me or 
> >>tell me where to find the way Spanish Plugin is
built so that I 
> can 
> >>figure out which of those taggs to accept or
reject ?
> >>
> >>Best regards.
> >>
> >>Sergi
> >>    
> >>
> >
> >
> >  
> >
> 


[1]

about | contact  Other archives ( Real Estate discussion Medical topics )