------------------------------------------------------------
--------------------------------------
FINITE-STATE METHODS AND NATURAL LANGUAGE
PROCESSING
FSMNLP 2008
SEVENTH INTERNATIONAL WORKSHOP
SECOND CALL FOR PAPERS
11-12 SEPTEMBER 2008, ISPRA, ITALY
HTTP://LANGTECH.JRC.IT/FSMNLP2008
CONTACT: FSMNLP2008 [AD] JRC [DOT] IT
------------------------------------------------------------
---------------------------------------
THIS YEAR FSMNLP IS MERGED WITH THE FASTAR
(FINITE AUTOMATA SYSTEMS - THEORETICAL AND APPLIED RESEARCH)
WORKSHOP (HTTP://WWW.FASTAR.ORG).
AIM AND SCOPE
THE AIM OF THE FSMNLP 2008 IS TO BRING TOGETHER MEMBERS OF
THE
RESEARCH AND INDUSTRIAL COMMUNITY WORKING ON FINITE-STATE
BASED MODELS IN LANGUAGE TECHNOLOGY, COMPUTATIONAL
LINGUISTICS,
WEB MINING, LINGUISTICS, AND COGNITIVE SCIENCE OR ON RELATED
THEORY AND METHODS IN FIELDS SUCH AS COMPUTER SCIENCE AND
MATHEMATICS.
THE WORKSHOP WILL BE A FORUM FOR RESEARCHERS AND
PRACTICIONERS WORKING
* ON NLP APPLICATIONS,
* ON THE THEORETICAL AND IMPLEMENTATION ASPECTS, OR
* ON THEIR COMBINATION.
THE SPECIAL THEME OF FSMNLP 2008 CENTERS AROUND HIGH
PERFORMANCE
FINITE-STATE DEVICES IN LARGE-SCALE NATURAL LANGUAGE TEXT
PROCESSING SYSTEMS
AND APPLICATIONS. WE INVITE IN PARTICULAR NOVEL HIGH-QUALITY
PAPERS RELATED
TO THE TOPICS INCLUDING:
* PRACTICES AND EXPERIENCE IN DEPLOYMENT OF FINITE-STATE
TECHNIQUES
IN REAL-WORLD APPLICATIONS PROCESSING MASSIVE AMOUNT OF
NATURAL LANGUAGE DATA
* INDUSTRIAL-STRENGTH FINITE-STATE PATTERN ENGINES FOR
INFORMATION RETRIEVAL,
INFORMATION EXTRACTION AND RELATED TEXT-MINING TASKS
* SCALABILITY ISSUES IN FS-BASED LARGE-SCALE TEXT
PROCESSING SYSTEMS
* EFFICIENT FINITE-STATE METHODS IN SEARCH ENGINES
* IMPLEMENTATION, CONSTRUCTION, COMPRESSION AND PROCESSING
TECHNIQUES FOR HUGE FINITE-STATE
DEVICES AND NETWORKS
* NOVEL APPLICATION AND EFFICIENCY-ORIENTED FINITE-STATE
PARADIGMS (COMPILATION AND PROCESSING),
E.G., FINITE-STATE DEVICES WITH RICH LABEL
ANNOTATATIONS, UNIFICATION-BASED FINITE-STATE DEVICES
* COMPARATIVE STUDIES OF TIME AND SPACE EFFICIENT
FINITE-STATE METHODS (VS. OTHER TECHNIQUES)
UTILIZED IN NLP APPLICATIONS
* NOVEL APPLLICATION AREAS FOR FINITE-STATE DEVICES IN
TEXT PROCESSING
AND INFORMATION MANAGEMENT SYSTEMS
* DESIGN PATTERNS FOR IMPLEMENTING FINITE-STATE DEVICES
AND TOOLKITS
WE ALSO INVITE SUBMISSIONS THAT ARE RELATED TO THE
TRADITIONAL FSMNLP THEMES
INCLUDING BUT NOT LIMITED TO:
1. NLP APPLICATIONS AND LINGUISTIC ASPECTS OF FINITE-STATE
METHODS
THE TOPIC INCLUDES BUT IS NOT RESTRICTED TO:
* SPEECH, SIGN LANGUAGE, PHONOLOGY, HYPHENATION, PROSODY,
* SCRIPTS, TEXT NORMALIZATION, SEGMENTATION, TOKENIZATION,
INDEXING,
* MORPHOLOGY, STEMMING, LEMMATISATION, INFORMATION
RETRIEVAL, WEB MINING, SPELLING CORRECTION,
* SYNTAX, POS TAGGING, PARTIAL PARSING, DISAMBIGUATION,
INFORMATION EXTRACTION, QUESTION ANSWERING
* MACHINE TRANSLATION, TRANSLATION MEMORIES, GLOSSING,
DIALECT ADAPTATION,
* ANNOTATED CORPORA AND TREEBANKS, SEMI-AUTOMATIC
ANNOTATION, ERROR MINING, SEARCHING
2. FINITE-STATE MODELS OF LANGUAGE
WITH THIS MORE FOCUSED TOPIC (INSIDE 1) WE INVITE PAPERS ON
ASPECTS
THAT MOTIVATE SUFFICIENCY OF FINITE-STATE METHODS OR THEIR
SUBSETS FOR
CAPTURING VARIOUS REQUIREMENTS OF NATURAL LANGUAGE
PROCESSING. THE
TOPIC INCLUDES BUT IS NOT RESTRICTED TO:
* PERFORMANCE, LINGUISTIC APPLICABILITY, FINITE-STATE
HYPOTHESES
* ZIPF'S LAW AND COVERAGE, MODEL CHECKING AGAINST FINITE
CORPORA
* REGULAR APPROXIMATIONS UNDER PARAMETERIZED COMPLEXITY,
LIMITATIONS AND DEFINITIONS OF RELEVANT
COMPLEXITIES SUCH AS AMBIGUITY, RECURSION, CROSSINGS,
RULE APPLICATIONS, CONSTRAINT VIOLATIONS,
REDUPLICATION, EXPONENTS, DISCONTINUITY, PATH-WIDTH,
AND INDUCTION DEPTH
* SIMILARITY INFERENCES, DISSIMILATION, SEGMENTAL LENGTH,
COUNTER-FREENESS, ASYNCHRONOUS MACHINES
* GARDEN-PATH SENTENCES, DETERMINISTIC PARSING, EXPECTED
PARSES, MARKOV CHAINS
* INCREMENTAL PARSING, UNCERTAINTY, RELIABILITY/VARIANCE
IN STOCHASTIC PARSING,
LINEAR SEQUENTIAL MACHINES
3. PRACTICES FOR BUILDING LEXICAL TRANSDUCERS FOR THE
WORLD'S LANGUAGES.
THE TOPIC ACCOUNTS FOR USABILITY OF FINITE-STATE METHODS IN
NLP. IT
INCLUDES BUT IS NOT RESTRICTED TO:
* REQUIRED USER TRAINING AND CONSULTATION, LEARNING CURVE
OF NON-SPECIALISTS
* QUESTIONNAIRES, DISCOVERY METHODS, ADAPTIVE
COMPUTER-AIDED GLOSSING AND INTERLINEARIZATION
* EXAMPLE-BASED GRAMMARS, UNSUPERVISED LEARNING,
SEMI-AUTOMATIC LEARNING, USER-DRIVEN LEARNING
(SEE TOPIC 5 TOO)
* LOW LITERACY LEVEL AND RESTRICTED AVAILABILITY OF
TRAINING DATA, WRITING SYSTEMS/PHONOLOGY
UNDER DEVELOPMENT, NEW NON-ROMAN SCRIPTS, ENDANGERED
LANGUAGES
* LINGUIST'S WORKBENCHES, STEALTH-TO-WEALTH PARSER
DEVELOPMENT
* EXPERIENCES OF USING EXISTING TOOLS (E.G. TWOL) FOR
COMPUTATIONAL MORPHOLOGY AND PHONOLOGY
4. SPECIFICATION AND IMPLEMENTATION OF SETS, RELATIONS AND
MULTIPLICITIES IN NLP USING FINITE STATE DEVICES
THE TOPIC INCLUDES BUT IS NOT RESTRICTED TO:
* REGULAR RULE FORMALISMS, GRAMMAR SYSTEMS, EXPRESSIONS,
OPERATIONS, CLOSURE PROPERTIES,
COMPLEXITIES
* ALGORITHMS FOR COMPILATION, APPROXIMATION, MANIPULATION,
OPTIMIZATION,
AND LAZY EVALUATION OF FINITE MACHINES
* FINITE STRING AND TREE AUTOMATA, TRANSDUCERS, MORPHISMS
AND BIMORPHISMS
* WEIGHTS, REGISTERS, MULTIPLE TAPES, ALPHABETS, STATE
COVERS AND PARTITIONS, REPRESENTATIONS
* LOCALITY, CONSTRAINT PROPAGATION, STAR-FREE LANGUAGES,
DATA VS. QUERY COMPLEXITY
* LOGICAL SPECIFICATION, MSO(SLR,MATCHES), FO(STR,<),
LTL, GENERALIZED RESTRICTION, LOCAL GRAMMARS
* MULTI-TAPE AUTOMATA, SAME-LENGTH RELATIONS AND
PARTITION-BASED MORPHOLOGY, SEMITIC MORPHOLOGY
* AUTOSEGMENTAL PHONOLOGY, SHUFFLE, TRAJECTORIES,
SYNCHRONIZATION, SEGMENTAL ANCHORING,
ALIGNMENT CONSTRAINTS, SYLLABLE STRUCTURE,
PARTIAL-ORDER REDUCTIONS
* VARIETIES OF REGULAR LANGUAGES AND RELATIONS,
DESCRIPTIVE COMPLEXITY OF FINITE-STATE BASED GRAMMARS
* AUTOMATON-BASED APPROACHES TO DECLARATIVE CONSTRAINT
GRAMMARS,
CONSTRAINTS IN OPTIMALITY THEORY
* PARALLEL CORPUS ANNOTATIONS, REGISTER AUTOMATA, ACYCLIC
TIMED AUTOMATA
5. MACHINE LEARNING OF FINITE-STATE MODELS OF NATURAL
LANGUAGE
THIS TOPIC INCLUDES BUT IS NOT RESTRICTED TO:
* LEARNING REGULAR RULE SYSTEMS, LEARNING TOPOLOGIES OF
FINITE AUTOMATA AND TRANSDUCERS
* PARAMETER ESTIMATION AND SMOOTHING, LEXICAL OPENNESS
* COMPUTER-DRIVEN GRAMMAR WRITING, USER-DRIVEN GRAMMAR
LEARNING, DISCOVERY PROCEDURES
* DATA SCARCITY, REALISTIC VARIATIONS OF GOLD'S MODEL,
LEARNABILITY AND COGNITIVE SCIENCE
* INCOMPLETELY SPECIFIED FINITE-STATE NETWORKS
* MODEL-THEORETIC GRAMMARS, GRADIENT WELL/ILL-FORMEDNESS
6. FINITE-STATE MANIPULATION SOFTWARE (WITH RELEVANCE TO THE
ABOVE THEMES)
THIS TOPIC INCLUDES BUT IS NOT RESTRICTED TO
* REGULAR EXPRESSION PRE-COMPILERS SUCH AS REGEXOPT,
XFST2FSA, STANDARDS AND INTERFACES
FOR FINITE-STATE BASED SOFTWARE COMPONENTS, CONVERSION
TOOLS
* TOOLS SUCH AS LEXC, LEXTOOLS, INTEX, XFST, FSM, GRM,
WFSC, FIRE ENGINE, FADD, FSA/UTR,
SRILM, FIRE STATION AND GRAIL
* FREE OR ALMOST FREE SOFTWARE SUCH AS MIT FST, CARMEL,
RWTH FSA, FSA UTILITIES, FSM<2.0>,
UNITEX, OPENFIRE, OPENFST, VAUCANSON, SFST, PCKIMMO,
MONA, HOPSKIP, ASTL, UCFSM,
HALEX, SML, AND WFST (SEE
HTTP://FORUMS.CSC.FI/KITWIKI/PILOT/VIEW/KITWIKI/FSMREG FOR
MORE
EXAMPLES)
* RESULTS OBTAINABLE WITH SUCH EXPLORATION TOOLS AS
AUTOMATA, AUTOGRAPHE, AMORE, AND TESTAS
* VISUALIZATION TOOLS SUCH AS GRAPHVIZ AND VAUCANSON-G
* LANGUAGE-SPECIFIC RESOURCES AND DESCRIPTIONS, FREELY
AVAILABLE BENCHMARKING RESOURCES
THE DESCRIPTIONS OF THE TOPICS ABOVE ARE NOT MEANT TO BE
COMPLETE, AND
SHOULD EXTEND TO COVER ALL TRADITIONAL FSMNLP TOPICS.
SUBMITTED PAPERS
OR ABSTRACTS MAY FALL IN SEVERAL CATEGORIES.
SUBMISSION
WE EXPECT THREE KINDS OF SUBMISSIONS:
- FULL PAPERS,
- SHORT PAPERS, AND
- INTERACTIVE SOFTWARE DEMOS.
SUBMISSIONS ARE ELECTRONIC AND IN PDF FORMAT VIA A WEB-BASED
SUBMISSION SERVER.
AUTHORS ARE ENCOURAGED TO USE SPRINGER LNCS STYLE
(PROCEEDINGS AND OTHER MULTIAUTHOR VOLUMES)
FOR LATEX IN PRODUCING THE PDF DOCUMENT. MORE INFORMATION ON
THIS STYLE CAN BE FOUND AT:
HTTP://WWW.SPRINGER.COM/EAST/HOME/COMPUTER/LNCS?SGWID=5-164-
7-72376-0
THE PAGE LIMIT FOR FULL PAPERS IS 12 PAGES, WHEREAS SHORT
PAPERS AND
SOFTWARE DEMO DESCRIPTIONS ARE LIMITED TO 6 PAGES. THE
INFORMATION ABOUT THE AUTHOR(S)
SHOULD BE OMITTED IN THE SUBMITTED PAPERS SINCE THE REVIEW
PROCESS WIL BE BLIND.
MORE DETAILED INFORMATION ABOUT SUBMISSION IS AVAILABLE ON:
HTTP://LANGTECH.JRC.IT/FSMNLP2008/M/SUBMISSION.HTML
PUBLICATION
THE PAPERS AND ABSTRACTS WILL BE PUBLISHED IN FSMNLP 2008
PROCEEDINGS (PAPER VERSION).
WE ARE CURRENTLY NEGOTIATING PUBLISHING THE POSTPROCEEDINGS
WITH A SCIENTIFIC PRESS COMPANY.
PUBLICATION OF EXTENDED AND REVISED VERSIONS OF THE PAPERS
IN A SPECIAL JOURNAL ISSUE
IS PLANNED TOO.
IMPORTANT DATES
PAPER SUBMISSIONS DUE: 11 MAY
NOTIFICATION OF ACCEPTANCE: 11 JUNE
CAMERA-READY VERSIONS DUE: 30 JUNE
PROGRAM COMMITTEE
CYRIL ALLAUZEN (GOOGLE RESEARCH, NEW YORK, USA)
FRANCISCO CASACUBERTA (INSTITUTO TECNOLOGICO DE INFORMáTICA,
VALENCIA, SPAIN)
JEAN-MARC CHAMPARNAUD (UNIVERSITé DE ROUEN, FRANCE)
MAXIME CROCHEMORE (DEPARTMENT OF COMPUTER SCIENCE, KING'S
COLLEGE bond, U.K.)
JAN DACIUK (GDA?SK UNIVERSITY OF TECHNOLOGY, POLAND)
KARIN HAENELT (FRAUNHOFER GESELLSCHAFT AND UNIVERSITY OF
HEIDELBERG, GERMANY)
THOMAS HANNEFORTH (UNIVERSITY OF POTSDAM, GERMANY)
COLIN DE LA HIGUERA (JEAN MONNET UNIVERSITY, SAINT-ETIENNE,
FRANCE)
ANDRé KEMPE (YAHOO SEARCH TECHNOLOGIES, PARIS, FRANCE)
DERRICK KOURIE (DEPT. OF COMPUTER SCIENCE, UNIVERSITY OF
PRETORIA, SOUTH AFRICA)
ANDRAS KORNAI (BUDAPEST INSTITUTE OF TECHNOLOGY, HUNGARY AND
METACARTA, CAMBRIDGE, USA)
MARCUS KRACHT (UNIVERISTY OF CALIFORNIA, LOS ANGELES, USA)
HANS-ULRICH KRIEGER (DFKI GMBH, SAARBRüCKEN, GERMANY)
ERIC LAPORTE (UNIVERSITé DE MARNE-LA-VALLéE, FRANCE)
STOYAN MIHOV (BULGARIAN ACADEMY OF SCIENCES, SOFIA,
BULGARIA)
HERMAN NEY (RWTH AACHEN UNIVERSITY, GERMANY)
KEMAL OFLAZER (SABANCI UNIVERSITY, TURKEY AND CARNEGIE
MELLON UNIVERSITY, PITTSBURGH, USA)
JAKUB PISKORSKI (JOINT RESEARCH CENTER OF THE EUROPEAN
COMMISSION, ITALY)
MICHAEL RILEY (GOOGLE RESEARCH, NEW YORK, USA)
STRAHIL RISTOV (RUDER BOSKOVIC INSTITUTE, ZAGREB, CROATIA)
WOJCIECH RYTTER (WARSAW UNIVERSITY, POLAND)
JACQUES SAKAROVITCH (ECOLE NATIONALE SUPéRIEURE DES
TéLéCOMMUNICATIONS, PARIS, FRANCE)
MAX SILBERZTEIN (UNIVERSITé DE FRANCHE-COMTé, FRANCE)
WOJCIECH SKUT (GOOGLE RESEARCH, MOUNTAIN VIEW, USA)
BRUCE WATSON (DEPT. OF COMPUTER SCIENCE, UNIVERSITY OF
PRETORIA, SOUTH AFRICA)
SHULY WINTNER (UNIVERSITY OF HAIFA, ISRAEL)
ATRO VOUTILAINEN (CONNEXOR OY, FINLAND)
ANSSI YLI JYRä (UNIVERSITY OF HELSINKI AND CSC €“ SCIENTIFIC
COMPUTING LTD., ESPOO, FINLAND)
SHENG YU (UNIVERSITY OF WESTERN ONTARIO, CANADA)
LYNETTE VAN ZIJL (STELLENBOSCH UNIVERSITY, SOUTH AFRICA)
|