Invited Speakers

Foto de Llus Padr

Lluís Padró teaches at the UPC since 1991, and holds a PhD in computer science since 1998. His teaching experience covers a wide range of subjects in undergraduate computing courses in the UPC, for software, telecommunications and civil engineering students. The taught subjects range from basic programming to calculability theory, including software engineering, compilers and operating systems, with special emphasis on open source systems. He has directed more than 30 graduation projects in Software Engineering and been member of many comittees, either of graduation projects, master, and PhD dissertations.

His research area is in artificial intelligence, specifically on natural language automatic processing, and particularly in the construction of language analyzers.He has published in the main area conferences (ACL, ANLP, NAACL, EACL, Coling, EMNLP, RANLP, ...) and either national and international magazines (Machine Learning, Computers & the Humanities, Procesamiento del Lenguaje Natural, ...). Has advised four doctoral dissertations, and is currently advising another two. He has also given courses in PhD programs at the UPC and at the Basque Country University (EHU/UPV). He has also participated and directed several funded research projects, both national and European projects like EuroOpenTrad (CFIT -350401-2007-1, IST-020302-2008-51) and EuroWordNet (LE-24 003).

Professor Padró is developer and administrator of the project Freeling, a suite of free software that provides linguistic analysis functions for texts in several languages, designed to provide a basis for the development of language processing applications. In the field of free software, he has given many courses and conferences about Free Software, both technical and dissemination oriented.

The title of the talk given by Lluís Padró will be: "FreeLing: Open-Source Natural Language Processing for Research and Development"
[Talk slides]

Foto d'Aarne Ranta

Aarne Ranta got his PhD at the University of Helsinki in 1990. His research about constructive type theory and computational linguistics led to the monograph "Type Theoretical Grammar" (OUP 1994) and later to the system GF (Grammatical Framework), which is a programming language for multilingual grammars and their applications. GF has been used in several projects on technical translation, natural language interfaces, spoken dialogue systems, and the creation of lexical and grammatical resources. A full professor of Computer Science since 2005, Ranta has supervised 5 PhD theses to completion and is currently supervising 4. He has a long teaching experience in logic, language technology, compiler construction, and functional programming. A native speaker of Finnish, Ranta speaks six languages and reads another six ones.

Since March 2010, Ranta is the coordinator of the European FP-7 project MOLTO (Multilingual On-Line Translation), which develops tools and case studies for high-quality translation among 15 languages. MOLTO tools are based on GF but the project also develops hybrid models combining grammars with statistical MT.

Aarne Ranta's talk summary:

The World Wide Web is a globally accessible, multilingual source of information, which has created an urgent need for automatic translation. Some of this need is satisfied by tools such as Google translate, Bing translator, and Babelfish/Systran, which provide quick translations of any web page between a large number of languages (currently 57 in Google translate). While these systems are impressive and useful, they have the character of *consumer* tools, rather than *producer* tools. A consumer, who uses one of these systems to translate a document, does it at her own risk. But if the producer herself publishes a translation, she is responsible for its correctness. For instance, an international e-commerce company publishing offers and product descriptions would take too high a risk by translating them by any of the consumer tools.

There is little hope in machine translation to achieve at the same time large coverage and high quality. In the resulting trade-off, consumer tools must opt for coverage, whereas producer tools should opt for quality. Fortunately, a producer of information is often in control of the content to be translated, so that unlimited coverage is not necessary. The MOLTO project (www.molto-project.eu) targets this need, and aims to show the feasibility of production quality translation for limited domains but a high number of simultaneous languages (up to 15 within the project's case studies). The technology used by MOLTO is based on GF (Grammatical Framework), which is a programming language for multilingual grammars.

The talk will explain how production-quality translation systems can be built rapidly and economically by using GF. The key concept of GF is a *multilingual grammar*, a grammar that uses an *abstract syntax* as an interlingua, a shared semantic structure for multiple languages. The languages are related to the abstract syntax by reversible mappings. To help writing these mappings, GF provides a Resource Grammar Library, which implements the morphology and basic syntax of currently 18 languages.

While the translation systems built in MOLTO are essentially grammar-based, they will integrate SMT methods to improve coverage and to learn parts of grammars from data. GF and related tools are open-source software available for all major platforms and constantly developed by an international community. [Talk slides]