Lexical Resources For Portuguese: Strategies


Valeria de Paiva (Nuance Communications, EUA)


FGV - Praia de Botafogo, 190, sala 317


18 de Setembro de 2014 às 16:00h

Lexical resources are essential for the development of Natural Language Processing and all the applications that come from it, Information Retrieval, Information Extraction, Summarization, Question Answering, Sentiment Analysis, Conversational Assistants, etc.. Even for English, the most analyzed language, there are still some missing resources, the existing lexica are not inter-related and information that could, in principle, be obtained from pooling resources together is not readily available. The situation is much worse for Portuguese, where we lack big projects that accumulated resources such as WordNet, COMLEX, VerbNet, the Penn TreeBank, FrameNet, GATE, NLTK, MindNet, Delph-IN, ParGram, GF, etc. Together with colleages I've embarked in a long-term project of building lexical resources for Portuguese and making them open, so that others can build on them. Our basic strategy is to do as much as we can automatically, but to try to verify the automatically created data, to the best of our abilities. I will describe the tools we created and the ones we are working on and why.


Valeria de Paiva is a mathematician, logician and computer scientist based in Sunnyvale, CA. She works as Senior Research Scientist at Nuance Communications in the Artificial Intelligence and Reasoning group, using logic to improve communication with computer systems. Before that she was a research scientist at the Intelligent Systems Laboratory of PARC (Palo Alto Research Center), CA for many years (2000-2008) and a search analyst at Cuil, Inc. and at Rearden Commerce, now Deem. She received her PhD in Mathematics from Cambridge University for work on "Dialectica Categories", and has ever since worked on logical approaches to computation. She is on the editorial boards of "Theoria and Application of Categories", "Logical Methods in Computer Science" and of "Logica Universalis", as well as in the editorial board of the Springer series of books "Logic, Language and Information". She's an Honorary Research Fellow at the School of Computer Science, University of Birmingham, UK, where she was an Assistant Professor, and she taught recently at Stanford University and Santa Clara University in California.

Observação para visitantes: 

