sustento -
Generation of Linguistic Knowledge for Multi-document Automatic
Summarization (coordinator: Ariani Di Felippo, DL-UFSCar)
The sustento is a two-years research project which
aims at generating knowledge to provide more linguistic-motivated
strategies for multi-document automatic summarization of texts in
the Brazilian Portuguese language. Specifically, the project has
been focused on 3 correlated tasks: linguistic characterization of
multi-document summaries and their manual production, since
multi-document summarization has just been based on clues regarding
the human summarization; corpus-based studies of multi-document
phenomena (redundancy, contradiction and complementarity);
representation of semantic-conceptual knowledge and construction of
resources and tools, since there are no methods based on this level
of knowledge for multi-document summarization of Brazilian
Portuguese texts.
TermiNet
- Instantiation and Application of a Methodology for the Development of
"Terminological Wordnets" in Brazilian Portuguese
(coordinator: Ariani Di Felippo, DL-UFSCar)
Due to the increasing necessity of processing
specialized texts, domain-specific (or terminological) lexical
databases have been built in many languages, especially in wordnet
format. Despite the existence of a reasonable number of
terminological wordnets in many languages, there is no clear and
generic methodology for building them. For Brazilian Portuguese
(BP), by the way, there is no domain-specific lexical database in
wordnet model. Consequently, we propose: (i) to instantiate a
generic NLP methodology for developing terminological wordnets, and
(ii) apply it to build a terminological wordnet in BP. Such
methodology distinguishes itself by conciliating the linguistic and
computational facets of the NLP researches. So, besides the benefits
to NLP domain, terminological wordnets may also contribute to the
development of terminological/ terminographic products since the
organization of lexical-conceptual knowledge is an essential step in
building such products.
PorSimples
- Simplification of Portuguese Text for Digital Inclusion and
Accessibility (coordinator: Sandra M. Aluísio, ICMC-USP)
In PorSimples project we propose the development
of a technology to facilitate accessibility to information by the
functional illiterates (FI) and potentially by people with other
cognitive disabilities (e.g. aphasia or dyslexia). Such technology
will be made available by means of two systems aimed to distinct
users: an authoring system to help authors to produce simplified
texts targeting FI, and a simplification system to allow for FI to
read Web content. The latter explores the tasks of summarization and
simplification and also text presentation schemes, which should
highlight the associations amongst the main ideas of the text, the
named entities, semantic roles and lexical elaboration.
PLN-BR
- Tools and Resources for Information Retrieval from Textual Bases in
Brazilian Portuguese (coordinator: Maria das Graças V. Nunes, ICMC-USP)
This project aimed at the creation of an
interinstitutional space for interaction and exchange of research
practices in Computational Linguistics for the investigation and
development of information representation and retrieval tasks in
Brazilian Portuguese language
ProCaCoSa
- Coreference Chains Processing for Automatic Summarization of
Portuguese texts (coordinator: Lucia H. M. Rino, DC-UFSCar)
This project aims at analyzing and solving
summarization problems caused by unresolved coreferences in content
selection and structuring during summary production. The general
purpose is to use information about the coreference chains in the
source text to produce better summaries.
EXPLOSA
- EXPLOration of several methods for Automatic Summarization
(coordinator: Lucia H. M. Rino, DC-UFSCar)
Fundamental and experimental approaches are
tackled by means of a variety of small projects under the EXPLOSA
scenario. The former is pursued through discourse-driven text
generation; the latter, through extraction-based AS methods.
|