|
|
The sucinto project aimed at investigating and exploring
generic and topic-focused multi-document
summarization strategies for providing a more feasible and intelligent
access to on-line information provided by news agencies. This commitment
brought
back old and well-known scientific challenges from the first studies in
summarization in the 50s as well as introduced several new and exciting challenges,
e.g., to deal with redundant, complementary and contradictory
information, to normalize different writing styles and referring
expression choices, to balance different perspectives and sides of
the same events and facts, to properly deal with evolving events and
their narration in different moments, and to arrange information pieces
from different texts to produce coherent and cohesive summaries, among several others.
An ultimate goal of this project was to pull the developed tools together
as on-line applications for final users.
This project took into consideration not only classical approaches
to single and multi-document summarization, but also new ones, following different paradigms and using
knowledge of varied nature ranging from empirical and statistical data
to semantic and discourse models. Research interests included (i) the
modeling of the summarization process (content selection, planning,
aggregation, generalization, substitution, information ordering, etc.) by means of Cross-document
Structure Theory (CST), Rhetorical Structure Theory (RST), ontologies,
and language and summarization statistical models, (ii) the investigation of related
tasks as discourse parsing, topic detection, temporal annotation and resolution, coreference resolution, text-summary alignment, and multilingual processing, and (iii) the
linguistic characterization of multi-document summaries and their manual
production. The project was developed at NILC (Interinstitutional
Center for Computational Linguistics), one of the biggest research
groups on Natural Language Processing and Computational Linguistics in
Brazil. It started
in 2007 as a natural follow up to some previous projects on
single-document summarization carried out at NILC (FAPESP #2006/02887-9;
see also related projects). It was supported by the research agencies
FAPESP,
CNPq, and CAPES, which have
granted scholarships for undergraduate and graduate students and regular
financial support for the project (FAPESP# 2015/17841-3, FAPESP #2012/03071-3, FAPESP #2009/05603-0).
The project was officially over at the end of 2017.
Best if viewed with Google Chrome |