NILC-WISE - Web
Interface for Summary Evaluation - an online and easy to use
interface for running
ROUGE (Lin, 2004)
for evaluating summaries
Summarization
extension to Google Chrome - extension for on-line news
summarization, based on
RSumm system
OpCluster-PT - as described in the MSc Dissertation of Vargas (2017), a new computational method based on semantic relations and linguistic rules to automatically detect fine-grained opinions in User-Generated Content (UGC)
Models for summary
coherence evaluation - a set of implemented models for summary
coherence evaluation, following several approaches, from traditional
entity grids to discourse grids. See the
PhD thesis of Marcio de Souza
Dias for more information.
RC-4
multi-document summarizer - based on the best RST & CST-based
summarization strategy proposed by
Cardoso (2014)
RCT-4
multi-document summarizer - based on the best RST & CST &
subtopics-based summarization strategy proposed by
Cardoso (2014). Notice that the difference of this summarization
method in relation to the above one is the inclusion of subtopic
segmentation and treatment.
Text-summary alignment - tool that includes a set of methods for
aligning texts and their multi-document summaries, as developed by
Agostini et al. (2014)
TextTiling for Portuguese -
topical segmentation tool adapted to news texts in Brazilian
Portuguese, based on the work of
Hearst (1997)
ViSum - a visualization
system for multi-document summarization (described by
Lima, 2013)
Lemmatizer for
Portuguese - based on the MXPOST part of speech tagger and
UNITEX dictionaries for Portuguese, this tool produces the lemmas of
the words of a text stored in a plain text file. The source code is
also provided. For more details, see the readme.pdf file or contact
Erick G. Maziero (the developer of the system)
NCLEANER trained model
for Portuguese - a trained model to be used with NCleaner
(Evert, 2008) for cleaning web pages in Portuguese. The model
was trained with 184 texts from several online sources, as Terra,
UOL, BBC, Exame, Estadão, IG,
R7, Zero Hora, G1, JB Online, and O
Globo, among others.
CSTTool
- a semi-automatic edition tool for annotating texts according to
the Cross-document Structure Theory (see
Aleixo and Pardo, 2008)
Newshead
- an on-line tool for searching and clustering related news
RSTeval - a tool
for discourse parsing evaluation, following
Marcu
(2000) evaluation method - the tool is able to compare RST trees
(automatically or manually produced), producing precision and recall
numbers (see Maziero and
Pardo, 2009)
Syntax-based text segmentation tool - a tool for detecting
elementary discourse units in texts - it uses the parser PALAVRAS
(Bick, 2000) for analyzing the input text and, then, applies
syntactical segmentation rules
RST Toolkit - utility programs for processing RST files,
offering several computational facilities for both computational and
linguistic purposes
Sentence ordering
program - program for ordering sentences in a multi-document
summary (given the source-texts) (see
Lima and Pardo, 2012)
CSTSumm - a multi-document summarizer based on CST
information (see README.txt in the rar file) (see
Castro Jorge, 2010)
RSumm - a
multi-document summarizer based on the relationship maps proposed by
Salton et al. (1997) (see
Ribaldo et al., 2012 and
Ribaldo, 2013)
DiZer 2.0 - an
on-line RST discourse parser, which is easily adaptable and portable
to different text types/genres and languages (see
Maziero et al., 2011)
CSTParser - a state-of-the-art CST
discourse parser for Portuguese, using both symbolic and machine
learning techniques (see
Maziero, 2012)
-->
Its stand-alone
(offline) version (with some adaptations in relation to the
online version) is also freely available for use
NASP (see NASP++ below) - a tool for aiding
in word sense annotation of nouns in Portuguese, using Princeton
Wordnet as sense repository
NASP++ - an improved version of NASP
(see above), with more facilities (e.g., the underlying generation
of ontologies for the annotated words) and adapted to other part of
speech tags
MulSEN - a
multilingual version of NASP (see above)
CSTNews-Update - a new arrangement of CSTNews texts for training
and testing update summarization methods for Portuguese
Corpora for sentence compression - two corpora composed by long
(original) sentences and their compressed versions for Portuguese
Corpus of automatic multi-document summaries with linguistic errors
- a corpus of automatic multi-document summaries (for the texts of
CSTNews corpus) produced by 4 different
summarizes with varied performances, manually annotated with
linguistic errors
OpiSums-PT - a corpus of
(extractive and abstractive) opinion summaries (170, in total) for
reviews of books (13 reviews) and electronic products (4 reviews),
written in Brazilian Portuguese
Aspect ontologies - groups of (hierarchically organized) opinion
aspects for supporting opinion mining tasks, including the domains
of smartphones, digital cameras and books, in OWL format
CSTNews
interface - on-line browsing interface to CSTNews corpus
CSTNews - a corpus with 50
clusters of news texts - in Portuguese - along with their multi-document summaries,
as well as several discourse and semantic annotations (see
Aleixo and Pardo, 2008;
Cardoso et al., 2011)