Carnegie Mellon University
Browse

Narrative Summarization in the Domain of Finance

Download (2.01 MB)
thesis
posted on 2021-10-01, 20:58 authored by Samir Abdaljalil
The number of electronic text documents is growing and so is the need for automatic text summarizers. In the finance
domain, documents can be quite long, averaging at approximately 180 pages. This creates a need for finding efficient ways to make use of technology to leverage the existence of these textual datasets. This goes hand in hand with the pressing need to make investment/financial decisions in a fast manner to ensure maximized financial gain. However, exhaustive reading of financial documents is extremely laborious. Hence, Automatic summarization methods could greatly simplify this task. In this work, we present several approaches for summarizing the qualitative sections of annual reports using extractive summarization, Natural Language Processing (NLP), and Machine Learning techniques. We investigated multiple approaches under two different types of explorations, sentence-based summarization and a section-based summarization tailored to the structure of financial reports. We then evaluated the quality of the summaries using an existing dataset of annual reports published by FNS-2020 shared-task that consists of annual reports by British Firms belonging to the London Stock Exchange.

History

Date

2021-04-30

Advisor(s)

Houda Bouamor

Department

  • Information Systems

Usage metrics

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC