posted on 2021-10-01, 20:58authored bySamir Abdaljalil
The number of electronic text documents is growing and so is the need for automatic text summarizers. In the finance
domain, documents can be quite long, averaging at approximately 180 pages. This creates a need for finding efficient ways to make use of technology to leverage the existence of these textual datasets. This goes hand in hand with the pressing need to make investment/financial decisions in a fast manner to ensure maximized financial gain. However, exhaustive reading of financial documents is extremely laborious. Hence, Automatic summarization methods could greatly simplify this task. In this work, we present several approaches for summarizing the qualitative sections of annual reports using extractive summarization, Natural Language Processing (NLP), and Machine Learning techniques. We investigated multiple approaches under two different types of explorations, sentence-based summarization and a section-based summarization tailored to the structure of financial reports. We then evaluated the quality of the summaries using an existing dataset of annual reports published by FNS-2020 shared-task that consists of annual reports by British Firms belonging to the London Stock Exchange.