Carnegie Mellon University
Browse

Neural Sequential Modeling and Applications

Download (2.02 MB)
thesis
posted on 2025-05-20, 19:54 authored by Guokun LaiGuokun Lai

How to model sequential data in various settings is an important machine learning problem across many domains, including predictions over time series data, natural language text, and event streams. Sequential data in different fields usually have different characteristics. For example, natural-language text can be viewed as a a sequence of a discrete variable while sensor-network signals can be treated as a multi-variate sequence in a continuous vector space. In order to develop successful neural network models in such various real-world domains, we need to customize the architectures and algorithms based on the nature of the data and the problems. This thesis designs novel and efficient neural network solutions for the sequential modeling and applications. Specifically, the contributions can be grouped into four parts.

  • The first part focuses on the correlation among variables in the multivariate sequential data, such as the time series of multiple sensors, and proposes novel algorithms namely Depthwise Separable Graph Convolution Network (DSGC) (Chapter 2) [60] and Factorized Recurrent Neural Network (FRNN) (Chapter 3) [63] for leveraging correlation patterns and improving prediction accuracy.
  • The second part focus on incorporating human prior knowledge in temporal modeling of dependency pattens in sequential data. Specifically, we propose a novel approach named the Long- and Short-term Time-series Network (LST-Net) (Chapter 4) [59] which is proven to be particularly effective for capturing various periodic patterns in different applications.
  • The third part focuses on efficient algorithms for Transformers in sequence classification tasks. Specifically, by identifying the computation redundancy in the commonly used Transformer architectures and by proposing a novel replacement namely the Funnel Transformer (Chapter 5) [27], we achieve a better trade-off between computation and accuracy.
  • The fourth part focuses on the modeling/prediction of the temporal relationship among events, where the major challenge is effective learning from sparsely labeled data. We address this challenge via the combination of advanced data augmentation, semi-supervised learning and introduction of human prior knowledge (Chapter 6). As a result, we improve the state-of-the-art performance of this task by a large margin

History

Date

2021-08-19

Degree Type

  • Dissertation

Thesis Department

  • Language Technologies Institute

Degree Name

  • Doctor of Philosophy (PhD)

Advisor(s)

Yiming Yang

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC