Digital Thesis Room >
Faculty of Graduate Studies >
Theses & Dissertations >
Please use this identifier to cite or link to this item:
http://hdl.handle.net/10791/334
|
Title: | Leveraging local and global word context for multi-label document classification |
Authors: | Ellis, Robert |
Supervisor(s): | Wen, Dunwei (Faculty of Science and Technology, School of Computing and Information Systems) |
Examining Committee: | Dewan, Ali (Faculty of Science and Technology, School of Computing and Information Systems) Bagheri, Ebrahim (Ryerson University) |
Degree: | Master of Science, Information Systems (MScIS) |
Department: | Faculty of Science and Technology |
Keywords: | Recurrent Convolutional Neural network Classification Attention Hierarchy Ensemble Siamese |
Issue Date: | 25-Nov-2020 |
Abstract: | With the increasing volume of text documents, it is crucial to identify the themes and
topics contained within. Labelling documents with the identified topics is called multi-label classification. Interdependencies exist between not just words, but sentences and
paragraphs. These longer sequences and more complex relationships increase the label
identification challenge. Five novel deep neural networks are proposed and evaluated for
their performance classifying longer documents. The RCLNN applies the RCL to NLP,
combining that model with a CNN which has demonstrated success on short text. The
QRCNN similarly extends a CNN in addition to implementing it with a QRNN. The
remaining three models build on these base models, integrating them in a novel pseudo-Siamese approach. Experiments find QRCNN highest performing overall, with the
PSRCNNA model a close second, indicating that the pseudo-Siamese approach can be
performant when married with attention. |
Graduation Date: | Nov-2020 |
URI: | http://hdl.handle.net/10791/334 |
Appears in Collections: | Theses & Dissertations
|
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.
|