Digital Thesis Room >
Faculty of Graduate Studies >
Theses & Dissertations >
Please use this identifier to cite or link to this item:
http://hdl.handle.net/10791/418
|
Title: | ANALYSIS OF PARALLEL PRE-PROCESSING OF MALWARE DATA FOR MACHINE LEARNING IN PYTHON |
Authors: | Larsen, Nels |
Supervisor(s): | Qing Tan (Athabasca University) |
Examining Committee: | Dr. Harris Wang (Athabasca University) Dr. Ali Dewan (Athabasca University) |
Degree: | Master of Science, Information Systems (MScIS) |
Department: | Faculty of Science and Technology |
Keywords: | Python Optimization Effective Cloud Computing Virtual Machines Concurrent Processing Dictionary |
Issue Date: | 10-Aug-2023 |
Abstract: | Resources needed for machine learning can be demanding. Large datasets used to perform training for machine learning can take hours or days. The shift to using cloud resources encourages machine learning practitioners to use powerful virtual machines, which have high costs that change based on the time the machine is operational. The second part of this thesis research shows analyzes the data processing execution time for this dataset. The processing can be improved by better hardware, and this research provides specific recommendations for hardware improvement. This research demonstrates how a dictionary that accepts multiple concurrent inputs can improve the time taken for data processing tasks. Standardization of the time taken for a task to run can allow for better budgeting decisions and encourages documentation and process changes that allows future researchers better outcomes. |
Graduation Date: | Aug-2023 |
URI: | http://hdl.handle.net/10791/418 |
Appears in Collections: | Theses & Dissertations
|
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.
|