Graduate Theses & Dissertations

Pages

Time Series Algorithms in Machine Learning - A Graph Approach to Multivariate Forecasting
Forecasting future values of time series has long been a field with many and varied applications, from climate and weather forecasting to stock prediction and economic planning to the control of industrial processes. Many of these problems involve not only a single time series but many simultaneous series which may influence each other. This thesis provides methods based on machine learning of handling such problems. We first consider single time series with both single and multiple features. We review the algorithms and unique challenges involved in applying machine learning to time series. Many machine learning algorithms when used for regression are designed to produce a single output value for each timestamp of interest with no measure of confidence; however, evaluating the uncertainty of the predictions is an important component for practical forecasting. We therefore discuss methods of constructing uncertainty estimates in the form of prediction intervals for each prediction. Stability over long time horizons is also a concern for these algorithms as recursion is a common method used to generate predictions over long time intervals. To address this, we present methods of maintaining stability in the forecast even over large time horizons. These methods are applied to an electricity forecasting problem where we demonstrate the effectiveness for support vector machines, neural networks and gradient boosted trees. We next consider spatiotemporal problems, which consist of multiple interlinked time series, each of which may contain multiple features. We represent these problems using graphs, allowing us to learn relationships using graph neural networks. Existing methods of doing this generally make use of separate time and spatial (graph) layers, or simply replace operations in temporal layers with graph operations. We show that these approaches have difficulty learning relationships that contain time lags of several time steps. To address this, we propose a new layer inspired by the long-short term memory (LSTM) recurrent neural network which adds a distinct memory state dedicated to learning graph relationships while keeping the original memory state. This allows the model to consider temporally distant events at other nodes without affecting its ability to model long-term relationships at a single node. We show that this model is capable of learning the long-term patterns that existing models struggle with. We then apply this model to a number of real-world bike-share and traffic datasets where we observe improved performance when compared to other models with similar numbers of parameters. Author Keywords: forecasting, graph neural network, LSTM, machine learning, neural network, time series
Positive Solutions for Boundary Value Problems of Second Order Ordinary Differential Equations
In this thesis, we study modelling with non-linear ordinary differential equations, and the existence of positive solutions for Boundary Value Problems (BVPs). These problems have wide applications in many areas. The focus is on the extensions of previous work done on non-linear second-order differential equations with boundary conditions involving first-order derivative. The contribution of this thesis has four folds. First, using a fixed point theorem on order intervals, the existence of a positive solution on an interval for a non-local boundary value problem is obtained. Second, considering a different boundary value problem that consists of the first-order derivative in the non-linear term, an increasing solution is obtained by applying the Krasnoselskii-Guo fixed point theorem. Third, the existence of two solutions, one solution and no solution for a BVP is proved by using fixed point index and iteration methods. Last, the results of Green's function unify some methods in studying the existence of positive solutions for BVPs of nonlinear differential equations. Examples are presented to illustrate the applications of our results. Author Keywords: Banach Space, Boundary Value Problems, Differential Equations, Fixed Point, Norm, Positive Solutions
Application of One-factor Models for Prices of Crops and Option Pricing Process
This thesis is intended to support dependent-on-crops farmers to hedge the price risks of their crops. Firstly, we applied one-factor model, which incorporated a deterministic function and a stochastic process, to predict the future prices of crops (soybean). A discrete form was employed for one-month-ahead prediction. For general prediction, de-trending and de-cyclicality were used to remove the deterministic function. Three candidate stochastic differential equations (SDEs) were chosen to simulate the stochastic process; they are mean-reverting Ornstein-Uhlenbeck (OU) process, OU process with zero mean, and Brownian motion with a drift. Least squares methods and maximum likelihood were used to estimate the parameters. Results indicated that one-factor model worked well for soybean prices. Meanwhile, we provided a two-factor model as an alternative model and it also performed well in this case. In the second main part, a zero-cost option package was introduced and we theoretically analyzed the process of hedging. In the last part, option premiums obtained based on one-factor model could be compared to those obtained from Black-Scholes model, thus we could see the differences and similarities which suggested that the deterministic function especially the cyclicality played an essential role for the soybean price, thus the one-factor model in this case was more suitable than Black-Scholes model for the underlying asset. Author Keywords: Brownian motion, Least Squares Method, Maximum Likelihood Method, One-factor Model, Option Pricing, Ornstein-Uhlenbeck Process
Exploring the Scalability of Deep Learning on GPU Clusters
In recent years, we have observed an unprecedented rise in popularity of AI-powered systems. They have become ubiquitous in modern life, being used by countless people every day. Many of these AI systems are powered, entirely or partially, by deep learning models. From language translation to image recognition, deep learning models are being used to build systems with unprecedented accuracy. The primary downside, is the significant time required to train the models. Fortunately, the time needed for training the models is reduced through the use of GPUs rather than CPUs. However, with model complexity ever increasing, training times even with GPUs are on the rise. One possible solution to ever-increasing training times is to use parallelization to enable the distributed training of models on GPU clusters. This thesis investigates how to utilise clusters of GPU-accelerated nodes to achieve the best scalability possible, thus minimising model training times. Author Keywords: Compute Canada, Deep Learning, Distributed Computing, Horovod, Parallel Computing, TensorFlow
Effect of Listing a Stock on the S&P 500 Index on the Stock’s Volatility
This paper investigates the effect of listing a stock on the S&P 500 Index on the stock’s volatility, using various econometrics models: GARCH and EGARCH. The study mainly addresses three issues; firstly, it analyzes stock volatility in two sub-periods, secondly, it determines whether the announcement can account for the fluctuations in the price of the stock, and finally, it investigates the change in the stock’s variance. After isolating the effects of external and industry shock by using the returns on the S&P 500 Index as a proxy, the author finds evidence of structural change in the volatility of stocks after that stock is added to the index. Additionally, the existence of a dominant symmetric effect, which captures the response of volatility to news, indicate that following the onset of including the stock on the index, information flowing into the market increased. However, the rate at which old news is captured in price falls. The empirical evidence also suggests that on average a stocks variance falls and that the announcement to list a stock on the index has little effect on the stock’s price. Author Keywords: EGARCH, GARCH, S&P 500 Index, Symmetric Effect, Volatility
Historic Magnetogram Digitization
The conversion of historical analog images to time series data was performed by using deconvolution for pre-processing, followed by the use of custom built digitization algorithms. These algorithms have been developed to be user friendly with the objective of aiding in the creation of a data set from decades of mechanical observations collected from the Agincourt and Toronto geomagnetic observatories beginning in the 1840s. The created algorithms follow a structure which begins with pre-processing followed by tracing and pattern detection. Each digitized magnetogram was then visually inspected, and the algorithm performance verified to ensure accuracy, and to allow the data to later be connected to create a long-running time-series. Author Keywords: Magnetograms
Assessing the Cost of Reproduction between Male and Female Sex Functions in Hermaphroditic Plants
The cost of reproduction refers to the use of resources for the production of offspring that decreases the availability of resources for future reproductive events and other biological processes. Models of sex-allocation provide insights into optimal patterns of resource investment in male and female sex functions and have been extended to include other components of the life history, enabling assessment of the costs of reproduction. These models have shown that, in general, costs of reproduction through female function should usually exceed costs through male function. However, those previous models only considered allocations from a single pool of shared resources. Recent studies have indicated that the type of resource currency can differ for female and male sex functions, and that this might affect costs of reproduction via effects on other components of the life history. Using multiple invasibility analysis, this study examined resource allocation to male and female sex functions, while simultaneously considering allocations to survival and growth. Allocation patterns were modelled using both shared and separate resource pools. Under shared resources, allocation patterns to male and female sex function followed the results of earlier models. When resource pools were separate, however, allocations to male function often exceeded allocations to female function, even if fitness gains increased less strongly with investment in male function than with investment in female function. These results demonstrate that the costs of reproduction are affected by (1) the types of resources needed for reproduction via female or male function and (2) via trade-offs with other components of the life history. Future studies of the costs of reproduction should examine whether allocations to reproduction via female versus male function usually entail the use of different types of resources. Author Keywords: Cost of Reproduction, Gain Curve, Life History, Resource Allocation Patterns, Resource Currencies
Range-Based Component Models for Conditional Volatility and Dynamic Correlations
Volatility modelling is an important task in the financial markets. This paper first evaluates the range-based DCC-CARR model of Chou et al. (2009) in modelling larger systems of assets, vis-à-vis the traditional return-based DCC-GARCH. Extending Colacito, Engle and Ghysels (2011), range-based volatility specifications are then employed in the first-stage of DCC-MIDAS conditional covariance estimation, including the CARR model of Chou et al. (2005). A range-based analog to the GARCH-MIDAS model of Engle, Ghysels and Sohn (2013) is also proposed and tested - which decomposes volatility into short- and long-run components and corrects for microstructure biases inherent to high-frequency price-range data. Estimator forecasts are evaluated and compared in a minimum-variance portfolio allocation experiment following the methodology of Engle and Colacito (2006). Some consistent inferences are drawn from the results, supporting the models proposed here as empirically relevant alternatives. Range-based DCC-MIDAS estimates produce efficiency gains over DCC-CARR which increase with portfolio size. Author Keywords: asset allocation, DCC MIDAS, dynamic correlations, forecasting, portfolio risk management, volatility
Prescription Drugs
Medication used to treat human illness is one of the greatest developments in human history. In Canada, prescription drugs have been developed and made available to treat a wide variety of illnesses, from infections to heart disease and so on. Records of prescription drug fulfillment at coarse Canadian geographic scales were obtained from Health Canada in order to track the use of these drugs by the Canadian population. The obtained prescription drug fulfillment records were in a variety of inconsistent formats, including a large selection of years for which only paper tabular records were available (hard copies). In this work, we organize, digitize, proof and synthesize the full available data set of prescription drug records, from paper to final database. Extensive quality control was performed on the data before use. This data was then analyzed for temporal and spatial changes in prescription drug use across Canada from 1990-2013. In addition, one of major research areas in environmental epidemiological studies is the study of population health risk associated with exposure to ambient air pollution. Prescription drugs can moderate public health risk, by reducing the drug user's physiological symptoms and preventing acute health effects (e.g., strokes, heart attacks, etc.). The cleaned prescription drug data was considered in the context of a common model to examine its influence on the association between air pollution exposure and various health outcomes. Since, prescription drug data were available only at the provincial level, a Bayesian hierarchical model was employed to include the prescription drugs as a covariate at regional level, which were then combined to estimate the association at national level. Although further investigations are required, the study results suggest that the prescription drugs influenced the air pollution related public health risk. Author Keywords: Data, Error checking, Population health, Prescriptions
Modelling Request Access Patterns for Information on the World Wide Web
In this thesis, we present a framework to model user object-level request patterns in the World Wide Web.This framework consists of three sub-models: one for file access, one for Web pages, and one for storage sites. Web Pages are modelled to be made up of different types and sizes of objects, which are characterized by way of categories. We developed a discrete event simulation to investigate the performance of systems that utilize our model.Using this simulation, we established parameters that produce a wide range of conditions that serve as a basis for generating a variety of user request patterns. We demonstrated that with our framework, we can affect the mean response time (our performance metric of choice) by varying the composition of Web pages using our categories. To further test our framework, it was applied to a Web caching system, for which our results showed improved mean response time and server load. Author Keywords: discrete event simulation (DES), Internet, performance modelling, Web caching, World Wide Web
Long-term Financial Sustainability of China's Urban Basic Pension System
Population aging has become a worldwide concern since the nineteenth century. The decrease in birth rate and the increase in life expectancy will make China’s population age rapidly. If the growth rate of the number of workers is less than that of the number of retirees, in the long run, there will be fewer workers per retiree. This will apply great pressure to China’s public pension system in the next several decades. This is a global problem known as the “pension crisis”. In this thesis, a long-term vision for China’s urban pension system is presented. Based on the mathematical models and the projections for demographic variables, economic variables and pension scheme variables, we test how the changes in key variables affect the balances of the pension fund in the next 27 years. This thesis applies methods of deterministic and stochastic modeling as well as sensitivity analysis to the problem. Using sensitivity analysis, we find that the pension fund balance is highly sensitive to the changes in retirement age compared with other key variables. Monte Carlo simulations are also used to find the possible distributions of the pension fund balance by the end of the projection period. Finally, according to my analysis, several changes in retirement age are recommended in order to maintain the sustainability of China’s urban basic pension scheme. Author Keywords: China, demographic changes, Monte Carlo simulation, pension fund, sensitivity tests, sustainability
Modelling Depressive Symptoms in Emerging Adulthood
Depression during the transition into adulthood is a growing mental health concern, with overwhelming evidence linking the developmental risk for depressive symptoms with maternal depression. In addition, there is a lack of research on the protective role of socioemotional competencies in this context. This study examines independent and joint effects of maternal depression and trait emotional intelligence (TEI) on the longitudinal trajectory of depressive symptoms during emerging adulthood. A series of latent growth models was applied to three biennial cycles of data from a nationally representative sample (N=933) from the Canadian National Longitudinal Survey of Children and Youth. We assessed the trajectory of self-reported depressive symptoms from age 20 to 24 years, as well as whether it was moderated by maternal depression at age 10 to 11 and TEI at age 20, separately by gender. The results indicated that mean levels of depression declined during the emerging adulthood in females, but remained relatively stable in males. Maternal depressive symptoms significantly positively predicted depressive symptoms across the entire emerging adulthood in females, but only at age 20-21 for males. In addition, likelihood of developing depressive symptoms was attenuated by higher global TEI in both females and males, and additionally by higher interpersonal skills in males. Our findings suggest that interventions for depressive symptoms in emerging adulthood should consider development of socioemotional competencies. Author Keywords: Depression, Depressive Symptoms, Emerging Adulthood, Intergenerational Risk, Longitudinal, Trait Emotional Intelligence

Pages

Search Our Digital Collections

Query

Enabled Filters

  • (-) ≠ Bates
  • (-) = Applied Modeling and Quantitative Methods

Filter Results

Date

2004 - 2024
(decades)
Specify date range: Show
Format: 2024/03/28