Trent University Graduate Thesis Collection

Pages

: Characteristics of Models for Representation of Mathematical Structure in Typesetting Applications and the Cognition of Digitally Transcribing Mathematics; The digital typesetting of mathematics can present many challenges to users, especially those of novice to intermediate experience levels. Through a series of experiments, we show that two models used to represent mathematical structure in these typesetting applications, the 1-dimensional structure based model and the 2-dimensional freeform model, cause interference with users' working memory during the process of transcribing mathematical content. This is a notable finding as a connection between working memory and mathematical performance has been established in the literature. Furthermore, we find that elements of these models allow them to handle various types of mathematical notation with different degrees of success. Notably, the 2-dimensional freeform model allows users to insert and manipulate exponents with increased efficiency and reduced cognitive load and working memory interference while the 1-dimensional structure based model allows for handling of the fraction structure with greater efficiency and decreased cognitive load. Author Keywords: mathematical cognition, mathematical software, user experience, working memory

: Development of a Cross-Platform Solution for Calculating Certified Emission Reduction Credits in Forestry Projects under the Kyoto Protocol of the UNFCCC; This thesis presents an exploration of the requirements for and development of a software tool to calculate Certified Emission Reduction (CERs) credits for afforestation and reforestation projects conducted under the Clean Development Mechanism (CDM). We examine the relevant methodologies and tools to determine what is required to create a software package that can support a wide variety of projects involving a large variety of data and computations. During the requirements gathering, it was determined that the software package developed would need to support the ability to enter and edit equations at runtime. To create the software we used Java for the programming language, an H2 database to store our data, and an XML file to store our configuration settings. Through these choices, we can build a cross-platform software solution for the purpose outlined above. The end result is a versatile software tool through which users can create and customize projects to meet their unique needs as well as utilize the features provided to streamline the management of their CDM projects. Author Keywords: Carbon Emissions, Climate Change, Forests, Java, UNFCCC, XML

: Educational Data Mining and Modelling on Trent University Students’ Academic Performance; Higher education is important. It enhances both individual and social welfare by improving productivity, life satisfaction, and health outcomes, and by reducing rates of crime. Universities play a critical role in providing that education. Because academic institutions face resource constraints, it is thus important that they deploy resources in support of student success in the most efficient ways possible. To inform that efficient deployment, this research analyzes institutional data reflecting undergraduate student performance to identify predictors of student success measured by GPA, rates of credit accumulation, and graduation rates. Using methods of cluster analysis and machine learning, the analysis yields predictions for the probabilities of individual success. Author Keywords: Educational data mining, Students’ academic performance modelling

: Sinc-Collocation Difference Methods for Solving the Gross-Pitaevskii Equation; The time-dependent Gross-Pitaevskii Equation, describing the movement of parti- cles in quantum mechanics, may not be solved analytically due to its inherent non- linearity. Hence numerical methods are of importance to approximate the solution. This study develops a discrete scheme in time and space to simulate the solution defined in a finite domain by using the Crank-Nicolson difference method and Sinc Collocation Methods (SCM), respectively. In theory and practice, the time discretiz- ing system decays errors in the second-order of accuracy, and SCMs are decaying errors exponentially. A new SCM with a unique boundary treatment is proposed and compared with the original SCM and other similar numerical techniques in time costs and numerical errors. As a result, the new SCM decays errors faster than the original one. Also, to attain the same accuracy, the new SCM interpolates fewer nodes than the original SCM, which saves computational costs. The new SCM is capable of approximating partial differential equations under different boundary con- ditions, which can be extensively applied in fitting theory. Author Keywords: Crank-Nicolson difference method, Gross-Pitaevskii Equation, Sinc-Collocation methods

: Combinatorial Collisions in Database Matching; Databases containing information such as location points, web searches and fi- nancial transactions are becoming the new normal as technology advances. Conse- quentially, searches and cross-referencing in big data are becoming a common prob- lem as computing and statistical analysis increasingly allow for the contents of such databases to be analyzed and dredged for data. Searches through big data are fre- quently done without a hypothesis formulated before hand, and as these databases grow and become more complex, the room for error also increases. Regardless of how these searches are framed, the data they collect may lead to false convictions. DNA databases may be of particular interest, since DNA is often viewed as significant evi- dence, however, such evidence is sometimes not interpreted in a proper manner in the court room. In this thesis, we present and validate a framework for investigating var- ious collisions within databases using Monte Carlo Simulations, with examples from DNA. We also discuss how DNA evidence may be wrongly portrayed in the court room, and the explanation behind this. We then outline the problem which may occur when numerous types of databases are searched for suspects, and framework to address these problems. Author Keywords: big data analysis, collisions, database searches, DNA databases, monte carlo simulation

: Influence of geodemographic factors on electricity consumption and forecasting models; The residential sector is a major consumer of electricity, and its demand will rise by 65 percent by the end of 2050. The electricity consumption of a household is determined by various factors, e.g. house size, socio-economic status of the family, size of the family, etc. Previous studies have only identified a limited number of socio-economic and dwelling factors. In this thesis, we study the significance of 826 geodemographic factors on electricity consumption for 4917 homes in the City of London. Geodemographic factors cover a wide array of categories e.g. social, economic, dwelling, family structure, health, education, finance, occupation, and transport. Using Spearman correlation, we have identified 354 factors that are strongly correlated with electricity consumption. We also examine the impact of using geodemographic factors in designing forecasting models. In particular, we develop an encoder-decoder LSTM model which shows improved accuracy with geodemographic factors. We believe that our study will help energy companies design better energy management strategies. Author Keywords: Electricity forecasting, Encoder-decoder model, Geodemographic factors, Socio-economic factors

: Pathways to Innovation; Research and development activities conducted at universities and firms fuel economic growth and play a key role in the process of innovation. Specifically, prior research has investigated the widespread university-to-firm research development path and concluded that universities are better suited for early stage of research while firms are better positioned for later stages. This thesis aims to present a novel explanation for the pervasive university-to-firm research development path. The model developed uses game theory to visualize and analyze interactions between a firm and university under different strategies. The results reveal that as academic research signals knowledge it helps attract tuition paying students. Generating these tuition revenues is facilitated by university research discoveries, which, once published, a firm can build upon to make new innovative products. In an environment of weak intellectual property rights, moreover, the university-to-firm research development path enables firms to bypass the hefty costs that are involved in basic research activities. The model also provides a range of solution scenarios where a university and firm may find it viable to initiate a research line. Author Keywords: Game theory, Intellectual property rights, Nash equilibrium, Research and development, University to-firm research path

: Automated Grading of UML Class Diagrams; Learning how to model the structural properties of a problem domain or an object-oriented design in form of a class diagram is an essential learning task in many software engineering courses. Since grading UML assignments is a cumbersome and time-consuming task, there is a need for an automated grading approach that can assist the instructors by speeding up the grading process, as well as ensuring consistency and fairness for large classrooms. This thesis presents an approach for automated grading of UML class diagrams. A metamodel is proposed to establish mappings between the instructor solution and all the solutions for a class, which allows the instructor to easily adjust the grading scheme. The approach uses a grading algorithm that uses syntactic, semantic and structural matching to match a student's solutions with the instructor's solution. The efficiency of this automated grading approach has been empirically evaluated when applied in two real world settings: a beginner undergraduate class of 103 students required to create a object-oriented design model, and an advanced undergraduate class of 89 students elaborating a domain model. The experiment result shows that the grading approach should be configurable so that the grading approach can adapt the grading strategy and strictness to the level of the students and the grading styles of the different instructors. Also it is important to considering multiple solution variants in the grading process. The grading algorithm and tool are proposed and validated experimentally. Author Keywords: automated grading, class diagrams, model comparison

: Solving Differential and Integro-Differential Boundary Value Problems using a Numerical Sinc-Collocation Method Based on Derivative Interpolation; In this thesis, a new sinc-collocation method based upon derivative interpolation is developed for solving linear and nonlinear boundary value problems involving differential as well as integro-differential equations. The sinc-collocation method is chosen for its ease of implementation, exponential convergence of error, and ability to handle to singularities in the BVP. We present a unique method of treating boundary conditions and introduce the concept of the stretch factor into the conformal mappings of domains. The result is a method that achieves great accuracy while reducing computational cost. In most cases, the results from the method greatly exceed the published results of comparable methods in both accuracy and efficiency. The method is tested on the Blasius problem, the Lane-Emden problem and generalised to cover Fredholm-Volterra integro-differential problems. The results show that the sinc-collocation method with derivative interpolation is a viable and preferable method for solving nonlinear BVPs. Author Keywords: Blasius, Boundary Value Problem, Exponential convergence, Integro-differential, Nonlinear, Sinc

: Fraud Detection in Financial Businesses Using Data Mining Approaches; The purpose of this research is to apply four methods on two data sets, a Synthetic dataset and a Real-World dataset, and compare the results to each other with the intention of arriving at methods to prevent fraud. Methods used include Logistic Regression, Isolation Forest, Ensemble Method and Generative Adversarial Networks. Results show that all four models achieve accuracies between 91% and 99% except Isolation Forest gave 69% accuracy for the Synthetic dataset. The four models detect fraud well when built on a training set and tested with a test set. Logistic Regression achieves good results with less computational eorts. Isolation Forest achieve lower results accuracies when the data is sparse and not preprocessed correctly. Ensemble Models achieve the highest accuracy for both datasets. GAN achieves good results but overts if a big number of epochs was used. Future work could incorporate other classiers. Author Keywords: Ensemble Method, GAN, Isolation forest, Logistic Regression, Outliers

: Problem Solving as a Path to Understanding Mathematics Representations; Little is actually known about how people cognitively process and integrate information when solving complex mathematical problems. In this thesis, eye-tracking was used to examine how people read and integrate information from mathematical symbols and complex formula, with eye fixations being used as a measure of their current focus of attention. Each participant in the studies was presented with a series of stimuli in the form of mathematical problems and their eyes were tracked as they worked through the problem mentally. From these examinations, we were able to demonstrate differences in both the comprehension and problem-solving, with the results suggesting that what information is selected, and how, is responsible for a large portion of success in solving such problems. We were also able to examine how different mathematical representations of the same mathematical object are attended to by students. Author Keywords: eye-tracking, mathematical notation, mathematical representations, problem identification, problem-solving, symbolism

: Framework for Testing Time Series Interpolators; The spectrum of a given time series is a characteristic function describing its frequency properties. Spectrum estimation methods require time series data to be contiguous in order for robust estimators to retain their performance. This poses a fundamental challenge, especially when considering real-world scientific data that is often plagued by missing values, and/or irregularly recorded measurements. One area of research devoted to this problem seeks to repair the original time series through interpolation. There are several algorithms that have proven successful for the interpolation of considerably large gaps of missing data, but most are only valid for use on stationary time series: processes whose statistical properties are time-invariant, which is not a common property of real-world data. The Hybrid Wiener interpolator is a method that was designed for repairing nonstationary data, rendering it suitable for spectrum estimation. This thesis work presents a computational framework designed for conducting systematic testing on the statistical performance of this method in light of changes to gap structure and departures from the stationarity assumption. A comprehensive audit of the Hybrid Wiener Interpolator against other state-of-the art algorithms will also be explored. Author Keywords: applied statistics, hybrid wiener interpolator, imputation, interpolation, R statistical software, time series

In collections

Trent University Digital Collections

Graduate Theses & Dissertations

Pages

Pages

Description

In collections

Search Our Digital Collections

Query

Enabled Filters

Sort Results

Filter Results

Date

Collection

Author Name

Name (Any)

Degree

Subject (Topic)

Display

Trent University Library & Archives

About

Accessibility

Copyright

Graduate Theses & Dissertations

You are here

Pages

Pages

Description

In collections

Search Our Digital Collections

Query

Enabled Filters

Sort Results

Filter Results

Date

Collection

Author Name

Name (Any)

Degree

Subject (Topic)

Display

Trent University Library & Archives

About

Accessibility

Copyright