TV
T.J. Viering
26 records found
1
Malware Evolution
Unraveling Malware Genomics: Synergistic Approach using Deep Learning and Phylogenetic Analysis for Evolutionary Insights
The rapid advancement of artificial intelligence technologies has significantly increased the complexity of polymorphic and metamorphic malware, presenting new challenges to cybersecurity defenses. Our study introduces a novel bioinformatics-inspired approach, leveraging dee ...
Learning curves illustrate the relationship between the performance of learning algorithms and the increasing volume of training data [1, 2, 3]. While the concept of learning curves is well-established, clustering these curves based on fitting parameters remains an underexplored
...
Learning curves are useful to determine the amount of data needed for a certain performance. The conventional belief is that increasing the amount of data improves performance. However, recent work challenges this assumption, and shows nonmonotonic behaviors of certain learners o
...
Learning Curve Extrapolation using Machine Learning
Benefits and Limitations of using LCPFN for Learning Curve Extrapolation
This study explores the extrapolation of learning curves, a crucial aspect in evaluating learner performance with varying dataset sample sizes. We use the Learning Curve Prior Fitted Network (LC-PFN), a transformer pre-trained on synthetic data with proficiency in approximate Bay
...
Learning Curves
How do Data Imbalances affect the Learning Curves using Nearest Mean Model?
This research investigates the impact of data imbalances on the learning curve using the nearest mean model. Learning curves are useful to represent the performance of the model as the training size increases. Imbalanced datasets are often encountered in real-life scenarios and p
...
Clustering Learning Curves in Machine Learning using K-Means Algorithm
Can patterns be identified amongst learning curves after the application of the K-Means algorithm using point and statistical vectors?
A learning curve can serve as an indicator of the “performance of trained models versus the training set size” [1]. Recent research on learning curve analysis has been carried out within the Learning Curve Database (LCDB) [2] This paper will investigate if there are similarities
...
Machine learning algorithms (learners) are typically expected to produce monotone learning curves, meaning that their performance improves as the size of the training dataset increases. However, it is important to note that this behavior is not universally observed. Recently ...
”How Much Data is Enough?” Learning curves for machine learning
Investigating alternatives to the Levenberg-Marquardt algorithm for learning curve extrapolation
The conducted research explores fitting algorithms for learning curves. Learning curves describe how the performance of a machine learning model changes with the size of the training input. Therefore, fitting these learning curves and extrapolating them can help determine the req
...
A Comparative Analysis of Learning Curve Models and their Applicability in Different Scenarios
Finding datasets patterns which lead to certain parametric curve model
Learning curves display predictions of the chosen model’s performance for different training set sizes. They can help estimate the amount of data required to achieve a minimal error rate, thus aiding in reducing the cost of data collection. However, our understanding and knowledg
...
Learning curves in machine learning are graphical representations that depict the relationship between a model's performance and the amount of training data it has been exposed to. They play a fundamental role in obtaining the knowledge and skills across a range of domains. Altho
...
Empirical Investigation of Learning Curves
Assessing Convexity Characteristics
Nonconvexity in learning curves is almost always undesirable. A machine learning model with a non-convex learning curve either requires a larger quantity of data to observe progress in its accuracy or experiences an exponential decrease of accuracy at low sample sizes, with no im
...
Non-Monotonicity in Empirical Learning Curves
Identifying non-monotonicity through slope approximations on discrete points
Learning curves are used to shape the performance of a Machine Learning (ML) model with respect to the size of the set used for training it. It was commonly thought that adding more training samples would increase the model's accuracy (i.e., they are monotone), but recent works s
...
Learning curves have been used extensively to analyse learners' behaviour and practical tasks such as model selection, speeding up training and tuning models. Nonetheless, we still have a relatively limited understanding of the behaviour of learning curves themselves, in particul
...
A learning curve displays the measure of accuracy/error on test data of a machine learning algorithm trained on different amounts of training data. They can be modeled by parametric curve models that help predict accuracy improvement through curve extrapolation methods. However,
...
Extrapolation of the learning curve provides an estimation of how much data is needed to achieve the desired performance. It can be beneficial when gathering data is complex, or computation resource is limited. One of the essential processes of learning curve extrapolation is cur
...
The learning curve illustrates how the generalization performance of the learner evolves with more training data. It can predict the amount of data needed for decent accuracy and the highest achievable accuracy. However, the behavior of learning curves is not well understood. Man
...
Does a convolutional neural network (CNN) always have to be deep to learn a task? This is an important question as deeper networks are generally harder to train. We trained shallow and deep CNNs and evaluated their performance on simple regression tasks, such as computing the mea
...
This research provides an overview on how training Convolutional Neural Networks (CNNs) on imbalanced datasets affect the performance of the CNNs. Datasets could be imbalanced as a result of several reasons. There are for example naturally less samples of rare diseases. Since the
...
With an expectation of 8.3 trillion photos stored in 2021 [1], convolutional neural networks (CNN) are beginning to be preeminent in the field of image recognition. However, with this deep neural network (DNN) still being seen as a black box, it is hard to fully employ its capabi
...
It sounds like Greek to me
Performance of phonetic representations for language identification
This paper compares the performance of two phonetic notations, IPA and ASJPcode, with the alphabetical notation for word-level language identification. Two machine learning models, a Multilayer Percerptron and a Logistic Regression model, are used to classify words using each o
...