MAKE | Topical Collection : Extravaganza Feature Papers on Hot Topics in Machine Learning and Knowledge Extraction

2024

Jump to: 2023, 2022

22 pages, 2817 KiB

Open AccessArticle

Enhanced Graph Representation Convolution: Effective Inferring Gene Regulatory Network Using Graph Convolution Network with Self-Attention Graph Pooling Layer

by Duaa Mohammad Alawad, Ataur Katebi and Md Tamjidul Hoque

Mach. Learn. Knowl. Extr. 2024, 6(3), 1818-1839; https://doi.org/10.3390/make6030089 - 1 Aug 2024

Viewed by 178

Abstract

Studying gene regulatory networks (GRNs) is paramount for unraveling the complexities of biological processes and their associated disorders, such as diabetes, cancer, and Alzheimer’s disease. Recent advancements in computational biology have aimed to enhance the inference of GRNs from gene expression data, a [...] Read more.

Studying gene regulatory networks (GRNs) is paramount for unraveling the complexities of biological processes and their associated disorders, such as diabetes, cancer, and Alzheimer’s disease. Recent advancements in computational biology have aimed to enhance the inference of GRNs from gene expression data, a non-trivial task given the networks’ intricate nature. The challenge lies in accurately identifying the myriad interactions among transcription factors and target genes, which govern cellular functions. This research introduces a cutting-edge technique, EGRC (Effective GRN Inference applying Graph Convolution with Self-Attention Graph Pooling), which innovatively conceptualizes GRN reconstruction as a graph classification problem, where the task is to discern the links within subgraphs that encapsulate pairs of nodes. By leveraging Spearman’s correlation, we generate potential subgraphs that bring nonlinear associations between transcription factors and their targets to light. We use mutual information to enhance this, capturing a broader spectrum of gene interactions. Our methodology bifurcates these subgraphs into ‘Positive’ and ‘Negative’ categories. ‘Positive’ subgraphs are those where a transcription factor and its target gene are connected, including interactions among their neighbors. ‘Negative’ subgraphs, conversely, denote pairs without a direct connection. EGRC utilizes dual graph convolution network (GCN) models that exploit node attributes from gene expression profiles and graph embedding techniques to classify these. The performance of EGRC is substantiated by comprehensive evaluations using the DREAM5 datasets. Notably, EGRC attained an AUROC of 0.856 and an AUPR of 0.841 on the E. coli dataset. In contrast, the in silico dataset achieved an AUROC of 0.5058 and an AUPR of 0.958. Furthermore, on the S. cerevisiae dataset, EGRC recorded an AUROC of 0.823 and an AUPR of 0.822. These results underscore the robustness of EGRC in accurately inferring GRNs across various organisms. The advanced performance of EGRC represents a substantial advancement in the field, promising to deepen our comprehension of the intricate biological processes and their implications in both health and disease. Full article

► Show Figures

Figure 1

20 pages, 1138 KiB

Open AccessArticle

Diverse Machine Learning for Forecasting Goal-Scoring Likelihood in Elite Football Leagues

by Christina Markopoulou, George Papageorgiou and Christos Tjortjis

Mach. Learn. Knowl. Extr. 2024, 6(3), 1762-1781; https://doi.org/10.3390/make6030086 - 28 Jul 2024

Viewed by 284

Abstract

The field of sports analytics has grown rapidly, with a primary focus on performance forecasting, enhancing the understanding of player capabilities, and indirectly benefiting team strategies and player development. This work aims to forecast and comparatively evaluate players’ goal-scoring likelihood in four elite [...] Read more.

The field of sports analytics has grown rapidly, with a primary focus on performance forecasting, enhancing the understanding of player capabilities, and indirectly benefiting team strategies and player development. This work aims to forecast and comparatively evaluate players’ goal-scoring likelihood in four elite football leagues (Premier League, Bundesliga, La Liga, and Serie A) by mining advanced statistics from 2017 to 2023. Six types of machine learning (ML) models were developed and tested individually through experiments on the comprehensive datasets collected for these leagues. We also tested the upper 30th percentile of the best-performing players based on their performance in the last season, with varied features evaluated to enhance prediction accuracy in distinct scenarios. The results offer insights into the forecasting abilities of those leagues, identifying the best forecasting methodologies and the factors that most significantly contribute to the prediction of players’ goal-scoring. XGBoost consistently outperformed other models in most experiments, yielding the most accurate results and leading to a well-generalized model. Notably, when applied to Serie A, it achieved a mean absolute error (MAE) of 1.29. This study provides insights into ML-based performance prediction, advancing the field of player performance forecasting. Full article

► Show Figures

Figure 1

22 pages, 1933 KiB

Open AccessArticle

Learning Effective Good Variables from Physical Data

by Giulio Barletta, Giovanni Trezza and Eliodoro Chiavazzo

Mach. Learn. Knowl. Extr. 2024, 6(3), 1597-1618; https://doi.org/10.3390/make6030077 - 12 Jul 2024

Viewed by 476

Abstract

We assume that a sufficiently large database is available, where a physical property of interest and a number of associated ruling primitive variables or observables are stored. We introduce and test two machine learning approaches to discover possible groups or combinations of primitive [...] Read more.

We assume that a sufficiently large database is available, where a physical property of interest and a number of associated ruling primitive variables or observables are stored. We introduce and test two machine learning approaches to discover possible groups or combinations of primitive variables, regardless of data origin, being it numerical or experimental: the first approach is based on regression models, whereas the second on classification models. The variable group (here referred to as the new effective good variable) can be considered as successfully found when the physical property of interest is characterized by the following effective invariant behavior: in the first method, invariance of the group implies invariance of the property up to a given accuracy; in the other method, upon partition of the physical property values into two or more classes, invariance of the group implies invariance of the class. For the sake of illustration, the two methods are successfully applied to two popular empirical correlations describing the convective heat transfer phenomenon and to the Newton’s law of universal gravitation. Full article

► Show Figures

Graphical abstract

19 pages, 771 KiB

Open AccessSystematic Review

Navigating the Multimodal Landscape: A Review on Integration of Text and Image Data in Machine Learning Architectures

by Maisha Binte Rashid, Md Shahidur Rahaman and Pablo Rivas

Mach. Learn. Knowl. Extr. 2024, 6(3), 1545-1563; https://doi.org/10.3390/make6030074 - 9 Jul 2024

Viewed by 675

Abstract

Images and text have become essential parts of the multimodal machine learning (MMML) framework in today’s world because data are always available, and technological breakthroughs bring disparate forms together, and while text adds semantic richness and narrative to images, images capture visual subtleties [...] Read more.

Images and text have become essential parts of the multimodal machine learning (MMML) framework in today’s world because data are always available, and technological breakthroughs bring disparate forms together, and while text adds semantic richness and narrative to images, images capture visual subtleties and emotions. Together, these two media improve knowledge beyond what would be possible with just one revolutionary application. This paper investigates feature extraction and advancement from text and image data using pre-trained models in MMML. It offers a thorough analysis of fusion architectures, outlining text and image data integration and evaluating their overall advantages and effects. Furthermore, it draws attention to the shortcomings and difficulties that MMML currently faces and guides areas that need more research and development. We have gathered 341 research articles from five digital library databases to accomplish this. Following a thorough assessment procedure, we have 88 research papers that enable us to evaluate MMML in detail. Our findings demonstrate that pre-trained models, such as BERT for text and ResNet for images, are predominantly employed for feature extraction due to their robust performance in diverse applications. Fusion techniques, ranging from simple concatenation to advanced attention mechanisms, are extensively adopted to enhance the representation of multimodal data. Despite these advancements, MMML models face significant challenges, including handling noisy data, optimizing dataset size, and ensuring robustness against adversarial attacks. Our findings highlight the necessity for further research to address these challenges, particularly in developing methods to improve the robustness of MMML models. Full article

► Show Figures

Figure 1

26 pages, 8622 KiB

Open AccessArticle

Using Deep Q-Learning to Dynamically Toggle between Push/Pull Actions in Computational Trust Mechanisms

by Zoi Lygizou and Dimitris Kalles

Mach. Learn. Knowl. Extr. 2024, 6(3), 1413-1438; https://doi.org/10.3390/make6030067 - 27 Jun 2024

Viewed by 359

Abstract

Recent work on decentralized computational trust models for open multi-agent systems has resulted in the development of CA, a biologically inspired model which focuses on the trustee’s perspective. This new model addresses a serious unresolved problem in existing trust and reputation models, namely [...] Read more.

Recent work on decentralized computational trust models for open multi-agent systems has resulted in the development of CA, a biologically inspired model which focuses on the trustee’s perspective. This new model addresses a serious unresolved problem in existing trust and reputation models, namely the inability to handle constantly changing behaviors and agents’ continuous entry and exit from the system. In previous work, we compared CA to FIRE, a well-known trust and reputation model, and found that CA is superior when the trustor population changes, whereas FIRE is more resilient to the trustee population changes. Thus, in this paper, we investigate how the trustors can detect the presence of several dynamic factors in their environment and then decide which trust model to employ in order to maximize utility. We frame this problem as a machine learning problem in a partially observable environment, where the presence of several dynamic factors is not known to the trustor, and we describe how an adaptable trustor can rely on a few measurable features so as to assess the current state of the environment and then use Deep Q-Learning (DQL), in a single-agent reinforcement learning setting, to learn how to adapt to a changing environment. We ran a series of simulation experiments to compare the performance of the adaptable trustor with the performance of trustors using only one model (FIRE or CA) and we show that an adaptable agent is indeed capable of learning when to use each model and, thus, perform consistently in dynamic environments. Full article

► Show Figures

Figure 1

18 pages, 2035 KiB

Open AccessReview

Machine Learning in Geosciences: A Review of Complex Environmental Monitoring Applications

by Maria Silvia Binetti, Carmine Massarelli and Vito Felice Uricchio

Mach. Learn. Knowl. Extr. 2024, 6(2), 1263-1280; https://doi.org/10.3390/make6020059 - 5 Jun 2024

Viewed by 1004

Abstract

This is a systematic literature review of the application of machine learning (ML) algorithms in geosciences, with a focus on environmental monitoring applications. ML algorithms, with their ability to analyze vast quantities of data, decipher complex relationships, and predict future events, and they [...] Read more.

This is a systematic literature review of the application of machine learning (ML) algorithms in geosciences, with a focus on environmental monitoring applications. ML algorithms, with their ability to analyze vast quantities of data, decipher complex relationships, and predict future events, and they offer promising capabilities to implement technologies based on more precise and reliable data processing. This review considers several vulnerable and particularly at-risk themes as landfills, mining activities, the protection of coastal dunes, illegal discharges into water bodies, and the pollution and degradation of soil and water matrices in large industrial complexes. These case studies about environmental monitoring provide an opportunity to better examine the impact of human activities on the environment, with a specific focus on water and soil matrices. The recent literature underscores the increasing importance of ML in these contexts, highlighting a preference for adapted classic models: random forest (RF) (the most widely used), decision trees (DTs), support vector machines (SVMs), artificial neural networks (ANNs), convolutional neural networks (CNNs), principal component analysis (PCA), and much more. In the field of environmental management, the following methodologies offer invaluable insights that can steer strategic planning and decision-making based on more accurate image classification, prediction models, object detection and recognition, map classification, data classification, and environmental variable predictions. Full article

► Show Figures

Figure 1

20 pages, 951 KiB

Open AccessReview

Bayesian Networks for the Diagnosis and Prognosis of Diseases: A Scoping Review

by Kristina Polotskaya, Carlos S. Muñoz-Valencia, Alejandro Rabasa, Jose A. Quesada-Rico, Domingo Orozco-Beltrán and Xavier Barber

Mach. Learn. Knowl. Extr. 2024, 6(2), 1243-1262; https://doi.org/10.3390/make6020058 - 4 Jun 2024

Viewed by 1151

Abstract

Bayesian networks (BNs) are probabilistic graphical models that leverage Bayes’ theorem to portray dependencies and cause-and-effect relationships between variables. These networks have gained prominence in the field of health sciences, particularly in diagnostic processes, by allowing the integration of medical knowledge into models [...] Read more.

Bayesian networks (BNs) are probabilistic graphical models that leverage Bayes’ theorem to portray dependencies and cause-and-effect relationships between variables. These networks have gained prominence in the field of health sciences, particularly in diagnostic processes, by allowing the integration of medical knowledge into models and addressing uncertainty in a probabilistic manner. Objectives: This review aims to provide an exhaustive overview of the current state of Bayesian networks in disease diagnosis and prognosis. Additionally, it seeks to introduce readers to the fundamental methodology of BNs, emphasising their versatility and applicability across varied medical domains. Employing a meticulous search strategy with MeSH descriptors in diverse scientific databases, we identified 190 relevant references. These were subjected to a rigorous analysis, resulting in the retention of 60 papers for in-depth review. The robustness of our approach minimised the risk of selection bias. Results: The selected studies encompass a wide range of medical areas, providing insights into the statistical methodology, implementation feasibility, and predictive accuracy of BNs, as evidenced by an average area under the curve (AUC) exceeding 75%. The comprehensive analysis underscores the adaptability and efficacy of Bayesian networks in diverse clinical scenarios. The majority of the examined studies demonstrate the potential of BNs as reliable adjuncts to clinical decision-making. The findings of this review affirm the role of Bayesian networks as accessible and versatile artificial intelligence tools in healthcare. They offer a viable solution to address complex medical challenges, facilitating timely and informed decision-making under conditions of uncertainty. The extensive exploration of Bayesian networks presented in this review highlights their significance and growing impact in the realm of disease diagnosis and prognosis. It underscores the need for further research and development to optimise their capabilities and broaden their applicability in addressing diverse and intricate healthcare challenges. Full article

► Show Figures

Figure 1

9 pages, 1615 KiB

Open AccessArticle

Evaluation of AI ChatBots for the Creation of Patient-Informed Consent Sheets

by Florian Jürgen Raimann, Vanessa Neef, Marie Charlotte Hennighausen, Kai Zacharowski and Armin Niklas Flinspach

Mach. Learn. Knowl. Extr. 2024, 6(2), 1145-1153; https://doi.org/10.3390/make6020053 - 24 May 2024

Viewed by 858

Abstract

Introduction: Large language models (LLMs), such as ChatGPT, are a topic of major public interest, and their potential benefits and threats are a subject of discussion. The potential contribution of these models to health care is widely discussed. However, few studies to date [...] Read more.

Introduction: Large language models (LLMs), such as ChatGPT, are a topic of major public interest, and their potential benefits and threats are a subject of discussion. The potential contribution of these models to health care is widely discussed. However, few studies to date have examined LLMs. For example, the potential use of LLMs in (individualized) informed consent remains unclear. Methods: We analyzed the performance of the LLMs ChatGPT 3.5, ChatGPT 4.0, and Gemini with regard to their ability to create an information sheet for six basic anesthesiologic procedures in response to corresponding questions. We performed multiple attempts to create forms for anesthesia and analyzed the results checklists based on existing standard sheets. Results: None of the LLMs tested were able to create a legally compliant information sheet for any basic anesthesiologic procedure. Overall, fewer than one-third of the risks, procedural descriptions, and preparations listed were covered by the LLMs. Conclusions: There are clear limitations of current LLMs in terms of practical application. Advantages in the generation of patient-adapted risk stratification within individual informed consent forms are not available at the moment, although the potential for further development is difficult to predict. Full article

► Show Figures

Figure 1

17 pages, 9163 KiB

Open AccessArticle

EyeXNet: Enhancing Abnormality Detection and Diagnosis via Eye-Tracking and X-ray Fusion

by Chihcheng Hsieh, André Luís, José Neves, Isabel Blanco Nobre, Sandra Costa Sousa, Chun Ouyang, Joaquim Jorge and Catarina Moreira

Mach. Learn. Knowl. Extr. 2024, 6(2), 1055-1071; https://doi.org/10.3390/make6020048 - 9 May 2024

Viewed by 1293

Abstract

Integrating eye gaze data with chest X-ray images in deep learning (DL) has led to contradictory conclusions in the literature. Some authors assert that eye gaze data can enhance prediction accuracy, while others consider eye tracking irrelevant for predictive tasks. We argue that [...] Read more.

Integrating eye gaze data with chest X-ray images in deep learning (DL) has led to contradictory conclusions in the literature. Some authors assert that eye gaze data can enhance prediction accuracy, while others consider eye tracking irrelevant for predictive tasks. We argue that this disagreement lies in how researchers process eye-tracking data as most remain agnostic to the human component and apply the data directly to DL models without proper preprocessing. We present EyeXNet, a multimodal DL architecture that combines images and radiologists’ fixation masks to predict abnormality locations in chest X-rays. We focus on fixation maps during reporting moments as radiologists are more likely to focus on regions with abnormalities and provide more targeted regions to the predictive models. Our analysis compares radiologist fixations in both silent and reporting moments, revealing that more targeted and focused fixations occur during reporting. Our results show that integrating the fixation masks in a multimodal DL architecture outperformed the baseline model in five out of eight experiments regarding average Recall and six out of eight regarding average Precision. Incorporating fixation masks representing radiologists’ classification patterns in a multimodal DL architecture benefits lesion detection in chest X-ray (CXR) images, particularly when there is a strong correlation between fixation masks and generated proposal regions. This highlights the potential of leveraging fixation masks to enhance multimodal DL architectures for CXR image analysis. This work represents a first step towards human-centered DL, moving away from traditional data-driven and human-agnostic approaches. Full article

► Show Figures

Figure 1

22 pages, 11463 KiB

Open AccessArticle

VOD: Vision-Based Building Energy Data Outlier Detection

by Jinzhao Tian, Tianya Zhao, Zhuorui Li, Tian Li, Haipei Bie and Vivian Loftness

Mach. Learn. Knowl. Extr. 2024, 6(2), 965-986; https://doi.org/10.3390/make6020045 - 3 May 2024

Viewed by 1290

Abstract

Outlier detection plays a critical role in building operation optimization and data quality maintenance. However, existing methods often struggle with the complexity and variability of building energy data, leading to poorly generalized and explainable results. To address the gap, this study introduces a [...] Read more.

Outlier detection plays a critical role in building operation optimization and data quality maintenance. However, existing methods often struggle with the complexity and variability of building energy data, leading to poorly generalized and explainable results. To address the gap, this study introduces a novel Vision-based Outlier Detection (VOD) approach, leveraging computer vision models to spot outliers in the building energy records. The models are trained to identify outliers by analyzing the load shapes in 2D time series plots derived from the energy data. The VOD approach is tested on four years of workday time-series electricity consumption data from 290 commercial buildings in the United States. Two distinct models are developed for different usage purposes, namely a classification model for broad-level outlier detection and an object detection model for the demands of precise pinpointing of outliers. The classification model is also interpreted via Grad-CAM to enhance its usage reliability. The classification model achieves an F1 score of 0.88, and the object detection model achieves an Average Precision (AP) of 0.84. VOD is a very efficient path to identifying energy consumption outliers in building operations, paving the way for the enhancement of building energy data quality, operation efficiency, and energy savings. Full article

► Show Figures

Figure 1

21 pages, 7555 KiB

Open AccessArticle

Quantum-Enhanced Representation Learning: A Quanvolutional Autoencoder Approach against DDoS Threats

by Pablo Rivas, Javier Orduz, Tonni Das Jui, Casimer DeCusatis and Bikram Khanal

Mach. Learn. Knowl. Extr. 2024, 6(2), 944-964; https://doi.org/10.3390/make6020044 - 1 May 2024

Viewed by 1594

Abstract

Motivated by the growing threat of distributed denial-of-service (DDoS) attacks and the emergence of quantum computing, this study introduces a novel “quanvolutional autoencoder” architecture for learning representations. The architecture leverages the computational advantages of quantum mechanics to improve upon traditional machine learning techniques. [...] Read more.

Motivated by the growing threat of distributed denial-of-service (DDoS) attacks and the emergence of quantum computing, this study introduces a novel “quanvolutional autoencoder” architecture for learning representations. The architecture leverages the computational advantages of quantum mechanics to improve upon traditional machine learning techniques. Specifically, the quanvolutional autoencoder employs randomized quantum circuits to analyze time-series data from DDoS attacks, offering a robust alternative to classical convolutional neural networks. Experimental results suggest that the quanvolutional autoencoder performs similarly to classical models in visualizing and learning from DDoS hive plots and leads to faster convergence and learning stability. These findings suggest that quantum machine learning holds significant promise for advancing data analysis and visualization in cybersecurity. The study highlights the need for further research in this fast-growing field, particularly for unsupervised anomaly detection. Full article

► Show Figures

Figure 1

18 pages, 1057 KiB

Open AccessArticle

Prompt Engineering or Fine-Tuning? A Case Study on Phishing Detection with Large Language Models

by Fouad Trad and Ali Chehab

Mach. Learn. Knowl. Extr. 2024, 6(1), 367-384; https://doi.org/10.3390/make6010018 - 6 Feb 2024

Cited by 7 | Viewed by 5622

Abstract

Large Language Models (LLMs) are reshaping the landscape of Machine Learning (ML) application development. The emergence of versatile LLMs capable of undertaking a wide array of tasks has reduced the necessity for intensive human involvement in training and maintaining ML models. Despite these [...] Read more.

Large Language Models (LLMs) are reshaping the landscape of Machine Learning (ML) application development. The emergence of versatile LLMs capable of undertaking a wide array of tasks has reduced the necessity for intensive human involvement in training and maintaining ML models. Despite these advancements, a pivotal question emerges: can these generalized models negate the need for task-specific models? This study addresses this question by comparing the effectiveness of LLMs in detecting phishing URLs when utilized with prompt-engineering techniques versus when fine-tuned. Notably, we explore multiple prompt-engineering strategies for phishing URL detection and apply them to two chat models, GPT-3.5-turbo and Claude 2. In this context, the maximum result achieved was an F1-score of 92.74% by using a test set of 1000 samples. Following this, we fine-tune a range of base LLMs, including GPT-2, Bloom, Baby LLaMA, and DistilGPT-2—all primarily developed for text generation—exclusively for phishing URL detection. The fine-tuning approach culminated in a peak performance, achieving an F1-score of 97.29% and an AUC of 99.56% on the same test set, thereby outperforming existing state-of-the-art methods. These results highlight that while LLMs harnessed through prompt engineering can expedite application development processes, achieving a decent performance, they are not as effective as dedicated, task-specific LLMs. Full article

► Show Figures

Figure 1

2023

Jump to: 2024, 2022

23 pages, 9134 KiB

Open AccessArticle

Transforming Simulated Data into Experimental Data Using Deep Learning for Vibration-Based Structural Health Monitoring

by Abhijeet Kumar, Anirban Guha and Sauvik Banerjee

Mach. Learn. Knowl. Extr. 2024, 6(1), 18-40; https://doi.org/10.3390/make6010002 - 27 Dec 2023

Viewed by 2030

Abstract

While machine learning (ML) has been quite successful in the field of structural health monitoring (SHM), its practical implementation has been limited. This is because ML model training requires data containing a variety of distinct instances of damage captured from a real structure [...] Read more.

While machine learning (ML) has been quite successful in the field of structural health monitoring (SHM), its practical implementation has been limited. This is because ML model training requires data containing a variety of distinct instances of damage captured from a real structure and the experimental generation of such data is challenging. One way to tackle this issue is by generating training data through numerical simulations. However, simulated data cannot capture the bias and variance of experimental uncertainty. To overcome this problem, this work proposes a deep-learning-based domain transformation method for transforming simulated data to the experimental domain. Use of this technique has been demonstrated for debonding location and size predictions of stiffened panels using a vibration-based method. The results are satisfactory for both debonding location and size prediction. This domain transformation method can be used in any field in which experimental data for training machine-learning models is scarce. Full article

► Show Figures

Figure 1

30 pages, 3241 KiB

Open AccessArticle

Detecting Adversarial Examples Using Surrogate Models

by Borna Feldsar, Rudolf Mayer and Andreas Rauber

Mach. Learn. Knowl. Extr. 2023, 5(4), 1796-1825; https://doi.org/10.3390/make5040087 - 27 Nov 2023

Cited by 1 | Viewed by 1963

Abstract

Deep Learning has enabled significant progress towards more accurate predictions and is increasingly integrated into our everyday lives in real-world applications; this is true especially for Convolutional Neural Networks (CNNs) in the field of image analysis. Nevertheless, it has been shown that Deep [...] Read more.

Deep Learning has enabled significant progress towards more accurate predictions and is increasingly integrated into our everyday lives in real-world applications; this is true especially for Convolutional Neural Networks (CNNs) in the field of image analysis. Nevertheless, it has been shown that Deep Learning is vulnerable against well-crafted, small perturbations to the input, i.e., adversarial examples. Defending against such attacks is therefore crucial to ensure the proper functioning of these models—especially when autonomous decisions are taken in safety-critical applications, such as autonomous vehicles. In this work, shallow machine learning models, such as Logistic Regression and Support Vector Machine, are utilised as surrogates of a CNN based on the assumption that they would be differently affected by the minute modifications crafted for CNNs. We develop three detection strategies for adversarial examples by analysing differences in the prediction of the surrogate and the CNN model: namely, deviation in (i) the prediction, (ii) the distance of the predictions, and (iii) the confidence of the predictions. We consider three different feature spaces: raw images, extracted features, and the activations of the CNN model. Our evaluation shows that our methods achieve state-of-the-art performance compared to other approaches, such as Feature Squeezing, MagNet, PixelDefend, and Subset Scanning, on the MNIST, Fashion-MNIST, and CIFAR-10 datasets while being robust in the sense that they do not entirely fail against selected single attacks. Further, we evaluate our defence against an adaptive attacker in a grey-box setting. Full article

► Show Figures

Figure 1

16 pages, 1858 KiB

Open AccessArticle

Unraveling COVID-19 Dynamics via Machine Learning and XAI: Investigating Variant Influence and Prognostic Classification

by Oliver Lohaj, Ján Paralič, Peter Bednár, Zuzana Paraličová and Matúš Huba

Mach. Learn. Knowl. Extr. 2023, 5(4), 1266-1281; https://doi.org/10.3390/make5040064 - 25 Sep 2023

Cited by 1 | Viewed by 1824

Abstract

Machine learning (ML) has been used in different ways in the fight against COVID-19 disease. ML models have been developed, e.g., for diagnostic or prognostic purposes and using various modalities of data (e.g., textual, visual, or structured). Due to the many specific aspects [...] Read more.

Machine learning (ML) has been used in different ways in the fight against COVID-19 disease. ML models have been developed, e.g., for diagnostic or prognostic purposes and using various modalities of data (e.g., textual, visual, or structured). Due to the many specific aspects of this disease and its evolution over time, there is still not enough understanding of all relevant factors influencing the course of COVID-19 in particular patients. In all aspects of our work, there was a strong involvement of a medical expert following the human-in-the-loop principle. This is a very important but usually neglected part of the ML and knowledge extraction (KE) process. Our research shows that explainable artificial intelligence (XAI) may significantly support this part of ML and KE. Our research focused on using ML for knowledge extraction in two specific scenarios. In the first scenario, we aimed to discover whether adding information about the predominant COVID-19 variant impacts the performance of the ML models. In the second scenario, we focused on prognostic classification models concerning the need for an intensive care unit for a given patient in connection with different explainability AI (XAI) methods. We have used nine ML algorithms, namely XGBoost, CatBoost, LightGBM, logistic regression, Naive Bayes, random forest, SGD, SVM-linear, and SVM-RBF. We measured the performance of the resulting models using precision, accuracy, and AUC metrics. Subsequently, we focused on knowledge extraction from the best-performing models using two different approaches as follows: (a) features extracted automatically by forward stepwise selection (FSS); (b) attributes and their interactions discovered by model explainability methods. Both were compared with the attributes selected by the medical experts in advance based on the domain expertise. Our experiments showed that adding information about the COVID-19 variant did not influence the performance of the resulting ML models. It also turned out that medical experts were much more precise in the identification of significant attributes than FSS. Explainability methods identified almost the same attributes as a medical expert and interesting interactions among them, which the expert discussed from a medical point of view. The results of our research and their consequences are discussed. Full article

► Show Figures

Figure 1

13 pages, 1436 KiB

Open AccessArticle

Improving Spiking Neural Network Performance with Auxiliary Learning

by Paolo G. Cachi, Sebastián Ventura and Krzysztof J. Cios

Mach. Learn. Knowl. Extr. 2023, 5(3), 1010-1022; https://doi.org/10.3390/make5030052 - 5 Aug 2023

Viewed by 2119

Abstract

The use of back propagation through the time learning rule enabled the supervised training of deep spiking neural networks to process temporal neuromorphic data. However, their performance is still below non-spiking neural networks. Previous work pointed out that one of the main causes [...] Read more.

The use of back propagation through the time learning rule enabled the supervised training of deep spiking neural networks to process temporal neuromorphic data. However, their performance is still below non-spiking neural networks. Previous work pointed out that one of the main causes is the limited number of neuromorphic data currently available, which are also difficult to generate. With the goal of overcoming this problem, we explore the usage of auxiliary learning as a means of helping spiking neural networks to identify more general features. Tests are performed on neuromorphic DVS-CIFAR10 and DVS128-Gesture datasets. The results indicate that training with auxiliary learning tasks improves their accuracy, albeit slightly. Different scenarios, including manual and automatic combination losses using implicit differentiation, are explored to analyze the usage of auxiliary tasks. Full article

► Show Figures

Figure 1

27 pages, 8245 KiB

Open AccessArticle

Classification Confidence in Exploratory Learning: A User’s Guide

by Peter Salamon, David Salamon, V. Adrian Cantu, Michelle An, Tyler Perry, Robert A. Edwards and Anca M. Segall

Mach. Learn. Knowl. Extr. 2023, 5(3), 803-829; https://doi.org/10.3390/make5030043 - 21 Jul 2023

Viewed by 1626

Abstract

This paper investigates the post-hoc calibration of confidence for “exploratory” machine learning classification problems. The difficulty in these problems stems from the continuing desire to push the boundaries of which categories have enough examples to generalize from when curating datasets, and confusion regarding [...] Read more.

This paper investigates the post-hoc calibration of confidence for “exploratory” machine learning classification problems. The difficulty in these problems stems from the continuing desire to push the boundaries of which categories have enough examples to generalize from when curating datasets, and confusion regarding the validity of those categories. We argue that for such problems the “one-versus-all” approach (top-label calibration) must be used rather than the “calibrate-the-full-response-matrix” approach advocated elsewhere in the literature. We introduce and test four new algorithms designed to handle the idiosyncrasies of category-specific confidence estimation using only the test set and the final model. Chief among these methods is the use of kernel density ratios for confidence calibration including a novel algorithm for choosing the bandwidth. We test our claims and explore the limits of calibration on a bioinformatics application (PhANNs) as well as the classic MNIST benchmark. Finally, our analysis argues that post-hoc calibration should always be performed, may be performed using only the test dataset, and should be sanity-checked visually. Full article

► Show Figures

Figure 1

17 pages, 3106 KiB

Open AccessArticle

The Value of Numbers in Clinical Text Classification

by Kristian Miok, Padraig Corcoran and Irena Spasić

Mach. Learn. Knowl. Extr. 2023, 5(3), 746-762; https://doi.org/10.3390/make5030040 - 7 Jul 2023

Cited by 2 | Viewed by 2271

Abstract

Clinical text often includes numbers of various types and formats. However, most current text classification approaches do not take advantage of these numbers. This study aims to demonstrate that using numbers as features can significantly improve the performance of text classification models. This [...] Read more.

Clinical text often includes numbers of various types and formats. However, most current text classification approaches do not take advantage of these numbers. This study aims to demonstrate that using numbers as features can significantly improve the performance of text classification models. This study also demonstrates the feasibility of extracting such features from clinical text. Unsupervised learning was used to identify patterns of number usage in clinical text. These patterns were analyzed manually and converted into pattern-matching rules. Information extraction was used to incorporate numbers as features into a document representation model. We evaluated text classification models trained on such representation. Our experiments were performed with two document representation models (vector space model and word embedding model) and two classification models (support vector machines and neural networks). The results showed that even a handful of numerical features can significantly improve text classification performance. We conclude that commonly used document representations do not represent numbers in a way that machine learning algorithms can effectively utilize them as features. Although we demonstrated that traditional information extraction can be effective in converting numbers into features, further community-wide research is required to systematically incorporate number representation into the word embedding process. Full article

► Show Figures

Figure 1

12 pages, 767 KiB

Open AccessArticle

Using Machine Learning with Eye-Tracking Data to Predict if a Recruiter Will Approve a Resume

by Angel Pina, Corbin Petersheim, Josh Cherian, Joanna Nicole Lahey, Gerianne Alexander and Tracy Hammond

Mach. Learn. Knowl. Extr. 2023, 5(3), 713-724; https://doi.org/10.3390/make5030038 - 28 Jun 2023

Cited by 1 | Viewed by 2338

Abstract

When job seekers are unsuccessful in getting a position, they often do not get feedback to inform them on how to develop a better application in the future. Therefore, there is a critical need to understand what qualifications recruiters value in order to [...] Read more.

When job seekers are unsuccessful in getting a position, they often do not get feedback to inform them on how to develop a better application in the future. Therefore, there is a critical need to understand what qualifications recruiters value in order to help applicants. To address this need, we utilized eye-trackers to measure and record visual data of recruiters screening resumes to gain insight into which Areas of Interest (AOIs) influenced recruiters’ decisions the most. Using just this eye-tracking data, we trained a machine learning classifier to predict whether or not a recruiter would move a resume on to the next level of the hiring process with an AUC of 0.767. We found that features associated with recruiters looking outside the content of a resume were most predictive of their decision as well as total time viewing the resume and time spent on the Experience and Education sections. We hypothesize that this behavior is indicative of the recruiter reflecting on the content of the resume. These initial results show that applicants should focus on designing clear and concise resumes that are easy for recruiters to absorb and think about, with additional attention given to the Experience and Education sections. Full article

► Show Figures

Figure 1

14 pages, 3220 KiB

Open AccessArticle

A Mathematical Framework for Enriching Human–Machine Interactions

by Andrée C. Ehresmann, Mathias Béjean and Jean-Paul Vanbremeersch

Mach. Learn. Knowl. Extr. 2023, 5(2), 597-610; https://doi.org/10.3390/make5020034 - 6 Jun 2023

Viewed by 1657

Abstract

This paper presents a conceptual mathematical framework for developing rich human–machine interactions in order to improve decision-making in a social organisation, S. The idea is to model how S can create a “multi-level artificial cognitive system”, called a data analyser (DA), to collaborate [...] Read more.

This paper presents a conceptual mathematical framework for developing rich human–machine interactions in order to improve decision-making in a social organisation, S. The idea is to model how S can create a “multi-level artificial cognitive system”, called a data analyser (DA), to collaborate with humans in collecting and learning how to analyse data, to anticipate situations, and to develop new responses, thus improving decision-making. In this model, the DA is “processed” to not only gather data and extend existing knowledge, but also to learn how to act autonomously with its own specific procedures or even to create new ones. An application is given in cases where such rich human–machine interactions are expected to allow the DA+S partnership to acquire deep anticipation capabilities for possible future changes, e.g., to prevent risks or seize opportunities. The way the social organization S operates over time, including the construction of DA, is described using the conceptual framework comprising “memory evolutive systems” (MES), a mathematical theoretical approach introduced by Ehresmann and Vanbremeersch for evolutionary multi-scale, multi-agent and multi-temporality systems. This leads to the definition of a “data analyser–MES”. Full article

► Show Figures

Figure 1

2022

Jump to: 2024, 2023

13 pages, 981 KiB

Open AccessArticle

Multimodal AutoML via Representation Evolution

by Blaž Škrlj, Matej Bevec and Nada Lavrač

Mach. Learn. Knowl. Extr. 2023, 5(1), 1-13; https://doi.org/10.3390/make5010001 - 23 Dec 2022

Cited by 2 | Viewed by 2598

Abstract

With the increasing amounts of available data, learning simultaneously from different types of inputs is becoming necessary to obtain robust and well-performing models. With the advent of representation learning in recent years, lower-dimensional vector-based representations have become available for both images and texts, [...] Read more.

With the increasing amounts of available data, learning simultaneously from different types of inputs is becoming necessary to obtain robust and well-performing models. With the advent of representation learning in recent years, lower-dimensional vector-based representations have become available for both images and texts, while automating simultaneous learning from multiple modalities remains a challenging problem. This paper presents an AutoML (automated machine learning) approach to automated machine learning model configuration identification for data composed of two modalities: texts and images. The approach is based on the idea of representation evolution, the process of automatically amplifying heterogeneous representations across several modalities, optimized jointly with a collection of fast, well-regularized linear models. The proposed approach is benchmarked against 11 unimodal and multimodal (texts and images) approaches on four real-life benchmark datasets from different domains. It achieves competitive performance with minimal human effort and low computing requirements, enabling learning from multiple modalities in automated manner for a wider community of researchers. Full article

► Show Figures

Figure 1

17 pages, 4418 KiB

Open AccessArticle

Ontology Completion with Graph-Based Machine Learning: A Comprehensive Evaluation

by Sebastian Mežnar, Matej Bevec, Nada Lavrač and Blaž Škrlj

Mach. Learn. Knowl. Extr. 2022, 4(4), 1107-1123; https://doi.org/10.3390/make4040056 - 1 Dec 2022

Cited by 1 | Viewed by 4163

Abstract

Increasing quantities of semantic resources offer a wealth of human knowledge, but their growth also increases the probability of wrong knowledge base entries. The development of approaches that identify potentially spurious parts of a given knowledge base is therefore highly relevant. We propose [...] Read more.

Increasing quantities of semantic resources offer a wealth of human knowledge, but their growth also increases the probability of wrong knowledge base entries. The development of approaches that identify potentially spurious parts of a given knowledge base is therefore highly relevant. We propose an approach for ontology completion that transforms an ontology into a graph and recommends missing edges using structure-only link analysis methods. By systematically evaluating thirteen methods (some for knowledge graphs) on eight different semantic resources, including Gene Ontology, Food Ontology, Marine Ontology, and similar ontologies, we demonstrate that a structure-only link analysis can offer a scalable and computationally efficient ontology completion approach for a subset of analyzed data sets. To the best of our knowledge, this is currently the most extensive systematic study of the applicability of different types of link analysis methods across semantic resources from different domains. It demonstrates that by considering symbolic node embeddings, explanations of the predictions (links) can be obtained, making this branch of methods potentially more valuable than black-box methods. Full article

► Show Figures

Figure 1

17 pages, 1861 KiB

Open AccessFeature PaperArticle

Semantic Interactive Learning for Text Classification: A Constructive Approach for Contextual Interactions

by Sebastian Kiefer, Mareike Hoffmann and Ute Schmid

Mach. Learn. Knowl. Extr. 2022, 4(4), 994-1010; https://doi.org/10.3390/make4040050 - 13 Nov 2022

Cited by 1 | Viewed by 2299

Abstract

Interactive Machine Learning (IML) can enable intelligent systems to interactively learn from their end-users, and is quickly becoming more and more relevant to many application domains. Although it places the human in the loop, interactions are mostly performed via mutual explanations that miss [...] Read more.

Interactive Machine Learning (IML) can enable intelligent systems to interactively learn from their end-users, and is quickly becoming more and more relevant to many application domains. Although it places the human in the loop, interactions are mostly performed via mutual explanations that miss contextual information. Furthermore, current model-agnostic IML strategies such as CAIPI are limited to ’destructive’ feedback, meaning that they solely allow an expert to prevent a learner from using irrelevant features. In this work, we propose a novel interaction framework called Semantic Interactive Learning for the domain of document classification, located at the intersection between Natural Language Processing (NLP) and Machine Learning (ML). We frame the problem of incorporating constructive and contextual feedback into the learner as a task involving finding an architecture that enables more semantic alignment between humans and machines while at the same time helping to maintain the statistical characteristics of the input domain when generating user-defined counterexamples based on meaningful corrections. Therefore, we introduce a technique called SemanticPush that is effective for translating conceptual corrections of humans to non-extrapolating training examples such that the learner’s reasoning is pushed towards the desired behavior. Through several experiments we show how our method compares to CAIPI, a state of the art IML strategy, in terms of Predictive Performance and Local Explanation Quality in downstream multi-class classification tasks. Especially in the early stages of interactions, our proposed method clearly outperforms CAIPI while allowing for contextual interpretation and intervention. Overall, SemanticPush stands out with regard to data efficiency, as it requires fewer queries from the pool dataset to achieve high accuracy. Full article

► Show Figures

Figure 1

30 pages, 4010 KiB

Open AccessFeature PaperArticle

Actionable Explainable AI (AxAI): A Practical Example with Aggregation Functions for Adaptive Classification and Textual Explanations for Interpretable Machine Learning

by Anna Saranti, Miroslav Hudec, Erika Mináriková, Zdenko Takáč, Udo Großschedl, Christoph Koch, Bastian Pfeifer, Alessa Angerschmid and Andreas Holzinger

Mach. Learn. Knowl. Extr. 2022, 4(4), 924-953; https://doi.org/10.3390/make4040047 - 27 Oct 2022

Cited by 18 | Viewed by 3783

Abstract

In many domains of our daily life (e.g., agriculture, forestry, health, etc.), both laymen and experts need to classify entities into two binary classes (yes/no, good/bad, sufficient/insufficient, benign/malign, etc.). For many entities, this decision is difficult and we need another class called “maybe”, [...] Read more.

In many domains of our daily life (e.g., agriculture, forestry, health, etc.), both laymen and experts need to classify entities into two binary classes (yes/no, good/bad, sufficient/insufficient, benign/malign, etc.). For many entities, this decision is difficult and we need another class called “maybe”, which contains a corresponding quantifiable tendency toward one of these two opposites. Human domain experts are often able to mark any entity, place it in a different class and adjust the position of the slope in the class. Moreover, they can often explain the classification space linguistically—depending on their individual domain experience and previous knowledge. We consider this human-in-the-loop extremely important and call our approach actionable explainable AI. Consequently, the parameters of the functions are adapted to these requirements and the solution is explained to the domain experts accordingly. Specifically, this paper contains three novelties going beyond the state-of-the-art: (1) A novel method for detecting the appropriate parameter range for the averaging function to treat the slope in the “maybe” class, along with a proposal for a better generalisation than the existing solution. (2) the insight that for a given problem, the family of t-norms and t-conorms covering the whole range of nilpotency is suitable because we need a clear “no” or “yes” not only for the borderline cases. Consequently, we adopted the Schweizer–Sklar family of t-norms or t-conorms in ordinal sums. (3) A new fuzzy quasi-dissimilarity function for classification into three classes: Main difference, irrelevant difference and partial difference. We conducted all of our experiments with real-world datasets. Full article

► Show Figures

Figure 1

13 pages, 1012 KiB

Open AccessReview

Artificial Intelligence Methods for Identifying and Localizing Abnormal Parathyroid Glands: A Review Study

by Ioannis D. Apostolopoulos, Nikolaos I. Papandrianos, Elpiniki I. Papageorgiou and Dimitris J. Apostolopoulos

Mach. Learn. Knowl. Extr. 2022, 4(4), 814-826; https://doi.org/10.3390/make4040040 - 21 Sep 2022

Cited by 5 | Viewed by 2440

Abstract

Background: Recent advances in Artificial Intelligence (AI) algorithms, and specifically Deep Learning (DL) methods, demonstrate substantial performance in detecting and classifying medical images. Recent clinical studies have reported novel optical technologies which enhance the localization or assess the viability of Parathyroid Glands (PG) [...] Read more.

Background: Recent advances in Artificial Intelligence (AI) algorithms, and specifically Deep Learning (DL) methods, demonstrate substantial performance in detecting and classifying medical images. Recent clinical studies have reported novel optical technologies which enhance the localization or assess the viability of Parathyroid Glands (PG) during surgery, or preoperatively. These technologies could become complementary to the surgeon’s eyes and may improve surgical outcomes in thyroidectomy and parathyroidectomy. Methods: The study explores and reports the use of AI methods for identifying and localizing PGs, Primary Hyperparathyroidism (PHPT), Parathyroid Adenoma (PTA), and Multiglandular Disease (MGD). Results: The review identified 13 publications that employ Machine Learning and DL methods for preoperative and operative implementations. Conclusions: AI can aid in PG, PHPT, PTA, and MGD detection, as well as PG abnormality discrimination, both during surgery and non-invasively. Full article

► Show Figures

Figure 1

14 pages, 1132 KiB

Open AccessArticle

Benefits from Variational Regularization in Language Models

by Cornelia Ferner and Stefan Wegenkittl

Mach. Learn. Knowl. Extr. 2022, 4(2), 542-555; https://doi.org/10.3390/make4020025 - 9 Jun 2022

Cited by 4 | Viewed by 2468

Abstract

Representations from common pre-trained language models have been shown to suffer from the degeneration problem, i.e., they occupy a narrow cone in latent space. This problem can be addressed by enforcing isotropy in latent space. In analogy with variational autoencoders, we suggest applying [...] Read more.

Representations from common pre-trained language models have been shown to suffer from the degeneration problem, i.e., they occupy a narrow cone in latent space. This problem can be addressed by enforcing isotropy in latent space. In analogy with variational autoencoders, we suggest applying a token-level variational loss to a Transformer architecture and optimizing the standard deviation of the prior distribution in the loss function as the model parameter to increase isotropy. The resulting latent space is complete and interpretable: any given point is a valid embedding and can be decoded into text again. This allows for text manipulations such as paraphrase generation directly in latent space. Surprisingly, features extracted at the sentence level also show competitive results on benchmark classification tasks. Full article

► Show Figures

Figure 1

14 pages, 805 KiB

Open AccessArticle

The Case of Aspect in Sentiment Analysis: Seeking Attention or Co-Dependency?

by Anastazia Žunić, Padraig Corcoran and Irena Spasić

Mach. Learn. Knowl. Extr. 2022, 4(2), 474-487; https://doi.org/10.3390/make4020021 - 13 May 2022

Cited by 2 | Viewed by 2835

Abstract

(1) Background: Aspect-based sentiment analysis (SA) is a natural language processing task, the aim of which is to classify the sentiment associated with a specific aspect of a written text. The performance of SA methods applied to texts related to health and well-being [...] Read more.

(1) Background: Aspect-based sentiment analysis (SA) is a natural language processing task, the aim of which is to classify the sentiment associated with a specific aspect of a written text. The performance of SA methods applied to texts related to health and well-being lags behind that of other domains. (2) Methods: In this study, we present an approach to aspect-based SA of drug reviews. Specifically, we analysed signs and symptoms, which were extracted automatically using the Unified Medical Language System. This information was then passed onto the BERT language model, which was extended by two layers to fine-tune the model for aspect-based SA. The interpretability of the model was analysed using an axiomatic attribution method. We performed a correlation analysis between the attribution scores and syntactic dependencies. (3) Results: Our fine-tuned model achieved accuracy of approximately

95 %

on a well-balanced test set. It outperformed our previous approach, which used syntactic information to guide the operation of a neural network and achieved an accuracy of approximately

82 %

. (4) Conclusions: We demonstrated that a BERT-based model of SA overcomes the negative bias associated with health-related aspects and closes the performance gap against the state-of-the-art in other domains. Full article

► Show Figures

Figure 1

Journal Menu

Journal Browser

Extravaganza Feature Papers on Hot Topics in Machine Learning and Knowledge Extraction

Share This Topical Collection

Editor

Topical Collection Information

Published Papers (27 papers)

2024

Jump to: 2023, 2022

2023

Jump to: 2024, 2022

2022

Jump to: 2024, 2023

Further Information

Guidelines

MDPI Initiatives

Follow MDPI