-
ConText at WASSA 2024 Empathy and Personality Shared Task: History-Dependent Embedding Utterance Representations for Empathy and Emotion Prediction in Conversations
Authors:
Patrícia Pereira,
Helena Moniz,
Joao Paulo Carvalho
Abstract:
Empathy and emotion prediction are key components in the development of effective and empathetic agents, amongst several other applications. The WASSA shared task on empathy and emotion prediction in interactions presents an opportunity to benchmark approaches to these tasks. Appropriately selecting and representing the historical context is crucial in the modelling of empathy and emotion in conve…
▽ More
Empathy and emotion prediction are key components in the development of effective and empathetic agents, amongst several other applications. The WASSA shared task on empathy and emotion prediction in interactions presents an opportunity to benchmark approaches to these tasks. Appropriately selecting and representing the historical context is crucial in the modelling of empathy and emotion in conversations. In our submissions, we model empathy, emotion polarity and emotion intensity of each utterance in a conversation by feeding the utterance to be classified together with its conversational context, i.e., a certain number of previous conversational turns, as input to an encoder Pre-trained Language Model, to which we append a regression head for prediction. We also model perceived counterparty empathy of each interlocutor by feeding all utterances from the conversation and a token identifying the interlocutor for which we are predicting the empathy. Our system officially ranked $1^{st}$ at the CONV-turn track and $2^{nd}$ at the CONV-dialog track.
△ Less
Submitted 4 July, 2024;
originally announced July 2024.
-
Observation of Dynamic Nuclear Polarization Echoes
Authors:
Nino Wili,
Anders B. Nielsen,
José P. Carvalho,
Niels Chr. Nielsen
Abstract:
It is demonstrated that the time evolution of the electron-nuclear polarization transfer process during pulsed dynamic nuclear polarization (DNP) can be reversed on a microsecond timescale, leading to the observation of DNP echoes. The DNP echoes are induced by consecutive application of two pulse trains that produce effective Hamiltonians that differ only in the sign of the effective hyperfine co…
▽ More
It is demonstrated that the time evolution of the electron-nuclear polarization transfer process during pulsed dynamic nuclear polarization (DNP) can be reversed on a microsecond timescale, leading to the observation of DNP echoes. The DNP echoes are induced by consecutive application of two pulse trains that produce effective Hamiltonians that differ only in the sign of the effective hyperfine coupling. The experiments have been performed on a frozen solution of trityl radicals in water/glycerol on a home-built X-band EPR/DNP spectrometer at 80 K. We envisage that DNP echoes will play an important role in future development of pulsed DNP for sensitivity-enhanced NMR, hyperfine spectroscopy, and quantum sensing.
△ Less
Submitted 27 June, 2024; v1 submitted 26 June, 2024;
originally announced June 2024.
-
On the control of a simplified k-e model of turbulence
Authors:
Pitágoras Pinheiro de Carvalho,
Juan Bautista Límaco Ferrel,
Enrique Fernandez-Cara
Abstract:
This paper deals with the control of a kind of turbulent flows. We consider a simplified k-e model with distributed controls, locally supported in space. We proof that the system is partially locally null-controllable, in the sense that the velocity field can be driven exactly to zero if the initial state is small enough. The proof relies on an argument where we have concatenated several technique…
▽ More
This paper deals with the control of a kind of turbulent flows. We consider a simplified k-e model with distributed controls, locally supported in space. We proof that the system is partially locally null-controllable, in the sense that the velocity field can be driven exactly to zero if the initial state is small enough. The proof relies on an argument where we have concatenated several techniques: fixed-point formulation, linearization, energy and Carleman estimates, local inversion, etc. Ths result can be viewed as a nontrivial step towards the control of turbulent fluids.
△ Less
Submitted 26 May, 2024;
originally announced May 2024.
-
Beyond Repetition: The Role of Varied Questioning and Feedback in Knowledge Generalization
Authors:
Gautam Yadav,
Paulo F. Carvalho,
Elizabeth A. McLaughlin,
Kenneth R. Koedinger
Abstract:
This study examines the effects of question type and feedback on learning outcomes in a hybrid graduate-level course. By analyzing data from 32 students over 30,198 interactions, we assess the efficacy of unique versus repeated questions and the impact of feedback on student learning. The findings reveal students demonstrate significantly better knowledge generalization when encountering unique qu…
▽ More
This study examines the effects of question type and feedback on learning outcomes in a hybrid graduate-level course. By analyzing data from 32 students over 30,198 interactions, we assess the efficacy of unique versus repeated questions and the impact of feedback on student learning. The findings reveal students demonstrate significantly better knowledge generalization when encountering unique questions compared to repeated ones, even though they perform better with repeated opportunities. Moreover, we find that the timing of explanatory feedback is a more robust predictor of learning outcomes than the practice opportunities themselves. These insights suggest that educational practices and technological platforms should prioritize a variety of questions to enhance the learning process. The study also highlights the critical role of feedback; opportunities preceding feedback are less effective in enhancing learning.
△ Less
Submitted 15 May, 2024;
originally announced May 2024.
-
Uncovering Name-Based Biases in Large Language Models Through Simulated Trust Game
Authors:
Yumou Wei,
Paulo F. Carvalho,
John Stamper
Abstract:
Gender and race inferred from an individual's name are a notable source of stereotypes and biases that subtly influence social interactions. Abundant evidence from human experiments has revealed the preferential treatment that one receives when one's name suggests a predominant gender or race. As large language models acquire more capabilities and begin to support everyday applications, it becomes…
▽ More
Gender and race inferred from an individual's name are a notable source of stereotypes and biases that subtly influence social interactions. Abundant evidence from human experiments has revealed the preferential treatment that one receives when one's name suggests a predominant gender or race. As large language models acquire more capabilities and begin to support everyday applications, it becomes crucial to examine whether they manifest similar biases when encountering names in a complex social interaction. In contrast to previous work that studies name-based biases in language models at a more fundamental level, such as word representations, we challenge three prominent models to predict the outcome of a modified Trust Game, a well-publicized paradigm for studying trust and reciprocity. To ensure the internal validity of our experiments, we have carefully curated a list of racially representative surnames to identify players in a Trust Game and rigorously verified the construct validity of our prompts. The results of our experiments show that our approach can detect name-based biases in both base and instruction-tuned models.
△ Less
Submitted 22 April, 2024;
originally announced April 2024.
-
ProtoAL: Interpretable Deep Active Learning with prototypes for medical imaging
Authors:
Iury B. de A. Santos,
André C. P. L. F. de Carvalho
Abstract:
The adoption of Deep Learning algorithms in the medical imaging field is a prominent area of research, with high potential for advancing AI-based Computer-aided diagnosis (AI-CAD) solutions. However, current solutions face challenges due to a lack of interpretability features and high data demands, prompting recent efforts to address these issues. In this study, we propose the ProtoAL method, wher…
▽ More
The adoption of Deep Learning algorithms in the medical imaging field is a prominent area of research, with high potential for advancing AI-based Computer-aided diagnosis (AI-CAD) solutions. However, current solutions face challenges due to a lack of interpretability features and high data demands, prompting recent efforts to address these issues. In this study, we propose the ProtoAL method, where we integrate an interpretable DL model into the Deep Active Learning (DAL) framework. This approach aims to address both challenges by focusing on the medical imaging context and utilizing an inherently interpretable model based on prototypes. We evaluated ProtoAL on the Messidor dataset, achieving an area under the precision-recall curve of 0.79 while utilizing only 76.54\% of the available labeled data. These capabilities can enhances the practical usability of a DL model in the medical field, providing a means of trust calibration in domain experts and a suitable solution for learning in the data scarcity context often found.
△ Less
Submitted 6 April, 2024;
originally announced April 2024.
-
Insights on induced magnetic moments and spin textures in synthetic ferrimagnetic Pt/Co/Gd heterolayers
Authors:
J. Brandão,
P. C. Carvalho,
I. P. Miranda,
T. J. A. Mori,
F. Béron,
A. Bergman,
H. M. Petrilli,
A. B. Klautau,
J. C. Cezar
Abstract:
To develop new devices based on synthetic ferrimagnetic (S-FiM) heterostructures, understanding the material's physical properties is pivotal. Here, the induced magnetic moment (IMM), magnetic exchange-coupling, and spin textures were investigated at room-temperature in Pt/Co/Gd multilayers using a multiscale approach. The magnitude and direction of the IMM were interpreted experimentally and theo…
▽ More
To develop new devices based on synthetic ferrimagnetic (S-FiM) heterostructures, understanding the material's physical properties is pivotal. Here, the induced magnetic moment (IMM), magnetic exchange-coupling, and spin textures were investigated at room-temperature in Pt/Co/Gd multilayers using a multiscale approach. The magnitude and direction of the IMM were interpreted experimentally and theoretically in the framework of both X-ray magnetic circular dichroism (XMCD) and density functional theory (DFT). The results demonstrate that the IMM transferred by Co across the Gd paramagnetic (PM) thickness leads to a flipped spin state (FSS) within the Gd layers, in which their magnetic moments couple antiparallel/parallel with the ferromagnetic (FM) Co near/far from the Co/Gd interface, respectively. For the Pt, in both Pt/Co and Gd/Pt interfaces the IMM follows the same direction as the Co magnetic moment, with negligible IMM in the Gd/Pt interface. Additionally, zero-field spin spirals were imaged using scanning transmission X-ray microscopy (STXM), while micromagnetic simulations employed to unfold the interactions stabilizing the FiM configurations, where the existence of a sizable Dzyaloshinskii-Moriya interaction is demonstrated to be crucial for the formation of those spin textures. Our outcomes may add fundamental physical and technological aspects for using FiM films in antiferromagnetic spintronic devices.
△ Less
Submitted 6 April, 2024;
originally announced April 2024.
-
Is $γ_{KLS}$-generalized statistical field theory complete?
Authors:
P. R. S. Carvalho
Abstract:
In this Letter we introduce some field-theoretic approach for computing the critical properties of $γ_{KLS}$-generalized systems undergoing continuous phase transitions, namely $γ_{KLS}$-statistical field theory. From this new approach emerges the new generalized O($N$)$_{γ_{KLS}}$ universality class, which is capable of encompassing nonconventional critical exponents for real imperfect systems kn…
▽ More
In this Letter we introduce some field-theoretic approach for computing the critical properties of $γ_{KLS}$-generalized systems undergoing continuous phase transitions, namely $γ_{KLS}$-statistical field theory. From this new approach emerges the new generalized O($N$)$_{γ_{KLS}}$ universality class, which is capable of encompassing nonconventional critical exponents for real imperfect systems known as manganites not described by standard statistical field theory. We compare the generalized results with those obtained from measurements in manganites. The agreement was satisfactory, where the relative errors are $< 5\%$ for the most of manganites used. Although the present approach describes the aforementioned nonconventional critical indices, we show that it is not complete. For example, it does not explain the results for some other manganites, being explained only for nonextensive statistical field theory recently introduced in literature. So, $γ_{KLS}$-statistical field theory has to be discarded for statistical mechanics generalization purposes.
△ Less
Submitted 1 April, 2024;
originally announced April 2024.
-
Property $(\diamond)$ for Ore extensions of small Krull dimension
Authors:
Ken Brown,
Paula A. A. B. Carvalho,
Jerzy Matczuk
Abstract:
This paper is a continuation of a project to determine which skew polynomial algebras $S = R[θ; α]$ satisfy property $(\diamond)$, namely that the injective hull of every simple $S$-module is locally artinian, where $k$ is a field, $R$ is a commutative noetherian $k$-algebra, and $α$ is a $k$-algebra automorphism of $R$. Earlier work (which we review) and further analysis done here leads us to foc…
▽ More
This paper is a continuation of a project to determine which skew polynomial algebras $S = R[θ; α]$ satisfy property $(\diamond)$, namely that the injective hull of every simple $S$-module is locally artinian, where $k$ is a field, $R$ is a commutative noetherian $k$-algebra, and $α$ is a $k$-algebra automorphism of $R$. Earlier work (which we review) and further analysis done here leads us to focus on the case where $S$ is a primitive domain and $R$ has Krull dimension 1 and contains an uncountable field. Then we show first that if $|\mathrm{Spec}(R)|$ is infinite then $S$ does not satisfy $(\diamond)$. Secondly we show that when $R = k[X]_{<X>}$ and $α(X) = qX$ where $q \in k \setminus \{0\}$ is not a root of unity then $S$ does not satisfy $(\diamond)$. This is in complete contrast to our earlier result that, when $R = k[[X]]$ and $α$ is an arbitrary $k$-algebra automorphism of infinite order, $S$ satisfies $(\diamond)$. A number of open questions are stated.
△ Less
Submitted 14 March, 2024;
originally announced March 2024.
-
Efficient Parameter Mining and Freezing for Continual Object Detection
Authors:
Angelo G. Menezes,
Augusto J. Peterlevitz,
Mateus A. Chinelatto,
André C. P. L. F. de Carvalho
Abstract:
Continual Object Detection is essential for enabling intelligent agents to interact proactively with humans in real-world settings. While parameter-isolation strategies have been extensively explored in the context of continual learning for classification, they have yet to be fully harnessed for incremental object detection scenarios. Drawing inspiration from prior research that focused on mining…
▽ More
Continual Object Detection is essential for enabling intelligent agents to interact proactively with humans in real-world settings. While parameter-isolation strategies have been extensively explored in the context of continual learning for classification, they have yet to be fully harnessed for incremental object detection scenarios. Drawing inspiration from prior research that focused on mining individual neuron responses and integrating insights from recent developments in neural pruning, we proposed efficient ways to identify which layers are the most important for a network to maintain the performance of a detector across sequential updates. The presented findings highlight the substantial advantages of layer-level parameter isolation in facilitating incremental learning within object detection models, offering promising avenues for future research and application in real-world scenarios.
△ Less
Submitted 19 February, 2024;
originally announced February 2024.
-
Pulse vaccination in a modified SIR model: global dynamics, bifurcations and seasonality
Authors:
João P. S. Maurício de Carvalho,
Alexandre A. Rodrigues
Abstract:
We analyze a periodically-forced dynamical system inspired by the SIR model with pulse vaccination. We fully characterize its dynamics according to the proportion $p$ of vaccinated individuals and the time $T$ between doses. If the basic reproduction number is less than 1 (i.e. $\mathcal{R}_p<1$), then we obtain precise conditions for the existence and global stability of a disease-free $T$-period…
▽ More
We analyze a periodically-forced dynamical system inspired by the SIR model with pulse vaccination. We fully characterize its dynamics according to the proportion $p$ of vaccinated individuals and the time $T$ between doses. If the basic reproduction number is less than 1 (i.e. $\mathcal{R}_p<1$), then we obtain precise conditions for the existence and global stability of a disease-free $T$-periodic solution. Otherwise, if $\mathcal{R}_p>1$, then a globally stable $T$-periodic solution emerges with positive coordinates. We draw a bifurcation diagram $(T,p)$ and we describe the associated bifurcations. We also find analytically and numerically chaotic dynamics by adding seasonality to the disease transmission rate. In a realistic context, low vaccination coverage and intense seasonality may result in unpredictable dynamics. Previous experiments have suggested chaos in periodically-forced biological impulsive models, but no analytic proof has been given.
△ Less
Submitted 4 December, 2023;
originally announced December 2023.
-
Dialogue Quality and Emotion Annotations for Customer Support Conversations
Authors:
John Mendonça,
Patrícia Pereira,
Miguel Menezes,
Vera Cabarrão,
Ana C. Farinha,
Helena Moniz,
João Paulo Carvalho,
Alon Lavie,
Isabel Trancoso
Abstract:
Task-oriented conversational datasets often lack topic variability and linguistic diversity. However, with the advent of Large Language Models (LLMs) pretrained on extensive, multilingual and diverse text data, these limitations seem overcome. Nevertheless, their generalisability to different languages and domains in dialogue applications remains uncertain without benchmarking datasets. This paper…
▽ More
Task-oriented conversational datasets often lack topic variability and linguistic diversity. However, with the advent of Large Language Models (LLMs) pretrained on extensive, multilingual and diverse text data, these limitations seem overcome. Nevertheless, their generalisability to different languages and domains in dialogue applications remains uncertain without benchmarking datasets. This paper presents a holistic annotation approach for emotion and conversational quality in the context of bilingual customer support conversations. By performing annotations that take into consideration the complete instances that compose a conversation, one can form a broader perspective of the dialogue as a whole. Furthermore, it provides a unique and valuable resource for the development of text classification models. To this end, we present benchmarks for Emotion Recognition and Dialogue Quality Estimation and show that further research is needed to leverage these models in a production setting.
△ Less
Submitted 23 November, 2023;
originally announced November 2023.
-
On homological reduction of Poisson structures
Authors:
Pedro H. Carvalho
Abstract:
Given a $\mathfrak{g}$-action on a Poisson manifold $(M, π)$ and an equivariant map $J: M \rightarrow \mathfrak{h}^*,$ for $\mathfrak{h}$ a $\mathfrak{g}$-module, we obtain, under natural compatibility and regularity conditions previously considered by Cattaneo-Zambon, a homotopy Poisson algebra generalizing the classical BFV algebra described by Kostant-Sternberg in the usual hamiltonian setting.…
▽ More
Given a $\mathfrak{g}$-action on a Poisson manifold $(M, π)$ and an equivariant map $J: M \rightarrow \mathfrak{h}^*,$ for $\mathfrak{h}$ a $\mathfrak{g}$-module, we obtain, under natural compatibility and regularity conditions previously considered by Cattaneo-Zambon, a homotopy Poisson algebra generalizing the classical BFV algebra described by Kostant-Sternberg in the usual hamiltonian setting. As an application of our methods, we also derive homological models for the reduced spaces associated to quasi-Poisson and hamiltonian quasi-Poisson spaces.
△ Less
Submitted 11 December, 2023; v1 submitted 13 September, 2023;
originally announced September 2023.
-
Fuzzy Fingerprinting Transformer Language-Models for Emotion Recognition in Conversations
Authors:
Patrícia Pereira,
Rui Ribeiro,
Helena Moniz,
Luisa Coheur,
Joao Paulo Carvalho
Abstract:
Fuzzy Fingerprints have been successfully used as an interpretable text classification technique, but, like most other techniques, have been largely surpassed in performance by Large Pre-trained Language Models, such as BERT or RoBERTa. These models deliver state-of-the-art results in several Natural Language Processing tasks, namely Emotion Recognition in Conversations (ERC), but suffer from the…
▽ More
Fuzzy Fingerprints have been successfully used as an interpretable text classification technique, but, like most other techniques, have been largely surpassed in performance by Large Pre-trained Language Models, such as BERT or RoBERTa. These models deliver state-of-the-art results in several Natural Language Processing tasks, namely Emotion Recognition in Conversations (ERC), but suffer from the lack of interpretability and explainability. In this paper, we propose to combine the two approaches to perform ERC, as a means to obtain simpler and more interpretable Large Language Models-based classifiers. We propose to feed the utterances and their previous conversational turns to a pre-trained RoBERTa, obtaining contextual embedding utterance representations, that are then supplied to an adapted Fuzzy Fingerprint classification module. We validate our approach on the widely used DailyDialog ERC benchmark dataset, in which we obtain state-of-the-art level results using a much lighter model.
△ Less
Submitted 8 September, 2023;
originally announced September 2023.
-
Simple LLM Prompting is State-of-the-Art for Robust and Multilingual Dialogue Evaluation
Authors:
John Mendonça,
Patrícia Pereira,
Helena Moniz,
João Paulo Carvalho,
Alon Lavie,
Isabel Trancoso
Abstract:
Despite significant research effort in the development of automatic dialogue evaluation metrics, little thought is given to evaluating dialogues other than in English. At the same time, ensuring metrics are invariant to semantically similar responses is also an overlooked topic. In order to achieve the desired properties of robustness and multilinguality for dialogue evaluation metrics, we propose…
▽ More
Despite significant research effort in the development of automatic dialogue evaluation metrics, little thought is given to evaluating dialogues other than in English. At the same time, ensuring metrics are invariant to semantically similar responses is also an overlooked topic. In order to achieve the desired properties of robustness and multilinguality for dialogue evaluation metrics, we propose a novel framework that takes advantage of the strengths of current evaluation models with the newly-established paradigm of prompting Large Language Models (LLMs). Empirical results show our framework achieves state of the art results in terms of mean Spearman correlation scores across several benchmarks and ranks first place on both the Robust and Multilingual tasks of the DSTC11 Track 4 "Automatic Evaluation Metrics for Open-Domain Dialogue Systems", proving the evaluation capabilities of prompted LLMs.
△ Less
Submitted 8 September, 2023; v1 submitted 31 August, 2023;
originally announced August 2023.
-
Quantitative Transversal Theorems in the Plane
Authors:
Ilani Axelrod-Freed,
João Pedro Carvalho,
Yuki Takahashi
Abstract:
Hadwiger's theorem is a variant of Helly-type theorems involving common transversals to families of convex sets instead of common intersections. In this paper, we obtain a quantitative version of Hadwiger's theorem on the plane: given an ordered family of pairwise disjoint and compact convex sets in $\mathbb{R}^2$ and any real-valued monotone function on convex subsets of $\mathbb{R}^2,$ if every…
▽ More
Hadwiger's theorem is a variant of Helly-type theorems involving common transversals to families of convex sets instead of common intersections. In this paper, we obtain a quantitative version of Hadwiger's theorem on the plane: given an ordered family of pairwise disjoint and compact convex sets in $\mathbb{R}^2$ and any real-valued monotone function on convex subsets of $\mathbb{R}^2,$ if every three sets have a common transversal, respecting the order, such that the intersection of the sets with each half-plane defined by the transversal are valued at least (or at most) some constant $α,$ then the entire family has a common transversal with the same property. Unlike previous generalizations of Hadwiger's theorem, we prove that disjointness is necessary for the quantitative case. We also prove colorful versions of our results.
△ Less
Submitted 21 August, 2023;
originally announced August 2023.
-
Analysis of the orbital evolution of space debris using a solar sail and natural forces
Authors:
Jean Paulo dos S. Carvalho,
Rodolpho Vilhena de Moraes,
Antonio Fernando Bertachini de A. Prado
Abstract:
In this work, the orbital evolution of these objects that are located in the geostationary orbit (GEO) is analyzed. Knowing this, the possibility of using a solar sail is considered to help to clean the space environment. The main natural environmental perturbations that act in the orbit of the debris are considered in the dynamics. Such forces acting in the solar sail can force the growth of the…
▽ More
In this work, the orbital evolution of these objects that are located in the geostationary orbit (GEO) is analyzed. Knowing this, the possibility of using a solar sail is considered to help to clean the space environment. The main natural environmental perturbations that act in the orbit of the debris are considered in the dynamics. Such forces acting in the solar sail can force the growth of the eccentricity of these objects in the GEO orbit. Several authors have presented models of the solar radiation pressure considering the single-averaged model. But, doing a literature research, we found that the authors consider the Earth around the Sun in a circular and inclined orbit. Our contribution to the SRP model is in developing a different approach from other authors, where we consider the Sun in an elliptical and inclined orbit, which is valid for other bodies in the solar system when the eccentricity cannot be neglected. The expression of the SRP is developed up to the second order. We found that the first-order term is much superior to the second-order term, so the quadrupole term can be neglected. Another contribution is the approach to identify the initial conditions of the perigee argument (g) and the longitude of the ascending node (h), where some values of the (g, h) plane contribute to amplify the eccentricity growth. In the numerical simulations we consider real data from space debris removed from the site Stuff in Space. The solar sail helps to clean up the space environment using a propulsion system that uses the Sun itself, a clean and abundant energy source, unlike chemical propellants, to contribute to the sustainability of space exploration.
△ Less
Submitted 16 July, 2023;
originally announced July 2023.
-
Is Kaniadakis $κ$-generalized statistical mechanics general?
Authors:
T. F. A. Alves,
J. F. da Silva Neto,
F. W. S. Lima,
G. A. Alves,
P. R. S. Carvalho
Abstract:
In this Letter we introduce some field-theoretic approach for computing the critical properties of systems undergoing continuous phase transitions governed by the $κ$-generalized statistics, namely $κ$-generalized statistical field theory. In particular, we show, by computations through analytic and simulation results, that the $κ$-generalized Ising-like systems are not capable of describing the n…
▽ More
In this Letter we introduce some field-theoretic approach for computing the critical properties of systems undergoing continuous phase transitions governed by the $κ$-generalized statistics, namely $κ$-generalized statistical field theory. In particular, we show, by computations through analytic and simulation results, that the $κ$-generalized Ising-like systems are not capable of describing the nonconventional critical properties of real imperfect crystals, \emph{e. g.} of manganites, as some alternative generalized theory is, namely nonextensive statistical field theory, as shown recently in literature. Although $κ$-Ising-like systems do not depend on $κ$, we show that a few distinct systems do. Thus the $κ$-generalized statistical field theory is not general, \emph{i. e.} it fails to generalize Ising-like systems for describing the critical behavior of imperfect crystals, and must be discarded as one generalizing statistical mechanics. For the latter systems we present the physical interpretation of the theory by furnishing the general physical interpretation of the deformation $κ$-parameter.
△ Less
Submitted 11 July, 2023;
originally announced July 2023.
-
On Pareto equilibria for bi-objective diffusive optimal control problems
Authors:
Pitágoras P. de Carvalho,
Enrique Fernández-Cara,
Juan Límaco,
Denilson Menezes,
Yuri Thamsten
Abstract:
We investigate Pareto equilibria for bi-objective optimal control problems. Our framework comprises the situation in which an agent acts with a distributed control in a portion of a given domain, and aims to achieve two distinct (possibly conflicting) targets. We analyze systems governed by linear and semilinear heat equations and also systems with multiplicative controls. We develop numerical met…
▽ More
We investigate Pareto equilibria for bi-objective optimal control problems. Our framework comprises the situation in which an agent acts with a distributed control in a portion of a given domain, and aims to achieve two distinct (possibly conflicting) targets. We analyze systems governed by linear and semilinear heat equations and also systems with multiplicative controls. We develop numerical methods relying on a combination of finite elements and finite differences. We illustrate the computational methods we develop via numerous experiments.
△ Less
Submitted 10 July, 2023;
originally announced July 2023.
-
Local null controllability of a class of non-Newtonian incompressible viscous fluids
Authors:
Pitágoras de Carvalho,
Juan Límaco,
Denilson Menezes,
Yuri Thamsten
Abstract:
We investigate the null controllability property of systems that mathematically describe the dynamics of some non-Newtonian incompressible viscous flows. The principal model we study was proposed by O. A. Ladyzhenskaya, although the techniques we develop here apply to other fluids having a shear-dependent viscosity. Taking advantage of the Pontryagin Minimum Principle, we utilize a bootstrapping a…
▽ More
We investigate the null controllability property of systems that mathematically describe the dynamics of some non-Newtonian incompressible viscous flows. The principal model we study was proposed by O. A. Ladyzhenskaya, although the techniques we develop here apply to other fluids having a shear-dependent viscosity. Taking advantage of the Pontryagin Minimum Principle, we utilize a bootstrapping argument to prove that sufficiently smooth controls to the forced linearized Stokes problem exist, as long as the initial data in turn has enough regularity. From there, we extend the result to the nonlinear problem. As a byproduct, we devise a quasi-Newton algorithm to compute the states and a control, which we prove to converge in an appropriate sense. We finish the work with some numerical experiments.
△ Less
Submitted 8 July, 2023;
originally announced July 2023.
-
Fast Matrix Multiplication via Compiler-only Layered Data Reorganization and Intrinsic Lowering
Authors:
Braedy Kuzma,
Ivan Korostelev,
João P. L. de Carvalho,
José E. Moreira,
Christopher Barton,
Guido Araujo,
José Nelson Amaral
Abstract:
The resurgence of machine learning has increased the demand for high-performance basic linear algebra subroutines (BLAS), which have long depended on libraries to achieve peak performance on commodity hardware. High-performance BLAS implementations rely on a layered approach that consists of tiling and packing layers, for data (re)organization, and micro kernels that perform the actual computation…
▽ More
The resurgence of machine learning has increased the demand for high-performance basic linear algebra subroutines (BLAS), which have long depended on libraries to achieve peak performance on commodity hardware. High-performance BLAS implementations rely on a layered approach that consists of tiling and packing layers, for data (re)organization, and micro kernels that perform the actual computations. The creation of high-performance micro kernels requires significant development effort to write tailored assembly code for each architecture. This hand optimization task is complicated by the recent introduction of matrix engines by IBM's POWER10 MMA, Intel AMX, and Arm ME to deliver high-performance matrix operations. This paper presents a compiler-only alternative to the use of high-performance libraries by incorporating, to the best of our knowledge and for the first time, the automatic generation of the layered approach into LLVM, a production compiler. Modular design of the algorithm, such as the use of LLVM's matrix-multiply intrinsic for a clear interface between the tiling and packing layers and the micro kernel, makes it easy to retarget the code generation to multiple accelerators. The use of intrinsics enables a comprehensive performance study. In processors without hardware matrix engines, the tiling and packing delivers performance up to 22x (Intel), for small matrices, and more than 6x (POWER9), for large matrices, faster than PLuTo, a widely used polyhedral optimizer. The performance also approaches high-performance libraries and is only 34% slower than OpenBLAS and on-par with Eigen for large matrices. With MMA in POWER10 this solution is, for large matrices, over 2.6x faster than the vector-extension solution, matches Eigen performance, and achieves up to 96% of BLAS peak performance.
△ Less
Submitted 15 May, 2023;
originally announced May 2023.
-
A Review of Benchmarks for Visual Defect Detection in the Manufacturing Industry
Authors:
Philippe Carvalho,
Alexandre Durupt,
Yves Grandvalet
Abstract:
The field of industrial defect detection using machine learning and deep learning is a subject of active research. Datasets, also called benchmarks, are used to compare and assess research results. There is a number of datasets in industrial visual inspection, of varying quality. Thus, it is a difficult task to determine which dataset to use. Generally speaking, datasets which include a testing se…
▽ More
The field of industrial defect detection using machine learning and deep learning is a subject of active research. Datasets, also called benchmarks, are used to compare and assess research results. There is a number of datasets in industrial visual inspection, of varying quality. Thus, it is a difficult task to determine which dataset to use. Generally speaking, datasets which include a testing set, with precise labeling and made in real-world conditions should be preferred. We propose a study of existing benchmarks to compare and expose their characteristics and their use-cases. A study of industrial metrics requirements, as well as testing procedures, will be presented and applied to the studied benchmarks. We discuss our findings by examining the current state of benchmarks for industrial visual inspection, and by exposing guidelines on the usage of benchmarks.
△ Less
Submitted 5 May, 2023;
originally announced May 2023.
-
Context-Dependent Embedding Utterance Representations for Emotion Recognition in Conversations
Authors:
Patrícia Pereira,
Helena Moniz,
Isabel Dias,
Joao Paulo Carvalho
Abstract:
Emotion Recognition in Conversations (ERC) has been gaining increasing importance as conversational agents become more and more common. Recognizing emotions is key for effective communication, being a crucial component in the development of effective and empathetic conversational agents. Knowledge and understanding of the conversational context are extremely valuable for identifying the emotions o…
▽ More
Emotion Recognition in Conversations (ERC) has been gaining increasing importance as conversational agents become more and more common. Recognizing emotions is key for effective communication, being a crucial component in the development of effective and empathetic conversational agents. Knowledge and understanding of the conversational context are extremely valuable for identifying the emotions of the interlocutor. We thus approach Emotion Recognition in Conversations leveraging the conversational context, i.e., taking into attention previous conversational turns. The usual approach to model the conversational context has been to produce context-independent representations of each utterance and subsequently perform contextual modeling of these. Here we propose context-dependent embedding representations of each utterance by leveraging the contextual representational power of pre-trained transformer language models. In our approach, we feed the conversational context appended to the utterance to be classified as input to the RoBERTa encoder, to which we append a simple classification module, thus discarding the need to deal with context after obtaining the embeddings since these constitute already an efficient representation of such context. We also investigate how the number of introduced conversational turns influences our model performance. The effectiveness of our approach is validated on the open-domain DailyDialog dataset and on the task-oriented EmoWOZ dataset.
△ Less
Submitted 3 June, 2023; v1 submitted 17 April, 2023;
originally announced April 2023.
-
PGTask: Introducing the Task of Profile Generation from Dialogues
Authors:
Rui Ribeiro,
Joao P. Carvalho,
Luísa Coheur
Abstract:
Recent approaches have attempted to personalize dialogue systems by leveraging profile information into models. However, this knowledge is scarce and difficult to obtain, which makes the extraction/generation of profile information from dialogues a fundamental asset. To surpass this limitation, we introduce the Profile Generation Task (PGTask). We contribute with a new dataset for this problem, co…
▽ More
Recent approaches have attempted to personalize dialogue systems by leveraging profile information into models. However, this knowledge is scarce and difficult to obtain, which makes the extraction/generation of profile information from dialogues a fundamental asset. To surpass this limitation, we introduce the Profile Generation Task (PGTask). We contribute with a new dataset for this problem, comprising profile sentences aligned with related utterances, extracted from a corpus of dialogues. Furthermore, using state-of-the-art methods, we provide a benchmark for profile generation on this novel dataset. Our experiments disclose the challenges of profile generation, and we hope that this introduces a new research direction.
△ Less
Submitted 26 August, 2023; v1 submitted 13 April, 2023;
originally announced April 2023.
-
Context Matters: Adaptive Mutation for Grammars
Authors:
Pedro Carvalho,
Jessica Mégane,
Nuno Lourenço,
Penousal Machado
Abstract:
This work proposes Adaptive Facilitated Mutation, a self-adaptive mutation method for Structured Grammatical Evolution (SGE), biologically inspired by the theory of facilitated variation. In SGE, the genotype of individuals contains a list for each non-terminal of the grammar that defines the search space. In our proposed mutation, each individual contains an array with a different, self-adaptive…
▽ More
This work proposes Adaptive Facilitated Mutation, a self-adaptive mutation method for Structured Grammatical Evolution (SGE), biologically inspired by the theory of facilitated variation. In SGE, the genotype of individuals contains a list for each non-terminal of the grammar that defines the search space. In our proposed mutation, each individual contains an array with a different, self-adaptive mutation rate for each non-terminal. We also propose Function Grouped Grammars, a grammar design procedure, to enhance the benefits of the proposed mutation. Experiments were conducted on three symbolic regression benchmarks using Probabilistic Structured Grammatical Evolution (PSGE), a variant of SGE. Results show our approach is similar or better when compared with the standard grammar and mutation.
△ Less
Submitted 25 March, 2023;
originally announced March 2023.
-
Advancing Direct Convolution using Convolution Slicing Optimization and ISA Extensions
Authors:
Victor Ferrari,
Rafael Sousa,
Marcio Pereira,
João P. L. de Carvalho,
José Nelson Amaral,
José Moreira,
Guido Araujo
Abstract:
Convolution is one of the most computationally intensive operations that must be performed for machine-learning model inference. A traditional approach to compute convolutions is known as the Im2Col + BLAS method. This paper proposes SConv: a direct-convolution algorithm based on a MLIR/LLVM code-generation toolchain that can be integrated into machine-learning compilers . This algorithm introduce…
▽ More
Convolution is one of the most computationally intensive operations that must be performed for machine-learning model inference. A traditional approach to compute convolutions is known as the Im2Col + BLAS method. This paper proposes SConv: a direct-convolution algorithm based on a MLIR/LLVM code-generation toolchain that can be integrated into machine-learning compilers . This algorithm introduces: (a) Convolution Slicing Analysis (CSA) - a convolution-specific 3D cache-blocking analysis pass that focuses on tile reuse over the cache hierarchy; (b) Convolution Slicing Optimization (CSO) - a code-generation pass that uses CSA to generate a tiled direct-convolution macro-kernel; and (c) Vector-Based Packing (VBP) - an architecture-specific optimized input-tensor packing solution based on vector-register shift instructions for convolutions with unitary stride. Experiments conducted on 393 convolutions from full ONNX-MLIR machine-learning models indicate that the elimination of the Im2Col transformation and the use of fast packing routines result in a total packing time reduction, on full model inference, of 2.0x - 3.9x on Intel x86 and 3.6x - 7.2x on IBM POWER10. The speed-up over an Im2Col + BLAS method based on current BLAS implementations for end-to-end machine-learning model inference is in the range of 9% - 25% for Intel x86 and 10% - 42% for IBM POWER10 architectures. The total convolution speedup for model inference is 12% - 27% on Intel x86 and 26% - 46% on IBM POWER10. SConv also outperforms BLAS GEMM, when computing pointwise convolutions, in more than 83% of the 219 tested instances.
△ Less
Submitted 8 March, 2023;
originally announced March 2023.
-
Deformable registration with intensity correction for CESM monitoring response to Neoadjuvant Chemotherapy
Authors:
Clément Jailin,
Pablo Milioni De Carvalho,
Sara Mohamed,
Laurence Vancamberg,
Amr Farouk Ibrahim Moustafa,
Mohammed Gomaa,
Rasha Mohammed Kamal,
Serge Muller
Abstract:
This paper proposes a robust longitudinal registration method for Contrast Enhanced Spectral Mammography in monitoring neoadjuvant chemotherapy. Because breast texture intensity changes with the treatment, a non-rigid registration procedure with local intensity compensations is developed. The approach allows registering the low energy images of the exams acquired before and after the chemotherapy.…
▽ More
This paper proposes a robust longitudinal registration method for Contrast Enhanced Spectral Mammography in monitoring neoadjuvant chemotherapy. Because breast texture intensity changes with the treatment, a non-rigid registration procedure with local intensity compensations is developed. The approach allows registering the low energy images of the exams acquired before and after the chemotherapy. The measured motion is then applied to the corresponding recombined images. The difference of registered images, called residual, makes vanishing the breast texture that did not changed between the two exams. Consequently, this registered residual allows identifying local density and iodine changes, especially in the lesion area. The method is validated with a synthetic NAC case where ground truths are available. Then the procedure is applied to 51 patients with 208 CESM image pairs acquired before and after the chemotherapy treatment. The proposed registration converged in all 208 cases. The intensity-compensated registration approach is evaluated with different mathematical metrics and through the repositioning of clinical landmarks (RMSE: 5.9 mm) and outperforms state-of-the-art registration techniques.
△ Less
Submitted 22 February, 2023;
originally announced February 2023.
-
Structured mutation inspired by evolutionary theory enriches population performance and diversity
Authors:
Stefano Tiso,
Pedro Carvalho,
Nuno Lourenço,
Penousal Machado
Abstract:
Grammar-Guided Genetic Programming (GGGP) employs a variety of insights from evolutionary theory to autonomously design solutions for a given task. Recent insights from evolutionary biology can lead to further improvements in GGGP algorithms. In this paper, we apply principles from the theory of Facilitated Variation and knowledge about heterogeneous mutation rates and mutation effects to improve…
▽ More
Grammar-Guided Genetic Programming (GGGP) employs a variety of insights from evolutionary theory to autonomously design solutions for a given task. Recent insights from evolutionary biology can lead to further improvements in GGGP algorithms. In this paper, we apply principles from the theory of Facilitated Variation and knowledge about heterogeneous mutation rates and mutation effects to improve the variation operators. We term this new method of variation Facilitated Mutation (FM). We test FM performance on the evolution of neural network optimizers for image classification, a relevant task in evolutionary computation, with important implications for the field of machine learning. We compare FM and FM combined with crossover (FMX) against a typical mutation regime to assess the benefits of the approach. We find that FMX in particular provides statistical improvements in key metrics, creating a superior optimizer overall (+0.48\% average test accuracy), improving the average quality of solutions (+50\% average population fitness), and discovering more diverse high-quality behaviors (+400 high-quality solutions discovered per run on average). Additionally, FM and FMX can reduce the number of fitness evaluations in an evolutionary run, reducing computational costs in some scenarios.
△ Less
Submitted 12 July, 2023; v1 submitted 1 February, 2023;
originally announced February 2023.
-
SIR model with vaccination: bifurcation analysis
Authors:
João P. S. Maurício de Carvalho,
Alexandre A. Rodrigues
Abstract:
There are few adapted SIR models in the literature that combine vaccination and logistic growth. In this article, we study bifurcations of a SIR model where the class of Susceptible individuals grows logistically and has been subject to constant vaccination. We explicitly prove that the endemic equilibrium is a codimension two singularity in the parameter space $(\mathcal{R}_0, p)$, where…
▽ More
There are few adapted SIR models in the literature that combine vaccination and logistic growth. In this article, we study bifurcations of a SIR model where the class of Susceptible individuals grows logistically and has been subject to constant vaccination. We explicitly prove that the endemic equilibrium is a codimension two singularity in the parameter space $(\mathcal{R}_0, p)$, where $\mathcal{R}_0$ is the basic reproduction number and $p$ is the proportion of Susceptible individuals successfully vaccinated at birth.
We exhibit explicitly the Hopf, transcritical, Belyakov, heteroclinic and saddle-node bifurcation curves unfolding the singularity. The two parameters $(\mathcal{R}_0, p)$ are written in a useful way to evaluate the proportion of vaccinated individuals necessary to eliminate the disease and to conclude how the vaccination may affect the outcome of the epidemic. We also exhibit the region in the parameter space where the disease persists and we illustrate our main result with numerical simulations, emphasizing the role of the parameters.
△ Less
Submitted 25 April, 2023; v1 submitted 29 November, 2022;
originally announced November 2022.
-
Deep Emotion Recognition in Textual Conversations: A Survey
Authors:
Patrícia Pereira,
Helena Moniz,
Joao Paulo Carvalho
Abstract:
While Emotion Recognition in Conversations (ERC) has seen a tremendous advancement in the last few years, new applications and implementation scenarios present novel challenges and opportunities. These range from leveraging the conversational context, speaker and emotion dynamics modelling, to interpreting common sense expressions, informal language and sarcasm, addressing challenges of real time…
▽ More
While Emotion Recognition in Conversations (ERC) has seen a tremendous advancement in the last few years, new applications and implementation scenarios present novel challenges and opportunities. These range from leveraging the conversational context, speaker and emotion dynamics modelling, to interpreting common sense expressions, informal language and sarcasm, addressing challenges of real time ERC, recognizing emotion causes, different taxonomies across datasets, multilingual ERC to interpretability. This survey starts by introducing ERC, elaborating on the challenges and opportunities pertaining to this task. It proceeds with a description of the emotion taxonomies and a variety of ERC benchmark datasets employing such taxonomies. This is followed by descriptions of the most prominent works in ERC with explanations of the Deep Learning architectures employed. Then, it provides advisable ERC practices towards better frameworks, elaborating on methods to deal with subjectivity in annotations and modelling and methods to deal with the typically unbalanced ERC datasets. Finally, it presents systematic review tables comparing several works regarding the methods used and their performance. The survey highlights the advantage of leveraging techniques to address unbalanced data, the exploration of mixed emotions and the benefits of incorporating annotation subjectivity in the learning phase.
△ Less
Submitted 22 May, 2024; v1 submitted 16 November, 2022;
originally announced November 2022.
-
Squeezing as a probe of the universality hypothesis
Authors:
P. A. L. Mourão,
H. A. S. Costa,
P. R. S. Carvalho
Abstract:
We compute analytically the radiative quantum corrections, up to next-to-leading loop order, to the universal critical exponents for both massless and massive O($N$) $λφ^{4}$ scalar squeezed field theories for probing the universality hypothesis. For that, we employ six distinct and independent methods. The outcomes for the universal squeezed critical exponents obtained through these methods are i…
▽ More
We compute analytically the radiative quantum corrections, up to next-to-leading loop order, to the universal critical exponents for both massless and massive O($N$) $λφ^{4}$ scalar squeezed field theories for probing the universality hypothesis. For that, we employ six distinct and independent methods. The outcomes for the universal squeezed critical exponents obtained through these methods are identical among them and reduce to the conventional ones where squeezing is absent. Although the squeezing mechanism modifies the internal properties of the field, the squeezed critical indices are not affected by the squeezing effect, thus implying the validity of the universality hypothesis, at least at the loop level considered. At the end, we present the corresponding physical interpretation for the results in terms of the geometric symmetry properties of the squeezed field.
△ Less
Submitted 16 November, 2022;
originally announced November 2022.
-
Diffusion of muonic hydrogen in hydrogen gas and the measurement of the 1$s$ hyperfine splitting of muonic hydrogen
Authors:
J. Nuber,
A. Adamczak,
M. Abdou Ahmed,
L. Affolter,
F. D. Amaro,
P. Amaro,
P. Carvalho,
Y. -H. Chang,
T. -L. Chen,
W. -L. Chen,
L. M. P. Fernandes,
M. Ferro,
D. Goeldi,
T. Graf,
M. Guerra,
T. W. Hänsch,
C. A. O. Henriques,
M. Hildebrandt,
P. Indelicato,
O. Kara,
K. Kirch,
A. Knecht,
F. Kottmann,
Y. -W. Liu,
J. Machado
, et al. (24 additional authors not shown)
Abstract:
The CREMA collaboration is pursuing a measurement of the ground-state hyperfine splitting (HFS) in muonic hydrogen ($μ$p) with 1 ppm accuracy by means of pulsed laser spectroscopy. In the proposed experiment, the $μ$p atom is excited by a laser pulse from the singlet to the triplet hyperfine sub-levels, and is quenched back to the singlet state by an inelastic collision with a H$_2$ molecule. The…
▽ More
The CREMA collaboration is pursuing a measurement of the ground-state hyperfine splitting (HFS) in muonic hydrogen ($μ$p) with 1 ppm accuracy by means of pulsed laser spectroscopy. In the proposed experiment, the $μ$p atom is excited by a laser pulse from the singlet to the triplet hyperfine sub-levels, and is quenched back to the singlet state by an inelastic collision with a H$_2$ molecule. The resulting increase of kinetic energy after this cycle modifies the $μ$p atom diffusion in the hydrogen gas and the arrival time of the $μ$p atoms at the target walls. This laser-induced modification of the arrival times is used to expose the atomic transition. In this paper we present the simulation of the $μ$p diffusion in the H$_2$ gas which is at the core of the experimental scheme. These simulations have been implemented with the Geant4 framework by introducing various low-energy processes including the motion of the H$_2$ molecules, i.e. the effects related with the hydrogen target temperature. The simulations have been used to optimize the hydrogen target parameters (pressure, temperatures and thickness) and to estimate signal and background rates. These rates allow to estimate the maximum time needed to find the resonance and the statistical accuracy of the spectroscopy experiment.
△ Less
Submitted 24 May, 2023; v1 submitted 15 November, 2022;
originally announced November 2022.
-
Experimental validation of nonextensive statistical field theory: applications to manganites
Authors:
P. R. S. Carvalho
Abstract:
In this Letter we validate experimentally the nonextensive statistical field theory, a new general field-theoretic approach introduced recently in the literature. With such an approach, we are capable of computing the critical properties of nonextensive systems undergoing continuous phase transitions, belonging to the new generalized O($N$)$_{q}$ universality class. We compare the nonextensive cri…
▽ More
In this Letter we validate experimentally the nonextensive statistical field theory, a new general field-theoretic approach introduced recently in the literature. With such an approach, we are capable of computing the critical properties of nonextensive systems undergoing continuous phase transitions, belonging to the new generalized O($N$)$_{q}$ universality class. We compare the nonextensive critical indices values evaluated from the theory with those already obtained in literature through experiments for various distinct manganites presenting nonconventional critical behavior. The agreement is satisfactory, whose relative errors are $< 5\%$ for the most of manganites employed and as better as $|1 - q| < 1$ which is the limit of validity of the theory. We present the physical interpretation of the experimental results and of the theory through the general physical interpretation of the $q$-parameter.
△ Less
Submitted 14 November, 2022;
originally announced November 2022.
-
An agent-based approach to procedural city generation incorporating Land Use and Transport Interaction models
Authors:
Luiz Fernando Silva Eugênio dos Santos,
Claus Aranha,
André Ponce de Leon F de Carvalho
Abstract:
We apply the knowledge of urban settings established with the study of Land Use and Transport Interaction (LUTI) models to develop reward functions for an agent-based system capable of planning realistic artificial cities. The system aims to replicate in the micro scale the main components of real settlements, such as zoning and accessibility in a road network. Moreover, we propose a novel represe…
▽ More
We apply the knowledge of urban settings established with the study of Land Use and Transport Interaction (LUTI) models to develop reward functions for an agent-based system capable of planning realistic artificial cities. The system aims to replicate in the micro scale the main components of real settlements, such as zoning and accessibility in a road network. Moreover, we propose a novel representation for the agent's environment that efficiently combines the road graph with a discrete model for the land. Our system starts from an empty map consisting only of the road network graph, and the agent incrementally expands it by building new sites while distinguishing land uses between residential, commercial, industrial, and recreational.
△ Less
Submitted 21 October, 2022;
originally announced November 2022.
-
EmulART: Emulating Radiative Transfer -- A pilot study on autoencoder based dimensionality reduction for radiative transfer models
Authors:
João Rino-Silvestre,
Santiago González-Gaitán,
Marko Stalevski,
Majda Smole,
Pedro Guilherme-Garcia,
João Paulo Carvalho,
Ana Maria Mourão
Abstract:
Dust is a major component of the interstellar medium. Through scattering, absorption and thermal re-emission, it can profoundly alter astrophysical observations. Models for dust composition and distribution are necessary to better understand and curb their impact on observations. A new approach for serial and computationally inexpensive production of such models is here presented. Traditionally th…
▽ More
Dust is a major component of the interstellar medium. Through scattering, absorption and thermal re-emission, it can profoundly alter astrophysical observations. Models for dust composition and distribution are necessary to better understand and curb their impact on observations. A new approach for serial and computationally inexpensive production of such models is here presented. Traditionally these models are studied with the help of radiative transfer modelling, a critical tool to understand the impact of dust attenuation and reddening on the observed properties of galaxies and active galactic nuclei. Such simulations present, however, an approximately linear computational cost increase with the desired information resolution. Our new efficient model generator proposes a denoising variational autoencoder (or alternatively PCA), for spectral compression, combined with an approximate Bayesian method for spatial inference, to emulate high information radiative transfer models from low information models. For a simple spherical dust shell model with anisotropic illumination, our proposed approach successfully emulates the reference simulation starting from less than 1% of the information. Our emulations of the model at different viewing angles present median residuals below 15% across the spectral dimension, and below 48% across spatial and spectral dimensions. EmulART infers estimates for ~85% of information missing from the input, all within a total running time of around 20 minutes, estimated to be 6x faster than the present target high information resolution simulations, and up to 50x faster when applied to more complicated simulations.
△ Less
Submitted 22 December, 2022; v1 submitted 27 October, 2022;
originally announced October 2022.
-
Value of Bidirectional V2G Smart Charging Responsive Services: Insights from a Simple CA Model
Authors:
Pedro M. S. Carvalho,
Luis A. F. M. Ferreira
Abstract:
In this paper, particle-hopping cellular automaton (CA) models of elastic demand are used to investigate the value added to plug-in electric vehicles (PEV) aggregators by adopting vehicle-to-grid (V2G) responsive services. CA models used earlier to study load-sifting responses are modified to capture discharge/ recharge capabilities of V2G. Results on ramping responses from CA are then analysed to…
▽ More
In this paper, particle-hopping cellular automaton (CA) models of elastic demand are used to investigate the value added to plug-in electric vehicles (PEV) aggregators by adopting vehicle-to-grid (V2G) responsive services. CA models used earlier to study load-sifting responses are modified to capture discharge/ recharge capabilities of V2G. Results on ramping responses from CA are then analysed to discuss the small contribution to system controllability added by V2G responsive services.
△ Less
Submitted 14 September, 2022;
originally announced September 2022.
-
Model interpretation using improved local regression with variable importance
Authors:
Gilson Y. Shimizu,
Rafael Izbicki,
Andre C. P. L. F. de Carvalho
Abstract:
A fundamental question on the use of ML models concerns the explanation of their predictions for increasing transparency in decision-making. Although several interpretability methods have emerged, some gaps regarding the reliability of their explanations have been identified. For instance, most methods are unstable (meaning that they give very different explanations with small changes in the data)…
▽ More
A fundamental question on the use of ML models concerns the explanation of their predictions for increasing transparency in decision-making. Although several interpretability methods have emerged, some gaps regarding the reliability of their explanations have been identified. For instance, most methods are unstable (meaning that they give very different explanations with small changes in the data), and do not cope well with irrelevant features (that is, features not related to the label). This article introduces two new interpretability methods, namely VarImp and SupClus, that overcome these issues by using local regressions fits with a weighted distance that takes into account variable importance. Whereas VarImp generates explanations for each instance and can be applied to datasets with more complex relationships, SupClus interprets clusters of instances with similar explanations and can be applied to simpler datasets where clusters can be found. We compare our methods with state-of-the art approaches and show that it yields better explanations according to several metrics, particularly in high-dimensional problems with irrelevant features, as well as when the relationship between features and target is non-linear.
△ Less
Submitted 12 September, 2022;
originally announced September 2022.
-
Multimetric Finsler Geometry
Authors:
Patrícia Carvalho,
Cristian Landri,
Ravi Mistry,
Aleksandr Pinzul
Abstract:
Motivated in part by the bi-gravity approach to massive gravity, we introduce and study the multimetric Finsler geometry. For the case of an arbitrary number of dimensions, we study some general properties of the geometry in terms of its Riemannian ingredients, while in the 2-dimensional case, we derive all the Cartan equations as well as explicitly find the Holmes-Thompson measure.
Motivated in part by the bi-gravity approach to massive gravity, we introduce and study the multimetric Finsler geometry. For the case of an arbitrary number of dimensions, we study some general properties of the geometry in terms of its Riemannian ingredients, while in the 2-dimensional case, we derive all the Cartan equations as well as explicitly find the Holmes-Thompson measure.
△ Less
Submitted 7 August, 2022;
originally announced August 2022.
-
No Pattern, No Recognition: a Survey about Reproducibility and Distortion Issues of Text Clustering and Topic Modeling
Authors:
Marília Costa Rosendo Silva,
Felipe Alves Siqueira,
João Pedro Mantovani Tarrega,
João Vitor Pataca Beinotti,
Augusto Sousa Nunes,
Miguel de Mattos Gardini,
Vinícius Adolfo Pereira da Silva,
Nádia Félix Felipe da Silva,
André Carlos Ponce de Leon Ferreira de Carvalho
Abstract:
Extracting knowledge from unlabeled texts using machine learning algorithms can be complex. Document categorization and information retrieval are two applications that may benefit from unsupervised learning (e.g., text clustering and topic modeling), including exploratory data analysis. However, the unsupervised learning paradigm poses reproducibility issues. The initialization can lead to variabi…
▽ More
Extracting knowledge from unlabeled texts using machine learning algorithms can be complex. Document categorization and information retrieval are two applications that may benefit from unsupervised learning (e.g., text clustering and topic modeling), including exploratory data analysis. However, the unsupervised learning paradigm poses reproducibility issues. The initialization can lead to variability depending on the machine learning algorithm. Furthermore, the distortions can be misleading when regarding cluster geometry. Amongst the causes, the presence of outliers and anomalies can be a determining factor. Despite the relevance of initialization and outlier issues for text clustering and topic modeling, the authors did not find an in-depth analysis of them. This survey provides a systematic literature review (2011-2022) of these subareas and proposes a common terminology since similar procedures have different terms. The authors describe research opportunities, trends, and open issues. The appendices summarize the theoretical background of the text vectorization, the factorization, and the clustering algorithms that are directly or indirectly related to the reviewed works.
△ Less
Submitted 2 August, 2022;
originally announced August 2022.
-
Breast shape estimation and correction in CESM biopsy
Authors:
Ruben Sanchez,
Clément Jailin,
Ann-Katherine Carton,
Pablo Milioni de Carvalho,
Laurence Casteignau,
Serge Muller
Abstract:
Description of purpose: Contrast-enhanced spectral mammography can be used to guide needle biopsies. However, in vertical approach the compressed breast is deformed generating a so-called bump in the paddle aperture, which may interfere with the visibility of contrast-uptakes. Local thickness estimation would provide an enhanced image quality of the recombined image, increasing the visibility of t…
▽ More
Description of purpose: Contrast-enhanced spectral mammography can be used to guide needle biopsies. However, in vertical approach the compressed breast is deformed generating a so-called bump in the paddle aperture, which may interfere with the visibility of contrast-uptakes. Local thickness estimation would provide an enhanced image quality of the recombined image, increasing the visibility of the contrast-uptakes to be targeted during the biopsy procedure. In this work we propose a method to estimate the shape of the breast bump in biopsy vertical approach. Materials and Methods: Our method consists on two steps: first, we compute a raw thickness which does not take into account the presence of contrast-uptakes; second, we use a physical model to separate the sparse iodine texture from the breast shape. This physical model is composed by a sum of Fourier components, describing the main shape of the bump, a series of low-order polynomials, describing the main compressed thickness, paddle tilt and deflection, and non-linear components describing the translation and rotation of the paddle aperture. A 3D object mimicking a bump was fabricated to test the pertinence of our shape model. Also, clinical images of 21 patients which followed CESM-guided biopsy were visually assessed. Results: Comparison between raw and final estimated thickness of our 3D test object shows an error standard deviation of 0.37 mm similar to the noise standard deviation equals to 0.32 mm. The visual assessment of clinical cases showed that the thickness correction removes the superimposed low-frequency pattern due to non-uniform thickness of the bump, improving the identification of the lesion to be targeted. Conclusion: The proposed method for thickness estimation is adapted to CESM-guided biopsies in vertical approach and it improves the identification of the contrast-uptakes that need to be targeted during the procedure.
△ Less
Submitted 28 July, 2022;
originally announced July 2022.
-
Critical and injective modules over skew polynomial rings
Authors:
Ken Brown,
Paula A. A. B. Carvalho,
Jerzy Matczuk
Abstract:
Let $R$ be a commutative local $k$-algebra of Krull dimension one, where $k$ is a field. Let $α$ be a $k$-algebra automorphism of $R$, and define $S$ to be the skew polynomial algebra $R[θ; α]$. We offer, under some additional assumptions on $R$, a criterion for $S$ to have injective hulls of all simple $S$-modules locally Artinian - that is, for $S$ to satisfy property $(\diamond)$. It is easy an…
▽ More
Let $R$ be a commutative local $k$-algebra of Krull dimension one, where $k$ is a field. Let $α$ be a $k$-algebra automorphism of $R$, and define $S$ to be the skew polynomial algebra $R[θ; α]$. We offer, under some additional assumptions on $R$, a criterion for $S$ to have injective hulls of all simple $S$-modules locally Artinian - that is, for $S$ to satisfy property $(\diamond)$. It is easy and well known that if $α$ is of finite order, then $S$ has this property, but in order to get the criterion when $α$ has infinite order we found it necessary to classify all cyclic (Krull) critical $S$-modules in this case, a result which may be of independent interest. With the help of the above we show that $\hat{S}=k[[X]][θ, α]$ satisfies $(\diamond)$ for all $k$-algebra automorphisms $α$ of $k[[X]]$.
△ Less
Submitted 22 June, 2022;
originally announced June 2022.
-
Lessons learned from the NeurIPS 2021 MetaDL challenge: Backbone fine-tuning without episodic meta-learning dominates for few-shot learning image classification
Authors:
Adrian El Baz,
Ihsan Ullah,
Edesio Alcobaça,
André C. P. L. F. Carvalho,
Hong Chen,
Fabio Ferreira,
Henry Gouk,
Chaoyu Guan,
Isabelle Guyon,
Timothy Hospedales,
Shell Hu,
Mike Huisman,
Frank Hutter,
Zhengying Liu,
Felix Mohr,
Ekrem Öztürk,
Jan N. van Rijn,
Haozhe Sun,
Xin Wang,
Wenwu Zhu
Abstract:
Although deep neural networks are capable of achieving performance superior to humans on various tasks, they are notorious for requiring large amounts of data and computing resources, restricting their success to domains where such resources are available. Metalearning methods can address this problem by transferring knowledge from related tasks, thus reducing the amount of data and computing reso…
▽ More
Although deep neural networks are capable of achieving performance superior to humans on various tasks, they are notorious for requiring large amounts of data and computing resources, restricting their success to domains where such resources are available. Metalearning methods can address this problem by transferring knowledge from related tasks, thus reducing the amount of data and computing resources needed to learn new tasks. We organize the MetaDL competition series, which provide opportunities for research groups all over the world to create and experimentally assess new meta-(deep)learning solutions for real problems. In this paper, authored collaboratively between the competition organizers and the top-ranked participants, we describe the design of the competition, the datasets, the best experimental results, as well as the top-ranked methods in the NeurIPS 2021 challenge, which attracted 15 active teams who made it to the final phase (by outperforming the baseline), making over 100 code submissions during the feedback phase. The solutions of the top participants have been open-sourced. The lessons learned include that learning good representations is essential for effective transfer learning.
△ Less
Submitted 11 July, 2022; v1 submitted 15 June, 2022;
originally announced June 2022.
-
Continual Object Detection: A review of definitions, strategies, and challenges
Authors:
Angelo G. Menezes,
Gustavo de Moura,
Cézanne Alves,
André C. P. L. F. de Carvalho
Abstract:
The field of Continual Learning investigates the ability to learn consecutive tasks without losing performance on those previously learned. Its focus has been mainly on incremental classification tasks. We believe that research in continual object detection deserves even more attention due to its vast range of applications in robotics and autonomous vehicles. This scenario is more complex than con…
▽ More
The field of Continual Learning investigates the ability to learn consecutive tasks without losing performance on those previously learned. Its focus has been mainly on incremental classification tasks. We believe that research in continual object detection deserves even more attention due to its vast range of applications in robotics and autonomous vehicles. This scenario is more complex than conventional classification given the occurrence of instances of classes that are unknown at the time, but can appear in subsequent tasks as a new class to be learned, resulting in missing annotations and conflicts with the background label. In this review, we analyze the current strategies proposed to tackle the problem of class-incremental object detection. Our main contributions are: (1) a short and systematic review of the methods that propose solutions to traditional incremental object detection scenarios; (2) A comprehensive evaluation of the existing approaches using a new metric to quantify the stability and plasticity of each technique in a standard way; (3) an overview of the current trends within continual object detection and a discussion of possible future research directions.
△ Less
Submitted 30 May, 2022;
originally announced May 2022.
-
Nonextensive percolation and Lee-Yang edge singularity from nonextensive $λφ^{3}$ scalar field theory
Authors:
P. R. S. Carvalho
Abstract:
We compute the critical exponents for nonextensive $λφ^{3}$ scalar field theory for all loop orders and $|q - 1| < 1$. We apply the results for both nonextensive percolation and Lee-Yang edge singularity problems. The corresponding systems are nonextensive generalizations of their extensive counterparts. For that we employ tools from the recently introduced nonextensive statistical field theory. T…
▽ More
We compute the critical exponents for nonextensive $λφ^{3}$ scalar field theory for all loop orders and $|q - 1| < 1$. We apply the results for both nonextensive percolation and Lee-Yang edge singularity problems. The corresponding systems are nonextensive generalizations of their extensive counterparts. For that we employ tools from the recently introduced nonextensive statistical field theory. The results for the nonextensive critical exponents computed depend on the nonextensive parameter $q$, which encodes global correlations among the degrees of freedom of the system. The extensive results are recovered in the limit $q\rightarrow 1$. There is an interplay between global correlations and fluctuations, once the nonextensive critical exponents depend on $q$. This dependence is in agreement with the universality hypothesis.
△ Less
Submitted 31 May, 2022; v1 submitted 28 March, 2022;
originally announced March 2022.
-
An Efficient Contact Algorithm for Rigid/Deformable Interaction based on the Dual Mortar Method
Authors:
R. Pinto Carvalho,
A. M. Couto Carneiro,
F. M. Andrade Pires,
A. Popp
Abstract:
In a wide range of practical problems, such as forming operations and impact tests, assuming that one of the contacting bodies is rigid is an excellent approximation to the physical phenomenon. In this work, the well-established dual mortar method is adopted to enforce interface constraints in the finite deformation frictionless contact of rigid and deformable bodies. The efficiency of the nonline…
▽ More
In a wide range of practical problems, such as forming operations and impact tests, assuming that one of the contacting bodies is rigid is an excellent approximation to the physical phenomenon. In this work, the well-established dual mortar method is adopted to enforce interface constraints in the finite deformation frictionless contact of rigid and deformable bodies. The efficiency of the nonlinear contact algorithm proposed here is based on two main contributions. Firstly, a variational formulation of the method using the so-called Petrov-Galerkin scheme is investigated, as it unlocks a significant simplification by removing the need to explicitly evaluate the dual basis functions. The corresponding first-order dual mortar interpolation is presented in detail. Particular focus is, then, placed on the extension for second-order interpolation by employing a piecewise linear interpolation scheme, which critically retains the geometrical information of the finite element mesh. Secondly, a new definition for the nodal orthonormal moving frame attached to each contact node is suggested. It reduces the geometrical coupling between the nodes and consequently decreases the stiffness matrix bandwidth. The proposed contributions decrease the computational complexity of dual mortar methods for rigid/deformable interaction, especially in the three-dimensional setting, while preserving accuracy and robustness.
△ Less
Submitted 6 October, 2022; v1 submitted 4 January, 2022;
originally announced January 2022.
-
Nonextensive statistical field theory
Authors:
P. R. S. Carvalho
Abstract:
We introduce a field-theoretic approach for describing the critical behavior of nonextensive systems, systems displaying global correlations among their degrees of freedom, encoded by the nonextensive parameter $q$. As some applications, we report, to our knowledge, the first analytical computation of both universal static and dynamic $q$-dependent nonextensive critical exponents for O($N$) vector…
▽ More
We introduce a field-theoretic approach for describing the critical behavior of nonextensive systems, systems displaying global correlations among their degrees of freedom, encoded by the nonextensive parameter $q$. As some applications, we report, to our knowledge, the first analytical computation of both universal static and dynamic $q$-dependent nonextensive critical exponents for O($N$) vector models, valid for all loop orders and $|q - 1| < 1$. Then emerges the new nonextensive O($N$)$_{q}$ universality class. We employ six independent methods which furnish identical results. Particularly, the results for nonextensive 2d Ising systems, exact within the referred approximation, agree with that obtained from computer simulations, within the margin of error, as better as $q$ is closer to $1$. We argue that the present approach can be applied to all models described by extensive statistical field theory as well. The results show an interplay between global correlations and fluctuations.
△ Less
Submitted 6 December, 2021; v1 submitted 1 December, 2021;
originally announced December 2021.
-
Laser excitation of the 1s-hyperfine transition in muonic hydrogen
Authors:
P. Amaro,
A. Adamczak,
M. Abdou Ahmed,
L. Affolter,
F. D. Amaro,
P. Carvalho,
T. -L. Chen,
L. M. P. Fernandes,
M. Ferro,
D. Goeldi,
T. Graf,
M. Guerra,
T. W. Hänsch,
C. A. O. Henriques,
Y. -C. Huang,
P. Indelicato,
O. Kara,
K. Kirch,
A. Knecht,
F. Kottmann,
Y. -W. Liu,
J. Machado,
M. Marszalek,
R. D. P. Mano,
C. M. B. Monteiro
, et al. (21 additional authors not shown)
Abstract:
The CREMA collaboration is pursuing a measurement of the ground-state hyperfine splitting (HFS) in muonic hydrogen ($μ$p) with 1 ppm accuracy by means of pulsed laser spectroscopy to determine the two-photon-exchange contribution with $2\times10^{-4}$ relative accuracy. In the proposed experiment, the $μ$p atom undergoes a laser excitation from the singlet hyperfine state to the triplet hyperfine…
▽ More
The CREMA collaboration is pursuing a measurement of the ground-state hyperfine splitting (HFS) in muonic hydrogen ($μ$p) with 1 ppm accuracy by means of pulsed laser spectroscopy to determine the two-photon-exchange contribution with $2\times10^{-4}$ relative accuracy. In the proposed experiment, the $μ$p atom undergoes a laser excitation from the singlet hyperfine state to the triplet hyperfine state, {then} is quenched back to the singlet state by an inelastic collision with a H$_2$ molecule. The resulting increase of kinetic energy after the collisional deexcitation is used as a signature of a successful laser transition between hyperfine states. In this paper, we calculate the combined probability that a $μ$p atom initially in the singlet hyperfine state undergoes a laser excitation to the triplet state followed by a collisional-induced deexcitation back to the singlet state. This combined probability has been computed using the optical Bloch equations including the inelastic and elastic collisions. Omitting the decoherence effects caused by {the laser bandwidth and }collisions would overestimate the transition probability by more than a factor of two in the experimental conditions. Moreover, we also account for Doppler effects and provide the matrix element, the saturation fluence, the elastic and inelastic collision rates for the singlet and triplet states, and the resonance linewidth. This calculation thus quantifies one of the key unknowns of the HFS experiment, leading to a precise definition of the requirements for the laser system and to an optimization of the hydrogen gas target where $μ$p is formed and the laser spectroscopy will occur.
△ Less
Submitted 7 June, 2022; v1 submitted 30 November, 2021;
originally announced December 2021.
-
Search for coherent elastic neutrino-nucleus scattering at a nuclear reactor with CONNIE 2019 data
Authors:
CONNIE collaboration,
Alexis Aguilar-Arevalo,
Javier Bernal,
Xavier Bertou,
Carla Bonifazi,
Gustavo Cancelo,
Victor G. P. B. de Carvalho,
Brenda A. Cervantes-Vergara,
Claudio Chavez,
Gustavo Coelho Corrêa,
Juan C. D'Olivo,
João C. dos Anjos,
Juan Estrada,
Aldo R. Fernandes Neto,
Guillermo Fernandez Moroni,
Ana Foguel,
Richard Ford,
Julián Gasanego Barbuscio,
Juan Gonzalez Cuevas,
Susana Hernandez,
Federico Izraelevitch,
Ben Kilminster,
Kevin Kuk,
Herman P. Lima Jr,
Martin Makler
, et al. (11 additional authors not shown)
Abstract:
The Coherent Neutrino-Nucleus Interaction Experiment (CONNIE) is taking data at the Angra 2 nuclear reactor with the aim of detecting the coherent elastic scattering of reactor antineutrinos with silicon nuclei using charge-coupled devices (CCDs). In 2019 the experiment operated with a hardware binning applied to the readout stage, leading to lower levels of readout noise and improving the detecti…
▽ More
The Coherent Neutrino-Nucleus Interaction Experiment (CONNIE) is taking data at the Angra 2 nuclear reactor with the aim of detecting the coherent elastic scattering of reactor antineutrinos with silicon nuclei using charge-coupled devices (CCDs). In 2019 the experiment operated with a hardware binning applied to the readout stage, leading to lower levels of readout noise and improving the detection threshold down to 50 eV. The results of the analysis of 2019 data are reported here, corresponding to the detector array of 8 CCDs with a fiducial mass of 36.2 g and a total exposure of 2.2 kg-days. The difference between the reactor-on and reactor-off spectra shows no excess at low energies and yields upper limits at 95% confidence level for the neutrino interaction rates. In the lowest-energy range, 50-180 eV, the expected limit stands at 34 (39) times the standard model prediction, while the observed limit is 66 (75) times the standard model prediction with Sarkis (Chavarria) quenching factors.
△ Less
Submitted 6 April, 2022; v1 submitted 25 October, 2021;
originally announced October 2021.
-
Forecasting Financial Market Structure from Network Features using Machine Learning
Authors:
Douglas Castilho,
Tharsis T. P. Souza,
Soong Moon Kang,
João Gama,
André C. P. L. F. de Carvalho
Abstract:
We propose a model that forecasts market correlation structure from link- and node-based financial network features using machine learning. For such, market structure is modeled as a dynamic asset network by quantifying time-dependent co-movement of asset price returns across company constituents of major global market indices. We provide empirical evidence using three different network filtering…
▽ More
We propose a model that forecasts market correlation structure from link- and node-based financial network features using machine learning. For such, market structure is modeled as a dynamic asset network by quantifying time-dependent co-movement of asset price returns across company constituents of major global market indices. We provide empirical evidence using three different network filtering methods to estimate market structure, namely Dynamic Asset Graph (DAG), Dynamic Minimal Spanning Tree (DMST) and Dynamic Threshold Networks (DTN). Experimental results show that the proposed model can forecast market structure with high predictive performance with up to $40\%$ improvement over a time-invariant correlation-based benchmark. Non-pair-wise correlation features showed to be important compared to traditionally used pair-wise correlation measures for all markets studied, particularly in the long-term forecasting of stock market structure. Evidence is provided for stock constituents of the DAX30, EUROSTOXX50, FTSE100, HANGSENG50, NASDAQ100 and NIFTY50 market indices. Findings can be useful to improve portfolio selection and risk management methods, which commonly rely on a backward-looking covariance matrix to estimate portfolio risk.
△ Less
Submitted 22 October, 2021;
originally announced October 2021.
-
Learning multiplane images from single views with self-supervision
Authors:
Gustavo Sutter P. Carvalho,
Diogo C. Luvizon,
Antonio Joia Neto,
Andre G. C. Pacheco,
Otavio A. B. Penatti
Abstract:
Generating static novel views from an already captured image is a hard task in computer vision and graphics, in particular when the single input image has dynamic parts such as persons or moving objects. In this paper, we tackle this problem by proposing a new framework, called CycleMPI, that is capable of learning a multiplane image representation from single images through a cyclic training stra…
▽ More
Generating static novel views from an already captured image is a hard task in computer vision and graphics, in particular when the single input image has dynamic parts such as persons or moving objects. In this paper, we tackle this problem by proposing a new framework, called CycleMPI, that is capable of learning a multiplane image representation from single images through a cyclic training strategy for self-supervision. Our framework does not require stereo data for training, therefore it can be trained with massive visual data from the Internet, resulting in a better generalization capability even for very challenging cases. Although our method does not require stereo data for supervision, it reaches results on stereo datasets comparable to the state of the art in a zero-shot scenario. We evaluated our method on RealEstate10K and Mannequin Challenge datasets for view synthesis and presented qualitative results on Places II dataset.
△ Less
Submitted 19 October, 2021; v1 submitted 18 October, 2021;
originally announced October 2021.