-
Hunting for extra dimensions in black hole shadows
Authors:
A. S. Lemos,
J. A. V. Campos,
F. A. Brito
Abstract:
Observational data of the Sagittarius A* (Sgr A*) shadow released by the Event Horizon Telescope (EHT) are used to investigate eventual deviations in the black hole shadow radius, aiming to seek physics beyond the Standard Model (SM) coming from extra-dimensional theory. We consider the brane-world scenario described by the Randall-Sundrum model and determine the black hole shadow radius correctio…
▽ More
Observational data of the Sagittarius A* (Sgr A*) shadow released by the Event Horizon Telescope (EHT) are used to investigate eventual deviations in the black hole shadow radius, aiming to seek physics beyond the Standard Model (SM) coming from extra-dimensional theory. We consider the brane-world scenario described by the Randall-Sundrum model and determine the black hole shadow radius correction owing to the higher dimension. From data of the shadow radius in units of BH mass determined by KECK- and VLTI-based estimates, we imposed restrictions on the deviation obtained, and one sets an upper limit to the curvature radius of Anti-de Sitter ($\mathrm{AdS_{5}}$) spacetime $\ell\lesssim4.3\times10^{-2}\,\mathrm{AU}$ (at $95\%$ confidence level).
△ Less
Submitted 5 July, 2024;
originally announced July 2024.
-
Improving Reward Models with Synthetic Critiques
Authors:
Zihuiwen Ye,
Fraser Greenlee-Scott,
Max Bartolo,
Phil Blunsom,
Jon Ander Campos,
Matthias Gallé
Abstract:
Reward models (RM) play a critical role in aligning language models through the process of reinforcement learning from human feedback. RMs are trained to predict a score reflecting human preference, which requires significant time and cost for human annotation. Additionally, RMs tend to quickly overfit on superficial features in the training set, hindering their generalization performance on unsee…
▽ More
Reward models (RM) play a critical role in aligning language models through the process of reinforcement learning from human feedback. RMs are trained to predict a score reflecting human preference, which requires significant time and cost for human annotation. Additionally, RMs tend to quickly overfit on superficial features in the training set, hindering their generalization performance on unseen distributions. We propose a novel approach using synthetic natural language critiques generated by large language models to provide additional feedback, evaluating aspects such as instruction following, correctness, and style. This offers richer signals and more robust features for RMs to assess and score on. We demonstrate that high-quality critiques improve the performance and data efficiency of RMs initialized from different pretrained models. Conversely, we also show that low-quality critiques negatively impact performance. Furthermore, incorporating critiques enhances the interpretability and robustness of RM training.
△ Less
Submitted 31 May, 2024;
originally announced May 2024.
-
Aya 23: Open Weight Releases to Further Multilingual Progress
Authors:
Viraat Aryabumi,
John Dang,
Dwarak Talupuru,
Saurabh Dash,
David Cairuz,
Hangyu Lin,
Bharat Venkitesh,
Madeline Smith,
Jon Ander Campos,
Yi Chern Tan,
Kelly Marchisio,
Max Bartolo,
Sebastian Ruder,
Acyr Locatelli,
Julia Kreutzer,
Nick Frosst,
Aidan Gomez,
Phil Blunsom,
Marzieh Fadaee,
Ahmet Üstün,
Sara Hooker
Abstract:
This technical report introduces Aya 23, a family of multilingual language models. Aya 23 builds on the recent release of the Aya model (Üstün et al., 2024), focusing on pairing a highly performant pre-trained model with the recently released Aya collection (Singh et al., 2024). The result is a powerful multilingual large language model serving 23 languages, expanding state-of-art language modelin…
▽ More
This technical report introduces Aya 23, a family of multilingual language models. Aya 23 builds on the recent release of the Aya model (Üstün et al., 2024), focusing on pairing a highly performant pre-trained model with the recently released Aya collection (Singh et al., 2024). The result is a powerful multilingual large language model serving 23 languages, expanding state-of-art language modeling capabilities to approximately half of the world's population. The Aya model covered 101 languages whereas Aya 23 is an experiment in depth vs breadth, exploring the impact of allocating more capacity to fewer languages that are included during pre-training. Aya 23 outperforms both previous massively multilingual models like Aya 101 for the languages it covers, as well as widely used models like Gemma, Mistral and Mixtral on an extensive range of discriminative and generative tasks. We release the open weights for both the 8B and 35B models as part of our continued commitment for expanding access to multilingual progress.
△ Less
Submitted 31 May, 2024; v1 submitted 23 May, 2024;
originally announced May 2024.
-
When to Retrieve: Teaching LLMs to Utilize Information Retrieval Effectively
Authors:
Tiziano Labruna,
Jon Ander Campos,
Gorka Azkune
Abstract:
In this paper, we demonstrate how Large Language Models (LLMs) can effectively learn to use an off-the-shelf information retrieval (IR) system specifically when additional context is required to answer a given question. Given the performance of IR systems, the optimal strategy for question answering does not always entail external information retrieval; rather, it often involves leveraging the par…
▽ More
In this paper, we demonstrate how Large Language Models (LLMs) can effectively learn to use an off-the-shelf information retrieval (IR) system specifically when additional context is required to answer a given question. Given the performance of IR systems, the optimal strategy for question answering does not always entail external information retrieval; rather, it often involves leveraging the parametric memory of the LLM itself. Prior research has identified this phenomenon in the PopQA dataset, wherein the most popular questions are effectively addressed using the LLM's parametric memory, while less popular ones require IR system usage. Following this, we propose a tailored training approach for LLMs, leveraging existing open-domain question answering datasets. Here, LLMs are trained to generate a special token, <RET>, when they do not know the answer to a question. Our evaluation of the Adaptive Retrieval LLM (Adapt-LLM) on the PopQA dataset showcases improvements over the same LLM under three configurations: (i) retrieving information for all the questions, (ii) using always the parametric memory of the LLM, and (iii) using a popularity threshold to decide when to use a retriever. Through our analysis, we demonstrate that Adapt-LLM is able to generate the <RET> token when it determines that it does not know how to answer a question, indicating the need for IR, while it achieves notably high accuracy levels when it chooses to rely only on its parametric memory.
△ Less
Submitted 6 May, 2024; v1 submitted 30 April, 2024;
originally announced April 2024.
-
The Active Asteroids Citizen Science Program: Overview and First Results
Authors:
Colin Orion Chandler,
Chadwick A. Trujillo,
William J. Oldroyd,
Jay K. Kueny,
William A. Burris,
Henry H. Hsieh,
Jarod A. DeSpain,
Nima Sedaghat,
Scott S. Sheppard,
Kennedy A. Farrell,
David E. Trilling,
Annika Gustafsson,
Mark Jesus Mendoza Magbanua,
Michele T. Mazzucato,
Milton K. D. Bosch,
Tiffany Shaw-Diaz,
Virgilio Gonano,
Al Lamperti,
José A. da Silva Campos,
Brian L. Goodwin,
Ivan A. Terentev,
Charles J. A. Dukes,
Sam Deen
Abstract:
We present the Citizen Science program Active Asteroids and describe discoveries stemming from our ongoing project. Our NASA Partner program is hosted on the Zooniverse online platform and launched on 2021 August 31, with the goal of engaging the community in the search for active asteroids -- asteroids with comet-like tails or comae. We also set out to identify other unusual active solar system o…
▽ More
We present the Citizen Science program Active Asteroids and describe discoveries stemming from our ongoing project. Our NASA Partner program is hosted on the Zooniverse online platform and launched on 2021 August 31, with the goal of engaging the community in the search for active asteroids -- asteroids with comet-like tails or comae. We also set out to identify other unusual active solar system objects, such as active Centaurs, active quasi-Hilda asteroids, and Jupiter-family comets (JFCs). Active objects are rare in large part because they are difficult to identify, so we ask volunteers to assist us in searching for active bodies in our collection of millions of images of known minor planets. We produced these cutout images with our project pipeline that makes use of publicly available Dark Energy Camera (DECam) data. Since the project launch, roughly 8,300 volunteers have scrutinized some 430,000 images to great effect, which we describe in this work. In total we have identified previously unknown activity on 15 asteroids, plus one Centaur, that were thought to be asteroidal (i.e., inactive). Of the asteroids, we classify four as active quasi-Hilda asteroids, seven as JFCs, and four as active asteroids, consisting of one Main-belt comet (MBC) and three MBC candidates. We also include our findings concerning known active objects that our program facilitated, an unanticipated avenue of scientific discovery. These include discovering activity occurring during an orbital epoch for which objects were not known to be active, and the reclassification of objects based on our dynamical analyses.
△ Less
Submitted 14 March, 2024;
originally announced March 2024.
-
Weak Coupling Regime in Dilatonic f(R,T) Cosmology
Authors:
F. A. Brito,
C. H. A. B. Borges,
J. A. V. Campos,
F. G. Costa
Abstract:
We consider $f(R,T)$ modified theories of gravity in the context of string theory inspired dilaton gravity. We deal with a specific model that under certain conditions describes the late time Universe in accord with observational data in modern cosmology and addresses the $H_0$ tension. This is done by exploring the space of parameters made out of those coming from the modified gravity and dilaton…
▽ More
We consider $f(R,T)$ modified theories of gravity in the context of string theory inspired dilaton gravity. We deal with a specific model that under certain conditions describes the late time Universe in accord with observational data in modern cosmology and addresses the $H_0$ tension. This is done by exploring the space of parameters made out of those coming from the modified gravity and dilatonic charge sectors. We employ numerical methods to obtain several important observable quantities.
△ Less
Submitted 8 March, 2024; v1 submitted 22 December, 2023;
originally announced December 2023.
-
Absorption, scattering, quasinormal modes and shadow by canonical acoustic black holes in Lorentz-violating background
Authors:
J. A. V. Campos,
M. A. Anacleto,
F. A. Brito,
E. Passos
Abstract:
In the present work, we study the scattering for a black hole described by the canonical acoustic metric with Lorentz violation using asymptotic and numerical methods. In this scenario, we also check the effects of quasinormal modes and the acoustic shadow radius. In the eikonal limit the relationship between the shadow radius and the real part of the quasinormal frequency is preserved.
In the present work, we study the scattering for a black hole described by the canonical acoustic metric with Lorentz violation using asymptotic and numerical methods. In this scenario, we also check the effects of quasinormal modes and the acoustic shadow radius. In the eikonal limit the relationship between the shadow radius and the real part of the quasinormal frequency is preserved.
△ Less
Submitted 10 June, 2024; v1 submitted 21 December, 2023;
originally announced December 2023.
-
NLP Evaluation in trouble: On the Need to Measure LLM Data Contamination for each Benchmark
Authors:
Oscar Sainz,
Jon Ander Campos,
Iker García-Ferrero,
Julen Etxaniz,
Oier Lopez de Lacalle,
Eneko Agirre
Abstract:
In this position paper, we argue that the classical evaluation on Natural Language Processing (NLP) tasks using annotated benchmarks is in trouble. The worst kind of data contamination happens when a Large Language Model (LLM) is trained on the test split of a benchmark, and then evaluated in the same benchmark. The extent of the problem is unknown, as it is not straightforward to measure. Contami…
▽ More
In this position paper, we argue that the classical evaluation on Natural Language Processing (NLP) tasks using annotated benchmarks is in trouble. The worst kind of data contamination happens when a Large Language Model (LLM) is trained on the test split of a benchmark, and then evaluated in the same benchmark. The extent of the problem is unknown, as it is not straightforward to measure. Contamination causes an overestimation of the performance of a contaminated model in a target benchmark and associated task with respect to their non-contaminated counterparts. The consequences can be very harmful, with wrong scientific conclusions being published while other correct ones are discarded. This position paper defines different levels of data contamination and argues for a community effort, including the development of automatic and semi-automatic measures to detect when data from a benchmark was exposed to a model, and suggestions for flagging papers with conclusions that are compromised by data contamination.
△ Less
Submitted 27 October, 2023;
originally announced October 2023.
-
Unsupervised Domain Adaption for Neural Information Retrieval
Authors:
Carlos Dominguez,
Jon Ander Campos,
Eneko Agirre,
Gorka Azkune
Abstract:
Neural information retrieval requires costly annotated data for each target domain to be competitive. Synthetic annotation by query generation using Large Language Models or rule-based string manipulation has been proposed as an alternative, but their relative merits have not been analysed. In this paper, we compare both methods head-to-head using the same neural IR architecture. We focus on the B…
▽ More
Neural information retrieval requires costly annotated data for each target domain to be competitive. Synthetic annotation by query generation using Large Language Models or rule-based string manipulation has been proposed as an alternative, but their relative merits have not been analysed. In this paper, we compare both methods head-to-head using the same neural IR architecture. We focus on the BEIR benchmark, which includes test datasets from several domains with no training data, and explore two scenarios: zero-shot, where the supervised system is trained in a large out-of-domain dataset (MS-MARCO); and unsupervised domain adaptation, where, in addition to MS-MARCO, the system is fine-tuned in synthetic data from the target domain. Our results indicate that Large Language Models outperform rule-based methods in all scenarios by a large margin, and, more importantly, that unsupervised domain adaptation is effective compared to applying a supervised IR system in a zero-shot fashion. In addition we explore several sizes of open Large Language Models to generate synthetic data and find that a medium-sized model suffices. Code and models are publicly available for reproducibility.
△ Less
Submitted 13 October, 2023;
originally announced October 2023.
-
Scattering and absorption by extra-dimensional black holes with GUP
Authors:
M. A. Anacleto,
J. A. V. Campos,
F. A. Brito,
E. Maciel,
E. Passos
Abstract:
In this paper, we consider the Schwarzschild-Tangherlini black hole to investigate the process of scalar wave scattering by the black hole in a spacetime of (d + 1) dimensions and also with the generalized uncertainty principle (GUP). In this scenario, we analytically determine the phase shift and explore the effect of extra dimensions by calculating the differential scattering and absorption cros…
▽ More
In this paper, we consider the Schwarzschild-Tangherlini black hole to investigate the process of scalar wave scattering by the black hole in a spacetime of (d + 1) dimensions and also with the generalized uncertainty principle (GUP). In this scenario, we analytically determine the phase shift and explore the effect of extra dimensions by calculating the differential scattering and absorption cross-section by applying the partial wave method at low and high-frequency limits. We show at high dimensions that the absorption is not zero as the mass parameter approaches zero.
△ Less
Submitted 10 July, 2024; v1 submitted 18 July, 2023;
originally announced July 2023.
-
IXA/Cogcomp at SemEval-2023 Task 2: Context-enriched Multilingual Named Entity Recognition using Knowledge Bases
Authors:
Iker García-Ferrero,
Jon Ander Campos,
Oscar Sainz,
Ander Salaberria,
Dan Roth
Abstract:
Named Entity Recognition (NER) is a core natural language processing task in which pre-trained language models have shown remarkable performance. However, standard benchmarks like CoNLL 2003 do not address many of the challenges that deployed NER systems face, such as having to classify emerging or complex entities in a fine-grained way. In this paper we present a novel NER cascade approach compri…
▽ More
Named Entity Recognition (NER) is a core natural language processing task in which pre-trained language models have shown remarkable performance. However, standard benchmarks like CoNLL 2003 do not address many of the challenges that deployed NER systems face, such as having to classify emerging or complex entities in a fine-grained way. In this paper we present a novel NER cascade approach comprising three steps: first, identifying candidate entities in the input sentence; second, linking the each candidate to an existing knowledge base; third, predicting the fine-grained category for each entity candidate. We empirically demonstrate the significance of external knowledge bases in accurately classifying fine-grained and emerging entities. Our system exhibits robust performance in the MultiCoNER2 shared task, even in the low-resource language setting where we leverage knowledge bases of high-resource languages.
△ Less
Submitted 27 April, 2023; v1 submitted 20 April, 2023;
originally announced April 2023.
-
Training Language Models with Language Feedback at Scale
Authors:
Jérémy Scheurer,
Jon Ander Campos,
Tomasz Korbak,
Jun Shern Chan,
Angelica Chen,
Kyunghyun Cho,
Ethan Perez
Abstract:
Pretrained language models often generate outputs that are not in line with human preferences, such as harmful text or factually incorrect summaries. Recent work approaches the above issues by learning from a simple form of human feedback: comparisons between pairs of model-generated outputs. However, comparison feedback only conveys limited information about human preferences. In this paper, we i…
▽ More
Pretrained language models often generate outputs that are not in line with human preferences, such as harmful text or factually incorrect summaries. Recent work approaches the above issues by learning from a simple form of human feedback: comparisons between pairs of model-generated outputs. However, comparison feedback only conveys limited information about human preferences. In this paper, we introduce Imitation learning from Language Feedback (ILF), a new approach that utilizes more informative language feedback. ILF consists of three steps that are applied iteratively: first, conditioning the language model on the input, an initial LM output, and feedback to generate refinements. Second, selecting the refinement incorporating the most feedback. Third, finetuning the language model to maximize the likelihood of the chosen refinement given the input. We show theoretically that ILF can be viewed as Bayesian Inference, similar to Reinforcement Learning from human feedback. We evaluate ILF's effectiveness on a carefully-controlled toy task and a realistic summarization task. Our experiments demonstrate that large language models accurately incorporate feedback and that finetuning with ILF scales well with the dataset size, even outperforming finetuning on human summaries. Learning from both language and comparison feedback outperforms learning from each alone, achieving human-level summarization performance.
△ Less
Submitted 22 February, 2024; v1 submitted 28 March, 2023;
originally announced March 2023.
-
Improving Code Generation by Training with Natural Language Feedback
Authors:
Angelica Chen,
Jérémy Scheurer,
Tomasz Korbak,
Jon Ander Campos,
Jun Shern Chan,
Samuel R. Bowman,
Kyunghyun Cho,
Ethan Perez
Abstract:
The potential for pre-trained large language models (LLMs) to use natural language feedback at inference time has been an exciting recent development. We build upon this observation by formalizing an algorithm for learning from natural language feedback at training time instead, which we call Imitation learning from Language Feedback (ILF). ILF requires only a small amount of human-written feedbac…
▽ More
The potential for pre-trained large language models (LLMs) to use natural language feedback at inference time has been an exciting recent development. We build upon this observation by formalizing an algorithm for learning from natural language feedback at training time instead, which we call Imitation learning from Language Feedback (ILF). ILF requires only a small amount of human-written feedback during training and does not require the same feedback at test time, making it both user-friendly and sample-efficient. We further show that ILF can be seen as a form of minimizing the KL divergence to the ground truth distribution and demonstrate a proof-of-concept on a neural program synthesis task. We use ILF to improve a Codegen-Mono 6.1B model's pass@1 rate by 38% relative (and 10% absolute) on the Mostly Basic Python Problems (MBPP) benchmark, outperforming both fine-tuning on MBPP and fine-tuning on repaired programs written by humans. Overall, our results suggest that learning from human-written natural language feedback is both more effective and sample-efficient than training exclusively on demonstrations for improving an LLM's performance on code generation tasks.
△ Less
Submitted 22 February, 2024; v1 submitted 28 March, 2023;
originally announced March 2023.
-
Absorption, scattering and shadow by a noncommutative black hole with global monopole
Authors:
M. A. Anacleto,
F. A. Brito,
J. A. V. Campos,
E. Passos
Abstract:
In this paper, we investigate the process of massless scalar wave scattering due to a noncommutative black hole with a global monopole through the partial wave method. We computed the cross section of differential scattering and absorption at the low frequency limit. We also calculated, at the high frequency limit, the absorption and the shadow radius by the null geodesic method. We showed that no…
▽ More
In this paper, we investigate the process of massless scalar wave scattering due to a noncommutative black hole with a global monopole through the partial wave method. We computed the cross section of differential scattering and absorption at the low frequency limit. We also calculated, at the high frequency limit, the absorption and the shadow radius by the null geodesic method. We showed that noncommutativity causes a reduction in the differential scattering/absorption cross section and shadow radius, while the presence of the global monopole has the effect of increasing the value of such quantities. In the limit of the null mass parameter, we verify that the cross section of differential scattering, absorption and shadow ray approach to a non-zero value proportional to a minimum mass.
△ Less
Submitted 28 December, 2022;
originally announced December 2022.
-
Training Language Models with Language Feedback
Authors:
Jérémy Scheurer,
Jon Ander Campos,
Jun Shern Chan,
Angelica Chen,
Kyunghyun Cho,
Ethan Perez
Abstract:
Pretrained language models often do not perform tasks in ways that are in line with our preferences, e.g., generating offensive text or factually incorrect summaries. Recent work approaches the above issue by learning from a simple form of human evaluation: comparisons between pairs of model-generated task outputs. Comparison feedback conveys limited information about human preferences per human e…
▽ More
Pretrained language models often do not perform tasks in ways that are in line with our preferences, e.g., generating offensive text or factually incorrect summaries. Recent work approaches the above issue by learning from a simple form of human evaluation: comparisons between pairs of model-generated task outputs. Comparison feedback conveys limited information about human preferences per human evaluation. Here, we propose to learn from natural language feedback, which conveys more information per human evaluation. We learn from language feedback on model outputs using a three-step learning algorithm. First, we condition the language model on the initial output and feedback to generate many refinements. Second, we choose the refinement with the highest similarity to the feedback. Third, we finetune a language model to maximize the likelihood of the chosen refinement given the input. In synthetic experiments, we first evaluate whether language models accurately incorporate feedback to produce refinements, finding that only large language models (175B parameters) do so. Using only 100 samples of human-written feedback, our learning algorithm finetunes a GPT-3 model to roughly human-level summarization ability.
△ Less
Submitted 17 November, 2022; v1 submitted 29 April, 2022;
originally announced April 2022.
-
Quasinormal modes and shadow of a Schwarzschild black hole with GUP
Authors:
M. A. Anacleto,
J. A. V. Campos,
F. A. Brito,
E. Passos
Abstract:
We consider quantum corrections for the Schwarzschild black hole metric by using the generalized uncertainty principle (GUP) to investigate quasinormal modes, shadow and their relationship in the eikonal limit. We calculate the quasinormal frequencies of the quantum-corrected Schwarzschild black hole by using the sixth-order Wentzel-Kramers-Brillouin (WKB) approximation, and also perform a numeric…
▽ More
We consider quantum corrections for the Schwarzschild black hole metric by using the generalized uncertainty principle (GUP) to investigate quasinormal modes, shadow and their relationship in the eikonal limit. We calculate the quasinormal frequencies of the quantum-corrected Schwarzschild black hole by using the sixth-order Wentzel-Kramers-Brillouin (WKB) approximation, and also perform a numerical analysis that confirms the results obtained from this approach. We also find that the shadow radius is nonzero even at very small mass limit for finite GUP parameter.
△ Less
Submitted 12 November, 2021; v1 submitted 10 August, 2021;
originally announced August 2021.
-
Quasinormal modes and shadow of noncommutative black hole
Authors:
J. A. V. Campos,
M. A. Anacleto,
F. A. Brito,
E. Passos
Abstract:
In this paper we investigate quasinormal modes (QNM) for a scalar field around a noncommutative Schwarzschild black hole. We verify the effect of noncommutativity on quasinormal frequencies by applying two procedures widely used in the literature. The first is the Wentzel-Kramers-Brillouin (WKB) approximation up to sixth order. In the second case we use the continuous fraction method developed by…
▽ More
In this paper we investigate quasinormal modes (QNM) for a scalar field around a noncommutative Schwarzschild black hole. We verify the effect of noncommutativity on quasinormal frequencies by applying two procedures widely used in the literature. The first is the Wentzel-Kramers-Brillouin (WKB) approximation up to sixth order. In the second case we use the continuous fraction method developed by Leaver. Besides, we also show that due to noncommutativity, the shadow radius is reduced when we increase the noncommutative parameter. In addition, we find that the shadow radius is nonzero even at the zero mass limit for finite noncommutative parameter.
△ Less
Submitted 12 May, 2022; v1 submitted 19 March, 2021;
originally announced March 2021.
-
Improving Conversational Question Answering Systems after Deployment using Feedback-Weighted Learning
Authors:
Jon Ander Campos,
Kyunghyun Cho,
Arantxa Otegi,
Aitor Soroa,
Gorka Azkune,
Eneko Agirre
Abstract:
The interaction of conversational systems with users poses an exciting opportunity for improving them after deployment, but little evidence has been provided of its feasibility. In most applications, users are not able to provide the correct answer to the system, but they are able to provide binary (correct, incorrect) feedback. In this paper we propose feedback-weighted learning based on importan…
▽ More
The interaction of conversational systems with users poses an exciting opportunity for improving them after deployment, but little evidence has been provided of its feasibility. In most applications, users are not able to provide the correct answer to the system, but they are able to provide binary (correct, incorrect) feedback. In this paper we propose feedback-weighted learning based on importance sampling to improve upon an initial supervised system using binary user feedback. We perform simulated experiments on document classification (for development) and Conversational Question Answering datasets like QuAC and DoQA, where binary user feedback is derived from gold annotations. The results show that our method is able to improve over the initial supervised system, getting close to a fully-supervised system that has access to the same labeled examples in in-domain experiments (QuAC), and even matching in out-of-domain experiments (DoQA). Our work opens the prospect to exploit interactions with real users and improve conversational systems after deployment.
△ Less
Submitted 1 November, 2020;
originally announced November 2020.
-
Spot The Bot: A Robust and Efficient Framework for the Evaluation of Conversational Dialogue Systems
Authors:
Jan Deriu,
Don Tuggener,
Pius von Däniken,
Jon Ander Campos,
Alvaro Rodrigo,
Thiziri Belkacem,
Aitor Soroa,
Eneko Agirre,
Mark Cieliebak
Abstract:
The lack of time-efficient and reliable evaluation methods hamper the development of conversational dialogue systems (chatbots). Evaluations requiring humans to converse with chatbots are time and cost-intensive, put high cognitive demands on the human judges, and yield low-quality results. In this work, we introduce \emph{Spot The Bot}, a cost-efficient and robust evaluation framework that replac…
▽ More
The lack of time-efficient and reliable evaluation methods hamper the development of conversational dialogue systems (chatbots). Evaluations requiring humans to converse with chatbots are time and cost-intensive, put high cognitive demands on the human judges, and yield low-quality results. In this work, we introduce \emph{Spot The Bot}, a cost-efficient and robust evaluation framework that replaces human-bot conversations with conversations between bots. Human judges then only annotate for each entity in a conversation whether they think it is human or not (assuming there are humans participants in these conversations). These annotations then allow us to rank chatbots regarding their ability to mimic the conversational behavior of humans. Since we expect that all bots are eventually recognized as such, we incorporate a metric that measures which chatbot can uphold human-like behavior the longest, i.e., \emph{Survival Analysis}. This metric has the ability to correlate a bot's performance to certain of its characteristics (e.g., \ fluency or sensibleness), yielding interpretable results. The comparably low cost of our framework allows for frequent evaluations of chatbots during their evaluation cycle. We empirically validate our claims by applying \emph{Spot The Bot} to three domains, evaluating several state-of-the-art chatbots, and drawing comparisons to related work. The framework is released as a ready-to-use tool.
△ Less
Submitted 5 October, 2020;
originally announced October 2020.
-
DoQA -- Accessing Domain-Specific FAQs via Conversational QA
Authors:
Jon Ander Campos,
Arantxa Otegi,
Aitor Soroa,
Jan Deriu,
Mark Cieliebak,
Eneko Agirre
Abstract:
The goal of this work is to build conversational Question Answering (QA) interfaces for the large body of domain-specific information available in FAQ sites. We present DoQA, a dataset with 2,437 dialogues and 10,917 QA pairs. The dialogues are collected from three Stack Exchange sites using the Wizard of Oz method with crowdsourcing. Compared to previous work, DoQA comprises well-defined informat…
▽ More
The goal of this work is to build conversational Question Answering (QA) interfaces for the large body of domain-specific information available in FAQ sites. We present DoQA, a dataset with 2,437 dialogues and 10,917 QA pairs. The dialogues are collected from three Stack Exchange sites using the Wizard of Oz method with crowdsourcing. Compared to previous work, DoQA comprises well-defined information needs, leading to more coherent and natural conversations with less factoid questions and is multi-domain. In addition, we introduce a more realistic information retrieval(IR) scenario where the system needs to find the answer in any of the FAQ documents. The results of an existing, strong, system show that, thanks to transfer learning from a Wikipedia QA dataset and fine tuning on a single FAQ domain, it is possible to build high quality conversational QA systems for FAQs without in-domain training data. The good results carry over into the more challenging IR scenario. In both cases, there is still ample room for improvement, as indicated by the higher human upperbound.
△ Less
Submitted 18 May, 2020; v1 submitted 4 May, 2020;
originally announced May 2020.
-
Give your Text Representation Models some Love: the Case for Basque
Authors:
Rodrigo Agerri,
Iñaki San Vicente,
Jon Ander Campos,
Ander Barrena,
Xabier Saralegi,
Aitor Soroa,
Eneko Agirre
Abstract:
Word embeddings and pre-trained language models allow to build rich representations of text and have enabled improvements across most NLP tasks. Unfortunately they are very expensive to train, and many small companies and research groups tend to use models that have been pre-trained and made available by third parties, rather than building their own. This is suboptimal as, for many languages, the…
▽ More
Word embeddings and pre-trained language models allow to build rich representations of text and have enabled improvements across most NLP tasks. Unfortunately they are very expensive to train, and many small companies and research groups tend to use models that have been pre-trained and made available by third parties, rather than building their own. This is suboptimal as, for many languages, the models have been trained on smaller (or lower quality) corpora. In addition, monolingual pre-trained models for non-English languages are not always available. At best, models for those languages are included in multilingual versions, where each language shares the quota of substrings and parameters with the rest of the languages. This is particularly true for smaller languages such as Basque. In this paper we show that a number of monolingual models (FastText word embeddings, FLAIR and BERT language models) trained with larger Basque corpora produce much better results than publicly available versions in downstream NLP tasks, including topic classification, sentiment classification, PoS tagging and NER. This work sets a new state-of-the-art in those tasks for Basque. All benchmarks and models used in this work are publicly available.
△ Less
Submitted 2 April, 2020; v1 submitted 31 March, 2020;
originally announced April 2020.
-
Quantum-corrected scattering and absorption of a Schwarzschild black hole with GUP
Authors:
M. A. Anacleto,
F. A. Brito,
J. A. V. Campos,
E. Passos
Abstract:
In this paper we have implemented quantum corrections for the Schwarzschild black hole metric using the generalized uncertainty principle (GUP) in order to investigate the scattering process. We mainly compute, at the low energy limit, the differential scattering and absorption cross section by using the partial wave method. We determine the phase shift analytically and verify that these quantitie…
▽ More
In this paper we have implemented quantum corrections for the Schwarzschild black hole metric using the generalized uncertainty principle (GUP) in order to investigate the scattering process. We mainly compute, at the low energy limit, the differential scattering and absorption cross section by using the partial wave method. We determine the phase shift analytically and verify that these quantities are modified by the GUP. We found that due to the quantum corrections from the GUP the absorption is not zero as the mass parameter goes to zero. A numerical analysis has also been performed for arbitrary frequencies.
△ Less
Submitted 9 April, 2020; v1 submitted 27 March, 2020;
originally announced March 2020.
-
Absorption and scattering of a self-dual black hole
Authors:
M. A. Anacleto,
F. A. Brito,
J. A. V. Campos,
E. Passos
Abstract:
In this paper we aim to investigate the process of massless scalar wave scattering due to a self-dual black hole through the partial wave method. We calculate the phase shift analytically at the low energy limit and we show that the dominant term of the differential cross section at the small angle limit is modified by the presence of parameters related to the polymeric function and minimum area o…
▽ More
In this paper we aim to investigate the process of massless scalar wave scattering due to a self-dual black hole through the partial wave method. We calculate the phase shift analytically at the low energy limit and we show that the dominant term of the differential cross section at the small angle limit is modified by the presence of parameters related to the polymeric function and minimum area of a self-dual black hole. We also find that the result for the absorption cross section is given by the event horizon area of the self-dual black hole at the low frequency limit. We also show that, contrarily to the case of a Schwarzschild black hole, the differential scattering/absorption cross section of a self-dual black hole is nonzero at the zero mass limit. In addition, we verify these results by numerically solving the radial equation for arbitrary frequencies.
△ Less
Submitted 27 February, 2020;
originally announced February 2020.
-
Absorption and scattering of a noncommutative black hole
Authors:
M. A. Anacleto,
F. A. Brito,
J. A. V. Campos,
E. Passos
Abstract:
In this paper we focus on the partial wave method with the aim of exploring the scattering of massless scalar waves due to the noncommutative Schwarzschild black hole via Lorentzian smeared mass distribution. We determine the phase shift analytically in low-frequency limit and we show that the scattering and absorption cross section is modified. Specially, we show that in the low-frequency limit t…
▽ More
In this paper we focus on the partial wave method with the aim of exploring the scattering of massless scalar waves due to the noncommutative Schwarzschild black hole via Lorentzian smeared mass distribution. We determine the phase shift analytically in low-frequency limit and we show that the scattering and absorption cross section is modified. Specially, we show that in the low-frequency limit the scattering/absorption cross section has its value decreased when we increase the value of the non-commutativity parameter. In addition, we have confirmed this result by solving the problem numerically for arbitrary frequencies. Such modifications found for the scattering and absorption cross section present similarities with the Reissner-Nordström black hole.
△ Less
Submitted 21 February, 2020; v1 submitted 29 July, 2019;
originally announced July 2019.
-
Higher-derivative analogue Aharonov-Bohm effect, absorption and superresonance
Authors:
M. A. Anacleto,
F. A. Brito,
J. A. V. Campos,
E. Passos
Abstract:
In this paper we consider the acoustic metric obtained from an Abelian Higgs model extended with higher derivative terms in the bosonic sector which describes an acoustic rotating black hole and analyze the phenomena of superresonance, analogue Aharonov-Bohm effect and absorption. We investigate these effects by computing the scalar wave scattering by a draining bathtub vortex and discuss the phys…
▽ More
In this paper we consider the acoustic metric obtained from an Abelian Higgs model extended with higher derivative terms in the bosonic sector which describes an acoustic rotating black hole and analyze the phenomena of superresonance, analogue Aharonov-Bohm effect and absorption. We investigate these effects by computing the scalar wave scattering by a draining bathtub vortex and discuss the physical interpretation of higher derivative terms. Furthermore, we show that the absorption does not vanish as the draining parameter tends to zero.
△ Less
Submitted 19 June, 2020; v1 submitted 31 October, 2018;
originally announced October 2018.
-
Oscillating soliton stars with network of domain walls
Authors:
Stephen Owusu,
Jose A. V. Campos,
Francisco A. Brito
Abstract:
In this work we study oscillating soliton stars (oscillatons) with network of domain walls. We consider a Lagrangian with three scalar fields coupled among themselves by a specific potential. We choose an appropriate potential to admit the formation of network of domain walls on the oscillatons. With small perturbations applied to this potential, we then compute the Einstein-Klein-Gordon (EKG) equ…
▽ More
In this work we study oscillating soliton stars (oscillatons) with network of domain walls. We consider a Lagrangian with three scalar fields coupled among themselves by a specific potential. We choose an appropriate potential to admit the formation of network of domain walls on the oscillatons. With small perturbations applied to this potential, we then compute the Einstein-Klein-Gordon (EKG) equations numerically and analyze the mass profile of this new object. From the results we discuss how the stability of the oscillatons is affected by the network. At some conditions the network provides a `bouncing stability' to the oscillatons.
△ Less
Submitted 17 December, 2023; v1 submitted 1 June, 2017;
originally announced June 2017.
-
Single Molecule Magnets and the Lipkin-Meshkov-Glick model
Authors:
Jorge A. Campos,
Jorge G. Hirsch
Abstract:
We discuss the description of quantum magnetization in the super paramagnetic compound Fe$_8$ using a generalization of the Lipkin-Meshkov-Glick Hamiltonian. We study the variation of the energy spectra and of the wave-functions as functions of the intensity of an external magnetic field along the three magnetic anisotropy axes.
We discuss the description of quantum magnetization in the super paramagnetic compound Fe$_8$ using a generalization of the Lipkin-Meshkov-Glick Hamiltonian. We study the variation of the energy spectra and of the wave-functions as functions of the intensity of an external magnetic field along the three magnetic anisotropy axes.
△ Less
Submitted 2 August, 2011;
originally announced August 2011.