-
Designing an Evaluation Framework for Large Language Models in Astronomy Research
Authors:
John F. Wu,
Alina Hyk,
Kiera McCormick,
Christine Ye,
Simone Astarita,
Elina Baral,
Jo Ciuca,
Jesse Cranney,
Anjalie Field,
Kartheik Iyer,
Philipp Koehn,
Jenn Kotler,
Sandor Kruk,
Michelle Ntampaka,
Charles O'Neill,
Joshua E. G. Peek,
Sanjib Sharma,
Mikaeel Yunus
Abstract:
Large Language Models (LLMs) are shifting how scientific research is done. It is imperative to understand how researchers interact with these models and how scientific sub-communities like astronomy might benefit from them. However, there is currently no standard for evaluating the use of LLMs in astronomy. Therefore, we present the experimental design for an evaluation study on how astronomy rese…
▽ More
Large Language Models (LLMs) are shifting how scientific research is done. It is imperative to understand how researchers interact with these models and how scientific sub-communities like astronomy might benefit from them. However, there is currently no standard for evaluating the use of LLMs in astronomy. Therefore, we present the experimental design for an evaluation study on how astronomy researchers interact with LLMs. We deploy a Slack chatbot that can answer queries from users via Retrieval-Augmented Generation (RAG); these responses are grounded in astronomy papers from arXiv. We record and anonymize user questions and chatbot answers, user upvotes and downvotes to LLM responses, user feedback to the LLM, and retrieved documents and similarity scores with the query. Our data collection method will enable future dynamic evaluations of LLM tools for astronomy.
△ Less
Submitted 30 May, 2024;
originally announced May 2024.
-
Sparse Autoencoders Enable Scalable and Reliable Circuit Identification in Language Models
Authors:
Charles O'Neill,
Thang Bui
Abstract:
This paper introduces an efficient and robust method for discovering interpretable circuits in large language models using discrete sparse autoencoders. Our approach addresses key limitations of existing techniques, namely computational complexity and sensitivity to hyperparameters. We propose training sparse autoencoders on carefully designed positive and negative examples, where the model can on…
▽ More
This paper introduces an efficient and robust method for discovering interpretable circuits in large language models using discrete sparse autoencoders. Our approach addresses key limitations of existing techniques, namely computational complexity and sensitivity to hyperparameters. We propose training sparse autoencoders on carefully designed positive and negative examples, where the model can only correctly predict the next token for the positive examples. We hypothesise that learned representations of attention head outputs will signal when a head is engaged in specific computations. By discretising the learned representations into integer codes and measuring the overlap between codes unique to positive examples for each head, we enable direct identification of attention heads involved in circuits without the need for expensive ablations or architectural modifications. On three well-studied tasks - indirect object identification, greater-than comparisons, and docstring completion - the proposed method achieves higher precision and recall in recovering ground-truth circuits compared to state-of-the-art baselines, while reducing runtime from hours to seconds. Notably, we require only 5-10 text examples for each task to learn robust representations. Our findings highlight the promise of discrete sparse autoencoders for scalable and efficient mechanistic interpretability, offering a new direction for analysing the inner workings of large language models.
△ Less
Submitted 21 May, 2024;
originally announced May 2024.
-
Infinite free resolutions over numerical semigroup algebras via specialization
Authors:
Tara Gomes,
Christopher O'Neill,
Aleksandra Sobieska,
Eduardo Torres Dávila
Abstract:
Each numerical semigroup $S$ with smallest positive element $m$ corresponds to an integer point in a polyhedral cone $C_m$, known as the Kunz cone. The faces of $C_m$ form a stratification of numerical semigroups that has been shown to respect a number of algebraic properties of $S$, including the combinatorial structure of the minimal free resolution of the defining toric ideal $I_S$. In this wor…
▽ More
Each numerical semigroup $S$ with smallest positive element $m$ corresponds to an integer point in a polyhedral cone $C_m$, known as the Kunz cone. The faces of $C_m$ form a stratification of numerical semigroups that has been shown to respect a number of algebraic properties of $S$, including the combinatorial structure of the minimal free resolution of the defining toric ideal $I_S$. In this work, we prove that the structure of the infinite free resolution of the ground field $\Bbbk$ over the semigroup algebra $\Bbbk[S]$ also respects this stratification, yielding a new combinatorial approach to classifying homological properties like Golodness and rationality of the poincare series in this setting. Additionally, we give a complete classification of such resolutions in the special case $m = 4$, and demonstrate that the associated graded algebras do not generally respect the same stratification.
△ Less
Submitted 2 May, 2024;
originally announced May 2024.
-
Families of numerical semigroups and a special case of the Huneke-Wiegand conjecture
Authors:
Miguel Landeros,
Christopher O'Neill,
Roberto Pelayo,
Karina Peña,
James Ren,
Brian Wissman
Abstract:
The Huneke-Wiegand conjecture is a decades-long open question in commutative algebra. García-Sánchez and Leamer showed that a special case of this conjecture concerning numerical semigroup rings $\Bbbk[Γ]$ can be answered in the affirmative by locating certain arithmetic sequences within the numerical semigroup $Γ$. In this paper, we use their approach to prove the Huneke-Wiegand conjecture in the…
▽ More
The Huneke-Wiegand conjecture is a decades-long open question in commutative algebra. García-Sánchez and Leamer showed that a special case of this conjecture concerning numerical semigroup rings $\Bbbk[Γ]$ can be answered in the affirmative by locating certain arithmetic sequences within the numerical semigroup $Γ$. In this paper, we use their approach to prove the Huneke-Wiegand conjecture in the case where $Γ$ is generated by a generalized arithmetic sequence and showcase how visualizations can be leveraged to find the requisite arithmetic sequences.
△ Less
Submitted 18 April, 2024;
originally announced April 2024.
-
Perspicacious $l_p$ norm parameters
Authors:
Christopher O'Neill,
Vadim Ponomarenko,
Eric Ren
Abstract:
Fix $t\in [1,\infty]$. Let $S$ be an atomic commutative semigroup and, for all $x\in S$, let $\mathscr{L}_t(S):=\{\|f\|_t:f\in Z(x)\}$ be the "$t$-length set" of $x$ (using the standard $l_p$-space definition of $\|\cdot\|_t$). The $t$-Delta set of $x$ (denoted $Δ_t(S)$) is the set of gaps between consecutive elements of $\mathscr{L}_t(S)$; the Delta set of $S$ is then defined by…
▽ More
Fix $t\in [1,\infty]$. Let $S$ be an atomic commutative semigroup and, for all $x\in S$, let $\mathscr{L}_t(S):=\{\|f\|_t:f\in Z(x)\}$ be the "$t$-length set" of $x$ (using the standard $l_p$-space definition of $\|\cdot\|_t$). The $t$-Delta set of $x$ (denoted $Δ_t(S)$) is the set of gaps between consecutive elements of $\mathscr{L}_t(S)$; the Delta set of $S$ is then defined by $\bigcup\limits_{x\in S} Δ_t(S)$. Though all existing literature on this topic considers the $1$-Delta set, recent results on the $t$-elasticity of Numerical Semigroups (Behera et. al.) for $t\neq 1$ have brought attention to other invariants, such as the $t$-Delta set for $t\neq 1$, as well. Here we characterize $Δ_t(S)$ for all numerical semigroups $\langle a_1,a_2\rangle$ and all $t\in(1,\infty)$ outside a small family of extremal examples. We also determine the cardinality and describe the distribution of that aberrant family.
△ Less
Submitted 2 April, 2024;
originally announced April 2024.
-
Two Online Map Matching Algorithms Based on Analytic Hierarchy Process and Fuzzy Logic
Authors:
Jeremy J. Lin,
Tomoro Mochida,
Riley C. W. O'Neill,
Atsuro Yoshida,
Masashi Yamazaki,
Akinobu Sasada
Abstract:
Our aim of this paper is to develop new map matching algorithms and to improve upon previous work. We address two key approaches: Analytic Hierarchy Process (AHP) map matching and fuzzy logic map matching. AHP is a decision-making method that combines mathematical analysis with human judgment, and fuzzy logic is an approach to computing based on the degree of truth and aims at modeling the impreci…
▽ More
Our aim of this paper is to develop new map matching algorithms and to improve upon previous work. We address two key approaches: Analytic Hierarchy Process (AHP) map matching and fuzzy logic map matching. AHP is a decision-making method that combines mathematical analysis with human judgment, and fuzzy logic is an approach to computing based on the degree of truth and aims at modeling the imprecise modes of reasoning from 0 to 1 rather than the usual boolean logic. Of these algorithms, the way of our applying AHP to map matching is newly developed in this paper, meanwhile, our application of fuzzy logic to map matching is mostly the same as existing research except for some small changes. Because of the common characteristic that both methods are designed to handle imprecise information and simplicity for implementation, we decided to use these methods.
△ Less
Submitted 19 February, 2024;
originally announced February 2024.
-
Measuring Sharpness in Grokking
Authors:
Jack Miller,
Patrick Gleeson,
Charles O'Neill,
Thang Bui,
Noam Levi
Abstract:
Neural networks sometimes exhibit grokking, a phenomenon where perfect or near-perfect performance is achieved on a validation set well after the same performance has been obtained on the corresponding training set. In this workshop paper, we introduce a robust technique for measuring grokking, based on fitting an appropriate functional form. We then use this to investigate the sharpness of transi…
▽ More
Neural networks sometimes exhibit grokking, a phenomenon where perfect or near-perfect performance is achieved on a validation set well after the same performance has been obtained on the corresponding training set. In this workshop paper, we introduce a robust technique for measuring grokking, based on fitting an appropriate functional form. We then use this to investigate the sharpness of transitions in training and validation accuracy under two settings. The first setting is the theoretical framework developed by Levi et al. (2023) where closed form expressions are readily accessible. The second setting is a two-layer MLP trained to predict the parity of bits, with grokking induced by the concealment strategy of Miller et al. (2023). We find that trends between relative grokking gap and grokking sharpness are similar in both settings when using absolute and relative measures of sharpness. Reflecting on this, we make progress toward explaining some trends and identify the need for further study to untangle the various mechanisms which influence the sharpness of grokking.
△ Less
Submitted 14 February, 2024;
originally announced February 2024.
-
Counting edges in factorization graphs of numerical semigroup elements
Authors:
Mariah Moschetti,
Christopher O'Neill
Abstract:
A numerical semigroup $S$ is an additively-closed set of non-negative integers, and a factorization of an element $n$ of $S$ is an expression of $n$ as a sum of generators of $S$. It is known that for a given numerical semigroup $S$, the number of factorizations of $n$ coincides with a quasipolynomial (that is, a polynomial whose coefficients are periodic functions of $n$). One of the standard met…
▽ More
A numerical semigroup $S$ is an additively-closed set of non-negative integers, and a factorization of an element $n$ of $S$ is an expression of $n$ as a sum of generators of $S$. It is known that for a given numerical semigroup $S$, the number of factorizations of $n$ coincides with a quasipolynomial (that is, a polynomial whose coefficients are periodic functions of $n$). One of the standard methods for computing certain semigroup-theoretic invariants involves assembling a graph or simplicial complex derived from the factorizations of $n$. In this paper, we prove that for two such graphs (which we call the factorization support graph and the trade graph), the number of edges coincides with a quasipolynomial function of $n$, and identify the degree, period, and leading coefficient of each. In the process, we uncover a surprising geometric connection: a combinatorially-assembled cubical complex that is homeomorphic to real projective space.
△ Less
Submitted 9 May, 2024; v1 submitted 12 January, 2024;
originally announced January 2024.
-
Numerical semigroups, polyhedra, and posets IV: walking the faces of the Kunz cone
Authors:
Cole Brower,
Joseph McDonough,
Christopher O'Neill
Abstract:
A numerical semigroup is a cofinite subset of $\mathbb Z_{\ge 0}$ containing $0$ and closed under addition. Each numerical semigroup $S$ with smallest positive element $m$ corresponds to an integer point in the Kunz cone $\mathcal C_m \subseteq \mathbb R^{m-1}$, and the face of $\mathcal C_m$ containing that integer points determines certain algebraic properties of $S$. In this paper, we introduce…
▽ More
A numerical semigroup is a cofinite subset of $\mathbb Z_{\ge 0}$ containing $0$ and closed under addition. Each numerical semigroup $S$ with smallest positive element $m$ corresponds to an integer point in the Kunz cone $\mathcal C_m \subseteq \mathbb R^{m-1}$, and the face of $\mathcal C_m$ containing that integer points determines certain algebraic properties of $S$. In this paper, we introduce the Kunz fan, a pure, polyhedral cone complex comprised of a faithful projection of certain faces of $\mathcal C_m$. We characterize several aspects of the Kunz fan in terms of the combinatorics of Kunz nilsemigroups, which are known to index the faces of $\mathcal C_m$, and our results culminate in a method of "walking" the face lattice of the Kunz cone in a manner analogous to that of a Gröbner walk. We apply our results in several contexts, including a wealth of computational data obtained from the aforementioned "walks" and a proof of a recent conjecture concerning which numerical semigroups achieve the highest minimal presentation cardinality.
△ Less
Submitted 11 January, 2024;
originally announced January 2024.
-
AstroLLaMA-Chat: Scaling AstroLLaMA with Conversational and Diverse Datasets
Authors:
Ernest Perkowski,
Rui Pan,
Tuan Dung Nguyen,
Yuan-Sen Ting,
Sandor Kruk,
Tong Zhang,
Charlie O'Neill,
Maja Jablonska,
Zechang Sun,
Michael J. Smith,
Huiling Liu,
Kevin Schawinski,
Kartheik Iyer,
Ioana Ciucă for UniverseTBD
Abstract:
We explore the potential of enhancing LLM performance in astronomy-focused question-answering through targeted, continual pre-training. By employing a compact 7B-parameter LLaMA-2 model and focusing exclusively on a curated set of astronomy corpora -- comprising abstracts, introductions, and conclusions -- we achieve notable improvements in specialized topic comprehension. While general LLMs like…
▽ More
We explore the potential of enhancing LLM performance in astronomy-focused question-answering through targeted, continual pre-training. By employing a compact 7B-parameter LLaMA-2 model and focusing exclusively on a curated set of astronomy corpora -- comprising abstracts, introductions, and conclusions -- we achieve notable improvements in specialized topic comprehension. While general LLMs like GPT-4 excel in broader question-answering scenarios due to superior reasoning capabilities, our findings suggest that continual pre-training with limited resources can still enhance model performance on specialized topics. Additionally, we present an extension of AstroLLaMA: the fine-tuning of the 7B LLaMA model on a domain-specific conversational dataset, culminating in the release of the chat-enabled AstroLLaMA for community use. Comprehensive quantitative benchmarking is currently in progress and will be detailed in an upcoming full paper. The model, AstroLLaMA-Chat, is now available at https://huggingface.co/universeTBD, providing the first open-source conversational AI tool tailored for the astronomy community.
△ Less
Submitted 5 January, 2024; v1 submitted 2 January, 2024;
originally announced January 2024.
-
The structure theorem for sets of length for numerical semigroups
Authors:
Gilad Moskowitz,
Christopher O'Neill
Abstract:
For sufficiently nice families of semigroups and monoids, the structure theorem for sets of length states that the length set of any sufficiently large element is an arithmetic sequence with some values omitted near the ends. In this paper, we prove a specialized version of the structure theorem that holds for any numerical semigroup $S$. Our description utilizes two other numerical semigroups…
▽ More
For sufficiently nice families of semigroups and monoids, the structure theorem for sets of length states that the length set of any sufficiently large element is an arithmetic sequence with some values omitted near the ends. In this paper, we prove a specialized version of the structure theorem that holds for any numerical semigroup $S$. Our description utilizes two other numerical semigroups $S_{\mathsf M}$ and $S_{\mathsf m}$, derived from the generators of $S$: for sufficiently large $n \in S$, the Apéry sets of $S_{\mathsf M}$ and $S_{\mathsf m}$ specify precisely which lengths appear in the length set of $n$, and their gaps specify which lengths are "missing". We also provide an explicit bound on which elements satisfy the structure theorem.
△ Less
Submitted 9 November, 2023;
originally announced November 2023.
-
Grokking Beyond Neural Networks: An Empirical Exploration with Model Complexity
Authors:
Jack Miller,
Charles O'Neill,
Thang Bui
Abstract:
In some settings neural networks exhibit a phenomenon known as \textit{grokking}, where they achieve perfect or near-perfect accuracy on the validation set long after the same performance has been achieved on the training set. In this paper, we discover that grokking is not limited to neural networks but occurs in other settings such as Gaussian process (GP) classification, GP regression, linear r…
▽ More
In some settings neural networks exhibit a phenomenon known as \textit{grokking}, where they achieve perfect or near-perfect accuracy on the validation set long after the same performance has been achieved on the training set. In this paper, we discover that grokking is not limited to neural networks but occurs in other settings such as Gaussian process (GP) classification, GP regression, linear regression and Bayesian neural networks. We also uncover a mechanism by which to induce grokking on algorithmic datasets via the addition of dimensions containing spurious information. The presence of the phenomenon in non-neural architectures shows that grokking is not restricted to settings considered in current theoretical and empirical studies. Instead, grokking may be possible in any model where solution search is guided by complexity and error.
△ Less
Submitted 31 March, 2024; v1 submitted 26 October, 2023;
originally announced October 2023.
-
Atomic density of arithmetical congruence monoids
Authors:
Nils Olsson,
Christopher O'Neill,
Derek Rawling
Abstract:
Consider the set $M_{a,b} = \{n \in \mathbb Z_{\ge 1} : n \equiv a \bmod b\} \cup \{1\}$ for $a, b \in \mathbb Z_{\ge 1}$. If $a^2 \equiv a \bmod b$, then $M_{a,b}$ is closed under multiplication and known as an arithmetic congruence monoid (ACM). A non-unit $n \in M_{a,b}$ is an atom if it cannot be expressed as a product of non-units, and the atomic density of $M_{a,b}$ is the limiting proportio…
▽ More
Consider the set $M_{a,b} = \{n \in \mathbb Z_{\ge 1} : n \equiv a \bmod b\} \cup \{1\}$ for $a, b \in \mathbb Z_{\ge 1}$. If $a^2 \equiv a \bmod b$, then $M_{a,b}$ is closed under multiplication and known as an arithmetic congruence monoid (ACM). A non-unit $n \in M_{a,b}$ is an atom if it cannot be expressed as a product of non-units, and the atomic density of $M_{a,b}$ is the limiting proportion of elements that are atoms. In this paper, we characterize the atomic density of $M_{a,b}$ in terms of $a$ and $b$.
△ Less
Submitted 11 October, 2023;
originally announced October 2023.
-
Minimal free resolutions of numerical semigroup algebras via Apéry specialization
Authors:
Benjamin Braun,
Tara Gomes,
Ezra Miller,
Christopher O'Neill,
Aleksandra Sobieska
Abstract:
Numerical semigroups with multiplicity $m$ are parameterized by integer points in a polyhedral cone $C_m$, according to Kunz. For the toric ideal of any such semigroup, the main result here constructs a free resolution whose overall structure is identical for all semigroups parametrized by the relative interior of a fixed face of $C_m$. The matrix entries of this resolution are monomials whose exp…
▽ More
Numerical semigroups with multiplicity $m$ are parameterized by integer points in a polyhedral cone $C_m$, according to Kunz. For the toric ideal of any such semigroup, the main result here constructs a free resolution whose overall structure is identical for all semigroups parametrized by the relative interior of a fixed face of $C_m$. The matrix entries of this resolution are monomials whose exponents are parametrized by the coordinates of the corresponding point in $C_m$, and minimality of the resolution is achieved when the semigroup is maximal embedding dimension, which is the case parametrized by the interior of $C_m$ itself.
△ Less
Submitted 21 June, 2024; v1 submitted 5 October, 2023;
originally announced October 2023.
-
On faces of the Kunz cone and the numerical semigroups within them
Authors:
Levi Borevitz,
Tara Gomes,
Jiajie Ma,
Harper Niergarth,
Christopher O'Neill,
Daniel Pocklington,
Rosa Stolk,
Jessica Wang,
Shuhang Xue
Abstract:
A numerical semigroup is a cofinite subset of the non-negative integers that is closed under addition and contains 0. Each numerical semigroup $S$ with fixed smallest positive element $m$ corresponds to an integer point in a rational polyhedral cone $\mathcal C_m$, called the Kunz cone. Moreover, numerical semigroups corresponding to points in the same face $F \subseteq \mathcal C_m$ are known to…
▽ More
A numerical semigroup is a cofinite subset of the non-negative integers that is closed under addition and contains 0. Each numerical semigroup $S$ with fixed smallest positive element $m$ corresponds to an integer point in a rational polyhedral cone $\mathcal C_m$, called the Kunz cone. Moreover, numerical semigroups corresponding to points in the same face $F \subseteq \mathcal C_m$ are known to share many properties, such as the number of minimal generators. In this work, we classify which faces of $\mathcal C_m$ contain points corresponding to numerical semigroups. Additionally, we obtain sharp bounds on the number of minimal generators of $S$ in terms of the dimension of the face of $\mathcal C_m$ containing the point corresponding to $S$.
△ Less
Submitted 14 September, 2023;
originally announced September 2023.
-
AstroLLaMA: Towards Specialized Foundation Models in Astronomy
Authors:
Tuan Dung Nguyen,
Yuan-Sen Ting,
Ioana Ciucă,
Charlie O'Neill,
Ze-Chang Sun,
Maja Jabłońska,
Sandor Kruk,
Ernest Perkowski,
Jack Miller,
Jason Li,
Josh Peek,
Kartheik Iyer,
Tomasz Różański,
Pranav Khetarpal,
Sharaf Zaman,
David Brodrick,
Sergio J. Rodríguez Méndez,
Thang Bui,
Alyssa Goodman,
Alberto Accomazzi,
Jill Naiman,
Jesse Cranney,
Kevin Schawinski,
UniverseTBD
Abstract:
Large language models excel in many human-language tasks but often falter in highly specialized domains like scholarly astronomy. To bridge this gap, we introduce AstroLLaMA, a 7-billion-parameter model fine-tuned from LLaMA-2 using over 300,000 astronomy abstracts from arXiv. Optimized for traditional causal language modeling, AstroLLaMA achieves a 30% lower perplexity than Llama-2, showing marke…
▽ More
Large language models excel in many human-language tasks but often falter in highly specialized domains like scholarly astronomy. To bridge this gap, we introduce AstroLLaMA, a 7-billion-parameter model fine-tuned from LLaMA-2 using over 300,000 astronomy abstracts from arXiv. Optimized for traditional causal language modeling, AstroLLaMA achieves a 30% lower perplexity than Llama-2, showing marked domain adaptation. Our model generates more insightful and scientifically relevant text completions and embedding extraction than state-of-the-arts foundation models despite having significantly fewer parameters. AstroLLaMA serves as a robust, domain-specific model with broad fine-tuning potential. Its public release aims to spur astronomy-focused research, including automatic paper summarization and conversational agent development.
△ Less
Submitted 12 September, 2023;
originally announced September 2023.
-
Adversarial Fine-Tuning of Language Models: An Iterative Optimisation Approach for the Generation and Detection of Problematic Content
Authors:
Charles O'Neill,
Jack Miller,
Ioana Ciuca,
Yuan-Sen Ting,
Thang Bui
Abstract:
In this paper, we tackle the emerging challenge of unintended harmful content generation in Large Language Models (LLMs) with a novel dual-stage optimisation technique using adversarial fine-tuning. Our two-pronged approach employs an adversarial model, fine-tuned to generate potentially harmful prompts, and a judge model, iteratively optimised to discern these prompts. In this adversarial cycle,…
▽ More
In this paper, we tackle the emerging challenge of unintended harmful content generation in Large Language Models (LLMs) with a novel dual-stage optimisation technique using adversarial fine-tuning. Our two-pronged approach employs an adversarial model, fine-tuned to generate potentially harmful prompts, and a judge model, iteratively optimised to discern these prompts. In this adversarial cycle, the two models seek to outperform each other in the prompting phase, generating a dataset of rich examples which are then used for fine-tuning. This iterative application of prompting and fine-tuning allows continuous refinement and improved performance. The performance of our approach is evaluated through classification accuracy on a dataset consisting of problematic prompts not detected by GPT-4, as well as a selection of contentious but unproblematic prompts. We show considerable increase in classification accuracy of the judge model on this challenging dataset as it undergoes the optimisation process. Furthermore, we show that a rudimentary model \texttt{ada} can achieve 13\% higher accuracy on the hold-out test set than GPT-4 after only a few rounds of this process, and that this fine-tuning improves performance in parallel tasks such as toxic comment identification.
△ Less
Submitted 26 August, 2023;
originally announced August 2023.
-
Steering Language Generation: Harnessing Contrastive Expert Guidance and Negative Prompting for Coherent and Diverse Synthetic Data Generation
Authors:
Charles O'Neill,
Yuan-Sen Ting,
Ioana Ciuca,
Jack Miller,
Thang Bui
Abstract:
Large Language Models (LLMs) hold immense potential to generate synthetic data of high quality and utility, which has numerous applications from downstream model training to practical data utilisation. However, contemporary models, despite their impressive capacities, consistently struggle to produce both coherent and diverse data. To address the coherency issue, we introduce contrastive expert gu…
▽ More
Large Language Models (LLMs) hold immense potential to generate synthetic data of high quality and utility, which has numerous applications from downstream model training to practical data utilisation. However, contemporary models, despite their impressive capacities, consistently struggle to produce both coherent and diverse data. To address the coherency issue, we introduce contrastive expert guidance, where the difference between the logit distributions of fine-tuned and base language models is emphasised to ensure domain adherence. In order to ensure diversity, we utilise existing real and synthetic examples as negative prompts to the model. We deem this dual-pronged approach to logit reshaping as STEER: Semantic Text Enhancement via Embedding Repositioning. STEER operates at inference-time and systematically guides the LLMs to strike a balance between adherence to the data distribution (ensuring semantic fidelity) and deviation from prior synthetic examples or existing real datasets (ensuring diversity and authenticity). This delicate balancing act is achieved by dynamically moving towards or away from chosen representations in the latent space. STEER demonstrates improved performance over previous synthetic data generation techniques, exhibiting better balance between data diversity and coherency across three distinct tasks: hypothesis generation, toxic and non-toxic comment generation, and commonsense reasoning task generation. We demonstrate how STEER allows for fine-tuned control over the diversity-coherency trade-off via its hyperparameters, highlighting its versatility.
△ Less
Submitted 17 August, 2023; v1 submitted 15 August, 2023;
originally announced August 2023.
-
Numerical semigroups via projections and via quotients
Authors:
Tristram Bogart,
Christopher O'Neill,
Kevin Woods
Abstract:
We examine two natural operations to create numerical semigroups. We say that a numerical semigroup $\mathcal{S}$ is $k$-normalescent if it is the projection of the set of integer points in a $k$-dimensional polyhedral cone, and we say that $\mathcal{S}$ is a $k$-quotient if it is the quotient of a numerical semigroup with $k$ generators. We prove that all $k$-quotients are $k$-normalescent, and a…
▽ More
We examine two natural operations to create numerical semigroups. We say that a numerical semigroup $\mathcal{S}$ is $k$-normalescent if it is the projection of the set of integer points in a $k$-dimensional polyhedral cone, and we say that $\mathcal{S}$ is a $k$-quotient if it is the quotient of a numerical semigroup with $k$ generators. We prove that all $k$-quotients are $k$-normalescent, and although the converse is false in general, we prove that the projection of the set of integer points in a cone with $k$ extreme rays (possibly lying in a dimension smaller than $k$) is a $k$-quotient. The discrete geometric perspective of studying cones is useful for studying $k$-quotients: in particular, we use it to prove that the sum of a $k_1$-quotient and a $k_2$-quotient is a $(k_1+k_2)$-quotient. In addition, we prove several results about when a numerical semigroup is not $k$-normalescent.
△ Less
Submitted 13 April, 2024; v1 submitted 20 June, 2023;
originally announced June 2023.
-
Rice paddy disease classifications using CNNs
Authors:
Charles O'Neill
Abstract:
Rice is a staple food in the world's diet, and yet huge percentages of crop yields are lost each year to disease. To combat this problem, people have been searching for ways to automate disease diagnosis. Here, we extend on previous modelling work by analysing how disease-classification accuracy is sensitive to both model architecture and common computer vision techniques. In doing so, we maximise…
▽ More
Rice is a staple food in the world's diet, and yet huge percentages of crop yields are lost each year to disease. To combat this problem, people have been searching for ways to automate disease diagnosis. Here, we extend on previous modelling work by analysing how disease-classification accuracy is sensitive to both model architecture and common computer vision techniques. In doing so, we maximise accuracy whilst working in the constraints of smaller model sizes, minimum GPUs and shorter training times. Whilst previous state-of-the-art models had 93% accuracy only predicting 5 diseases, we improve this to 98.7% using 10 disease classes.
△ Less
Submitted 15 March, 2023;
originally announced March 2023.
-
Eigenvalue initialisation and regularisation for Koopman autoencoders
Authors:
Jack W. Miller,
Charles O'Neill,
Navid C. Constantinou,
Omri Azencot
Abstract:
Regularising the parameter matrices of neural networks is ubiquitous in training deep models. Typical regularisation approaches suggest initialising weights using small random values, and to penalise weights to promote sparsity. However, these widely used techniques may be less effective in certain scenarios. Here, we study the Koopman autoencoder model which includes an encoder, a Koopman operato…
▽ More
Regularising the parameter matrices of neural networks is ubiquitous in training deep models. Typical regularisation approaches suggest initialising weights using small random values, and to penalise weights to promote sparsity. However, these widely used techniques may be less effective in certain scenarios. Here, we study the Koopman autoencoder model which includes an encoder, a Koopman operator layer, and a decoder. These models have been designed and dedicated to tackle physics-related problems with interpretable dynamics and an ability to incorporate physics-related constraints. However, the majority of existing work employs standard regularisation practices. In our work, we take a step toward augmenting Koopman autoencoders with initialisation and penalty schemes tailored for physics-related settings. Specifically, we propose the "eigeninit" initialisation scheme that samples initial Koopman operators from specific eigenvalue distributions. In addition, we suggest the "eigenloss" penalty scheme that penalises the eigenvalues of the Koopman operator during training. We demonstrate the utility of these schemes on two synthetic data sets: a driven pendulum and flow past a cylinder; and two real-world problems: ocean surface temperatures and cyclone wind fields. We find on these datasets that eigenloss and eigeninit improves the convergence rate by up to a factor of 5, and that they reduce the cumulative long-term prediction error by up to a factor of 3. Such a finding points to the utility of incorporating similar schemes as an inductive bias in other physics-related deep learning approaches.
△ Less
Submitted 25 December, 2022; v1 submitted 22 December, 2022;
originally announced December 2022.
-
When is a numerical semigroup a quotient?
Authors:
Tristram Bogart,
Christopher O'Neill,
Kevin Woods
Abstract:
A natural operation on numerical semigroups is taking a quotient by a positive integer. If $\mathcal S$ is a quotient of a numerical semigroup with $k$ generators, we call $\mathcal S$ a $k$-quotient. We give a necessary condition for a given numerical semigroup $\mathcal S$ to be a $k$-quotient, and present, for each $k \ge 3$, the first known family of numerical semigroups that cannot be written…
▽ More
A natural operation on numerical semigroups is taking a quotient by a positive integer. If $\mathcal S$ is a quotient of a numerical semigroup with $k$ generators, we call $\mathcal S$ a $k$-quotient. We give a necessary condition for a given numerical semigroup $\mathcal S$ to be a $k$-quotient, and present, for each $k \ge 3$, the first known family of numerical semigroups that cannot be written as a $k$-quotient. We also examine the probability that a randomly selected numerical semigroup with $k$ generators is a $k$-quotient.
△ Less
Submitted 16 December, 2022;
originally announced December 2022.
-
Unsupervised language models for disease variant prediction
Authors:
Allan Zhou,
Nicholas C. Landolfi,
Daniel C. O'Neill
Abstract:
There is considerable interest in predicting the pathogenicity of protein variants in human genes. Due to the sparsity of high quality labels, recent approaches turn to \textit{unsupervised} learning, using Multiple Sequence Alignments (MSAs) to train generative models of natural sequence variation within each gene. These generative models then predict variant likelihood as a proxy to evolutionary…
▽ More
There is considerable interest in predicting the pathogenicity of protein variants in human genes. Due to the sparsity of high quality labels, recent approaches turn to \textit{unsupervised} learning, using Multiple Sequence Alignments (MSAs) to train generative models of natural sequence variation within each gene. These generative models then predict variant likelihood as a proxy to evolutionary fitness. In this work we instead combine this evolutionary principle with pretrained protein language models (LMs), which have already shown promising results in predicting protein structure and function. Instead of training separate models per-gene, we find that a single protein LM trained on broad sequence datasets can score pathogenicity for any gene variant zero-shot, without MSAs or finetuning. We call this unsupervised approach \textbf{VELM} (Variant Effect via Language Models), and show that it achieves scoring performance comparable to the state of the art when evaluated on clinically labeled variants of disease-related genes.
△ Less
Submitted 7 December, 2022;
originally announced December 2022.
-
Convexity in (colored) affine semigroups
Authors:
Jesus A. De Loera,
Christopher O'Neill,
Chengyang Wang
Abstract:
In this paper, we explore affine semigroup versions of the convex geometry theorems of Helly, Tverberg, and Caratheodory. Additionally, we develop a new theory of colored affine semigroups, where the semigroup generators each receive a color and the elements of the semigroup take into account the colors used (the classical theory of affine semigroups coincides with the case in which all generators…
▽ More
In this paper, we explore affine semigroup versions of the convex geometry theorems of Helly, Tverberg, and Caratheodory. Additionally, we develop a new theory of colored affine semigroups, where the semigroup generators each receive a color and the elements of the semigroup take into account the colors used (the classical theory of affine semigroups coincides with the case in which all generators have the same color). We prove an analog of Tverberg's theorem and colorful Helly's theorem for semigroups, as well as a version of colorful Caratheodory's theorem for cones. We also demonstrate that colored numerical semigroups are particularly rich by introducing a colored version of the Frobenius number.
△ Less
Submitted 4 October, 2023; v1 submitted 5 December, 2022;
originally announced December 2022.
-
Graver bases of shifted numerical semigroups with 3 generators
Authors:
James Howard,
Christopher O'Neill
Abstract:
A numerical semigroup $M$ is a subset of the non-negative integers that is closed under addition. A factorization of $n \in M$ is an expression of $n$ as a sum of generators of $M$, and the Graver basis of $M$ is a collection $Gr(M_t)$ of trades between the generators of $M$ that allows for efficient movement between factorizations. Given positive integers $r_1, \ldots, r_k$, consider the family…
▽ More
A numerical semigroup $M$ is a subset of the non-negative integers that is closed under addition. A factorization of $n \in M$ is an expression of $n$ as a sum of generators of $M$, and the Graver basis of $M$ is a collection $Gr(M_t)$ of trades between the generators of $M$ that allows for efficient movement between factorizations. Given positive integers $r_1, \ldots, r_k$, consider the family $M_t = \langle t + r_1, \ldots, t + r_k\rangle$ of "shifted" numerical semigroups whose generators are obtained by translating $r_1, \ldots, r_k$ by an integer parameter $t$. In this paper, we characterize the Graver basis $Gr(M_t)$ of $M_t$ for sufficiently large $t$ in the case $k = 3$, in the form of a recursive construction of $Gr(M_t)$ from that of smaller values of $t$. As a consequence of our result, the number of trades in $Gr(M_t)$, when viewed as a function of $t$, is eventually quasilinear. We also obtain a sharp lower bound on the start of quasilinear behavior.
△ Less
Submitted 10 December, 2022; v1 submitted 5 December, 2022;
originally announced December 2022.
-
Enumerating numerical sets associated to a numerical semigroup
Authors:
April Chen,
Nathan Kaplan,
Liam Lawson,
Christopher O'Neill,
Deepesh Singhal
Abstract:
A numerical set $T$ is a subset of $\mathbb N_0$ that contains $0$ and has finite complement. The atom monoid of $T$ is the set of $x \in \mathbb N_0$ such that $x+T \subseteq T$. Marzuola and Miller introduced the anti-atom problem: how many numerical sets have a given atom monoid? This is equivalent to asking for the number of integer partitions with a given set of hook lengths. We introduce the…
▽ More
A numerical set $T$ is a subset of $\mathbb N_0$ that contains $0$ and has finite complement. The atom monoid of $T$ is the set of $x \in \mathbb N_0$ such that $x+T \subseteq T$. Marzuola and Miller introduced the anti-atom problem: how many numerical sets have a given atom monoid? This is equivalent to asking for the number of integer partitions with a given set of hook lengths. We introduce the void poset of a numerical semigroup $S$ and show that numerical sets with atom monoid $S$ are in bijection with certain order ideals of this poset. We use this characterization to answer the anti-atom problem when $S$ has small type.
△ Less
Submitted 16 June, 2023; v1 submitted 30 November, 2022;
originally announced November 2022.
-
On the cardinality of minimal presentations of numerical semigroups
Authors:
Ceyhun Elmacioglu,
Kieran Hilmer,
Christopher O'Neill,
Melin Okandan,
Hannah Park-Kaufmann
Abstract:
In this paper, we consider the following question: "given the multiplicity $m$ and embedding dimension $e$ of a numerical semigroup $S$, what can be said about the cardinality $η$ of a minimal presentation of $S$?" We approach this question from a combinatorial (poset-theoretic) perspective, utilizing the recently-introduced notion of a Kunz nilsemigroup. In addition to making significant headway…
▽ More
In this paper, we consider the following question: "given the multiplicity $m$ and embedding dimension $e$ of a numerical semigroup $S$, what can be said about the cardinality $η$ of a minimal presentation of $S$?" We approach this question from a combinatorial (poset-theoretic) perspective, utilizing the recently-introduced notion of a Kunz nilsemigroup. In addition to making significant headway on this question beyond what was previously known, in the form of both explicit constructions and general bounds, we provide a self-contained introduction to Kunz nilsemigroups that avoids the polyhedral geometry necessary for much of their source material.
△ Less
Submitted 9 January, 2024; v1 submitted 29 November, 2022;
originally announced November 2022.
-
Modification of the radioactive heat budget of Earth-like exoplanets by the loss of primordial atmospheres
Authors:
N. Erkaev,
M. Scherf,
O. Herbort,
H. Lammer,
P. Odert,
D. Kubyshkina,
M. Leitzinger,
P. Woitke,
C. O'Neill
Abstract:
The initial abundance of radioactive heat producing isotopes in the interior of a terrestrial planet are important drivers of its thermal evolution and the related tectonics and possible evolution to an Earth-like habitat. The moderately volatile element K can be outgassed from a magma ocean into H$_2$-dominated primordial atmospheres of protoplanets with assumed masses between 0.55-1.0…
▽ More
The initial abundance of radioactive heat producing isotopes in the interior of a terrestrial planet are important drivers of its thermal evolution and the related tectonics and possible evolution to an Earth-like habitat. The moderately volatile element K can be outgassed from a magma ocean into H$_2$-dominated primordial atmospheres of protoplanets with assumed masses between 0.55-1.0$ M_{\rm Earth}$ at the time when the gas disk evaporated. We estimate this outgassing and let these planets grow through impacts of depleted and non-depleted material that resembles the same $^{40}$K abundance of average carbonaceous chondrites until the growing protoplanets reach 1.0 $M_{\rm Earth}$. We examine different atmospheric compositions and, as a function of pressure and temperature, calculate the proportion of K by Gibbs Free Energy minimisation using the GGChem code. We find that for H$_2$-envelopes and for magma ocean surface temperatures that are $\ge$ 2500 K, no K condensates are thermally stable, so that outgassed $^{40}$K can populate the atmosphere to a great extent. However, due to magma ocean turn-over time and the limited diffusion of $^{40}$K into the upper atmosphere, from the entire $^{40}$K in the magma ocean only a fraction may be available for escaping into space. The escape rates of the primordial atmospheres and the dragged $^{40}$K are further simulated for different stellar EUV-activities with a multispecies hydrodynamic upper atmosphere evolution model. Our results lead to different abundances of heat producing elements within the fully grown planets which may give rise to different thermal and tectonic histories of terrestrial planets and their habitability conditions.
△ Less
Submitted 29 September, 2022;
originally announced September 2022.
-
Stochastic accretion of the Earth
Authors:
Paolo A. Sossi,
Ingo L. Stotz,
Seth A. Jacobson,
Alessandro Morbidelli,
Hugh St. C. O'Neill
Abstract:
Earth is depleted in volatile elements relative to chondritic meteorites, its possible building blocks. The extent of this depletion increases with decreasing condensation temperature, and is approximated by a cumulative normal distribution, unlike that in any chondrite. However, moderately volatile elements, occupying the mid-range of the distribution, have chondritic isotope ratios, contrary to…
▽ More
Earth is depleted in volatile elements relative to chondritic meteorites, its possible building blocks. The extent of this depletion increases with decreasing condensation temperature, and is approximated by a cumulative normal distribution, unlike that in any chondrite. However, moderately volatile elements, occupying the mid-range of the distribution, have chondritic isotope ratios, contrary to that expected from loss by partial vaporisation/condensation. Here we reconcile these observations by showing, using N-body simulations, that Earth accreted stochastically from many precursor bodies whose variable compositions reflect the temperatures at which they formed. Impact-induced atmospheric loss was efficient only when the proto-Earth was small, and elements that accreted thereafter retain near-chondritic isotope ratios. Earth's composition is reproduced when initial temperatures of planetesimal- to embryo-sized bodies are set by disk accretion rates of (1.08 $\pm$ 0.17) $\times$ 10$^{-7}$ solar masses/yr, although they may be perturbed by $^{26}$Al heating on bodies formed at different times. The model implies a heliocentric gradient in composition and rapid planetesimal formation within $\sim$ 1 Myr, in accord with radiometric volatile depletion ages of Earth.
△ Less
Submitted 17 July, 2022;
originally announced July 2022.
-
Length density and numerical semigroups
Authors:
Cole Brower,
Scott Chapman,
Travis Kulhanek,
Joseph McDonough,
Christopher O'Neill,
Vody Pavlyuk,
Vadim Ponomarenko
Abstract:
Length density is a recently introduced factorization invariant, assigned to each element $n$ of a cancellative commutative atomic semigroup $S$, that measures how far the set of factorization lengths of $n$ is from being a full interval. We examine length density of elements of numerical semigroups (that is, additive subsemigroups of the non-negative integers).
Length density is a recently introduced factorization invariant, assigned to each element $n$ of a cancellative commutative atomic semigroup $S$, that measures how far the set of factorization lengths of $n$ is from being a full interval. We examine length density of elements of numerical semigroups (that is, additive subsemigroups of the non-negative integers).
△ Less
Submitted 20 October, 2021;
originally announced October 2021.
-
Interference suppression techniques for OPM-based MEG: Opportunities and challenges
Authors:
Robert A Seymour,
Nicholas Alexander,
Stephanie Mellor,
George C O'Neill,
Tim M Tierney,
Gareth R Barnes,
Eleanor A Maguire
Abstract:
One of the primary technical challenges facing magnetoencephalography (MEG) is that the magnitude of neuromagnetic fields is several orders of magnitude lower than interfering signals. Recently, a new type of sensor has been developed - the optically pumped magnetometer (OPM). These sensors can be placed directly on the scalp and move with the head during participant movement, making them wearable…
▽ More
One of the primary technical challenges facing magnetoencephalography (MEG) is that the magnitude of neuromagnetic fields is several orders of magnitude lower than interfering signals. Recently, a new type of sensor has been developed - the optically pumped magnetometer (OPM). These sensors can be placed directly on the scalp and move with the head during participant movement, making them wearable. This opens up a range of exciting experimental and clinical opportunities for OPM-based MEG experiments, including paediatric studies, and the incorporation of naturalistic movements into neuroimaging paradigms. However, OPMs face some unique challenges in terms of interference suppression, especially in situations involving mobile participants, and when OPMs are integrated with electrical equipment required for naturalistic paradigms, such as motion capture systems. Here we briefly review various hardware solutions for OPM interference suppression. We then outline several signal processing strategies aimed at increasing the signal from neuromagnetic sources. These include regression-based strategies, temporal filtering and spatial filtering approaches. The focus is on the practical application of these signal processing algorithms to OPM data. In a similar vein, we include two worked-through experiments using OPM data collected from a whole-head sensor array. These tutorial-style examples illustrate how the steps for suppressing external interference can be implemented, including the associated data and code so that researchers can try the pipelines for themselves. With the popularity of OPM-based MEG rising, there will be an increasing need to deal with interference suppression. We hope this practical paper provides a resource for OPM-based MEG researchers to build upon.
△ Less
Submitted 29 November, 2021; v1 submitted 6 October, 2021;
originally announced October 2021.
-
Factorization length distribution for affine semigroups IV: a geometric approach to weighted factorization lengths in three-generator numerical semigroups
Authors:
Stephan Ramon Garcia,
Christopher O'Neill,
Gabe Udell
Abstract:
For numerical semigroups with three generators, we study the asymptotic behavior of weighted factorization lengths, that is, linear functionals of the coefficients in the factorizations of semigroup elements. This work generalizes many previous results, provides more natural and intuitive proofs, and yields a completely explicit error bound.
For numerical semigroups with three generators, we study the asymptotic behavior of weighted factorization lengths, that is, linear functionals of the coefficients in the factorizations of semigroup elements. This work generalizes many previous results, provides more natural and intuitive proofs, and yields a completely explicit error bound.
△ Less
Submitted 13 August, 2021;
originally announced August 2021.
-
An assessment of Sentinel-1 radar and Sentinel-2 multispectral data for remote archaeological investigation and preservation: Qubbet el-Hawa, Egypt
Authors:
Craig O'Neill,
Martin Bommas
Abstract:
Remote sensing for archaeological investigations using surface response is reasonably well established, however, remote subsurface exploration is limited by depth and penetration and ground resolution. Furthermore, the conservation of archaeological sites requires constant monitoring capability, which is often not feasible between annual field seasons, but may be provided by modern satellite cover…
▽ More
Remote sensing for archaeological investigations using surface response is reasonably well established, however, remote subsurface exploration is limited by depth and penetration and ground resolution. Furthermore, the conservation of archaeological sites requires constant monitoring capability, which is often not feasible between annual field seasons, but may be provided by modern satellite coverage. Here we develop an approach using Sentinel-1 C-band radar backscatter, and Sentinel-2 multispectral data, to map and characterise the site of Qubbet el-Hawa, Egypt. The multispectral bands analysed show similar sensitivity to satellite imagery. However, the radar backscatter is sensitive to exposed known structures, as well as disturbances to soil textural/composition profile due to excavation/erosion. Sub-resolution features such as causeways manifest as a 'radar-break' in the backscatter - a discontinuity in otherwise continuous radar units. Furthermore, the finite subsurface response in the backscatter under the arid conditions of the site means we are able to delineate some shallow subsurface structures and map their orientation beneath the surface in areas not yet excavated. The sensitivity of Sentinel-1 backscatter to soil disturbance and human activity at Qubbet el-Hawa, and the short (~12 day) recurrence time of the satellites, makes it an important tool in heritage conservation.
△ Less
Submitted 26 January, 2021;
originally announced January 2021.
-
Decreasing water budget of the Australian continent from Grace satellite gravity data
Authors:
Craig O'Neill,
Serena Chandler-Ho
Abstract:
Increasing aridification of continental areas due to global climate change has impacted freshwater availability, particularly in extremely dry landmasses, such as Australia. Multiple demands on water resources require integrated basin management approaches, necessitating knowledge of total water storage, and changes in water mass. Such monitoring is not practical at continental scales using tradit…
▽ More
Increasing aridification of continental areas due to global climate change has impacted freshwater availability, particularly in extremely dry landmasses, such as Australia. Multiple demands on water resources require integrated basin management approaches, necessitating knowledge of total water storage, and changes in water mass. Such monitoring is not practical at continental scales using traditional methods. Satellite gravity has proven successful at documenting changes in total water mass at regional scales, and here we use data from the Grace and Grace-FO missions, spanning 2002 - 2020, to track regional water budget trends in Australia most heavily utilised basin systems, including the Murray-Darling Basin. The period of analysis covers the Millennium drought (2002-2009) and 2010-11 heavy flooding events, which contribute significant signal variability. However our extended datasets demonstrate a negative trend in the geoid anomaly over the Murray-Darling Basin of -1.5mm, equivalent to a water loss rate of -0.91 Gt yr-1. With the exception of northern Australia, similar scale geoid declines are observed in most Australian basin systems analysed - implying declining total water storage. Long-term declines in water availability require concerted management plans, balancing the requirements of agriculture and industry, with domestic use, traditional owners, and healthy freshwater ecosystems.
△ Less
Submitted 26 January, 2021;
originally announced January 2021.
-
The fundamental theorem of finite fields: a proof from first principles
Authors:
Anastasia Chavez,
Christopher O'Neill
Abstract:
A mathematics student's first introduction to the fundamental theorem of finite fields (FTFF) often occurs in an advanced abstract algebra course and invokes the power of Galois theory to prove it. Yet the combinatorial and algebraic coding theory applications of finite fields can show up early on for students in STEM. To make the FTFF more accessible to students lacking exposure to Galois theory,…
▽ More
A mathematics student's first introduction to the fundamental theorem of finite fields (FTFF) often occurs in an advanced abstract algebra course and invokes the power of Galois theory to prove it. Yet the combinatorial and algebraic coding theory applications of finite fields can show up early on for students in STEM. To make the FTFF more accessible to students lacking exposure to Galois theory, we provide a proof from algebraic "first principles."
△ Less
Submitted 20 August, 2021; v1 submitted 18 October, 2020;
originally announced October 2020.
-
Numerical semigroups, polyhedra, and posets III: minimal presentations and face dimension
Authors:
Tara Gomes,
Christopher O'Neill,
Eduardo Torres Davila
Abstract:
This paper is the third in a series of manuscripts that examine the combinatorics of the Kunz polyhedron $P_m$, whose positive integer points are in bijection with numerical semigroups (cofinite subsemigroups of $\mathbb Z_{\ge 0}$) whose smallest positive element is $m$. The faces of $P_m$ are indexed by a family of finite posets (called Kunz posets) obtained from the divisibility posets of the n…
▽ More
This paper is the third in a series of manuscripts that examine the combinatorics of the Kunz polyhedron $P_m$, whose positive integer points are in bijection with numerical semigroups (cofinite subsemigroups of $\mathbb Z_{\ge 0}$) whose smallest positive element is $m$. The faces of $P_m$ are indexed by a family of finite posets (called Kunz posets) obtained from the divisibility posets of the numerical semigroups lying on a given face. In this paper, we characterize to what extent the minimal presentation of a numerical semigroup can be recovered from its Kunz poset. In doing so, we prove that all numerical semigroups lying on the interior of a given face of $P_m$ have identical minimal presentation cardinality, and we provide a combinatorial method of obtaining the dimension of a face from its corresponding Kunz poset.
△ Less
Submitted 7 May, 2023; v1 submitted 13 September, 2020;
originally announced September 2020.
-
Field Induced Modulated State in the Ferromagnet PrPtAl
Authors:
Christopher D. O'Neill,
Gino Abdul-Jabbar,
Didier Wermeille,
Philippe Bourges,
Frank Krüger,
Andrew D. Huxley
Abstract:
The theory of quantum order-by-disorder (QOBD) explains the formation of modulated magnetic states at the boundary between ferromagnetism and paramagnetism in zero field. PrPtAl has been argued to provide an archetype for this. Here, we report the phase diagram in magnetic field, applied along both the easy $a$-axis and hard $b$-axis. For field aligned to the $b$-axis, we find that the magnetic tr…
▽ More
The theory of quantum order-by-disorder (QOBD) explains the formation of modulated magnetic states at the boundary between ferromagnetism and paramagnetism in zero field. PrPtAl has been argued to provide an archetype for this. Here, we report the phase diagram in magnetic field, applied along both the easy $a$-axis and hard $b$-axis. For field aligned to the $b$-axis, we find that the magnetic transition temperatures are suppressed and at low temperature there is a single modulated fan state, separating an easy $a$-axis ferromagnetic state from a field polarised state. This fan state is well explained with the QOBD theory in the presence of anisotropy and field. Experimental evidence supporting the QOBD explanation is provided by the large increase in the $T^2$ coefficient of the resistivity and direct detection of enhanced magnetic fluctuations with inelastic neutron scattering, across the field range spanned by the fan state. This shows that the QOBD mechanism can explain field induced modulated states that persist to very low temperature.
△ Less
Submitted 27 April, 2021; v1 submitted 26 August, 2020;
originally announced August 2020.
-
Changes of Fermi Surface Topology due to the Rhombohedral Distortion in SnTe
Authors:
Christopher D. O'Neill,
Oliver J. Clark,
Harry D. J. Keen,
Federico Mazzola,
Igor Marković,
Dmitry A. Sokolov,
Andreas Malekos,
Phil D. C. King,
Andreas Hermann,
Andrew D. Huxley
Abstract:
Stoichiometric SnTe is theoretically a small gap semiconductor that undergoes a ferroelectric distortion on cooling. In reality however, crystals are always non-stoichiometric and metallic; the ferroelectric transition is therefore more accurately described as a polar structural transition. Here we study the Fermi surface using quantum oscillations as a function of pressure. We find the oscillatio…
▽ More
Stoichiometric SnTe is theoretically a small gap semiconductor that undergoes a ferroelectric distortion on cooling. In reality however, crystals are always non-stoichiometric and metallic; the ferroelectric transition is therefore more accurately described as a polar structural transition. Here we study the Fermi surface using quantum oscillations as a function of pressure. We find the oscillation spectrum changes at high pressure, due to the suppression of the polar transition and less than 10 kbar is sufficient to stabilize the undistorted cubic lattice. This is accompanied by a large decrease in the Hall and electrical resistivity. Combined with our density functional theory (DFT) calculations and angle resolved photoemission spectroscopy (ARPES) measurements this suggests the Fermi surface $L$-pockets have lower mobility than the tubular Fermi surfaces that connect them. Also captured in our DFT calculations is a small widening of the band gap and shift in density of states for the polar phase. Additionally we find the unusual phenomenon of a linear magnetoresistance that exists irrespective of the distortion that we attribute to regions of the Fermi surface with high curvature.
△ Less
Submitted 21 August, 2020;
originally announced August 2020.
-
On length densities
Authors:
Scott T. Chapman,
Christopher O'Neill,
Vadim Ponomarenko
Abstract:
For a commutative cancellative monoid $M$, we introduce the notion of the length density of both a nonunit $x\in M$, denoted $\mathrm{LD}(x)$, and the entire monoid $M$, denoted $\mathrm{LD}(M)$. This invariant is related to three widely studied invariants in the theory of non-unit factorizations, $L(x)$, $\ell(x)$, and $ρ(x)$. We consider some general properties of $\mathrm{LD}(x)$ and…
▽ More
For a commutative cancellative monoid $M$, we introduce the notion of the length density of both a nonunit $x\in M$, denoted $\mathrm{LD}(x)$, and the entire monoid $M$, denoted $\mathrm{LD}(M)$. This invariant is related to three widely studied invariants in the theory of non-unit factorizations, $L(x)$, $\ell(x)$, and $ρ(x)$. We consider some general properties of $\mathrm{LD}(x)$ and $\mathrm{LD}(M)$ and give a wide variety of examples using numerical semigroups, Puiseux monoids, and Krull monoids. While we give an example of a monoid $M$ with irrational length density, we show that if $M$ is finitely generated, then $\mathrm{LD}(M)$ is rational and there is a nonunit element $x\in M$ with $\mathrm{LD}(M)=\mathrm{LD}(x)$ (such a monoid is said to have accepted length density). While it is well-known that the much studied asymptotic versions of $L(x)$, $\ell (x)$ and $ρ(x)$ (denoted $\overline{L}(x)$, $\overline{\ell}(x)$, and $\overlineρ (x)$) always exist, we show the somewhat surprising result that $\overline{\mathrm{LD}}(x) = \lim_{n\rightarrow \infty} \mathrm{LD}(x^n)$ may not exist. We also give some finiteness conditions on $M$ that force the existence of $\overline{\mathrm{LD}}(x)$.
△ Less
Submitted 15 August, 2020;
originally announced August 2020.
-
Planetary thermal evolution models with tectonic transitions
Authors:
Craig O'Neill
Abstract:
Thermal history calculations provide important insights into the interior evolution of planets, but incorporate simplified dynamics from the systems they represent. Planetary interiors typical incorporate complex rheologies, viscous layering, lateral heterogeneities, and time delays in processes, which have not been traditionally represented by parameterised approaches. Here we develop numerical m…
▽ More
Thermal history calculations provide important insights into the interior evolution of planets, but incorporate simplified dynamics from the systems they represent. Planetary interiors typical incorporate complex rheologies, viscous layering, lateral heterogeneities, and time delays in processes, which have not been traditionally represented by parameterised approaches. Here we develop numerical models for planetary evolution, incorporating the physical complexity of Earth's interior, and use them to generate statistically-based Nu-Ra scalings. These encapsulate the main effects of tectonic transitions, geometry, and depth-dependent rheology, and time-sensitivity. We find an exponent $β$ of ~0.26 best describes the Nu-Ra relationship for evolving mobile lid systems, and $β$ ~0.12 for stagnant-lid systems. Systems with time dependent subduction have $β$ varying between ~0.26 during the Hadean, when external factors such as impacts facilitate tectonics, to ~0.12 during the Archaean, when the system is dominated by long periods of quiescence, and systems driven by external forcings (eg. due to impacts in the first 100Myr of Earth's history) may exhibit much higher exponents. We also find a time-lag between Ra (which primarily depends on mantle temperature) and Nu (normalised surface heat flow) of around 200-300Myr, suggesting a significant delay between mantle thermal configuration, and its surface manifestation. These results provide an approach for the rapid characterisation of tectonic, volcanic, and atmospheric evolution of terrestrial exoplanets.
△ Less
Submitted 10 June, 2020;
originally announced June 2020.
-
On minimal presentations of shifted affine semigroups with few generators
Authors:
Christopher O'Neill,
Isabel White
Abstract:
An affine semigroup is a finitely generated subsemigroup of $(\mathbb Z_{\ge 0}^d, +)$, and a numerical semigroup is an affine semigroup with $d = 1$. A growing body of recent work examines shifted families of numerical semigroups, that is, families of numerical semigroups of the form $M_n = \langle n + r_1, \ldots, n + r_k \rangle$ for fixed $r_1, \ldots, r_k$, with one semigroup for each value o…
▽ More
An affine semigroup is a finitely generated subsemigroup of $(\mathbb Z_{\ge 0}^d, +)$, and a numerical semigroup is an affine semigroup with $d = 1$. A growing body of recent work examines shifted families of numerical semigroups, that is, families of numerical semigroups of the form $M_n = \langle n + r_1, \ldots, n + r_k \rangle$ for fixed $r_1, \ldots, r_k$, with one semigroup for each value of the shift parameter $n$. It has been shown that within any shifted family of numerical semigroups, the size of any minimal presentation is bounded (in fact, this size is eventually periodic in $n$). In this paper, we consider shifted families of affine semigroups, and demonstrate that some, but not all, shifted families of 4-generated affine semigroups have arbitrarily large minimal presentations.
△ Less
Submitted 2 June, 2020;
originally announced June 2020.
-
Factorization length distribution for affine semigroups III: modular equidistribution for numerical semigroups with arbitrarily many generators
Authors:
Stephan Ramon Garcia,
Mohamed Omar,
Christopher O'Neill,
Timothy Wesley
Abstract:
For numerical semigroups with a specified list of (not necessarily minimal) generators, we describe the asymptotic distribution of factorization lengths with respect to an arbitrary modulus. In particular, we prove that the factorization lengths are equidistributed across all congruence classes that are not trivially ruled out by modular considerations.
For numerical semigroups with a specified list of (not necessarily minimal) generators, we describe the asymptotic distribution of factorization lengths with respect to an arbitrary modulus. In particular, we prove that the factorization lengths are equidistributed across all congruence classes that are not trivially ruled out by modular considerations.
△ Less
Submitted 1 October, 2020; v1 submitted 29 May, 2020;
originally announced June 2020.
-
A Global Fireball Observatory
Authors:
H. A. R. Devillepoix,
M. Cupák,
P. A. Bland,
E. K. Sansom,
M. C. Towner,
R. M. Howie,
B. A. D. Hartig,
T. Jansen-Sturgeon,
P. M. Shober,
S. L. Anderson,
G. K. Benedix,
D. Busan,
R. Sayers,
P. Jenniskens,
J. Albers,
C. D. K. Herd,
P. J. A. Hill,
P. G. Brown,
Z. Krzeminski,
G. R. Osinski,
H. Chennaoui Aoudjehane,
Z. Benkhaldoun,
A. Jabiri,
M. Guennoun,
A. Barka
, et al. (24 additional authors not shown)
Abstract:
The world's meteorite collections contain a very rich picture of what the early Solar System would have been made of, however the lack of spatial context with respect to their parent population for these samples is an issue. The asteroid population is equally as rich in surface mineralogies, and mapping these two populations (meteorites and asteroids) together is a major challenge for planetary sc…
▽ More
The world's meteorite collections contain a very rich picture of what the early Solar System would have been made of, however the lack of spatial context with respect to their parent population for these samples is an issue. The asteroid population is equally as rich in surface mineralogies, and mapping these two populations (meteorites and asteroids) together is a major challenge for planetary science. Directly probing asteroids achieves this at a high cost. Observing meteorite falls and calculating their pre-atmospheric orbit on the other hand, is a cheaper way to approach the problem. The Global Fireball Observatory (GFO) collaboration was established in 2017 and brings together multiple institutions (from Australia, USA, Canada, Morocco, Saudi Arabia, the UK, and Argentina) to maximise the area for fireball observation time and therefore meteorite recoveries. The members have a choice to operate independently, but they can also choose to work in a fully collaborative manner with other GFO partners. This efficient approach leverages the experience gained from the Desert Fireball Network (DFN) pathfinder project in Australia. The state-of-the art technology (DFN camera systems and data reduction) and experience of the support teams is shared between all partners, freeing up time for science investigations and meteorite searching. With all networks combined together, the GFO collaboration already covers 0.6% of the Earth's surface for meteorite recovery as of mid-2019, and aims to reach 2% in the early 2020s. We estimate that after 5 years of operation, the GFO will have observed a fireball from virtually every meteorite type. This combined effort will bring new, fresh, extra-terrestrial material to the labs, yielding new insights about the formation of the Solar System.
△ Less
Submitted 12 June, 2020; v1 submitted 2 April, 2020;
originally announced April 2020.
-
On Atomic Density of Numerical Semigroup Algebras
Authors:
A. A. Antoniou,
R. A. C. Edmonds,
B. Kubik,
C. O'Neill,
S. Talbott
Abstract:
A numerical semigroup $S$ is a cofinite, additively-closed subset of the nonnegative integers that contains $0$. In this paper, we initiate the study of atomic density, an asymptotic measure of the proportion of irreducible elements in a given ring or semigroup, for semigroup algebras. It is known that the atomic density of the polynomial ring $\mathbb{F}_q[x]$ is zero for any finite field…
▽ More
A numerical semigroup $S$ is a cofinite, additively-closed subset of the nonnegative integers that contains $0$. In this paper, we initiate the study of atomic density, an asymptotic measure of the proportion of irreducible elements in a given ring or semigroup, for semigroup algebras. It is known that the atomic density of the polynomial ring $\mathbb{F}_q[x]$ is zero for any finite field $\mathbb{F}_q$; we prove that the numerical semigroup algebra $\mathbb{F}_q[S]$ also has atomic density zero for any numerical semigroup~$S$. We also examine the particular algebra $\mathbb{F}_2[x^2,x^3]$ in more detail, providing a bound on the rate of convergence of the atomic density as well as a counting formula for irreducible polynomials using Möbius inversion, comparable to the formula for irreducible polynomials over a finite field $\mathbb{F}_q$.
△ Less
Submitted 6 March, 2021; v1 submitted 3 March, 2020;
originally announced March 2020.
-
Weighted Means of B-Splines, Positivity of Divided Differences, and Complete Homogeneous Symmetric Polynomials
Authors:
Albrecht Boettcher,
Stephan Ramon Garcia,
Mohamed Omar,
Christopher O'Neill
Abstract:
We employ the fact certain divided differences can be written as weighted means of B-splines and hence are positive. These divided differences include the complete homogeneous symmetric polynomials of even degree $2p$, the positivity of which is a classical result by D. B. Hunter. We extend Hunter's result to complete homogeneous symmetric polynomials of fractional degree, which are defined via Ja…
▽ More
We employ the fact certain divided differences can be written as weighted means of B-splines and hence are positive. These divided differences include the complete homogeneous symmetric polynomials of even degree $2p$, the positivity of which is a classical result by D. B. Hunter. We extend Hunter's result to complete homogeneous symmetric polynomials of fractional degree, which are defined via Jacobi's bialternant formula. We show in particular that these polynomials have positive real part for real degrees $μ$ with $|μ-2p|< 1/2$. We also prove a positivity criterion for linear combinations of the classical complete homogeneous symmetric polynomials and a sufficient criterion for the positivity of linear combinations of products of such polynomials.
△ Less
Submitted 20 January, 2020; v1 submitted 6 January, 2020;
originally announced January 2020.
-
Distances between factorizations in the Chicken McNugget monoid
Authors:
Scott Chapman,
Pedro Garcia-Sanchez,
Christopher O'Neill
Abstract:
We use the Chicken McNugget monoid to demonstrate various factorization properties related to relations and chains of factorizations. We study in depth the catenary and tame degrees of this monoid.
We use the Chicken McNugget monoid to demonstrate various factorization properties related to relations and chains of factorizations. We study in depth the catenary and tame degrees of this monoid.
△ Less
Submitted 9 December, 2019;
originally announced December 2019.
-
Numerical semigroups, polyhedra, and posets II: locating certain families of semigroups
Authors:
Jackson Autry,
Abigail Ezell,
Tara Gomes,
Christopher O'Neill,
Christopher Preuss,
Tarang Saluja,
Eduardo Torres Davila
Abstract:
Several recent papers have examined a rational polyhedron $P_m$ whose integer points are in bijection with the numerical semigroups (cofinite, additively closed subsets of the non-negative integers) containing $m$. A combinatorial description of the faces of $P_m$ was recently introduced, one that can be obtained from the divisibility posets of the numerical semigroups a given face contains. In th…
▽ More
Several recent papers have examined a rational polyhedron $P_m$ whose integer points are in bijection with the numerical semigroups (cofinite, additively closed subsets of the non-negative integers) containing $m$. A combinatorial description of the faces of $P_m$ was recently introduced, one that can be obtained from the divisibility posets of the numerical semigroups a given face contains. In this paper, we study the faces of $P_m$ containing arithmetical numerical semigroups and those containing certain glued numerical semigroups, as an initial step towards better understanding the full face structure of $P_m$. In most cases, such faces only contain semigroups from these families, yielding a tight connection to the geometry of $P_m$.
△ Less
Submitted 4 October, 2021; v1 submitted 9 December, 2019;
originally announced December 2019.
-
Numerical semigroups, polyhedra, and posets I: the group cone
Authors:
Nathan Kaplan,
Christopher O'Neill
Abstract:
Several recent papers have explored families of rational polyhedra whose integer points are in bijection with certain families of numerical semigroups. One such family, first introduced by Kunz, has integer points in bijection with numerical semigroups of fixed multiplicity, and another, introduced by Hellus and Waldi, has integer points corresponding to oversemigroups of numerical semigroups with…
▽ More
Several recent papers have explored families of rational polyhedra whose integer points are in bijection with certain families of numerical semigroups. One such family, first introduced by Kunz, has integer points in bijection with numerical semigroups of fixed multiplicity, and another, introduced by Hellus and Waldi, has integer points corresponding to oversemigroups of numerical semigroups with two generators. In this paper, we provide a combinatorial framework from which to study both families of polyhedra. We introduce a new family of polyhedra called group cones, each constructed from some finite abelian group, from which both of the aforementioned families of polyhedra are directly determined but that are more natural to study from a standpoint of polyhedral geometry. We prove that the faces of group cones are naturally indexed by a family of finite posets, and illustrate how this combinatorial data relates to semigroups living in the corresponding faces of the other two families of polyhedra.
△ Less
Submitted 30 March, 2022; v1 submitted 8 December, 2019;
originally announced December 2019.
-
Factorization length distribution for affine semigroups II: asymptotic behavior for numerical semigroups with arbitrarily many generators
Authors:
Stephan Ramon Garcia,
Mohamed Omar,
Christopher O'Neill,
Samuel Yih
Abstract:
For numerical semigroups with a specified list of (not necessarily minimal) generators, we obtain explicit asymptotic expressions, and in some cases quasipolynomial/quasirational representations, for all major factorization length statistics. This involves a variety of tools that are not standard in the subject, such as algebraic combinatorics (Schur polynomials), probability theory (weak converge…
▽ More
For numerical semigroups with a specified list of (not necessarily minimal) generators, we obtain explicit asymptotic expressions, and in some cases quasipolynomial/quasirational representations, for all major factorization length statistics. This involves a variety of tools that are not standard in the subject, such as algebraic combinatorics (Schur polynomials), probability theory (weak convergence of measures, characteristic functions), and harmonic analysis (Fourier transforms of distributions). We provide instructive examples which demonstrate the power and generality of our techniques. We also highlight unexpected consequences in the theory of homogeneous symmetric functions.
△ Less
Submitted 8 October, 2020; v1 submitted 11 November, 2019;
originally announced November 2019.
-
Recreating the OSIRIS-REx Slingshot Manoeuvre from a Network of Ground-Based Sensors
Authors:
Trent Jansen-Sturgeon,
Benjamin A. D. Hartig,
Gregory J. Madsen,
Philip A. Bland,
Eleanor K. Sansom,
Hadrien A. R. Devillepoix,
Robert M. Howie,
Martin Cupak,
Martin C. Towner,
Morgan A. Cox,
Nicole D. Nevill,
Zacchary N. P. Hoskins,
Geoffrey P. Bonning,
Josh Calcino,
Jake T. Clark,
Bryce M. Henson,
Andrew Langendam,
Samuel J. Matthews,
Terence P. McClafferty,
Jennifer T. Mitchell,
Craig J. O'Neill,
Luke T. Smith,
Alastair W. Tait
Abstract:
Optical tracking systems typically trade-off between astrometric precision and field-of-view. In this work, we showcase a networked approach to optical tracking using very wide field-of-view imagers that have relatively low astrometric precision on the scheduled OSIRIS-REx slingshot manoeuvre around Earth on September 22nd, 2017. As part of a trajectory designed to get OSIRIS-REx to NEO 101955 Ben…
▽ More
Optical tracking systems typically trade-off between astrometric precision and field-of-view. In this work, we showcase a networked approach to optical tracking using very wide field-of-view imagers that have relatively low astrometric precision on the scheduled OSIRIS-REx slingshot manoeuvre around Earth on September 22nd, 2017. As part of a trajectory designed to get OSIRIS-REx to NEO 101955 Bennu, this flyby event was viewed from 13 remote sensors spread across Australia and New Zealand to promote triangulatable observations. Each observatory in this portable network was constructed to be as lightweight and portable as possible, with hardware based off the successful design of the Desert Fireball Network.
Over a 4 hour collection window, we gathered 15,439 images of the night sky in the predicted direction of the OSIRIS-REx spacecraft. Using a specially developed streak detection and orbit determination data pipeline, we detected 2,090 line-of-sight observations. Our fitted orbit was determined to be within about 10~km of orbital telemetry along the observed 109,262~km length of OSIRIS-REx trajectory, and thus demonstrating the impressive capability of a networked approach to SSA.
△ Less
Submitted 2 November, 2019;
originally announced November 2019.