-
Trace is the New AutoDiff -- Unlocking Efficient Optimization of Computational Workflows
Authors:
Ching-An Cheng,
Allen Nie,
Adith Swaminathan
Abstract:
We study a class of optimization problems motivated by automating the design and update of AI systems like coding assistants, robots, and copilots. We propose an end-to-end optimization framework, Trace, which treats the computational workflow of an AI system as a graph akin to neural networks, based on a generalization of back-propagation. Optimization of computational workflows often involves ri…
▽ More
We study a class of optimization problems motivated by automating the design and update of AI systems like coding assistants, robots, and copilots. We propose an end-to-end optimization framework, Trace, which treats the computational workflow of an AI system as a graph akin to neural networks, based on a generalization of back-propagation. Optimization of computational workflows often involves rich feedback (e.g. console output or user's responses), heterogeneous parameters (e.g. prompts, hyper-parameters, codes), and intricate objectives (beyond maximizing a score). Moreover, its computation graph can change dynamically with the inputs and parameters. We frame a new mathematical setup of iterative optimization, Optimization with Trace Oracle (OPTO), to capture and abstract these properties so as to design optimizers that work across many domains. In OPTO, an optimizer receives an execution trace along with feedback on the computed output and updates parameters iteratively. Trace is the tool to implement OPTO in practice. Trace has a Python interface that efficiently converts a computational workflow into an OPTO instance using a PyTorch-like interface. Using Trace, we develop a general-purpose LLM-based optimizer called OptoPrime that can effectively solve OPTO problems. In empirical studies, we find that OptoPrime is capable of first-order numerical optimization, prompt optimization, hyper-parameter tuning, robot controller design, code debugging, etc., and is often competitive with specialized optimizers for each domain. We believe that Trace, OptoPrime and the OPTO framework will enable the next generation of interactive agents that automatically adapt using various kinds of feedback. Website: https://microsoft.github.io/Trace
△ Less
Submitted 23 June, 2024;
originally announced June 2024.
-
Stability of the Toda equations related to a perturbed $R_i$ type recurrence relation
Authors:
Vinay Shukla,
A. Swaminathan
Abstract:
In this manuscript, a modified $R_I$ type recurrence relation is considered whose recurrence coefficients are perturbed by addition or multiplication of a constant. The perturbed system of recurrence coefficients is represented by Toda lattice equations, which are derived. These equations are then represented in a matrix form. With the help of this matrix representation, a known Lax pair is recove…
▽ More
In this manuscript, a modified $R_I$ type recurrence relation is considered whose recurrence coefficients are perturbed by addition or multiplication of a constant. The perturbed system of recurrence coefficients is represented by Toda lattice equations, which are derived. These equations are then represented in a matrix form. With the help of this matrix representation, a known Lax pair is recovered. Inferences about the stability of resulting perturbed system of Toda equations are drawn based on numerical experiments.
△ Less
Submitted 14 June, 2024;
originally announced June 2024.
-
Diffusion Soup: Model Merging for Text-to-Image Diffusion Models
Authors:
Benjamin Biggs,
Arjun Seshadri,
Yang Zou,
Achin Jain,
Aditya Golatkar,
Yusheng Xie,
Alessandro Achille,
Ashwin Swaminathan,
Stefano Soatto
Abstract:
We present Diffusion Soup, a compartmentalization method for Text-to-Image Generation that averages the weights of diffusion models trained on sharded data. By construction, our approach enables training-free continual learning and unlearning with no additional memory or inference costs, since models corresponding to data shards can be added or removed by re-averaging. We show that Diffusion Soup…
▽ More
We present Diffusion Soup, a compartmentalization method for Text-to-Image Generation that averages the weights of diffusion models trained on sharded data. By construction, our approach enables training-free continual learning and unlearning with no additional memory or inference costs, since models corresponding to data shards can be added or removed by re-averaging. We show that Diffusion Soup samples from a point in weight space that approximates the geometric mean of the distributions of constituent datasets, which offers anti-memorization guarantees and enables zero-shot style mixing. Empirically, Diffusion Soup outperforms a paragon model trained on the union of all data shards and achieves a 30% improvement in Image Reward (.34 $\to$ .44) on domain sharded data, and a 59% improvement in IR (.37 $\to$ .59) on aesthetic data. In both cases, souping also prevails in TIFA score (respectively, 85.5 $\to$ 86.5 and 85.6 $\to$ 86.8). We demonstrate robust unlearning -- removing any individual domain shard only lowers performance by 1% in IR (.45 $\to$ .44) -- and validate our theoretical insights on anti-memorization using real data. Finally, we showcase Diffusion Soup's ability to blend the distinct styles of models finetuned on different shards, resulting in the zero-shot generation of hybrid styles.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
On Overcoming Miscalibrated Conversational Priors in LLM-based Chatbots
Authors:
Christine Herlihy,
Jennifer Neville,
Tobias Schnabel,
Adith Swaminathan
Abstract:
We explore the use of Large Language Model (LLM-based) chatbots to power recommender systems. We observe that the chatbots respond poorly when they encounter under-specified requests (e.g., they make incorrect assumptions, hedge with a long response, or refuse to answer). We conjecture that such miscalibrated response tendencies (i.e., conversational priors) can be attributed to LLM fine-tuning us…
▽ More
We explore the use of Large Language Model (LLM-based) chatbots to power recommender systems. We observe that the chatbots respond poorly when they encounter under-specified requests (e.g., they make incorrect assumptions, hedge with a long response, or refuse to answer). We conjecture that such miscalibrated response tendencies (i.e., conversational priors) can be attributed to LLM fine-tuning using annotators -- single-turn annotations may not capture multi-turn conversation utility, and the annotators' preferences may not even be representative of users interacting with a recommender system.
We first analyze public LLM chat logs to conclude that query under-specification is common. Next, we study synthetic recommendation problems with configurable latent item utilities and frame them as Partially Observed Decision Processes (PODP). We find that pre-trained LLMs can be sub-optimal for PODPs and derive better policies that clarify under-specified queries when appropriate. Then, we re-calibrate LLMs by prompting them with learned control messages to approximate the improved policy. Finally, we show empirically that our lightweight learning approach effectively uses logged conversation data to re-calibrate the response strategies of LLM-based chatbots for recommendation tasks.
△ Less
Submitted 1 June, 2024;
originally announced June 2024.
-
The Importance of Directional Feedback for LLM-based Optimizers
Authors:
Allen Nie,
Ching-An Cheng,
Andrey Kolobov,
Adith Swaminathan
Abstract:
We study the potential of using large language models (LLMs) as an interactive optimizer for solving maximization problems in a text space using natural language and numerical feedback. Inspired by the classical optimization literature, we classify the natural language feedback into directional and non-directional, where the former is a generalization of the first-order feedback to the natural lan…
▽ More
We study the potential of using large language models (LLMs) as an interactive optimizer for solving maximization problems in a text space using natural language and numerical feedback. Inspired by the classical optimization literature, we classify the natural language feedback into directional and non-directional, where the former is a generalization of the first-order feedback to the natural language space. We find that LLMs are especially capable of optimization when they are provided with {directional feedback}. Based on this insight, we design a new LLM-based optimizer that synthesizes directional feedback from the historical optimization trace to achieve reliable improvement over iterations. Empirically, we show our LLM-based optimizer is more stable and efficient in solving optimization problems, from maximizing mathematical functions to optimizing prompts for writing poems, compared with existing techniques.
△ Less
Submitted 20 June, 2024; v1 submitted 26 May, 2024;
originally announced May 2024.
-
A common zero at the end point of the support of measure for the quasi-natured spectrally transformed polynomials
Authors:
Vikash Kumar,
A. Swaminathan
Abstract:
In this work, the explicit expressions of coefficients involved in quasi-type kernel polynomials of order one and quasi-Geronimus polynomials of order one are determined for Jacobi polynomials. These coefficients are responsible for establishing the orthogonality of quasi-spectral polynomials for Jacobi polynomials. Additionally, the orthogonality of quasi-type kernel Laguerre polynomials of order…
▽ More
In this work, the explicit expressions of coefficients involved in quasi-type kernel polynomials of order one and quasi-Geronimus polynomials of order one are determined for Jacobi polynomials. These coefficients are responsible for establishing the orthogonality of quasi-spectral polynomials for Jacobi polynomials. Additionally, the orthogonality of quasi-type kernel Laguerre polynomials of order one is derived. In the process of achieving orthogonality, one zero in both cases is located on the boundary of the support of the measure. This allows us to derive the chain sequence and minimal parameter sequence at the point lying at the end point of the support of the measure. Also, this leads to the question of characterizing such spectrally transformed polynomials.
Furthermore, the interlacing properties among the zeros of quasi-spectral orthogonal Jacobi polynomials and Jacobi polynomials are illustrated.
△ Less
Submitted 20 May, 2024;
originally announced May 2024.
-
A positive proportion of monic odd-degree hyperelliptic curves of genus $g \geq 4$ have no unexpected quadratic points
Authors:
Jef Laga,
Ashvin A. Swaminathan
Abstract:
Let $\mathcal{F}_g$ be the family of monic odd-degree hyperelliptic curves of genus $g$ over $\mathbb{Q}$. Poonen and Stoll have shown that for every $g \geq 3$, a positive proportion of curves in $\mathcal{F}_g$ have no rational points except the point at infinity. In this note, we prove the analogue for quadratic points: for each $g\geq 4$, a positive proportion of curves in $\mathcal{F}_g$ have…
▽ More
Let $\mathcal{F}_g$ be the family of monic odd-degree hyperelliptic curves of genus $g$ over $\mathbb{Q}$. Poonen and Stoll have shown that for every $g \geq 3$, a positive proportion of curves in $\mathcal{F}_g$ have no rational points except the point at infinity. In this note, we prove the analogue for quadratic points: for each $g\geq 4$, a positive proportion of curves in $\mathcal{F}_g$ have no points defined over quadratic extensions except those that arise by pulling back rational points from $\mathbb{P}^1$.
△ Less
Submitted 15 May, 2024;
originally announced May 2024.
-
THRONE: An Object-based Hallucination Benchmark for the Free-form Generations of Large Vision-Language Models
Authors:
Prannay Kaul,
Zhizhong Li,
Hao Yang,
Yonatan Dukler,
Ashwin Swaminathan,
C. J. Taylor,
Stefano Soatto
Abstract:
Mitigating hallucinations in large vision-language models (LVLMs) remains an open problem. Recent benchmarks do not address hallucinations in open-ended free-form responses, which we term "Type I hallucinations". Instead, they focus on hallucinations responding to very specific question formats -- typically a multiple-choice response regarding a particular object or attribute -- which we term "Typ…
▽ More
Mitigating hallucinations in large vision-language models (LVLMs) remains an open problem. Recent benchmarks do not address hallucinations in open-ended free-form responses, which we term "Type I hallucinations". Instead, they focus on hallucinations responding to very specific question formats -- typically a multiple-choice response regarding a particular object or attribute -- which we term "Type II hallucinations". Additionally, such benchmarks often require external API calls to models which are subject to change. In practice, we observe that a reduction in Type II hallucinations does not lead to a reduction in Type I hallucinations but rather that the two forms of hallucinations are often anti-correlated. To address this, we propose THRONE, a novel object-based automatic framework for quantitatively evaluating Type I hallucinations in LVLM free-form outputs. We use public language models (LMs) to identify hallucinations in LVLM responses and compute informative metrics. By evaluating a large selection of recent LVLMs using public datasets, we show that an improvement in existing metrics do not lead to a reduction in Type I hallucinations, and that established benchmarks for measuring Type I hallucinations are incomplete. Finally, we provide a simple and effective data augmentation method to reduce Type I and Type II hallucinations as a strong baseline.
△ Less
Submitted 8 May, 2024;
originally announced May 2024.
-
Grounded Compositional and Diverse Text-to-3D with Pretrained Multi-View Diffusion Model
Authors:
Xiaolong Li,
Jiawei Mo,
Ying Wang,
Chethan Parameshwara,
Xiaohan Fei,
Ashwin Swaminathan,
CJ Taylor,
Zhuowen Tu,
Paolo Favaro,
Stefano Soatto
Abstract:
In this paper, we propose an effective two-stage approach named Grounded-Dreamer to generate 3D assets that can accurately follow complex, compositional text prompts while achieving high fidelity by using a pre-trained multi-view diffusion model. Multi-view diffusion models, such as MVDream, have shown to generate high-fidelity 3D assets using score distillation sampling (SDS). However, applied na…
▽ More
In this paper, we propose an effective two-stage approach named Grounded-Dreamer to generate 3D assets that can accurately follow complex, compositional text prompts while achieving high fidelity by using a pre-trained multi-view diffusion model. Multi-view diffusion models, such as MVDream, have shown to generate high-fidelity 3D assets using score distillation sampling (SDS). However, applied naively, these methods often fail to comprehend compositional text prompts, and may often entirely omit certain subjects or parts. To address this issue, we first advocate leveraging text-guided 4-view images as the bottleneck in the text-to-3D pipeline. We then introduce an attention refocusing mechanism to encourage text-aligned 4-view image generation, without the necessity to re-train the multi-view diffusion model or craft a high-quality compositional 3D dataset. We further propose a hybrid optimization strategy to encourage synergy between the SDS loss and the sparse RGB reference images. Our method consistently outperforms previous state-of-the-art (SOTA) methods in generating compositional 3D assets, excelling in both quality and accuracy, and enabling diverse 3D from the same text prompt.
△ Less
Submitted 28 April, 2024;
originally announced April 2024.
-
Mixed-Query Transformer: A Unified Image Segmentation Architecture
Authors:
Pei Wang,
Zhaowei Cai,
Hao Yang,
Ashwin Swaminathan,
R. Manmatha,
Stefano Soatto
Abstract:
Existing unified image segmentation models either employ a unified architecture across multiple tasks but use separate weights tailored to each dataset, or apply a single set of weights to multiple datasets but are limited to a single task. In this paper, we introduce the Mixed-Query Transformer (MQ-Former), a unified architecture for multi-task and multi-dataset image segmentation using a single…
▽ More
Existing unified image segmentation models either employ a unified architecture across multiple tasks but use separate weights tailored to each dataset, or apply a single set of weights to multiple datasets but are limited to a single task. In this paper, we introduce the Mixed-Query Transformer (MQ-Former), a unified architecture for multi-task and multi-dataset image segmentation using a single set of weights. To enable this, we propose a mixed query strategy, which can effectively and dynamically accommodate different types of objects without heuristic designs. In addition, the unified architecture allows us to use data augmentation with synthetic masks and captions to further improve model generalization. Experiments demonstrate that MQ-Former can not only effectively handle multiple segmentation datasets and tasks compared to specialized state-of-the-art models with competitive performance, but also generalize better to open-set segmentation tasks, evidenced by over 7 points higher performance than the prior art on the open-vocabulary SeginW benchmark.
△ Less
Submitted 5 April, 2024;
originally announced April 2024.
-
On the Scalability of Diffusion-based Text-to-Image Generation
Authors:
Hao Li,
Yang Zou,
Ying Wang,
Orchid Majumder,
Yusheng Xie,
R. Manmatha,
Ashwin Swaminathan,
Zhuowen Tu,
Stefano Ermon,
Stefano Soatto
Abstract:
Scaling up model and data size has been quite successful for the evolution of LLMs. However, the scaling law for the diffusion based text-to-image (T2I) models is not fully explored. It is also unclear how to efficiently scale the model for better performance at reduced cost. The different training settings and expensive training cost make a fair model comparison extremely difficult. In this work,…
▽ More
Scaling up model and data size has been quite successful for the evolution of LLMs. However, the scaling law for the diffusion based text-to-image (T2I) models is not fully explored. It is also unclear how to efficiently scale the model for better performance at reduced cost. The different training settings and expensive training cost make a fair model comparison extremely difficult. In this work, we empirically study the scaling properties of diffusion based T2I models by performing extensive and rigours ablations on scaling both denoising backbones and training set, including training scaled UNet and Transformer variants ranging from 0.4B to 4B parameters on datasets upto 600M images. For model scaling, we find the location and amount of cross attention distinguishes the performance of existing UNet designs. And increasing the transformer blocks is more parameter-efficient for improving text-image alignment than increasing channel numbers. We then identify an efficient UNet variant, which is 45% smaller and 28% faster than SDXL's UNet. On the data scaling side, we show the quality and diversity of the training set matters more than simply dataset size. Increasing caption density and diversity improves text-image alignment performance and the learning efficiency. Finally, we provide scaling functions to predict the text-image alignment performance as functions of the scale of model size, compute and dataset size.
△ Less
Submitted 3 April, 2024;
originally announced April 2024.
-
CPR: Retrieval Augmented Generation for Copyright Protection
Authors:
Aditya Golatkar,
Alessandro Achille,
Luca Zancato,
Yu-Xiang Wang,
Ashwin Swaminathan,
Stefano Soatto
Abstract:
Retrieval Augmented Generation (RAG) is emerging as a flexible and robust technique to adapt models to private users data without training, to handle credit attribution, and to allow efficient machine unlearning at scale. However, RAG techniques for image generation may lead to parts of the retrieved samples being copied in the model's output. To reduce risks of leaking private information contain…
▽ More
Retrieval Augmented Generation (RAG) is emerging as a flexible and robust technique to adapt models to private users data without training, to handle credit attribution, and to allow efficient machine unlearning at scale. However, RAG techniques for image generation may lead to parts of the retrieved samples being copied in the model's output. To reduce risks of leaking private information contained in the retrieved set, we introduce Copy-Protected generation with Retrieval (CPR), a new method for RAG with strong copyright protection guarantees in a mixed-private setting for diffusion models.CPR allows to condition the output of diffusion models on a set of retrieved images, while also guaranteeing that unique identifiable information about those example is not exposed in the generated outputs. In particular, it does so by sampling from a mixture of public (safe) distribution and private (user) distribution by merging their diffusion scores at inference. We prove that CPR satisfies Near Access Freeness (NAF) which bounds the amount of information an attacker may be able to extract from the generated images. We provide two algorithms for copyright protection, CPR-KL and CPR-Choose. Unlike previously proposed rejection-sampling-based NAF methods, our methods enable efficient copyright-protected sampling with a single run of backward diffusion. We show that our method can be applied to any pre-trained conditional diffusion model, such as Stable Diffusion or unCLIP. In particular, we empirically show that applying CPR on top of unCLIP improves quality and text-to-image alignment of the generated results (81.4 to 83.17 on TIFA benchmark), while enabling credit attribution, copy-right protection, and deterministic, constant time, unlearning.
△ Less
Submitted 27 March, 2024;
originally announced March 2024.
-
Multi-Modal Hallucination Control by Visual Information Grounding
Authors:
Alessandro Favero,
Luca Zancato,
Matthew Trager,
Siddharth Choudhary,
Pramuditha Perera,
Alessandro Achille,
Ashwin Swaminathan,
Stefano Soatto
Abstract:
Generative Vision-Language Models (VLMs) are prone to generate plausible-sounding textual answers that, however, are not always grounded in the input image. We investigate this phenomenon, usually referred to as "hallucination" and show that it stems from an excessive reliance on the language prior. In particular, we show that as more tokens are generated, the reliance on the visual prompt decreas…
▽ More
Generative Vision-Language Models (VLMs) are prone to generate plausible-sounding textual answers that, however, are not always grounded in the input image. We investigate this phenomenon, usually referred to as "hallucination" and show that it stems from an excessive reliance on the language prior. In particular, we show that as more tokens are generated, the reliance on the visual prompt decreases, and this behavior strongly correlates with the emergence of hallucinations. To reduce hallucinations, we introduce Multi-Modal Mutual-Information Decoding (M3ID), a new sampling method for prompt amplification. M3ID amplifies the influence of the reference image over the language prior, hence favoring the generation of tokens with higher mutual information with the visual prompt. M3ID can be applied to any pre-trained autoregressive VLM at inference time without necessitating further training and with minimal computational overhead. If training is an option, we show that M3ID can be paired with Direct Preference Optimization (DPO) to improve the model's reliance on the prompt image without requiring any labels. Our empirical findings show that our algorithms maintain the fluency and linguistic capabilities of pre-trained VLMs while reducing hallucinations by mitigating visually ungrounded answers. Specifically, for the LLaVA 13B model, M3ID and M3ID+DPO reduce the percentage of hallucinated objects in captioning tasks by 25% and 28%, respectively, and improve the accuracy on VQA benchmarks such as POPE by 21% and 24%.
△ Less
Submitted 20 March, 2024;
originally announced March 2024.
-
Fast Sparse View Guided NeRF Update for Object Reconfigurations
Authors:
Ziqi Lu,
Jianbo Ye,
Xiaohan Fei,
Xiaolong Li,
Jiawei Mo,
Ashwin Swaminathan,
Stefano Soatto
Abstract:
Neural Radiance Field (NeRF), as an implicit 3D scene representation, lacks inherent ability to accommodate changes made to the initial static scene. If objects are reconfigured, it is difficult to update the NeRF to reflect the new state of the scene without time-consuming data re-capturing and NeRF re-training. To address this limitation, we develop the first update method for NeRFs to physical…
▽ More
Neural Radiance Field (NeRF), as an implicit 3D scene representation, lacks inherent ability to accommodate changes made to the initial static scene. If objects are reconfigured, it is difficult to update the NeRF to reflect the new state of the scene without time-consuming data re-capturing and NeRF re-training. To address this limitation, we develop the first update method for NeRFs to physical changes. Our method takes only sparse new images (e.g. 4) of the altered scene as extra inputs and update the pre-trained NeRF in around 1 to 2 minutes. Particularly, we develop a pipeline to identify scene changes and update the NeRF accordingly. Our core idea is the use of a second helper NeRF to learn the local geometry and appearance changes, which sidesteps the optimization difficulties in direct NeRF fine-tuning. The interpolation power of the helper NeRF is the key to accurately reconstruct the un-occluded objects regions under sparse view supervision. Our method imposes no constraints on NeRF pre-training, and requires no extra user input or explicit semantic priors. It is an order of magnitude faster than re-training NeRF from scratch while maintaining on-par and even superior performance.
△ Less
Submitted 16 March, 2024;
originally announced March 2024.
-
Recovering orthogonality from quasi-nature of Spectral transformations
Authors:
Vikash Kumar,
Francisco Marcellán,
A. Swaminathan
Abstract:
In this contribution, quasi-orthogonality of polynomials generated by Geronimus and Uvarov transformations is analyzed. An attempt is made to discuss the recovery of the source orthogonal polynomial from the quasi-Geronimus and quasi-Uvarov polynomials of order one. Moreover, the discussion on the difference equation satisfied by quasi-Geronimus and quasi-Uvarov polynomials is presented. Furthermo…
▽ More
In this contribution, quasi-orthogonality of polynomials generated by Geronimus and Uvarov transformations is analyzed. An attempt is made to discuss the recovery of the source orthogonal polynomial from the quasi-Geronimus and quasi-Uvarov polynomials of order one. Moreover, the discussion on the difference equation satisfied by quasi-Geronimus and quasi-Uvarov polynomials is presented. Furthermore, the orthogonality of quasi-Geronimus and quasi-Uvarov polynomials is achieved through the reduction of the degree of coefficients in the difference equation. During this procedure, alternative representations of the parameters responsible for achieving orthogonality are derived. One of these representations involves the Stieltjes transform of the measure. Finally, the recurrence coefficients ensuring the existence of a measure that makes the quasi-Geronimus Laguerre polynomial of order one an orthogonal polynomial are calculated.
△ Less
Submitted 17 May, 2024; v1 submitted 6 March, 2024;
originally announced March 2024.
-
AutoAttacker: A Large Language Model Guided System to Implement Automatic Cyber-attacks
Authors:
Jiacen Xu,
Jack W. Stokes,
Geoff McDonald,
Xuesong Bai,
David Marshall,
Siyue Wang,
Adith Swaminathan,
Zhou Li
Abstract:
Large language models (LLMs) have demonstrated impressive results on natural language tasks, and security researchers are beginning to employ them in both offensive and defensive systems. In cyber-security, there have been multiple research efforts that utilize LLMs focusing on the pre-breach stage of attacks like phishing and malware generation. However, so far there lacks a comprehensive study r…
▽ More
Large language models (LLMs) have demonstrated impressive results on natural language tasks, and security researchers are beginning to employ them in both offensive and defensive systems. In cyber-security, there have been multiple research efforts that utilize LLMs focusing on the pre-breach stage of attacks like phishing and malware generation. However, so far there lacks a comprehensive study regarding whether LLM-based systems can be leveraged to simulate the post-breach stage of attacks that are typically human-operated, or "hands-on-keyboard" attacks, under various attack techniques and environments.
As LLMs inevitably advance, they may be able to automate both the pre- and post-breach attack stages. This shift may transform organizational attacks from rare, expert-led events to frequent, automated operations requiring no expertise and executed at automation speed and scale. This risks fundamentally changing global computer security and correspondingly causing substantial economic impacts, and a goal of this work is to better understand these risks now so we can better prepare for these inevitable ever-more-capable LLMs on the horizon. On the immediate impact side, this research serves three purposes. First, an automated LLM-based, post-breach exploitation framework can help analysts quickly test and continually improve their organization's network security posture against previously unseen attacks. Second, an LLM-based penetration test system can extend the effectiveness of red teams with a limited number of human analysts. Finally, this research can help defensive systems and teams learn to detect novel attack behaviors preemptively before their use in the wild....
△ Less
Submitted 1 March, 2024;
originally announced March 2024.
-
A Quantitative Evaluation of Score Distillation Sampling Based Text-to-3D
Authors:
Xiaohan Fei,
Chethan Parameshwara,
Jiawei Mo,
Xiaolong Li,
Ashwin Swaminathan,
CJ Taylor,
Paolo Favaro,
Stefano Soatto
Abstract:
The development of generative models that create 3D content from a text prompt has made considerable strides thanks to the use of the score distillation sampling (SDS) method on pre-trained diffusion models for image generation. However, the SDS method is also the source of several artifacts, such as the Janus problem, the misalignment between the text prompt and the generated 3D model, and 3D mod…
▽ More
The development of generative models that create 3D content from a text prompt has made considerable strides thanks to the use of the score distillation sampling (SDS) method on pre-trained diffusion models for image generation. However, the SDS method is also the source of several artifacts, such as the Janus problem, the misalignment between the text prompt and the generated 3D model, and 3D model inaccuracies. While existing methods heavily rely on the qualitative assessment of these artifacts through visual inspection of a limited set of samples, in this work we propose more objective quantitative evaluation metrics, which we cross-validate via human ratings, and show analysis of the failure cases of the SDS technique. We demonstrate the effectiveness of this analysis by designing a novel computationally efficient baseline model that achieves state-of-the-art performance on the proposed metrics while addressing all the above-mentioned artifacts.
△ Less
Submitted 28 February, 2024;
originally announced February 2024.
-
LLF-Bench: Benchmark for Interactive Learning from Language Feedback
Authors:
Ching-An Cheng,
Andrey Kolobov,
Dipendra Misra,
Allen Nie,
Adith Swaminathan
Abstract:
We introduce a new benchmark, LLF-Bench (Learning from Language Feedback Benchmark; pronounced as "elf-bench"), to evaluate the ability of AI agents to interactively learn from natural language feedback and instructions. Learning from language feedback (LLF) is essential for people, largely because the rich information this feedback provides can help a learner avoid much of trial and error and the…
▽ More
We introduce a new benchmark, LLF-Bench (Learning from Language Feedback Benchmark; pronounced as "elf-bench"), to evaluate the ability of AI agents to interactively learn from natural language feedback and instructions. Learning from language feedback (LLF) is essential for people, largely because the rich information this feedback provides can help a learner avoid much of trial and error and thereby speed up the learning process. Large Language Models (LLMs) have recently enabled AI agents to comprehend natural language -- and hence AI agents can potentially benefit from language feedback during learning like humans do. But existing interactive benchmarks do not assess this crucial capability: they either use numeric reward feedback or require no learning at all (only planning or information retrieval). LLF-Bench is designed to fill this omission. LLF-Bench is a diverse collection of sequential decision-making tasks that includes user recommendation, poem writing, navigation, and robot control. The objective of an agent is to interactively solve these tasks based on their natural-language instructions and the feedback received after taking actions. Crucially, to ensure that the agent actually "learns" from the feedback, LLF-Bench implements several randomization techniques (such as paraphrasing and environment randomization) to ensure that the task isn't familiar to the agent and that the agent is robust to various verbalizations. In addition, LLF-Bench provides a unified OpenAI Gym interface for all its tasks and allows the users to easily configure the information the feedback conveys (among suggestion, explanation, and instantaneous performance) to study how agents respond to different types of feedback. Together, these features make LLF-Bench a unique research platform for developing and testing LLF agents.
△ Less
Submitted 13 December, 2023; v1 submitted 11 December, 2023;
originally announced December 2023.
-
Orthogonality of a new family of $q$-Sobolev type polynomials
Authors:
Neha,
A. Swaminathan
Abstract:
In this work, we introduce and construct specific $q$-polynomials that are desired from the well-established families of $q$-orthogonal polynomials, namely little $q$-Jacobi polynomials and $q$-Laguerre polynomials, respectively. We examine these newly constructed $q$-polynomials and observe that they possess integral representations of little $q$-Jacobi polynomials and $q$-Laguerre polynomials. T…
▽ More
In this work, we introduce and construct specific $q$-polynomials that are desired from the well-established families of $q$-orthogonal polynomials, namely little $q$-Jacobi polynomials and $q$-Laguerre polynomials, respectively. We examine these newly constructed $q$-polynomials and observe that they possess integral representations of little $q$-Jacobi polynomials and $q$-Laguerre polynomials. These polynomials solve a third-order $q$-difference equation and display an unconventional four-term recurrence relation. This unique recurrence relation makes us categorize them as $q$-Sobolev-type orthogonal polynomials. This motivation leads to defining the general Sobolev-type orthogonality for $q$-polynomials. Special cases of these polynomials are also explored and discussed. Furthermore, we delve into the behavior of these $q$-orthogonal polynomials of Sobolev type as the parameters approach $1$. We also examine their zeros and interlacing properties.
△ Less
Submitted 7 December, 2023;
originally announced December 2023.
-
Do text-free diffusion models learn discriminative visual representations?
Authors:
Soumik Mukhopadhyay,
Matthew Gwilliam,
Yosuke Yamaguchi,
Vatsal Agarwal,
Namitha Padmanabhan,
Archana Swaminathan,
Tianyi Zhou,
Abhinav Shrivastava
Abstract:
While many unsupervised learning models focus on one family of tasks, either generative or discriminative, we explore the possibility of a unified representation learner: a model which addresses both families of tasks simultaneously. We identify diffusion models, a state-of-the-art method for generative tasks, as a prime candidate. Such models involve training a U-Net to iteratively predict and re…
▽ More
While many unsupervised learning models focus on one family of tasks, either generative or discriminative, we explore the possibility of a unified representation learner: a model which addresses both families of tasks simultaneously. We identify diffusion models, a state-of-the-art method for generative tasks, as a prime candidate. Such models involve training a U-Net to iteratively predict and remove noise, and the resulting model can synthesize high-fidelity, diverse, novel images. We find that the intermediate feature maps of the U-Net are diverse, discriminative feature representations. We propose a novel attention mechanism for pooling feature maps and further leverage this mechanism as DifFormer, a transformer feature fusion of features from different diffusion U-Net blocks and noise steps. We also develop DifFeed, a novel feedback mechanism tailored to diffusion. We find that diffusion models are better than GANs, and, with our fusion and feedback mechanisms, can compete with state-of-the-art unsupervised image representation learning methods for discriminative tasks - image classification with full and semi-supervision, transfer for fine-grained classification, object detection and segmentation, and semantic segmentation. Our project website (https://mgwillia.github.io/diffssl/) and code (https://github.com/soumik-kanad/diffssl) are available publicly.
△ Less
Submitted 29 November, 2023; v1 submitted 29 November, 2023;
originally announced November 2023.
-
Interactive Robot Learning from Verbal Correction
Authors:
Huihan Liu,
Alice Chen,
Yuke Zhu,
Adith Swaminathan,
Andrey Kolobov,
Ching-An Cheng
Abstract:
The ability to learn and refine behavior after deployment has become ever more important for robots as we design them to operate in unstructured environments like households. In this work, we design a new learning system based on large language model (LLM), OLAF, that allows everyday users to teach a robot using verbal corrections when the robot makes mistakes, e.g., by saying "Stop what you're do…
▽ More
The ability to learn and refine behavior after deployment has become ever more important for robots as we design them to operate in unstructured environments like households. In this work, we design a new learning system based on large language model (LLM), OLAF, that allows everyday users to teach a robot using verbal corrections when the robot makes mistakes, e.g., by saying "Stop what you're doing. You should move closer to the cup." A key feature of OLAF is its ability to update the robot's visuomotor neural policy based on the verbal feedback to avoid repeating mistakes in the future. This is in contrast to existing LLM-based robotic systems, which only follow verbal commands or corrections but not learn from them. We demonstrate the efficacy of our design in experiments where a user teaches a robot to perform long-horizon manipulation tasks both in simulation and on physical hardware, achieving on average 20.0% improvement in policy success rate. Videos and more results are at https://ut-austin-rpl.github.io/olaf/
△ Less
Submitted 26 October, 2023;
originally announced October 2023.
-
Chop & Learn: Recognizing and Generating Object-State Compositions
Authors:
Nirat Saini,
Hanyu Wang,
Archana Swaminathan,
Vinoj Jayasundara,
Bo He,
Kamal Gupta,
Abhinav Shrivastava
Abstract:
Recognizing and generating object-state compositions has been a challenging task, especially when generalizing to unseen compositions. In this paper, we study the task of cutting objects in different styles and the resulting object state changes. We propose a new benchmark suite Chop & Learn, to accommodate the needs of learning objects and different cut styles using multiple viewpoints. We also p…
▽ More
Recognizing and generating object-state compositions has been a challenging task, especially when generalizing to unseen compositions. In this paper, we study the task of cutting objects in different styles and the resulting object state changes. We propose a new benchmark suite Chop & Learn, to accommodate the needs of learning objects and different cut styles using multiple viewpoints. We also propose a new task of Compositional Image Generation, which can transfer learned cut styles to different objects, by generating novel object-state images. Moreover, we also use the videos for Compositional Action Recognition, and show valuable uses of this dataset for multiple video tasks. Project website: https://chopnlearn.github.io.
△ Less
Submitted 25 September, 2023;
originally announced September 2023.
-
Inequalities involving a measure of Marcellán class and zeros of corresponding orthogonal polynomials
Authors:
Vikash Kumar,
A. Swaminathan
Abstract:
Let $\tildeΦ_n$ be a quasi-orthogonal polynomial of order 1 on the unit circle, obtained from an orthogonal polynomial $Φ_n$ with measure $μ$, which is in the Marcellán class, if there exist another measure $\tildeμ$ such that $\tildeΦ_n$ is a monic orthogonal polynomial. This article aims to investigate various properties related to the Marcellán class. At first, we study the behaviour of the zer…
▽ More
Let $\tildeΦ_n$ be a quasi-orthogonal polynomial of order 1 on the unit circle, obtained from an orthogonal polynomial $Φ_n$ with measure $μ$, which is in the Marcellán class, if there exist another measure $\tildeμ$ such that $\tildeΦ_n$ is a monic orthogonal polynomial. This article aims to investigate various properties related to the Marcellán class. At first, we study the behaviour of the zeros between $Φ_n$ and $\tildeΦ_n$. Along with numerical examples, we analyze the zeros of $Φ_n$, its POPUC and the linear combination of the POPUC. Further, comparison of the norm inequalities among $Φ_n$ and $\tildeΦ_n$ are obtained by involving their measures. This leads to the study of the Lubinsky type inequality between the measures $μ$ and $\tildeμ$, without using the ordering relation between $μ$ and $\tildeμ$. Additionally, similar type of inequalities for the kernel type polynomials related to $μ$ and $\tildeμ$ are obtained.
△ Less
Submitted 10 June, 2024; v1 submitted 30 August, 2023;
originally announced August 2023.
-
Training Data Protection with Compositional Diffusion Models
Authors:
Aditya Golatkar,
Alessandro Achille,
Ashwin Swaminathan,
Stefano Soatto
Abstract:
We introduce Compartmentalized Diffusion Models (CDM), a method to train different diffusion models (or prompts) on distinct data sources and arbitrarily compose them at inference time. The individual models can be trained in isolation, at different times, and on different distributions and domains and can be later composed to achieve performance comparable to a paragon model trained on all data s…
▽ More
We introduce Compartmentalized Diffusion Models (CDM), a method to train different diffusion models (or prompts) on distinct data sources and arbitrarily compose them at inference time. The individual models can be trained in isolation, at different times, and on different distributions and domains and can be later composed to achieve performance comparable to a paragon model trained on all data simultaneously. Furthermore, each model only contains information about the subset of the data it was exposed to during training, enabling several forms of training data protection. In particular, CDMs enable perfect selective forgetting and continual learning for large-scale diffusion models, allow serving customized models based on the user's access rights. Empirically the quality (FID) of the class-conditional CDMs (8-splits) is within 10% (on fine-grained vision datasets) of a monolithic model (no splits), and allows (8x) faster forgetting compared monolithic model with a maximum FID increase of 1%. When applied to text-to-image generation, CDMs improve alignment (TIFA) by 14.33% over a monolithic model trained on MSCOCO. CDMs also allow determining the importance of a subset of the data (attribution) in generating particular samples, and reduce memorization.
△ Less
Submitted 13 February, 2024; v1 submitted 2 August, 2023;
originally announced August 2023.
-
Diffusion Models Beat GANs on Image Classification
Authors:
Soumik Mukhopadhyay,
Matthew Gwilliam,
Vatsal Agarwal,
Namitha Padmanabhan,
Archana Swaminathan,
Srinidhi Hegde,
Tianyi Zhou,
Abhinav Shrivastava
Abstract:
While many unsupervised learning models focus on one family of tasks, either generative or discriminative, we explore the possibility of a unified representation learner: a model which uses a single pre-training stage to address both families of tasks simultaneously. We identify diffusion models as a prime candidate. Diffusion models have risen to prominence as a state-of-the-art method for image…
▽ More
While many unsupervised learning models focus on one family of tasks, either generative or discriminative, we explore the possibility of a unified representation learner: a model which uses a single pre-training stage to address both families of tasks simultaneously. We identify diffusion models as a prime candidate. Diffusion models have risen to prominence as a state-of-the-art method for image generation, denoising, inpainting, super-resolution, manipulation, etc. Such models involve training a U-Net to iteratively predict and remove noise, and the resulting model can synthesize high fidelity, diverse, novel images. The U-Net architecture, as a convolution-based architecture, generates a diverse set of feature representations in the form of intermediate feature maps. We present our findings that these embeddings are useful beyond the noise prediction task, as they contain discriminative information and can also be leveraged for classification. We explore optimal methods for extracting and using these embeddings for classification tasks, demonstrating promising results on the ImageNet classification task. We find that with careful feature selection and pooling, diffusion models outperform comparable generative-discriminative methods such as BigBiGAN for classification tasks. We investigate diffusion models in the transfer learning regime, examining their performance on several fine-grained visual classification datasets. We compare these embeddings to those generated by competing architectures and pre-trainings for classification tasks.
△ Less
Submitted 17 July, 2023;
originally announced July 2023.
-
Towards Visual Foundational Models of Physical Scenes
Authors:
Chethan Parameshwara,
Alessandro Achille,
Matthew Trager,
Xiaolong Li,
Jiawei Mo,
Matthew Trager,
Ashwin Swaminathan,
CJ Taylor,
Dheera Venkatraman,
Xiaohan Fei,
Stefano Soatto
Abstract:
We describe a first step towards learning general-purpose visual representations of physical scenes using only image prediction as a training criterion. To do so, we first define "physical scene" and show that, even though different agents may maintain different representations of the same scene, the underlying physical scene that can be inferred is unique. Then, we show that NeRFs cannot represen…
▽ More
We describe a first step towards learning general-purpose visual representations of physical scenes using only image prediction as a training criterion. To do so, we first define "physical scene" and show that, even though different agents may maintain different representations of the same scene, the underlying physical scene that can be inferred is unique. Then, we show that NeRFs cannot represent the physical scene, as they lack extrapolation mechanisms. Those, however, could be provided by Diffusion Models, at least in theory. To test this hypothesis empirically, NeRFs can be combined with Diffusion Models, a process we refer to as NeRF Diffusion, used as unsupervised representations of the physical scene. Our analysis is limited to visual data, without external grounding mechanisms that can be provided by independent sensory modalities.
△ Less
Submitted 6 June, 2023;
originally announced June 2023.
-
A Privacy-Preserving Federated Learning Approach for Kernel methods
Authors:
Anika Hannemann,
Ali Burak Ünal,
Arjhun Swaminathan,
Erik Buchmann,
Mete Akgün
Abstract:
It is challenging to implement Kernel methods, if the data sources are distributed and cannot be joined at a trusted third party for privacy reasons. It is even more challenging, if the use case rules out privacy-preserving approaches that introduce noise. An example for such a use case is machine learning on clinical data. To realize exact privacy preserving computation of kernel methods, we prop…
▽ More
It is challenging to implement Kernel methods, if the data sources are distributed and cannot be joined at a trusted third party for privacy reasons. It is even more challenging, if the use case rules out privacy-preserving approaches that introduce noise. An example for such a use case is machine learning on clinical data. To realize exact privacy preserving computation of kernel methods, we propose FLAKE, a Federated Learning Approach for KErnel methods on horizontally distributed data. With FLAKE, the data sources mask their data so that a centralized instance can compute a Gram matrix without compromising privacy. The Gram matrix allows to calculate many kernel matrices, which can be used to train kernel-based machine learning algorithms such as Support Vector Machines. We prove that FLAKE prevents an adversary from learning the input data or the number of input features under a semi-honest threat model. Experiments on clinical and synthetic data confirm that FLAKE is outperforming the accuracy and efficiency of comparable methods. The time needed to mask the data and to compute the Gram matrix is several orders of magnitude less than the time a Support Vector Machine needs to be trained. Thus, FLAKE can be applied to many use cases.
△ Less
Submitted 5 June, 2023;
originally announced June 2023.
-
Evaluation of GPT-3.5 and GPT-4 for supporting real-world information needs in healthcare delivery
Authors:
Debadutta Dash,
Rahul Thapa,
Juan M. Banda,
Akshay Swaminathan,
Morgan Cheatham,
Mehr Kashyap,
Nikesh Kotecha,
Jonathan H. Chen,
Saurabh Gombar,
Lance Downing,
Rachel Pedreira,
Ethan Goh,
Angel Arnaout,
Garret Kenn Morris,
Honor Magon,
Matthew P Lungren,
Eric Horvitz,
Nigam H. Shah
Abstract:
Despite growing interest in using large language models (LLMs) in healthcare, current explorations do not assess the real-world utility and safety of LLMs in clinical settings. Our objective was to determine whether two LLMs can serve information needs submitted by physicians as questions to an informatics consultation service in a safe and concordant manner. Sixty six questions from an informatic…
▽ More
Despite growing interest in using large language models (LLMs) in healthcare, current explorations do not assess the real-world utility and safety of LLMs in clinical settings. Our objective was to determine whether two LLMs can serve information needs submitted by physicians as questions to an informatics consultation service in a safe and concordant manner. Sixty six questions from an informatics consult service were submitted to GPT-3.5 and GPT-4 via simple prompts. 12 physicians assessed the LLM responses' possibility of patient harm and concordance with existing reports from an informatics consultation service. Physician assessments were summarized based on majority vote. For no questions did a majority of physicians deem either LLM response as harmful. For GPT-3.5, responses to 8 questions were concordant with the informatics consult report, 20 discordant, and 9 were unable to be assessed. There were 29 responses with no majority on "Agree", "Disagree", and "Unable to assess". For GPT-4, responses to 13 questions were concordant, 15 discordant, and 3 were unable to be assessed. There were 35 responses with no majority. Responses from both LLMs were largely devoid of overt harm, but less than 20% of the responses agreed with an answer from an informatics consultation service, responses contained hallucinated references, and physicians were divided on what constitutes harm. These results suggest that while general purpose LLMs are able to provide safe and credible responses, they often do not meet the specific information need of a given question. A definitive evaluation of the usefulness of LLMs in healthcare settings will likely require additional research on prompt engineering, calibration, and custom-tailoring of general purpose models.
△ Less
Submitted 30 April, 2023; v1 submitted 26 April, 2023;
originally announced April 2023.
-
SAFE: Machine Unlearning With Shard Graphs
Authors:
Yonatan Dukler,
Benjamin Bowman,
Alessandro Achille,
Aditya Golatkar,
Ashwin Swaminathan,
Stefano Soatto
Abstract:
We present Synergy Aware Forgetting Ensemble (SAFE), a method to adapt large models on a diverse collection of data while minimizing the expected cost to remove the influence of training samples from the trained model. This process, also known as selective forgetting or unlearning, is often conducted by partitioning a dataset into shards, training fully independent models on each, then ensembling…
▽ More
We present Synergy Aware Forgetting Ensemble (SAFE), a method to adapt large models on a diverse collection of data while minimizing the expected cost to remove the influence of training samples from the trained model. This process, also known as selective forgetting or unlearning, is often conducted by partitioning a dataset into shards, training fully independent models on each, then ensembling the resulting models. Increasing the number of shards reduces the expected cost to forget but at the same time it increases inference cost and reduces the final accuracy of the model since synergistic information between samples is lost during the independent model training. Rather than treating each shard as independent, SAFE introduces the notion of a shard graph, which allows incorporating limited information from other shards during training, trading off a modest increase in expected forgetting cost with a significant increase in accuracy, all while still attaining complete removal of residual influence after forgetting. SAFE uses a lightweight system of adapters which can be trained while reusing most of the computations. This allows SAFE to be trained on shards an order-of-magnitude smaller than current state-of-the-art methods (thus reducing the forgetting costs) while also maintaining high accuracy, as we demonstrate empirically on fine-grained computer vision datasets.
△ Less
Submitted 22 August, 2023; v1 submitted 25 April, 2023;
originally announced April 2023.
-
Generalized co-polynomials of $R_{II}$ type and associated quadrature rules
Authors:
Vinay Shukla,
A. Swaminathan
Abstract:
When the co-recursion and co-dilation in the recurrence relation of certain sequences of orthogonal polynomials are not at the same level, the behaviour of the modified orthogonal polynomials is expected to have different properties compared to the situation of the same level of perturbation. This manuscript attempts to derive structural relations between the perturbed and original $R_{II}$ type o…
▽ More
When the co-recursion and co-dilation in the recurrence relation of certain sequences of orthogonal polynomials are not at the same level, the behaviour of the modified orthogonal polynomials is expected to have different properties compared to the situation of the same level of perturbation. This manuscript attempts to derive structural relations between the perturbed and original $R_{II}$ type orthogonal polynomials. The classical result is improved using a transfer matrix approach. It turns out that the $R_{II}$ fraction with perturbation is the rational spectral transformation of the unperturbed one. The derived notions are used to deduce some consequences for the polynomials orthogonal on the real line. A natural question that arises while dealing with perturbations at different levels, i.e., which perturbation, co-recursion or co-dilation, needs to be performed first, is answered.
△ Less
Submitted 12 May, 2024; v1 submitted 24 April, 2023;
originally announced April 2023.
-
Counting integral points on symmetric varieties with applications to arithmetic statistics
Authors:
Arul Shankar,
Artane Siad,
Ashvin A. Swaminathan
Abstract:
In this article, we combine Bhargava's geometry-of-numbers methods with the dynamical point-counting methods of Eskin--McMullen and Benoist--Oh to develop a new technique for counting integral points on symmetric varieties lying within fundamental domains for coregular representations. As applications, we study the distribution of the $2$-torsion subgroup of the class group in thin families of cub…
▽ More
In this article, we combine Bhargava's geometry-of-numbers methods with the dynamical point-counting methods of Eskin--McMullen and Benoist--Oh to develop a new technique for counting integral points on symmetric varieties lying within fundamental domains for coregular representations. As applications, we study the distribution of the $2$-torsion subgroup of the class group in thin families of cubic number fields, as well as the distribution of the $2$-Selmer groups in thin families of elliptic curves over $\mathbb{Q}$. For example, our results suggest that the existence of a generator of the ring of integers with small norm has an increasing effect on the average size of the $2$-torsion subgroup of the class group, relative to the Cohen--Lenstra predictions.
△ Less
Submitted 3 April, 2023;
originally announced April 2023.
-
Learning Expressive Prompting With Residuals for Vision Transformers
Authors:
Rajshekhar Das,
Yonatan Dukler,
Avinash Ravichandran,
Ashwin Swaminathan
Abstract:
Prompt learning is an efficient approach to adapt transformers by inserting learnable set of parameters into the input and intermediate representations of a pre-trained model. In this work, we present Expressive Prompts with Residuals (EXPRES) which modifies the prompt learning paradigm specifically for effective adaptation of vision transformers (ViT). Out method constructs downstream representat…
▽ More
Prompt learning is an efficient approach to adapt transformers by inserting learnable set of parameters into the input and intermediate representations of a pre-trained model. In this work, we present Expressive Prompts with Residuals (EXPRES) which modifies the prompt learning paradigm specifically for effective adaptation of vision transformers (ViT). Out method constructs downstream representations via learnable ``output'' tokens, that are akin to the learned class tokens of the ViT. Further for better steering of the downstream representation processed by the frozen transformer, we introduce residual learnable tokens that are added to the output of various computations. We apply EXPRES for image classification, few shot learning, and semantic segmentation, and show our method is capable of achieving state of the art prompt tuning on 3/3 categories of the VTAB benchmark. In addition to strong performance, we observe that our approach is an order of magnitude more prompt efficient than existing visual prompting baselines. We analytically show the computational benefits of our approach over weight space adaptation techniques like finetuning. Lastly we systematically corroborate the architectural design of our method via a series of ablation experiments.
△ Less
Submitted 27 March, 2023;
originally announced March 2023.
-
Your representations are in the network: composable and parallel adaptation for large scale models
Authors:
Yonatan Dukler,
Alessandro Achille,
Hao Yang,
Varsha Vivek,
Luca Zancato,
Benjamin Bowman,
Avinash Ravichandran,
Charless Fowlkes,
Ashwin Swaminathan,
Stefano Soatto
Abstract:
We propose InCA, a lightweight method for transfer learning that cross-attends to any activation layer of a pre-trained model. During training, InCA uses a single forward pass to extract multiple activations, which are passed to external cross-attention adapters, trained anew and combined or selected for downstream tasks. We show that, even when selecting a single top-scoring adapter, InCA achieve…
▽ More
We propose InCA, a lightweight method for transfer learning that cross-attends to any activation layer of a pre-trained model. During training, InCA uses a single forward pass to extract multiple activations, which are passed to external cross-attention adapters, trained anew and combined or selected for downstream tasks. We show that, even when selecting a single top-scoring adapter, InCA achieves performance comparable to full fine-tuning, at a cost comparable to fine-tuning just the last layer. For example, with a cross-attention probe 1.3% the size of a pre-trained ViT-L/16 model, we achieve performance within 0.2% of the full fine-tuning paragon at a computational training cost of 51% of the baseline, on average across 11 downstream classification. Unlike other forms of efficient adaptation, InCA does not require backpropagating through the pre-trained model, thus leaving its execution unaltered at both training and inference. The versatility of InCA is best illustrated in fine-grained tasks, which may require accessing information absent in the last layer but accessible in intermediate layer activations. Since the backbone is fixed, InCA allows parallel ensembling as well as parallel execution of multiple tasks. InCA achieves state-of-the-art performance in the ImageNet-to-Sketch multi-task benchmark.
△ Less
Submitted 31 October, 2023; v1 submitted 7 March, 2023;
originally announced March 2023.
-
A Meta-Learning Approach to Predicting Performance and Data Requirements
Authors:
Achin Jain,
Gurumurthy Swaminathan,
Paolo Favaro,
Hao Yang,
Avinash Ravichandran,
Hrayr Harutyunyan,
Alessandro Achille,
Onkar Dabeer,
Bernt Schiele,
Ashwin Swaminathan,
Stefano Soatto
Abstract:
We propose an approach to estimate the number of samples required for a model to reach a target performance. We find that the power law, the de facto principle to estimate model performance, leads to large error when using a small dataset (e.g., 5 samples per class) for extrapolation. This is because the log-performance error against the log-dataset size follows a nonlinear progression in the few-…
▽ More
We propose an approach to estimate the number of samples required for a model to reach a target performance. We find that the power law, the de facto principle to estimate model performance, leads to large error when using a small dataset (e.g., 5 samples per class) for extrapolation. This is because the log-performance error against the log-dataset size follows a nonlinear progression in the few-shot regime followed by a linear progression in the high-shot regime. We introduce a novel piecewise power law (PPL) that handles the two data regimes differently. To estimate the parameters of the PPL, we introduce a random forest regressor trained via meta learning that generalizes across classification/detection tasks, ResNet/ViT based architectures, and random/pre-trained initializations. The PPL improves the performance estimation on average by 37% across 16 classification and 33% across 10 detection datasets, compared to the power law. We further extend the PPL to provide a confidence bound and use it to limit the prediction horizon that reduces over-estimation of data by 76% on classification and 91% on detection datasets.
△ Less
Submitted 2 March, 2023;
originally announced March 2023.
-
Recovering orthogonality from Quasi-type Kernel Polynomials using specific spectral transformations
Authors:
Vikash Kumar,
A. Swaminathan
Abstract:
In this work, the concept of quasi-type Kernel polynomials with respect to a moment functional is introduced. Difference equation satisfied by these polynomials along with the criterion for orthogonality conditions are discussed. The process of recovering orthogonality for the linear combination of a quasi-type kernel polynomial with another orthogonal polynomial, which is identified by involving…
▽ More
In this work, the concept of quasi-type Kernel polynomials with respect to a moment functional is introduced. Difference equation satisfied by these polynomials along with the criterion for orthogonality conditions are discussed. The process of recovering orthogonality for the linear combination of a quasi-type kernel polynomial with another orthogonal polynomial, which is identified by involving linear spectral transformation, is provided. This process involves an expression of ratio of iterated kernel polynomials. This lead to considering the limiting case of ratio of kernel polynomials involving continued fractions. Special cases of such ratios in terms of certain continued fractions are exhibited.
△ Less
Submitted 29 January, 2023; v1 submitted 19 November, 2022;
originally announced November 2022.
-
The mean number of $2$-torsion elements in the class groups of cubic orders
Authors:
Ashvin Swaminathan
Abstract:
We determine the mean number of 2-torsion elements in class groups of cubic orders, when such orders are enumerated by discriminant. Specifically, we prove that when isomorphism classes of totally real (resp., complex) cubic orders are enumerated by discriminant, the average $2$-torsion in the class group is $1 + \frac{1}{4} \times \frac{ζ(2)}{ζ(4)}$ (resp.,…
▽ More
We determine the mean number of 2-torsion elements in class groups of cubic orders, when such orders are enumerated by discriminant. Specifically, we prove that when isomorphism classes of totally real (resp., complex) cubic orders are enumerated by discriminant, the average $2$-torsion in the class group is $1 + \frac{1}{4} \times \frac{ζ(2)}{ζ(4)}$ (resp., $1 + \frac{1}{2} \times \frac{ζ(2)}{ζ(4)}$). In particular, we find that the average $2$-torsion in the class group increases when one ranges over all orders in cubic fields instead of restricting to the subfamily of rings of integers of cubic fields, where the average $2$-torsion in the class group was first determined in work of Bhargava to be $\frac{5}{4}$ (resp., $\frac{3}{2}$).
By work of Bhargava--Varma, proving this result amounts to obtaining an asymptotic count of the number of "reducible" $\operatorname{SL}_3(\mathbb{Z})$-orbits on the space $\mathbb{Z}^2 \otimes_{\mathbb{Z}} \operatorname{Sym}^2 \mathbb{Z}^3$ of $3 \times 3$ symmetric integer matrices having bounded invariants and satisfying local conditions. In this paper, we resolve the generalization of this orbit-counting problem where the dimension $3$ is replaced by any fixed odd integer $N \geq 3$. More precisely, we determine asymptotic formulas for the number of reducible $\operatorname{SL}_N(\mathbb{Z})$-orbits on $\mathbb{Z}^2 \otimes_{\mathbb{Z}} \operatorname{Sym}^2 \mathbb{Z}^N$ satisfying general infinite sets of congruence conditions.
△ Less
Submitted 16 November, 2022;
originally announced November 2022.
-
Spectral properties related to generalized complementary Romanovski-Routh polynomials
Authors:
Vinay Shukla,
A. Swaminathan
Abstract:
Complementary Romanovski-Routh polynomials play an important role in extracting specific properties of orthogonal polynomials. In this work, a generalized form of the Complementary Romanovski-Routh polynomials (GCRR) that has the Gaussian hypergeometric representation and satisfies a particular type of recurrence called $R_{II}$ type three term recurrence relation involving two arbitrary parameter…
▽ More
Complementary Romanovski-Routh polynomials play an important role in extracting specific properties of orthogonal polynomials. In this work, a generalized form of the Complementary Romanovski-Routh polynomials (GCRR) that has the Gaussian hypergeometric representation and satisfies a particular type of recurrence called $R_{II}$ type three term recurrence relation involving two arbitrary parameters is considered. Self perturbation of GCRR polynomials leading to extracting two different types of $R_{II}$ type orthogonal polynomials are identified. Spectral properties of these resultant polynomials in terms of tri-diagonal linear pencil were analyzed. The LU decomposition of these pencil matrices provided interesting properties involving biorthogonality. Interlacing properties between the zeros of the polynomials in the discussion are established.
△ Less
Submitted 7 September, 2022;
originally announced September 2022.
-
Hindsight Learning for MDPs with Exogenous Inputs
Authors:
Sean R. Sinclair,
Felipe Frujeri,
Ching-An Cheng,
Luke Marshall,
Hugo Barbalho,
Jingling Li,
Jennifer Neville,
Ishai Menache,
Adith Swaminathan
Abstract:
Many resource management problems require sequential decision-making under uncertainty, where the only uncertainty affecting the decision outcomes are exogenous variables outside the control of the decision-maker. We model these problems as Exo-MDPs (Markov Decision Processes with Exogenous Inputs) and design a class of data-efficient algorithms for them termed Hindsight Learning (HL). Our HL algo…
▽ More
Many resource management problems require sequential decision-making under uncertainty, where the only uncertainty affecting the decision outcomes are exogenous variables outside the control of the decision-maker. We model these problems as Exo-MDPs (Markov Decision Processes with Exogenous Inputs) and design a class of data-efficient algorithms for them termed Hindsight Learning (HL). Our HL algorithms achieve data efficiency by leveraging a key insight: having samples of the exogenous variables, past decisions can be revisited in hindsight to infer counterfactual consequences that can accelerate policy improvements. We compare HL against classic baselines in the multi-secretary and airline revenue management problems. We also scale our algorithms to a business-critical cloud resource management problem -- allocating Virtual Machines (VMs) to physical machines, and simulate their performance with real datasets from a large public cloud provider. We find that HL algorithms outperform domain-specific heuristics, as well as state-of-the-art reinforcement learning methods.
△ Less
Submitted 23 October, 2023; v1 submitted 13 July, 2022;
originally announced July 2022.
-
Chain sequences and Zeros of a perturbed $R_{II}$ type recurrence relation
Authors:
Vinay Shukla,
A. Swaminathan
Abstract:
In this manuscript, new algebraic and analytic aspects of the orthogonal polynomials satisfying $R_{II}$ type recurrence relation given by \begin{align*} \mathcal{P}_{n+1}(x) = (x-c_n)\mathcal{P}_n(x)-λ_n (x-a_n)(x-b_n)\mathcal{P}_{n-1}(x), \quad n \geq 0, \end{align*} where $λ_n$ is a positive chain sequence and $a_n$, $b_n$, $c_n$ are sequences of real or complex numbers with…
▽ More
In this manuscript, new algebraic and analytic aspects of the orthogonal polynomials satisfying $R_{II}$ type recurrence relation given by \begin{align*} \mathcal{P}_{n+1}(x) = (x-c_n)\mathcal{P}_n(x)-λ_n (x-a_n)(x-b_n)\mathcal{P}_{n-1}(x), \quad n \geq 0, \end{align*} where $λ_n$ is a positive chain sequence and $a_n$, $b_n$, $c_n$ are sequences of real or complex numbers with $\mathcal{P}_{-1}(x) = 0$ and $\mathcal{P}_0(x) = 1$ are investigated when the recurrence coefficients are perturbed. Specifically, representation of new perturbed polynomials (co-polynomials of $R_{II}$ type) in terms of original ones with the interlacing and monotonicity properties of zeros are given. For finite perturbations, a transfer matrix approach is used to obtain new structural relations. Effect of co-dilation in the corresponding chain sequences and their consequences onto the unit circle are analysed. A particular perturbation in the corresponding chain sequence called complementary chain sequences and its effect on the corresponding Verblunsky coefficients is also studied.
△ Less
Submitted 23 January, 2022;
originally announced January 2022.
-
Spectral transformation associated with a perturbed $R_I$ type recurrence relation
Authors:
Vinay Shukla,
A. Swaminathan
Abstract:
In this work, orthogonal polynomials satisfying $R_I$ type recurrence relation %$\mathcal{P}_{n+1}(z) = (z-c_n)\mathcal{P}_n(z)-λ_n (z-a_n)\mathcal{P}_{n-1}(z),$ with $\mathcal{P}_{-1}(z) = 0$ and $\mathcal{P}_0(z) = 1$ are analyzed when the recurrence coefficients are modified. The structural relationship between the perturbed and the unperturbed polynomials along with the spectral properties and…
▽ More
In this work, orthogonal polynomials satisfying $R_I$ type recurrence relation %$\mathcal{P}_{n+1}(z) = (z-c_n)\mathcal{P}_n(z)-λ_n (z-a_n)\mathcal{P}_{n-1}(z),$ with $\mathcal{P}_{-1}(z) = 0$ and $\mathcal{P}_0(z) = 1$ are analyzed when the recurrence coefficients are modified. The structural relationship between the perturbed and the unperturbed polynomials along with the spectral properties and spectral transformation of continued fraction are investigated. It is demonstrated that the transfer matrix method is computationally more efficient than the classical method for obtaining perturbed $R_I$ polynomials. Further, an interesting consequence of co-dilation on the Carathéodary function is presented. Finally, the study of co-recursion and co-dilation in connection to the unit circle is carried out with the help of an illustration. The interlacing and monotonicity of zeros between L-Jacobi polynomials and their perturbed forms are demonstrated.
△ Less
Submitted 19 May, 2024; v1 submitted 14 January, 2022;
originally announced January 2022.
-
Geometry-of-numbers methods in the cusp
Authors:
Arul Shankar,
Artane Siad,
Ashvin Swaminathan,
Ila Varma
Abstract:
In this article, we develop new methods for counting integral orbits having bounded invariants that lie inside the cusps of fundamental domains for coregular representations. We illustrate these methods for a representation of cardinal interest in number theory, namely that of the split orthogonal group acting on the space of quadratic forms.
In this article, we develop new methods for counting integral orbits having bounded invariants that lie inside the cusps of fundamental domains for coregular representations. We illustrate these methods for a representation of cardinal interest in number theory, namely that of the split orthogonal group acting on the space of quadratic forms.
△ Less
Submitted 22 June, 2022; v1 submitted 18 October, 2021;
originally announced October 2021.
-
The second moment of the size of the $2$-Selmer group of elliptic curves
Authors:
Manjul Bhargava,
Arul Shankar,
Ashvin Swaminathan
Abstract:
In this paper, we prove that when elliptic curves over $\mathbb{Q}$ are ordered by height, the second moment of the size of the $2$-Selmer group is at most $15$. This confirms a conjecture of Poonen and Rains.
In this paper, we prove that when elliptic curves over $\mathbb{Q}$ are ordered by height, the second moment of the size of the $2$-Selmer group is at most $15$. This confirms a conjecture of Poonen and Rains.
△ Less
Submitted 18 October, 2021;
originally announced October 2021.
-
Hermite equivalence of polynomials
Authors:
Manjul Bhargava,
Jan-Hendrik Evertse,
Kálmán Győry,
László Remete,
Ashvin A. Swaminathan
Abstract:
In this paper, we resurrect a long-forgotten notion of equivalence for univariate polynomials with integral coefficients introduced by Hermite in the 1850s. We show that the Hermite equivalence class of a polynomial has a very natural interpretation in terms of the invariant ring and invariant ideal associated with the polynomial. We apply this interpretation to shed light on the relationship betw…
▽ More
In this paper, we resurrect a long-forgotten notion of equivalence for univariate polynomials with integral coefficients introduced by Hermite in the 1850s. We show that the Hermite equivalence class of a polynomial has a very natural interpretation in terms of the invariant ring and invariant ideal associated with the polynomial. We apply this interpretation to shed light on the relationship between Hermite equivalence and more familiar notions of polynomial equivalence, such as ${\rm GL}_2(\mathbb{Z})$- and $\mathbb{Z}$-equivalence. Specifically, we prove that ${\rm GL}_2(\mathbb{Z})$-equivalent polynomials are Hermite equivalent and, for polynomials of degree $2$ or $3$, the converse is also true. On the other hand, for every $n\geq 4$, we give infinite collections of examples of polynomials $f,g\in \mathbb{Z}[X]$ of degree $n$ that are Hermite equivalent but not ${\rm GL}_2(\mathbb{Z})$-equivalent.
△ Less
Submitted 14 September, 2022; v1 submitted 7 September, 2021;
originally announced September 2021.
-
Electroencephalogram Signal Processing with Independent Component Analysis and Cognitive Stress Classification using Convolutional Neural Networks
Authors:
Venkatakrishnan Sutharsan,
Alagappan Swaminathan,
Saisrinivasan Ramachandran,
Madan Kumar Lakshmanan,
Balaji Mahadevan
Abstract:
Electroencephalogram (EEG) is the recording which is the result due to the activity of bio-electrical signals that is acquired from electrodes placed on the scalp. In Electroencephalogram signal(EEG) recordings, the signals obtained are contaminated predominantly by the Electrooculogram(EOG) signal. Since this artifact has higher magnitude compared to EEG signals, these noise signals have to be re…
▽ More
Electroencephalogram (EEG) is the recording which is the result due to the activity of bio-electrical signals that is acquired from electrodes placed on the scalp. In Electroencephalogram signal(EEG) recordings, the signals obtained are contaminated predominantly by the Electrooculogram(EOG) signal. Since this artifact has higher magnitude compared to EEG signals, these noise signals have to be removed in order to have a better understanding regarding the functioning of a human brain for applications such as medical diagnosis. This paper proposes an idea of using Independent Component Analysis(ICA) along with cross-correlation to de-noise EEG signal. This is done by selecting the component based on the cross-correlation coefficient with a threshold value and reducing its effect instead of zeroing it out completely, thus reducing the information loss. The results of the recorded data show that this algorithm can eliminate the EOG signal artifact with little loss in EEG data. The denoising is verified by an increase in SNR value and the decrease in cross-correlation coefficient value. The denoised signals are used to train an Artificial Neural Network(ANN) which would examine the features of the input EEG signal and predict the stress levels of the individual.
△ Less
Submitted 22 August, 2021;
originally announced August 2021.
-
Wind Power Projection using Weather Forecasts by Novel Deep Neural Networks
Authors:
Alagappan Swaminathan,
Venkatakrishnan Sutharsan,
Tamilselvi Selvaraj
Abstract:
The transition from conventional methods of energy production to renewable energy production necessitates better prediction models of the upcoming supply of renewable energy. In wind power production, error in forecasting production is impossible to negate owing to the intermittence of wind. For successful power grid integration, it is crucial to understand the uncertainties that arise in predicti…
▽ More
The transition from conventional methods of energy production to renewable energy production necessitates better prediction models of the upcoming supply of renewable energy. In wind power production, error in forecasting production is impossible to negate owing to the intermittence of wind. For successful power grid integration, it is crucial to understand the uncertainties that arise in predicting wind power production and use this information to build an accurate and reliable forecast. This can be achieved by observing the fluctuations in wind power production with changes in different parameters such as wind speed, temperature, and wind direction, and deriving functional dependencies for the same. Using optimized machine learning algorithms, it is possible to find obscured patterns in the observations and obtain meaningful data, which can then be used to accurately predict wind power requirements . Utilizing the required data provided by the Gamesa's wind farm at Bableshwar, the paper explores the use of both parametric and the non-parametric models for calculating wind power prediction using power curves. The obtained results are subject to comparison to better understand the accuracy of the utilized models and to determine the most suitable model for predicting wind power production based on the given data set.
△ Less
Submitted 22 August, 2021;
originally announced August 2021.
-
Heuristic-Guided Reinforcement Learning
Authors:
Ching-An Cheng,
Andrey Kolobov,
Adith Swaminathan
Abstract:
We provide a framework for accelerating reinforcement learning (RL) algorithms by heuristics constructed from domain knowledge or offline data. Tabula rasa RL algorithms require environment interactions or computation that scales with the horizon of the sequential decision-making task. Using our framework, we show how heuristic-guided RL induces a much shorter-horizon subproblem that provably solv…
▽ More
We provide a framework for accelerating reinforcement learning (RL) algorithms by heuristics constructed from domain knowledge or offline data. Tabula rasa RL algorithms require environment interactions or computation that scales with the horizon of the sequential decision-making task. Using our framework, we show how heuristic-guided RL induces a much shorter-horizon subproblem that provably solves the original task. Our framework can be viewed as a horizon-based regularization for controlling bias and variance in RL under a finite interaction budget. On the theoretical side, we characterize properties of a good heuristic and its impact on RL acceleration. In particular, we introduce the novel concept of an improvable heuristic, a heuristic that allows an RL agent to extrapolate beyond its prior knowledge. On the empirical side, we instantiate our framework to accelerate several state-of-the-art algorithms in simulated robotic control tasks and procedurally generated games. Our framework complements the rich literature on warm-starting RL with expert demonstrations or exploratory datasets, and introduces a principled method for injecting prior knowledge into RL.
△ Less
Submitted 22 November, 2021; v1 submitted 4 June, 2021;
originally announced June 2021.
-
Improving Long-Term Metrics in Recommendation Systems using Short-Horizon Reinforcement Learning
Authors:
Bogdan Mazoure,
Paul Mineiro,
Pavithra Srinath,
Reza Sharifi Sedeh,
Doina Precup,
Adith Swaminathan
Abstract:
We study session-based recommendation scenarios where we want to recommend items to users during sequential interactions to improve their long-term utility. Optimizing a long-term metric is challenging because the learning signal (whether the recommendations achieved their desired goals) is delayed and confounded by other user interactions with the system. Targeting immediately measurable proxies…
▽ More
We study session-based recommendation scenarios where we want to recommend items to users during sequential interactions to improve their long-term utility. Optimizing a long-term metric is challenging because the learning signal (whether the recommendations achieved their desired goals) is delayed and confounded by other user interactions with the system. Targeting immediately measurable proxies such as clicks can lead to suboptimal recommendations due to misalignment with the long-term metric. We develop a new reinforcement learning algorithm called Short Horizon Policy Improvement (SHPI) that approximates policy-induced drift in user behavior across sessions. SHPI is a straightforward modification of episodic RL algorithms for session-based recommendation, that additionally gives an appropriate termination bonus in each session. Empirical results on four recommendation tasks show that SHPI can outperform state-of-the-art recommendation techniques like matrix factorization with offline proxy signals, bandits with myopic online proxies, and RL baselines with limited amounts of user interaction.
△ Less
Submitted 14 September, 2021; v1 submitted 1 June, 2021;
originally announced June 2021.
-
Sufficiency for Nephroid Starlikeness using Hypergeometric Functions
Authors:
A. Swaminathan,
Lateef Ahmad Wani
Abstract:
Let $\mathcal{A}$ consists of analytic functions $f:\mathbb{D}\to\mathbb{C}$ satisfying $f(0)=f'(0)-1=0$. Let $\mathcal{S}^*_{Ne}$ be the recently introduced Ma-Minda type functions family associated with the $2$-cusped kidney-shaped {\it nephroid} curve $\left((u-1)^2+v^2-\frac{4}{9}\right)^3-\frac{4 v^2}{3}=0$ given by \begin{align*}
\mathcal{S}^*_{Ne}:=
\left\{f\in\mathcal{A}:\frac{zf'(z)}{…
▽ More
Let $\mathcal{A}$ consists of analytic functions $f:\mathbb{D}\to\mathbb{C}$ satisfying $f(0)=f'(0)-1=0$. Let $\mathcal{S}^*_{Ne}$ be the recently introduced Ma-Minda type functions family associated with the $2$-cusped kidney-shaped {\it nephroid} curve $\left((u-1)^2+v^2-\frac{4}{9}\right)^3-\frac{4 v^2}{3}=0$ given by \begin{align*}
\mathcal{S}^*_{Ne}:=
\left\{f\in\mathcal{A}:\frac{zf'(z)}{f(z)}\prec\varphi_{\scriptscriptstyle {Ne}}(z)=1+z-z^3/3\right\}. \end{align*} In this paper, we adopt a novel technique that uses the geometric properties of {\it hypergeometric functions} to determine sharp estimates on $β$ so that each of the differential subordinations \begin{align*}
p(z)+βzp'(z)\prec
\begin{cases}
\sqrt{1+z};
1+z;
e^z;
\end{cases} \end{align*} imply $p(z)\prec\varphi_{\scriptscriptstyle{Ne}}(z)$, where $p(z)$ is analytic satisfying $p(0)=1$. As applications, we establish conditions that are sufficient to deduce that $f\in\mathcal{A}$ is a member of $\mathcal{S}^*_{Ne}$.
△ Less
Submitted 17 April, 2021; v1 submitted 10 April, 2021;
originally announced April 2021.
-
Average $2$-Torsion in Class Groups of Rings Associated to Binary $n$-ic Forms
Authors:
Ashvin Swaminathan
Abstract:
Let $n \geq 3$ be an integer. In this paper, we study the average behavior of the $2$-torsion in class groups of number fields cut out by integral binary $n$-ic forms having any fixed odd leading coefficient. Specifically, we compute upper bounds on the average size of the $2$-torsion in the class groups of such number fields. Conditional on a uniformity estimate, we further prove that each of the…
▽ More
Let $n \geq 3$ be an integer. In this paper, we study the average behavior of the $2$-torsion in class groups of number fields cut out by integral binary $n$-ic forms having any fixed odd leading coefficient. Specifically, we compute upper bounds on the average size of the $2$-torsion in the class groups of such number fields. Conditional on a uniformity estimate, we further prove that each of these upper bounds is in fact an equality.
Our theorems extend recent work of Bhargava-Hanke-Shankar in the cubic case and of Siad in the monic case to binary forms of any degree with any fixed odd leading coefficient. When $n$ is odd, we find that fixing the leading coefficient increases the average $2$-torsion in the class group, relative to the prediction of Cohen-Lenstra-Martinet-Malle. When $n$ is even, such predictions are yet to be formulated; along with Siad's results in the monic case, our theorems are the first of their kind to describe the average behavior of the $p$-torsion in class groups of degree-$n$ fields where $p \mid n > 2$.
To prove these theorems, we first answer a question of Ellenberg by parametrizing square roots of the class of the inverse different of a ring cut out by a binary form in terms of the integral orbits of a certain coregular representation. This parametrization has a range of interesting applications, from studying $2$-parts of class groups to studying $2$-Selmer groups of hyperelliptic Jacobians.
△ Less
Submitted 19 October, 2021; v1 submitted 27 November, 2020;
originally announced November 2020.
-
Provably Good Batch Reinforcement Learning Without Great Exploration
Authors:
Yao Liu,
Adith Swaminathan,
Alekh Agarwal,
Emma Brunskill
Abstract:
Batch reinforcement learning (RL) is important to apply RL algorithms to many high stakes tasks. Doing batch RL in a way that yields a reliable new policy in large domains is challenging: a new decision policy may visit states and actions outside the support of the batch data, and function approximation and optimization with limited samples can further increase the potential of learning policies w…
▽ More
Batch reinforcement learning (RL) is important to apply RL algorithms to many high stakes tasks. Doing batch RL in a way that yields a reliable new policy in large domains is challenging: a new decision policy may visit states and actions outside the support of the batch data, and function approximation and optimization with limited samples can further increase the potential of learning policies with overly optimistic estimates of their future performance. Recent algorithms have shown promise but can still be overly optimistic in their expected outcomes. Theoretical work that provides strong guarantees on the performance of the output policy relies on a strong concentrability assumption, that makes it unsuitable for cases where the ratio between state-action distributions of behavior policy and some candidate policies is large. This is because in the traditional analysis, the error bound scales up with this ratio. We show that a small modification to Bellman optimality and evaluation back-up to take a more conservative update can have much stronger guarantees. In certain settings, they can find the approximately best policy within the state-action space explored by the batch data, without requiring a priori assumptions of concentrability. We highlight the necessity of our conservative update and the limitations of previous algorithms and analyses by illustrative MDP examples, and demonstrate an empirical comparison of our algorithm and other state-of-the-art batch RL baselines in standard benchmarks.
△ Less
Submitted 22 July, 2020; v1 submitted 16 July, 2020;
originally announced July 2020.