Skip to main content

Showing 1–50 of 157 results for author: Amini, A

  1. arXiv:2407.06057  [pdf, other

    cs.CL cs.AI cs.LG

    Variational Best-of-N Alignment

    Authors: Afra Amini, Tim Vieira, Ryan Cotterell

    Abstract: Best-of-N (BoN) is a popular and effective algorithm for aligning language models to human preferences. The algorithm works as follows: at inference time, N samples are drawn from the language model, and the sample with the highest reward, as judged by a reward model, is returned as the output. Despite its effectiveness, BoN is computationally expensive; it reduces sampling throughput by a factor… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  2. arXiv:2406.15149  [pdf, other

    cs.RO cs.AI cs.CV

    Gaussian Splatting to Real World Flight Navigation Transfer with Liquid Networks

    Authors: Alex Quach, Makram Chahine, Alexander Amini, Ramin Hasani, Daniela Rus

    Abstract: Simulators are powerful tools for autonomous robot learning as they offer scalable data generation, flexible design, and optimization of trajectories. However, transferring behavior learned from simulation data into the real world proves to be difficult, usually mitigated with compute-heavy domain randomization methods or further model fine-tuning. We present a method to improve generalization and… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  3. arXiv:2406.13121  [pdf, other

    cs.CL cs.AI cs.IR

    Can Long-Context Language Models Subsume Retrieval, RAG, SQL, and More?

    Authors: Jinhyuk Lee, Anthony Chen, Zhuyun Dai, Dheeru Dua, Devendra Singh Sachan, Michael Boratko, Yi Luan, Sébastien M. R. Arnold, Vincent Perot, Siddharth Dalmia, Hexiang Hu, Xudong Lin, Panupong Pasupat, Aida Amini, Jeremy R. Cole, Sebastian Riedel, Iftekhar Naim, Ming-Wei Chang, Kelvin Guu

    Abstract: Long-context language models (LCLMs) have the potential to revolutionize our approach to tasks traditionally reliant on external tools like retrieval systems or databases. Leveraging LCLMs' ability to natively ingest and process entire corpora of information offers numerous advantages. It enhances user-friendliness by eliminating the need for specialized knowledge of tools, provides robust end-to-… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: 29 pages. Dataset available at https://github.com/google-deepmind/loft

  4. arXiv:2406.10686  [pdf, other

    cs.LG cs.AI stat.ML

    Graph Neural Thompson Sampling

    Authors: Shuang Wu, Arash A. Amini

    Abstract: We consider an online decision-making problem with a reward function defined over graph-structured data. We formally formulate the problem as an instance of graph action bandit. We then propose \texttt{GNN-TS}, a Graph Neural Network (GNN) powered Thompson Sampling (TS) algorithm which employs a GNN approximator for estimating the mean reward function and the graph neural tangent features for unce… ▽ More

    Submitted 20 June, 2024; v1 submitted 15 June, 2024; originally announced June 2024.

  5. arXiv:2406.06349  [pdf, other

    cs.IT

    ARMA Processes with Discrete-Continuous Excitation: Compressibility Beyond Sparsity

    Authors: Mohammad-Amin Charusaie, Stefano Rini, Arash Amini

    Abstract: Rényi Information Dimension (RID) plays a central role in quantifying the compressibility of random variables with singularities in their distribution, encompassing and extending beyond the class of sparse sources. The RID, from a high perspective, presents the average number of bits that is needed for coding the i.i.d. samples of a random variable with high precision. There are two main extension… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

  6. arXiv:2406.06014  [pdf, other

    math.ST cs.SI stat.ME stat.ML

    Network two-sample test for block models

    Authors: Chung Kyong Nguen, Oscar Hernan Madrid Padilla, Arash A. Amini

    Abstract: We consider the two-sample testing problem for networks, where the goal is to determine whether two sets of networks originated from the same stochastic model. Assuming no vertex correspondence and allowing for different numbers of nodes, we address a fundamental network testing problem that goes beyond simple adjacency matrix comparisons. We adopt the stochastic block model (SBM) for network dist… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

  7. arXiv:2405.07890  [pdf, other

    eess.SP cs.IT

    Subspace-Informed Matrix Completion

    Authors: Hamideh. Sadat Fazael Ardakani, Sajad Daei, Arash Amini, Mikael Skoglund, Gabor Fodor

    Abstract: In this work, we consider the matrix completion problem, where the objective is to reconstruct a low-rank matrix from a few observed entries. A commonly employed approach involves nuclear norm minimization. For this method to succeed, the number of observed entries needs to scale at least proportional to both the rank of the ground-truth matrix and the coherence parameter. While the only prior inf… ▽ More

    Submitted 24 June, 2024; v1 submitted 13 May, 2024; originally announced May 2024.

    Comments: arXiv admin note: text overlap with arXiv:2111.00235

  8. arXiv:2404.01924  [pdf, other

    cs.CV

    Toward Efficient Visual Gyroscopes: Spherical Moments, Harmonics Filtering, and Masking Techniques for Spherical Camera Applications

    Authors: Yao Du, Carlos M. Mateo, Mirjana Maras, Tsun-Hsuan Wang, Marc Blanchon, Alexander Amini, Daniela Rus, Omar Tahri

    Abstract: Unlike a traditional gyroscope, a visual gyroscope estimates camera rotation through images. The integration of omnidirectional cameras, offering a larger field of view compared to traditional RGB cameras, has proven to yield more accurate and robust results. However, challenges arise in situations that lack features, have substantial noise causing significant errors, and where certain features in… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: Submitted to 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2024)

  9. arXiv:2404.01750  [pdf, other

    cs.CV

    Exploring Latent Pathways: Enhancing the Interpretability of Autonomous Driving with a Variational Autoencoder

    Authors: Anass Bairouk, Mirjana Maras, Simon Herlin, Alexander Amini, Marc Blanchon, Ramin Hasani, Patrick Chareyre, Daniela Rus

    Abstract: Autonomous driving presents a complex challenge, which is usually addressed with artificial intelligence models that are end-to-end or modular in nature. Within the landscape of modular approaches, a bio-inspired neural circuit policy model has emerged as an innovative control module, offering a compact and inherently interpretable system to infer a steering wheel command from abstract visual feat… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: Submitted to 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2024)

  10. arXiv:2403.17240  [pdf, other

    cs.CL

    The Role of $n$-gram Smoothing in the Age of Neural Networks

    Authors: Luca Malagutti, Andrius Buinovskij, Anej Svete, Clara Meister, Afra Amini, Ryan Cotterell

    Abstract: For nearly three decades, language models derived from the $n$-gram assumption held the state of the art on the task. The key to their success lay in the application of various smoothing techniques that served to combat overfitting. However, when neural language models toppled $n$-gram models as the best performers, $n$-gram smoothing techniques became less relevant. Indeed, it would hardly be an… ▽ More

    Submitted 30 April, 2024; v1 submitted 25 March, 2024; originally announced March 2024.

    Comments: NAACL 2024

  11. arXiv:2403.12279  [pdf, other

    cs.RO

    Scalable Networked Feature Selection with Randomized Algorithm for Robot Navigation

    Authors: Vivek Pandey, Arash Amini, Guangyi Liu, Ufuk Topcu, Qiyu Sun, Kostas Daniilidis, Nader Motee

    Abstract: We address the problem of sparse selection of visual features for localizing a team of robots navigating an unknown environment, where robots can exchange relative position measurements with neighbors. We select a set of the most informative features by anticipating their importance in robots localization by simulating trajectories of robots over a prediction horizon. Through theoretical proofs, w… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  12. arXiv:2403.10705  [pdf, other

    cs.SI

    Susceptibility of Communities against Low-Credibility Content in Social News Websites

    Authors: Yigit Ege Bayiz, Arash Amini, Radu Marculescu, Ufuk Topcu

    Abstract: Social news websites, such as Reddit, have evolved into prominent platforms for sharing and discussing news. A key issue on social news websites sites is the formation of echo chambers, which often lead to the spread of highly biased or uncredible news. We develop a method to identify communities within a social news website that are prone to uncredible or highly biased news. We employ a user embe… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

    Comments: 11 pages, 2 figures, Under review in ICWSM 2024

  13. arXiv:2402.10938  [pdf, other

    cs.CL cs.SI

    News Source Credibility Assessment: A Reddit Case Study

    Authors: Arash Amini, Yigit Ege Bayiz, Ashwin Ram, Radu Marculescu, Ufuk Topcu

    Abstract: In the era of social media platforms, identifying the credibility of online content is crucial to combat misinformation. We present the CREDiBERT (CREDibility assessment using Bi-directional Encoder Representations from Transformers), a source credibility assessment model fine-tuned for Reddit submissions focusing on political discourse as the main contribution. We adopt a semi-supervised training… ▽ More

    Submitted 7 February, 2024; originally announced February 2024.

    Comments: 12 pages; 3 figures

  14. arXiv:2402.10571  [pdf, other

    cs.CL cs.AI cs.LG

    Direct Preference Optimization with an Offset

    Authors: Afra Amini, Tim Vieira, Ryan Cotterell

    Abstract: Direct preference optimization (DPO) is a successful fine-tuning strategy for aligning large language models with human preferences without the need to train a reward model or employ reinforcement learning. DPO, as originally formulated, relies on binary preference data and fine-tunes a language model to increase the likelihood of a preferred response over a dispreferred response. However, not all… ▽ More

    Submitted 6 June, 2024; v1 submitted 16 February, 2024; originally announced February 2024.

  15. arXiv:2401.07742  [pdf, ps, other

    math.GR

    On pure subrings of sp-groups

    Authors: A. Amini, B. Amini, E. Momtahan

    Abstract: Let $G$ be a sp-group such that for every prime $p$, $G_p$ is elementary. %$\oplus \End_{\zz}(G_p) \leq \End_{\zz}(G) \leq \prod \End_{\zz}(G_p)$. Suppose that $\frac{G}{\oplus_{p\in \mathbb{P}} G_p}$ is torsion-free divisible. %In this article we characterize pure subrings of $\prod_{p\in \mathbb{P}} \End(G_p)$. We show that $\End_{\zz}(G)$ is a sp-group and every subring $R$ of… ▽ More

    Submitted 15 January, 2024; originally announced January 2024.

  16. arXiv:2401.05857   

    eess.SY

    Secure Dynamic Event-triggered Consensus Under Asynchronous Denial of Service

    Authors: Ali Azarbahram, Amir Amini

    Abstract: This article proposes a secure implementation for consensus using a dynamic event-triggered (DET) communication scheme in high-order nonlinear multi-agent systems (MAS) under asynchronous (distributed) denial of service (DoS) attacks. By introducing a linear auxiliary trajectory of the system, the DET data transmission scheme among the neighboring agents is employed to reduce the communication for… ▽ More

    Submitted 26 February, 2024; v1 submitted 11 January, 2024; originally announced January 2024.

    Comments: This work needs to be revised fundamentally with a greater emphasis on the nonlinear dynamics and the destructive effects of independent DoS attacks over the communication links between actual and auxiliary states

  17. arXiv:2312.17710  [pdf, other

    cs.CL cs.LG

    Principled Gradient-based Markov Chain Monte Carlo for Text Generation

    Authors: Li Du, Afra Amini, Lucas Torroba Hennigen, Xinyan Velocity Yu, Jason Eisner, Holden Lee, Ryan Cotterell

    Abstract: Recent papers have demonstrated the possibility of energy-based text generation by adapting gradient-based sampling algorithms, a paradigm of MCMC algorithms that promises fast convergence. However, as we show in this paper, previous attempts on this approach to text generation all fail to sample correctly from the target language model distributions. To address this limitation, we consider the pr… ▽ More

    Submitted 29 December, 2023; originally announced December 2023.

    Comments: Preprint

  18. arXiv:2312.16940  [pdf, other

    cs.LG eess.SP

    Joint Signal Recovery and Graph Learning from Incomplete Time-Series

    Authors: Amirhossein Javaheri, Arash Amini, Farokh Marvasti, Daniel P. Palomar

    Abstract: Learning a graph from data is the key to taking advantage of graph signal processing tools. Most of the conventional algorithms for graph learning require complete data statistics, which might not be available in some scenarios. In this work, we aim to learn a graph from incomplete time-series observations. From another viewpoint, we consider the problem of semi-blind recovery of time-varying grap… ▽ More

    Submitted 28 December, 2023; originally announced December 2023.

  19. arXiv:2312.11805  [pdf, other

    cs.CL cs.AI cs.CV

    Gemini: A Family of Highly Capable Multimodal Models

    Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1325 additional authors not shown)

    Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More

    Submitted 17 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

  20. arXiv:2311.15451  [pdf, other

    cs.CL cs.LG

    Uncertainty-aware Language Modeling for Selective Question Answering

    Authors: Qi Yang, Shreya Ravikumar, Fynn Schmitt-Ulms, Satvik Lolla, Ege Demir, Iaroslav Elistratov, Alex Lavaee, Sadhana Lolla, Elaheh Ahmadi, Daniela Rus, Alexander Amini, Alejandro Perez

    Abstract: We present an automatic large language model (LLM) conversion approach that produces uncertainty-aware LLMs capable of estimating uncertainty with every prediction. Our approach is model- and data-agnostic, is computationally-efficient, and does not rely on external models or systems. We evaluate converted models on the selective question answering setting -- to answer as many questions as possibl… ▽ More

    Submitted 26 November, 2023; originally announced November 2023.

  21. arXiv:2311.05756  [pdf, other

    math.ST stat.AP stat.ML

    Step and Smooth Decompositions as Topological Clustering

    Authors: Luciano Vinas, Arash A. Amini

    Abstract: We investigate a class of recovery problems for which observations are a noisy combination of continuous and step functions. These problems can be seen as non-injective instances of non-linear ICA with direct applications to image decontamination for magnetic resonance imaging. Alternately, the problem can be viewed as clustering in the presence of structured (smooth) contaminant. We show that a g… ▽ More

    Submitted 9 November, 2023; originally announced November 2023.

  22. arXiv:2311.05003  [pdf, ps, other

    eess.SP cs.IT

    Harmonic Retrieval Using Weighted Lifted-Structure Low-Rank Matrix Completion

    Authors: Mohammad Bokaei, Saeed Razavikia, Stefano Rini, Arash Amini, Hamid Behrouzi

    Abstract: In this paper, we investigate the problem of recovering the frequency components of a mixture of $K$ complex sinusoids from a random subset of $N$ equally-spaced time-domain samples. Because of the random subset, the samples are effectively non-uniform. Besides, the frequency values of each of the $K$ complex sinusoids are assumed to vary continuously within a given range. For this problem, we p… ▽ More

    Submitted 8 November, 2023; originally announced November 2023.

  23. arXiv:2310.17642  [pdf, other

    cs.RO cs.CV cs.LG

    Drive Anywhere: Generalizable End-to-end Autonomous Driving with Multi-modal Foundation Models

    Authors: Tsun-Hsuan Wang, Alaa Maalouf, Wei Xiao, Yutong Ban, Alexander Amini, Guy Rosman, Sertac Karaman, Daniela Rus

    Abstract: As autonomous driving technology matures, end-to-end methodologies have emerged as a leading strategy, promising seamless integration from perception to control via deep learning. However, existing systems grapple with challenges such as unexpected open set environments and the complexity of black-box models. At the same time, the evolution of deep learning introduces larger, multimodal foundation… ▽ More

    Submitted 26 October, 2023; originally announced October 2023.

    Comments: Project webpage: https://drive-anywhere.github.io Explainer video: https://www.youtube.com/watch?v=4n-DJf8vXxo&feature=youtu.be

  24. arXiv:2310.12021  [pdf, other

    math.OC eess.SY

    Data-Driven Distributionally Robust Mitigation of Risk of Cascading Failures

    Authors: Guangyi Liu, Arash Amini, Vivek Pandey, Nader Motee

    Abstract: We introduce a novel data-driven method to mitigate the risk of cascading failures in delayed discrete-time Linear Time-Invariant (LTI) systems. Our approach involves formulating a distributionally robust finite-horizon optimal control problem, where the objective is to minimize a given performance function while satisfying a set of distributionally chances constraints on cascading failures, which… ▽ More

    Submitted 18 October, 2023; originally announced October 2023.

  25. arXiv:2310.11276  [pdf, other

    eess.IV cs.CV

    Video Super-Resolution Using a Grouped Residual in Residual Network

    Authors: MohammadHossein Ashoori, Arash Amini

    Abstract: Super-resolution (SR) is the technique of increasing the nominal resolution of image / video content accompanied with quality improvement. Video super-resolution (VSR) can be considered as the generalization of single image super-resolution (SISR). This generalization should be such that more detail is created in the output using adjacent input frames. In this paper, we propose a grouped residual… ▽ More

    Submitted 17 October, 2023; originally announced October 2023.

  26. arXiv:2310.05250  [pdf, other

    cs.LG stat.ML

    Simplifying GNN Performance with Low Rank Kernel Models

    Authors: Luciano Vinas, Arash A. Amini

    Abstract: We revisit recent spectral GNN approaches to semi-supervised node classification (SSNC). We posit that many of the current GNN architectures may be over-engineered. Instead, simpler, traditional methods from nonparametric estimation, applied in the spectral domain, could replace many deep-learning inspired GNN designs. These conventional techniques appear to be well suited for a variety of graph t… ▽ More

    Submitted 8 October, 2023; originally announced October 2023.

  27. arXiv:2310.02932  [pdf, other

    cs.CL cs.AI cs.CY cs.LG

    Assessing Large Language Models on Climate Information

    Authors: Jannis Bulian, Mike S. Schäfer, Afra Amini, Heidi Lam, Massimiliano Ciaramita, Ben Gaiarin, Michelle Chen Hübscher, Christian Buck, Niels G. Mede, Markus Leippold, Nadine Strauß

    Abstract: As Large Language Models (LLMs) rise in popularity, it is necessary to assess their capability in critically relevant domains. We present a comprehensive evaluation framework, grounded in science communication research, to assess LLM responses to questions about climate change. Our framework emphasizes both presentational and epistemological adequacy, offering a fine-grained analysis of LLM genera… ▽ More

    Submitted 28 May, 2024; v1 submitted 4 October, 2023; originally announced October 2023.

    Journal ref: Proceedings of the 41st International Conference on Machine Learning (ICML), 2024

  28. arXiv:2309.04922  [pdf, other

    eess.SY

    Quantification of Distributionally Robust Risk of Cascade of Failures in Platoon of Vehicles

    Authors: Vivek Pandey, Guangyi Liu, Arash Amini, Nader Motee

    Abstract: Achieving safety is a critical aspect of attaining autonomy in a platoon of autonomous vehicles. In this paper, we propose a distributionally robust risk framework to investigate cascading failures in platoons. To examine the impact of network connectivity and system dynamics on the emergence of cascading failures, we consider a time-delayed network model of the platoon of vehicles as a benchmark.… ▽ More

    Submitted 9 September, 2023; originally announced September 2023.

  29. arXiv:2308.00231  [pdf, other

    cs.LG cs.AI

    Capsa: A Unified Framework for Quantifying Risk in Deep Neural Networks

    Authors: Sadhana Lolla, Iaroslav Elistratov, Alejandro Perez, Elaheh Ahmadi, Daniela Rus, Alexander Amini

    Abstract: The modern pervasiveness of large-scale deep neural networks (NNs) is driven by their extraordinary performance on complex problems but is also plagued by their sudden, unexpected, and often catastrophic failures, particularly on challenging scenarios. Existing algorithms that provide risk-awareness to NNs are complex and ad-hoc. Specifically, these methods require significant engineering changes,… ▽ More

    Submitted 31 July, 2023; originally announced August 2023.

    Comments: Neural Information Processing Systems (NeurIPS) 2022. Workshop on Machine Learning for Autonomous Driving (ML4AD)

    Journal ref: Neural Information Processing Systems (NeurIPS) 2022. Workshop on Machine Learning for Autonomous Driving (ML4AD)

  30. arXiv:2307.13503  [pdf, other

    cs.LG stat.ML

    Continuous Time Evidential Distributions for Irregular Time Series

    Authors: Taylor W. Killian, Haoran Zhang, Thomas Hartvigsen, Ava P. Amini

    Abstract: Prevalent in many real-world settings such as healthcare, irregular time series are challenging to formulate predictions from. It is difficult to infer the value of a feature at any given time when observations are sporadic, as it could take on a range of values depending on when it was last observed. To characterize this uncertainty we present EDICT, a strategy that learns an evidential distribut… ▽ More

    Submitted 25 July, 2023; originally announced July 2023.

    Comments: ICML 2023 Workshop on Interpretable Machine Learning in Healthcare. Code is available at https://github.com/twkillian/EDICT

  31. arXiv:2307.11550  [pdf, other

    cs.CV

    YOLOPose V2: Understanding and Improving Transformer-based 6D Pose Estimation

    Authors: Arul Selvam Periyasamy, Arash Amini, Vladimir Tsaturyan, Sven Behnke

    Abstract: 6D object pose estimation is a crucial prerequisite for autonomous robot manipulation applications. The state-of-the-art models for pose estimation are convolutional neural network (CNN)-based. Lately, Transformers, an architecture originally proposed for natural language processing, is achieving state-of-the-art results in many computer vision tasks as well. Equipped with the multi-head self-atte… ▽ More

    Submitted 21 July, 2023; originally announced July 2023.

    Comments: Robotics and Autonomous Systems Journal, Elsevier, to appear 2023. arXiv admin note: substantial text overlap with arXiv:2205.02536

  32. arXiv:2307.09210  [pdf, other

    stat.ME cs.SI stat.ML

    Nested stochastic block model for simultaneously clustering networks and nodes

    Authors: Nathaniel Josephs, Arash A. Amini, Marina Paez, Lizhen Lin

    Abstract: We introduce the nested stochastic block model (NSBM) to cluster a collection of networks while simultaneously detecting communities within each network. NSBM has several appealing features including the ability to work on unlabeled networks with potentially different node sets, the flexibility to model heterogeneous communities, and the means to automatically select the number of classes for the… ▽ More

    Submitted 18 July, 2023; originally announced July 2023.

  33. arXiv:2306.12146  [pdf, other

    cs.CL cs.HC

    Which Spurious Correlations Impact Reasoning in NLI Models? A Visual Interactive Diagnosis through Data-Constrained Counterfactuals

    Authors: Robin Chan, Afra Amini, Mennatallah El-Assady

    Abstract: We present a human-in-the-loop dashboard tailored to diagnosing potential spurious features that NLI models rely on for predictions. The dashboard enables users to generate diverse and challenging examples by drawing inspiration from GPT-3 suggestions. Additionally, users can receive feedback from a trained NLI model on how challenging the newly created example is and make refinements based on the… ▽ More

    Submitted 21 June, 2023; originally announced June 2023.

    Comments: 7 pages, Accepted at ACL 2023: System Demonstrations

  34. arXiv:2306.05477  [pdf, other

    cs.CL cs.AI

    Hexatagging: Projective Dependency Parsing as Tagging

    Authors: Afra Amini, Tianyu Liu, Ryan Cotterell

    Abstract: We introduce a novel dependency parser, the hexatagger, that constructs dependency trees by tagging the words in a sentence with elements from a finite set of possible tags. In contrast to many approaches to dependency parsing, our approach is fully parallelizable at training time, i.e., the structure-building actions needed to build a dependency parse can be predicted in parallel to each other. A… ▽ More

    Submitted 8 June, 2023; originally announced June 2023.

    Comments: accepted at ACL 2023

  35. arXiv:2306.03061  [pdf, other

    cs.CL cs.AI

    Structured Voronoi Sampling

    Authors: Afra Amini, Li Du, Ryan Cotterell

    Abstract: Gradient-based sampling algorithms have demonstrated their effectiveness in text generation, especially in the context of controlled text generation. However, there exists a lack of theoretically grounded and principled approaches for this task. In this paper, we take an important step toward building a principled approach for sampling from language models with gradient-based methods. We use discr… ▽ More

    Submitted 6 June, 2024; v1 submitted 5 June, 2023; originally announced June 2023.

    Comments: Accepted at NeurIPS 2023

  36. arXiv:2305.15057  [pdf, other

    cs.CL

    Linear-Time Modeling of Linguistic Structure: An Order-Theoretic Perspective

    Authors: Tianyu Liu, Afra Amini, Mrinmaya Sachan, Ryan Cotterell

    Abstract: Tasks that model the relation between pairs of tokens in a string are a vital part of understanding natural language. Such tasks, in general, require exhaustive pair-wise comparisons of tokens, thus having a quadratic runtime complexity in the length of the string. We show that these exhaustive comparisons can be avoided, and, moreover, the complexity of such tasks can be reduced to linear by cast… ▽ More

    Submitted 12 December, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

    Comments: EMNLP 2023, 23 pages

  37. arXiv:2305.14171  [pdf, other

    cs.CL

    In-Context Probing: Toward Building Robust Classifiers via Probing Large Language Models

    Authors: Afra Amini, Massimiliano Ciaramita

    Abstract: Large language models are able to learn new tasks in context, where they are provided with instructions and a few annotated examples. However, the effectiveness of in-context learning is dependent on the provided context, and the performance on a downstream task can vary considerably, depending on the instruction. Importantly, such dependency on the context can surface in unpredictable ways, e.g.,… ▽ More

    Submitted 22 December, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

  38. arXiv:2304.02733  [pdf, other

    cs.RO cs.LG eess.SY

    Learning Stability Attention in Vision-based End-to-end Driving Policies

    Authors: Tsun-Hsuan Wang, Wei Xiao, Makram Chahine, Alexander Amini, Ramin Hasani, Daniela Rus

    Abstract: Modern end-to-end learning systems can learn to explicitly infer control from perception. However, it is difficult to guarantee stability and robustness for these systems since they are often exposed to unstructured, high-dimensional, and complex observation spaces (e.g., autonomous driving from a stream of pixel inputs). We propose to leverage control Lyapunov functions (CLFs) to equip end-to-end… ▽ More

    Submitted 5 April, 2023; originally announced April 2023.

    Comments: First two authors contributed equally; L4DC 2023

  39. arXiv:2302.02956  [pdf, other

    cs.RO

    RoboCup 2022 AdultSize Winner NimbRo: Upgraded Perception, Capture Steps Gait and Phase-based In-walk Kicks

    Authors: Dmytro Pavlichenko, Grzegorz Ficht, Arash Amini, Mojtaba Hosseini, Raphael Memmesheimer, Angel Villar-Corrales, Stefan M. Schulz, Marcell Missura, Maren Bennewitz, Sven Behnke

    Abstract: Beating the human world champions by 2050 is an ambitious goal of the Humanoid League that provides a strong incentive for RoboCup teams to further improve and develop their systems. In this paper, we present upgrades of our system which enabled our team NimbRo to win the Soccer Tournament, the Drop-in Games, and the Technical Challenges in the Humanoid AdultSize League of RoboCup 2022. Strong per… ▽ More

    Submitted 7 February, 2023; v1 submitted 6 February, 2023; originally announced February 2023.

    Journal ref: In: RoboCup 2022: Robot World Cup XXV. LNCS 13561, Springer, May 2023

  40. arXiv:2302.01428  [pdf, other

    cs.LG cs.AI cs.NE stat.ML

    Understanding Reconstruction Attacks with the Neural Tangent Kernel and Dataset Distillation

    Authors: Noel Loo, Ramin Hasani, Mathias Lechner, Alexander Amini, Daniela Rus

    Abstract: Modern deep learning requires large volumes of data, which could contain sensitive or private information that cannot be leaked. Recent work has shown for homogeneous neural networks a large portion of this training data could be reconstructed with only access to the trained network parameters. While the attack was shown to work empirically, there exists little formal understanding of its effectiv… ▽ More

    Submitted 9 November, 2023; v1 submitted 2 February, 2023; originally announced February 2023.

  41. arXiv:2301.06974  [pdf, other

    cs.IR

    Towards Improving the Explainability of Text-based Information Retrieval with Knowledge Graphs

    Authors: Boqi Chen, Kua Chen, Yujing Yang, Afshin Amini, Bharat Saxena, Cecilia Chávez-García, Majid Babaei, Amir Feizpour, Dániel Varró

    Abstract: Thanks to recent advancements in machine learning, vector-based methods have been adopted in many modern information retrieval (IR) systems. While showing promising retrieval performance, these approaches typically fail to explain why a particular document is retrieved as a query result to address explainable information retrieval(XIR). Knowledge graphs record structured information about entities… ▽ More

    Submitted 17 January, 2023; originally announced January 2023.

    Comments: 7 pages, The 1st Workshop on Trustworthy Learning on Graphs (TrustLOG)

  42. arXiv:2211.11069  [pdf, other

    eess.SY cs.LG math.DS math.ST

    Learning Nonlinear Couplings in Network of Agents from a Single Sample Trajectory

    Authors: Arash Amini, Qiyu Sun, Nader Motee

    Abstract: We consider a class of stochastic dynamical networks whose governing dynamics can be modeled using a coupling function. It is shown that the dynamics of such networks can generate geometrically ergodic trajectories under some reasonable assumptions. We show that a general class of coupling functions can be learned using only one sample trajectory from the network. This is practically plausible as… ▽ More

    Submitted 20 November, 2022; originally announced November 2022.

    Comments: 15 pages, 5 figures

    MSC Class: 93E35 (Primary) 93B70; 47H25 (Secondary)

  43. arXiv:2211.07344  [pdf, other

    cs.CL

    On Parsing as Tagging

    Authors: Afra Amini, Ryan Cotterell

    Abstract: There have been many proposals to reduce constituency parsing to tagging in the literature. To better understand what these approaches have in common, we cast several existing proposals into a unifying pipeline consisting of three steps: linearization, learning, and decoding. In particular, we show how to reduce tetratagging, a state-of-the-art constituency tagger, to shift--reduce parsing by perf… ▽ More

    Submitted 20 November, 2022; v1 submitted 14 November, 2022; originally announced November 2022.

    Comments: Will appear in Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

  44. arXiv:2210.12067  [pdf, other

    cs.LG cs.AI cs.NE stat.ML

    Efficient Dataset Distillation Using Random Feature Approximation

    Authors: Noel Loo, Ramin Hasani, Alexander Amini, Daniela Rus

    Abstract: Dataset distillation compresses large datasets into smaller synthetic coresets which retain performance with the aim of reducing the storage and computational burden of processing the entire dataset. Today's best-performing algorithm, \textit{Kernel Inducing Points} (KIP), which makes use of the correspondence between infinite-width neural networks and kernel-ridge regression, is prohibitively slo… ▽ More

    Submitted 21 October, 2022; originally announced October 2022.

    Comments: Accepted to the Conference on the Advances in Neural Information Processing Systems (NeurIPS) 2022

  45. arXiv:2210.12030  [pdf, other

    cs.LG cs.AI cs.NE stat.ML

    Evolution of Neural Tangent Kernels under Benign and Adversarial Training

    Authors: Noel Loo, Ramin Hasani, Alexander Amini, Daniela Rus

    Abstract: Two key challenges facing modern deep learning are mitigating deep networks' vulnerability to adversarial attacks and understanding deep learning's generalization capabilities. Towards the first issue, many defense strategies have been developed, with the most common being Adversarial Training (AT). Towards the second challenge, one of the dominant theories that has emerged is the Neural Tangent K… ▽ More

    Submitted 21 October, 2022; originally announced October 2022.

    Comments: Accepted to the Conference on Advances in Neural Information Processing Systems (NeurIPS) 2022

  46. arXiv:2210.04303  [pdf, other

    cs.CV cs.AI cs.LG cs.NE cs.RO

    Are All Vision Models Created Equal? A Study of the Open-Loop to Closed-Loop Causality Gap

    Authors: Mathias Lechner, Ramin Hasani, Alexander Amini, Tsun-Hsuan Wang, Thomas A. Henzinger, Daniela Rus

    Abstract: There is an ever-growing zoo of modern neural network models that can efficiently learn end-to-end control from visual observations. These advanced deep models, ranging from convolutional to patch-based networks, have been extensively tested on offline image classification and regression tasks. In this paper, we study these vision architectures with respect to the open-loop to closed-loop causalit… ▽ More

    Submitted 9 October, 2022; originally announced October 2022.

  47. arXiv:2209.15611  [pdf, other

    q-bio.BM cs.AI

    Protein structure generation via folding diffusion

    Authors: Kevin E. Wu, Kevin K. Yang, Rianne van den Berg, James Y. Zou, Alex X. Lu, Ava P. Amini

    Abstract: The ability to computationally generate novel yet physically foldable protein structures could lead to new biological discoveries and new treatments targeting yet incurable diseases. Despite recent advances in protein structure prediction, directly generating diverse, novel protein structures from neural networks remains difficult. In this work, we present a new diffusion-based generative model th… ▽ More

    Submitted 23 November, 2022; v1 submitted 30 September, 2022; originally announced September 2022.

    ACM Class: I.2.0; J.3

  48. arXiv:2209.13619  [pdf, other

    physics.med-ph cs.CV stat.AP

    LapGM: A Multisequence MR Bias Correction and Normalization Model

    Authors: Luciano Vinas, Arash A. Amini, Jade Fischer, Atchar Sudhyadhom

    Abstract: A spatially regularized Gaussian mixture model, LapGM, is proposed for the bias field correction and magnetic resonance normalization problem. The proposed spatial regularizer gives practitioners fine-tuned control between balancing bias field removal and preserving image contrast preservation for multi-sequence, magnetic resonance images. The fitted Gaussian parameters of LapGM serve as control v… ▽ More

    Submitted 27 September, 2022; originally announced September 2022.

  49. arXiv:2209.12951  [pdf, other

    cs.LG cs.AI cs.CL cs.CV cs.NE

    Liquid Structural State-Space Models

    Authors: Ramin Hasani, Mathias Lechner, Tsun-Hsuan Wang, Makram Chahine, Alexander Amini, Daniela Rus

    Abstract: A proper parametrization of state transition matrices of linear state-space models (SSMs) followed by standard nonlinearities enables them to efficiently learn representations from sequential data, establishing the state-of-the-art on a large series of long-range sequence modeling benchmarks. In this paper, we show that we can improve further when the structural SSM such as S4 is given by a linear… ▽ More

    Submitted 26 September, 2022; originally announced September 2022.

  50. arXiv:2207.07755  [pdf, other

    math.DS eess.SY

    Carleman Linearization of Nonlinear Systems and Its Finite-Section Approximations

    Authors: Arash Amini, Cong Zheng, Qiyu Sun, Nader Motee

    Abstract: The Carleman linearization is one of the mainstream approaches to lift a finite-dimensional nonlinear dynamical system into an infinite-dimensional linear system with the promise of providing accurate approximations of the original nonlinear system over larger regions around the equilibrium for longer time horizons with respect to the conventional first-order linearization approach. Finite-section… ▽ More

    Submitted 19 July, 2022; v1 submitted 15 July, 2022; originally announced July 2022.

    Comments: 25 Pages, 10 figures

    MSC Class: 34H05 (Primary); 65P99; 37M99 (Secondary) ACM Class: G.1.7; G.1.2