Skip to main content

Showing 1–44 of 44 results for author: Mehri, S

  1. arXiv:2311.17376  [pdf, other

    cs.CL

    CESAR: Automatic Induction of Compositional Instructions for Multi-turn Dialogs

    Authors: Taha Aksu, Devamanyu Hazarika, Shikib Mehri, Seokhwan Kim, Dilek Hakkani-Tür, Yang Liu, Mahdi Namazifar

    Abstract: Instruction-based multitasking has played a critical role in the success of large language models (LLMs) in multi-turn dialog applications. While publicly available LLMs have shown promising performance, when exposed to complex instructions with multiple constraints, they lag against state-of-the-art models like ChatGPT. In this work, we hypothesize that the availability of large-scale complex dem… ▽ More

    Submitted 29 November, 2023; originally announced November 2023.

    Comments: EMNLP 2023

  2. arXiv:2311.14543  [pdf, other

    cs.CL cs.AI

    Data-Efficient Alignment of Large Language Models with Human Feedback Through Natural Language

    Authors: Di Jin, Shikib Mehri, Devamanyu Hazarika, Aishwarya Padmakumar, Sungjin Lee, Yang Liu, Mahdi Namazifar

    Abstract: Learning from human feedback is a prominent technique to align the output of large language models (LLMs) with human expectations. Reinforcement learning from human feedback (RLHF) leverages human preference signals that are in the form of ranking of response pairs to perform this alignment. However, human preference on LLM outputs can come in much richer forms including natural language, which ma… ▽ More

    Submitted 24 November, 2023; originally announced November 2023.

    Comments: Accepted by Workshop on Instruction Tuning and Instruction Following at NeurIPS 2023, Submitted to AAAI 2024

  3. arXiv:2310.20072  [pdf, other

    cs.CL cs.LG

    Automatic Evaluation of Generative Models with Instruction Tuning

    Authors: Shuhaib Mehri, Vered Shwartz

    Abstract: Automatic evaluation of natural language generation has long been an elusive goal in NLP.A recent paradigm fine-tunes pre-trained language models to emulate human judgements for a particular task and evaluation criterion. Inspired by the generalization ability of instruction-tuned models, we propose a learned metric based on instruction tuning. To test our approach, we collected HEAP, a dataset of… ▽ More

    Submitted 30 October, 2023; originally announced October 2023.

    Comments: 11 pages, 1 figure

  4. arXiv:2304.10162  [pdf, other

    math.PR cs.PF

    Poly-Exp Bounds in Tandem Queues

    Authors: Florin Ciucu, Sima Mehri

    Abstract: When the arrival processes are Poisson, queueing networks are well-understood in terms of the product-form structure of the number of jobs $N_i$ at the individual queues; much less is known about the waiting time $W$ across the whole network. In turn, for non-Poisson arrivals, little is known about either $N_i$'s or $W$. This paper considers a tandem network… ▽ More

    Submitted 20 April, 2023; originally announced April 2023.

  5. arXiv:2303.03069  [pdf, other

    cond-mat.soft cond-mat.mes-hall cond-mat.mtrl-sci

    Hidden scale invariance in the Gay-Berne model. II. Smectic B phase

    Authors: Saeed Mehri, Jeppe C. Dyre, Trond S. Ingebrigtsen

    Abstract: This paper complements a previous study of the isotropic and nematic phases of the Gay-Berne liquid-crystal model [Mehri et al., Phys. Rev. E 105, 064703 (2022)] with a study of its smectic B phase found at high density and low temperatures. We find also in this phase strong correlations between the virial and potential-energy thermal fluctuations, reflecting hidden scale invariance and implying t… ▽ More

    Submitted 6 April, 2023; v1 submitted 6 March, 2023; originally announced March 2023.

    Journal ref: Phys. Rev. E 107, 044702 (2023)

  6. arXiv:2301.12004  [pdf, other

    cs.CL

    Understanding the Effectiveness of Very Large Language Models on Dialog Evaluation

    Authors: Jessica Huynh, Cathy Jiao, Prakhar Gupta, Shikib Mehri, Payal Bajaj, Vishrav Chaudhary, Maxine Eskenazi

    Abstract: Language models have steadily increased in size over the past few years. They achieve a high level of performance on various natural language processing (NLP) tasks such as question answering and summarization. Large language models (LLMs) have been used for generation and can now output human-like text. Due to this, there are other downstream tasks in the realm of dialog that can now harness the… ▽ More

    Submitted 27 January, 2023; originally announced January 2023.

    Comments: Accepted for publication at IWSDS 2023

  7. arXiv:2208.10918  [pdf, other

    cs.HC cs.AI cs.CL

    The DialPort tools

    Authors: Jessica Huynh, Shikib Mehri, Cathy Jiao, Maxine Eskenazi

    Abstract: The DialPort project http://dialport.org/, funded by the National Science Foundation (NSF), covers a group of tools and services that aim at fulfilling the needs of the dialog research community. Over the course of six years, several offerings have been created, including the DialPort Portal and DialCrowd. This paper describes these contributions, which will be demoed at SIGDIAL, including impleme… ▽ More

    Submitted 18 August, 2022; originally announced August 2022.

    Comments: Accepted to SIGDIAL 2022

  8. arXiv:2207.14403  [pdf, other

    cs.CL

    Interactive Evaluation of Dialog Track at DSTC9

    Authors: Shikib Mehri, Yulan Feng, Carla Gordon, Seyed Hossein Alavi, David Traum, Maxine Eskenazi

    Abstract: The ultimate goal of dialog research is to develop systems that can be effectively used in interactive settings by real users. To this end, we introduced the Interactive Evaluation of Dialog Track at the 9th Dialog System Technology Challenge. This track consisted of two sub-tasks. The first sub-task involved building knowledge-grounded response generation models. The second sub-task aimed to exte… ▽ More

    Submitted 28 July, 2022; originally announced July 2022.

    Comments: Presented at LREC 2022 and DSTC9 Workshop at AAAI 2021

  9. arXiv:2207.14393  [pdf, other

    cs.CL cs.AI

    LAD: Language Models as Data for Zero-Shot Dialog

    Authors: Shikib Mehri, Yasemin Altun, Maxine Eskenazi

    Abstract: To facilitate zero-shot generalization in taskoriented dialog, this paper proposes Language Models as Data (LAD). LAD is a paradigm for creating diverse and accurate synthetic data which conveys the necessary structural constraints and can be used to train a downstream neural dialog model. LAD leverages GPT-3 to induce linguistic diversity. LAD achieves significant performance gains in zero-shot s… ▽ More

    Submitted 28 July, 2022; originally announced July 2022.

    Comments: Accepted as a long paper to SIGDial 2022

  10. arXiv:2206.05131  [pdf, other

    cond-mat.soft cond-mat.dis-nn cond-mat.mtrl-sci

    Single-parameter aging in the weakly nonlinear limit

    Authors: Saeed Mehri, Lorenzo Costigliola, Jeppe C. Dyre

    Abstract: Physical aging deals with slow property changes over time caused by molecular rearrangements. This is relevant for non-crystalline materials like polymers and inorganic glasses, both in production and during subsequent use. The Narayanaswamy theory from 1971 describes physical aging - an inherently nonlinear phenomenon - in terms of a linear convolution integral over the so-called material time… ▽ More

    Submitted 6 July, 2022; v1 submitted 10 June, 2022; originally announced June 2022.

    Journal ref: Thermo 2, 160 (2022) [Open access]

  11. arXiv:2205.12673  [pdf, other

    cs.CL

    InstructDial: Improving Zero and Few-shot Generalization in Dialogue through Instruction Tuning

    Authors: Prakhar Gupta, Cathy Jiao, Yi-Ting Yeh, Shikib Mehri, Maxine Eskenazi, Jeffrey P. Bigham

    Abstract: Instruction tuning is an emergent paradigm in NLP wherein natural language instructions are leveraged with language models to induce zero-shot performance on unseen tasks. Instructions have been shown to enable good performance on unseen tasks and datasets in both large and small language models. Dialogue is an especially interesting area to explore instruction tuning because dialogue systems perf… ▽ More

    Submitted 26 October, 2022; v1 submitted 25 May, 2022; originally announced May 2022.

    Comments: EMNLP 2022

  12. Hidden scale invariance in the Gay-Berne model

    Authors: Saeed Mehri, Jeppe C. Dyre, Trond S. Ingebrigtsen

    Abstract: This paper presents a numerical study of the Gay-Berne liquid crystal model with parameters corresponding to calamitic (rod-shaped) molecules. The focus is on the isotropic and nematic phases at temperatures above unity. There we find strong correlations between the virial and potential-energy thermal fluctuations, reflecting the hidden-scale invariance symmetry. This implies the existence of isom… ▽ More

    Submitted 19 June, 2022; v1 submitted 20 May, 2022; originally announced May 2022.

    Journal ref: Phys. Rev. E 105, 064703 (2022)

  13. arXiv:2203.10012  [pdf, ps, other

    cs.CL cs.AI cs.LG

    Report from the NSF Future Directions Workshop on Automatic Evaluation of Dialog: Research Directions and Challenges

    Authors: Shikib Mehri, Jinho Choi, Luis Fernando D'Haro, Jan Deriu, Maxine Eskenazi, Milica Gasic, Kallirroi Georgila, Dilek Hakkani-Tur, Zekang Li, Verena Rieser, Samira Shaikh, David Traum, Yi-Ting Yeh, Zhou Yu, Yizhe Zhang, Chen Zhang

    Abstract: This is a report on the NSF Future Directions Workshop on Automatic Evaluation of Dialog. The workshop explored the current state of the art along with its limitations and suggested promising directions for future work in this important and very rapidly changing area of research.

    Submitted 18 March, 2022; originally announced March 2022.

    Comments: Report from the NSF AED Workshop (http://dialrc.org/AED/)

  14. Predicting nonlinear physical aging of glasses from equilibrium relaxation via the material time

    Authors: Birte Riechers, Lisa A. Roed, Saeed Mehri, Trond S. Ingebrigtsen, Tina Hecksher, Jeppe C. Dyre, Kristine Niss

    Abstract: The noncrystalline glassy state of matter plays a role in virtually all fields of materials science and offers complementary properties to those of the crystalline counterpart. The caveat of the glassy state is that it is out of equilibrium and therefore exhibits physical aging, i.e., material properties change over time. For half a century, the physical aging of glasses has been known to be descr… ▽ More

    Submitted 21 March, 2022; v1 submitted 24 September, 2021; originally announced September 2021.

    Comments: Published in Science Advances

    Journal ref: Sci. Adv. 8, eabl9809 (2022) [Open access]

  15. arXiv:2106.07056  [pdf, other

    cs.CL cs.AI cs.LG

    Schema-Guided Paradigm for Zero-Shot Dialog

    Authors: Shikib Mehri, Maxine Eskenazi

    Abstract: Developing mechanisms that flexibly adapt dialog systems to unseen tasks and domains is a major challenge in dialog research. Neural models implicitly memorize task-specific dialog policies from the training data. We posit that this implicit memorization has precluded zero-shot transfer learning. To this end, we leverage the schema-guided paradigm, wherein the task-specific dialog policy is explic… ▽ More

    Submitted 13 June, 2021; originally announced June 2021.

    Comments: Accepted at SIGDial 2021

  16. arXiv:2106.07055  [pdf, other

    cs.CL cs.AI cs.LG

    GenSF: Simultaneous Adaptation of Generative Pre-trained Models and Slot Filling

    Authors: Shikib Mehri, Maxine Eskenazi

    Abstract: In transfer learning, it is imperative to achieve strong alignment between a pre-trained model and a downstream task. Prior work has done this by proposing task-specific pre-training objectives, which sacrifices the inherent scalability of the transfer learning paradigm. We instead achieve strong alignment by simultaneously modifying both the pre-trained model and the formulation of the downstream… ▽ More

    Submitted 13 June, 2021; originally announced June 2021.

    Comments: Accepted at SIGDial 2021

  17. arXiv:2106.03706  [pdf, other

    cs.CL cs.AI

    A Comprehensive Assessment of Dialog Evaluation Metrics

    Authors: Yi-Ting Yeh, Maxine Eskenazi, Shikib Mehri

    Abstract: Automatic evaluation metrics are a crucial component of dialog systems research. Standard language evaluation metrics are known to be ineffective for evaluating dialog. As such, recent research has proposed a number of novel, dialog-specific metrics that correlate better with human judgements. Due to the fast pace of research, many of these metrics have been assessed on different datasets and ther… ▽ More

    Submitted 7 July, 2021; v1 submitted 7 June, 2021; originally announced June 2021.

  18. arXiv:2103.02650  [pdf, other

    cs.LG

    Successor Feature Sets: Generalizing Successor Representations Across Policies

    Authors: Kianté Brantley, Soroush Mehri, Geoffrey J. Gordon

    Abstract: Successor-style representations have many advantages for reinforcement learning: for example, they can help an agent generalize from past experience to new goals, and they have been proposed as explanations of behavioral and neural data from human and animal learners. They also form a natural bridge between model-based and model-free RL methods: like the former they make predictions about future e… ▽ More

    Submitted 15 March, 2021; v1 submitted 3 March, 2021; originally announced March 2021.

  19. arXiv:2012.00358  [pdf, other

    cond-mat.soft cond-mat.mtrl-sci cond-mat.stat-mech

    Single-parameter aging in a binary Lennard-Jones system

    Authors: Saeed Mehri, Trond S. Ingebrigtsen, Jeppe C. Dyre

    Abstract: This paper studies physical aging by computer simulations of a 2:1 Kob-Andersen binary Lennard-Jones mixture, a system that is less prone to crystallization than the standard 4:1 composition. Starting from thermal-equilibrium states, the time evolution of the following four quantities is monitored following up and down jumps in temperature: the potential energy, the virial, the average squared for… ▽ More

    Submitted 22 January, 2021; v1 submitted 1 December, 2020; originally announced December 2020.

    Journal ref: J. Chem. Phys. 154, 094504 (2021)

  20. arXiv:2011.06486  [pdf, ps, other

    cs.CL

    Overview of the Ninth Dialog System Technology Challenge: DSTC9

    Authors: Chulaka Gunasekara, Seokhwan Kim, Luis Fernando D'Haro, Abhinav Rastogi, Yun-Nung Chen, Mihail Eric, Behnam Hedayatnia, Karthik Gopalakrishnan, Yang Liu, Chao-Wei Huang, Dilek Hakkani-Tür, Jinchao Li, Qi Zhu, Lingxiao Luo, Lars Liden, Kaili Huang, Shahin Shayandeh, Runze Liang, Baolin Peng, Zheng Zhang, Swadheen Shukla, Minlie Huang, Jianfeng Gao, Shikib Mehri, Yulan Feng , et al. (14 additional authors not shown)

    Abstract: This paper introduces the Ninth Dialog System Technology Challenge (DSTC-9). This edition of the DSTC focuses on applying end-to-end dialog technologies for four distinct tasks in dialog systems, namely, 1. Task-oriented dialog Modeling with unstructured knowledge access, 2. Multi-domain task-oriented dialog, 3. Interactive evaluation of dialog, and 4. Situated interactive multi-modal dialog. This… ▽ More

    Submitted 12 November, 2020; originally announced November 2020.

  21. arXiv:2011.00669  [pdf, other

    cs.CL

    Reasoning Over History: Context Aware Visual Dialog

    Authors: Muhammad A. Shah, Shikib Mehri, Tejas Srinivasan

    Abstract: While neural models have been shown to exhibit strong performance on single-turn visual question answering (VQA) tasks, extending VQA to a multi-turn, conversational setting remains a challenge. One way to address this challenge is to augment existing strong neural VQA models with the mechanisms that allow them to retain information from previous dialog turns. One strong VQA model is the MAC netwo… ▽ More

    Submitted 1 November, 2020; originally announced November 2020.

    Comments: Accepted to NLP Beyond Text workshop, EMNLP 2020

  22. arXiv:2010.11853  [pdf, other

    cs.CL

    STAR: A Schema-Guided Dialog Dataset for Transfer Learning

    Authors: Johannes E. M. Mosig, Shikib Mehri, Thomas Kober

    Abstract: We present STAR, a schema-guided task-oriented dialog dataset consisting of 127,833 utterances and knowledge base queries across 5,820 task-oriented dialogs in 13 domains that is especially designed to facilitate task and domain transfer learning in task-oriented dialog. Furthermore, we propose a scalable crowd-sourcing paradigm to collect arbitrarily large datasets of the same quality as STAR. Mo… ▽ More

    Submitted 22 October, 2020; originally announced October 2020.

    Comments: Equal contribution: Johannes E. M. Mosig, Shikib Mehri

  23. arXiv:2010.08684  [pdf, other

    cs.CL cs.AI

    Example-Driven Intent Prediction with Observers

    Authors: Shikib Mehri, Mihail Eric

    Abstract: A key challenge of dialog systems research is to effectively and efficiently adapt to new domains. A scalable paradigm for adaptation necessitates the development of generalizable models that perform well in few-shot settings. In this paper, we focus on the intent classification problem which aims to identify user intents given utterances addressed to the dialog system. We propose two approaches f… ▽ More

    Submitted 24 May, 2021; v1 submitted 16 October, 2020; originally announced October 2020.

    Comments: NAACL 2021

  24. arXiv:2009.13570  [pdf, ps, other

    cs.CL cs.AI

    DialoGLUE: A Natural Language Understanding Benchmark for Task-Oriented Dialogue

    Authors: Shikib Mehri, Mihail Eric, Dilek Hakkani-Tur

    Abstract: A long-standing goal of task-oriented dialogue research is the ability to flexibly adapt dialogue models to new domains. To progress research in this direction, we introduce DialoGLUE (Dialogue Language Understanding Evaluation), a public benchmark consisting of 7 task-oriented dialogue datasets covering 4 distinct natural language understanding tasks, designed to encourage dialogue research in re… ▽ More

    Submitted 30 September, 2020; v1 submitted 28 September, 2020; originally announced September 2020.

    Comments: Benchmark hosted on: https://evalai.cloudcv.org/web/challenges/challenge-page/708/

  25. arXiv:2006.12719  [pdf, ps, other

    cs.CL cs.AI cs.HC

    Unsupervised Evaluation of Interactive Dialog with DialoGPT

    Authors: Shikib Mehri, Maxine Eskenazi

    Abstract: It is important to define meaningful and interpretable automatic evaluation metrics for open-domain dialog research. Standard language generation metrics have been shown to be ineffective for dialog. This paper introduces the FED metric (fine-grained evaluation of dialog), an automatic evaluation metric which uses DialoGPT, without any fine-tuning or supervision. It also introduces the FED dataset… ▽ More

    Submitted 22 June, 2020; originally announced June 2020.

    Comments: Published at to SIGdial 2020

  26. arXiv:2005.00456  [pdf, other

    cs.CL cs.LG

    USR: An Unsupervised and Reference Free Evaluation Metric for Dialog Generation

    Authors: Shikib Mehri, Maxine Eskenazi

    Abstract: The lack of meaningful automatic evaluation metrics for dialog has impeded open-domain dialog research. Standard language generation metrics have been shown to be ineffective for evaluating dialog models. To this end, this paper presents USR, an UnSupervised and Reference-free evaluation metric for dialog. USR is a reference-free metric that trains unsupervised models to measure several desirable… ▽ More

    Submitted 1 May, 2020; originally announced May 2020.

    Comments: Accepted to ACL 2020 as long paper

  27. arXiv:2004.01926  [pdf, other

    cs.CL

    "None of the Above":Measure Uncertainty in Dialog Response Retrieval

    Authors: Yulan Feng, Shikib Mehri, Maxine Eskenazi, Tiancheng Zhao

    Abstract: This paper discusses the importance of uncovering uncertainty in end-to-end dialog tasks, and presents our experimental results on uncertainty classification on the Ubuntu Dialog Corpus. We show that, instead of retraining models for this specific purpose, the original retrieval model's underlying confidence concerning the best prediction can be captured with trivial additional computation.

    Submitted 14 May, 2020; v1 submitted 4 April, 2020; originally announced April 2020.

    Comments: Accepted to ACL 2020 as short paper

  28. arXiv:1912.09863  [pdf, other

    math.PR math.AP math.NA

    Discretizations of Stochastic Evolution Equations in Variational Approach Driven by Jump-Diffusion

    Authors: Sima Mehri, Erfan Salavati, Bijan Z. Zangeneh

    Abstract: Stochastic evolution equations with compensated Poisson noise are considered in the variational approach with monotone and coercive coefficients. Here the Poisson noise is assumed to be time-homogeneous with $σ$-finite intensity measure on a metric space. By using finite element methods and Galerkin approximations, some explicit and implicit discretizations for this equation are presented and thei… ▽ More

    Submitted 19 April, 2022; v1 submitted 20 December, 2019; originally announced December 2019.

    MSC Class: 60H15; 65M60; 60G51; 47H05; 47J35

  29. arXiv:1911.03861  [pdf, other

    cs.CL cs.LG

    Increasing Robustness to Spurious Correlations using Forgettable Examples

    Authors: Yadollah Yaghoobzadeh, Soroush Mehri, Remi Tachet, T. J. Hazen, Alessandro Sordoni

    Abstract: Neural NLP models tend to rely on spurious correlations between labels and input features to perform their tasks. Minority examples, i.e., examples that contradict the spurious correlations present in the majority of data points, have been shown to increase the out-of-distribution generalization of pre-trained language models. In this paper, we first propose using example forgetting to find minori… ▽ More

    Submitted 1 February, 2021; v1 submitted 10 November, 2019; originally announced November 2019.

    Comments: 14 pages, Accepted at EACL2021

  30. arXiv:1909.01322  [pdf, other

    cs.CL cs.HC

    CMU GetGoing: An Understandable and Memorable Dialog System for Seniors

    Authors: Shikib Mehri, Alan W Black, Maxine Eskenazi

    Abstract: Voice-based technologies are typically developed for the average user, and thus generally not tailored to the specific needs of any subgroup of the population, like seniors. This paper presents CMU GetGoing, an accessible trip planning dialog system designed for senior users. The GetGoing system design is described in detail, with particular attention to the senior-tailored features. A user study… ▽ More

    Submitted 3 September, 2019; originally announced September 2019.

    Comments: Accepted to the Dialog for Good (DiGo) workshop (http://dialogforgood.org) at SIGDial 2019

  31. arXiv:1908.10646  [pdf, ps, other

    math.PR

    A Stochastic Gronwall Lemma and Well-Posedness of Path-Dependent SDEs Driven by Martingale Noise

    Authors: Sima Mehri, Michael Scheutzow

    Abstract: We show existence and uniqueness of solutions of stochastic path-dependent differential equations driven by cadlag martingale noise under joint local monotonicity and coercivity assumptions on the coefficients with a bound in terms of the supremum norm. In this set-up, the usual proof using the ordinary Gronwall lemma together with the Burkholder-Davis-Gundy inequality seems impossible. In order t… ▽ More

    Submitted 28 August, 2019; originally announced August 2019.

    Comments: 18 pages

    MSC Class: 34K50; 60H10; 60G57; 34K28; 60G44

  32. arXiv:1908.09890  [pdf, ps, other

    cs.CL cs.AI cs.LG

    Multi-Granularity Representations of Dialog

    Authors: Shikib Mehri, Maxine Eskenazi

    Abstract: Neural models of dialog rely on generalized latent representations of language. This paper introduces a novel training procedure which explicitly learns multiple representations of language at several levels of granularity. The multi-granularity training algorithm modifies the mechanism by which negative candidate responses are sampled in order to control the granularity of learned latent represen… ▽ More

    Submitted 26 August, 2019; originally announced August 2019.

    Comments: Accepted as a long paper at EMNLP 2019

  33. arXiv:1907.10568  [pdf, other

    cs.CL

    Investigating Evaluation of Open-Domain Dialogue Systems With Human Generated Multiple References

    Authors: Prakhar Gupta, Shikib Mehri, Tiancheng Zhao, Amy Pavel, Maxine Eskenazi, Jeffrey P. Bigham

    Abstract: The aim of this paper is to mitigate the shortcomings of automatic evaluation of open-domain dialog systems through multi-reference evaluation. Existing metrics have been shown to correlate poorly with human judgement, particularly in open-domain dialog. One alternative is to collect human annotations for evaluation, which can be expensive and time consuming. To demonstrate the effectiveness of mu… ▽ More

    Submitted 8 September, 2019; v1 submitted 24 July, 2019; originally announced July 2019.

    Comments: SIGDIAL 2019

  34. arXiv:1907.10016  [pdf, other

    cs.CL cs.AI cs.LG

    Structured Fusion Networks for Dialog

    Authors: Shikib Mehri, Tejas Srinivasan, Maxine Eskenazi

    Abstract: Neural dialog models have exhibited strong performance, however their end-to-end nature lacks a representation of the explicit structure of dialog. This results in a loss of generalizability, controllability and a data-hungry nature. Conversely, more traditional dialog systems do have strong models of explicit structure. This paper introduces several approaches for explicitly incorporating structu… ▽ More

    Submitted 23 July, 2019; originally announced July 2019.

    Comments: Accepted to SIGDial 2019

  35. arXiv:1906.00414  [pdf, other

    cs.CL cs.AI

    Pretraining Methods for Dialog Context Representation Learning

    Authors: Shikib Mehri, Evgeniia Razumovskaia, Tiancheng Zhao, Maxine Eskenazi

    Abstract: This paper examines various unsupervised pretraining objectives for learning dialog context representations. Two novel methods of pretraining dialog context encoders are proposed, and a total of four methods are examined. Each pretraining objective is fine-tuned and evaluated on a set of downstream dialog tasks using the MultiWoz dataset and strong performance improvement is observed. Further eval… ▽ More

    Submitted 3 June, 2019; v1 submitted 2 June, 2019; originally announced June 2019.

    Comments: Accepted to ACL 2019

  36. Weak Solutions to Vlasov-McKean Equations under Lyapunov-Type Conditions

    Authors: Sima Mehri, Wilhelm Stannat

    Abstract: We present a Lyapunov type approach to the problem of existence and uniqueness of general law-dependent stochastic differential equations. In the existing literature most results concerning existence and uniqueness are obtained under regularity assumptions of the coefficients w.r.t the Wasserstein distance. Some existence and uniqueness results for irregular coefficients have been obtained by cons… ▽ More

    Submitted 16 November, 2019; v1 submitted 23 January, 2019; originally announced January 2019.

    MSC Class: 60J60; 60H30; 93D30; 35Q83

    Journal ref: Stochastics and Dynamics 2019

  37. arXiv:1901.06613  [pdf, other

    cs.CL cs.AI

    Beyond Turing: Intelligent Agents Centered on the User

    Authors: Maxine Eskenazi, Shikib Mehri, Evgeniia Razumovskaia, Tiancheng Zhao

    Abstract: Most research on intelligent agents centers on the agent and not on the user. We look at the origins of agent-centric research for slot-filling, gaming and chatbot agents. We then argue that it is important to concentrate more on the user. After reviewing relevant literature, some approaches for creating and assessing user-centric systems are proposed.

    Submitted 18 March, 2019; v1 submitted 19 January, 2019; originally announced January 2019.

    Comments: 13 pages

  38. arXiv:1810.11735  [pdf, other

    cs.CL

    Middle-Out Decoding

    Authors: Shikib Mehri, Leonid Sigal

    Abstract: Despite being virtually ubiquitous, sequence-to-sequence models are challenged by their lack of diversity and inability to be externally controlled. In this paper, we speculate that a fundamental shortcoming of sequence generation models is that the decoding is done strictly from left-to-right, meaning that outputs values generated earlier have a profound effect on those generated later. To addres… ▽ More

    Submitted 27 October, 2018; originally announced October 2018.

    Comments: Published as a conference paper at NIPS 2018

  39. Propagation of Chaos for Stochastic Spatially Structured Neuronal Networks with Delay driven by Jump Diffusions

    Authors: Sima Mehri, Michael Scheutzow, Wilhelm Stannat, Bijan Z. Zangeneh

    Abstract: Spatially structured neural networks driven by jump diffusion noise with monotone coefficients, fully path dependent delay and with a disorder parameter are considered. Well-posedness for the associated McKean-Vlasov equation and a corresponding propagation of chaos result in the infinite population limit are proven. Our existence result for the McKean-Vlasov equation is based on the Euler approxi… ▽ More

    Submitted 27 May, 2019; v1 submitted 4 May, 2018; originally announced May 2018.

    Comments: In this version, a shorter title has been chosen. The manuscript has been accepted for publication in Annals of Applied Probability

    MSC Class: primary: 60K35; 92B20 secondary: 65C20; 60F99; 82C80

    Journal ref: Ann. Appl. Probab. 30 (2020), no. 1, 175-207

  40. arXiv:1712.09926  [pdf, other

    cs.LG cs.NE stat.ML

    Rapid Adaptation with Conditionally Shifted Neurons

    Authors: Tsendsuren Munkhdalai, Xingdi Yuan, Soroush Mehri, Adam Trischler

    Abstract: We describe a mechanism by which artificial neural networks can learn rapid adaptation - the ability to adapt on the fly, with little data, to new tasks - that we call conditionally shifted neurons. We apply this mechanism in the framework of metalearning, where the aim is to replicate some of the flexibility of human learning in machines. Conditionally shifted neurons modify their activation valu… ▽ More

    Submitted 3 July, 2018; v1 submitted 28 December, 2017; originally announced December 2017.

    Comments: ICML 2018; Added: additional ablation and speed comparison with MetaNet

  41. arXiv:1705.09792  [pdf, other

    cs.NE cs.LG

    Deep Complex Networks

    Authors: Chiheb Trabelsi, Olexa Bilaniuk, Ying Zhang, Dmitriy Serdyuk, Sandeep Subramanian, João Felipe Santos, Soroush Mehri, Negar Rostamzadeh, Yoshua Bengio, Christopher J Pal

    Abstract: At present, the vast majority of building blocks, techniques, and architectures for deep learning are based on real-valued operations and representations. However, recent work on recurrent neural networks and older fundamental theoretical analysis suggests that complex numbers could have a richer representational capacity and could also facilitate noise-robust memory retrieval mechanisms. Despite… ▽ More

    Submitted 25 February, 2018; v1 submitted 27 May, 2017; originally announced May 2017.

  42. arXiv:1612.07837  [pdf, other

    cs.SD cs.AI

    SampleRNN: An Unconditional End-to-End Neural Audio Generation Model

    Authors: Soroush Mehri, Kundan Kumar, Ishaan Gulrajani, Rithesh Kumar, Shubham Jain, Jose Sotelo, Aaron Courville, Yoshua Bengio

    Abstract: In this paper we propose a novel model for unconditional audio generation based on generating one audio sample at a time. We show that our model, which profits from combining memory-less modules, namely autoregressive multilayer perceptrons, and stateful recurrent neural networks in a hierarchical structure is able to capture underlying sources of variations in the temporal sequences over very lon… ▽ More

    Submitted 11 February, 2017; v1 submitted 22 December, 2016; originally announced December 2016.

    Comments: Published as a conference paper at ICLR 2017

  43. arXiv:1509.03891  [pdf, other

    cs.CV

    On Binary Classification with Single-Layer Convolutional Neural Networks

    Authors: Soroush Mehri

    Abstract: Convolutional neural networks are becoming standard tools for solving object recognition and visual tasks. However, most of the design and implementation of these complex models are based on trail-and-error. In this report, the main focus is to consider some of the important factors in designing convolutional networks to perform better. Specifically, classification with wide single-layer networks… ▽ More

    Submitted 13 September, 2015; originally announced September 2015.

  44. arXiv:1404.5106  [pdf, other

    math.HO math.CO

    The Hockey Stick Theorems in Pascal and Trinomial Triangles

    Authors: Sima Mehri

    Abstract: We have found some patterns in some triangles.

    Submitted 30 May, 2016; v1 submitted 21 April, 2014; originally announced April 2014.