Skip to main content

Showing 1–50 of 108 results for author: Ramakrishnan, G

  1. arXiv:2407.06331  [pdf, other

    cs.CL

    CharSS: Character-Level Transformer Model for Sanskrit Word Segmentation

    Authors: Krishnakant Bhatt, Karthika N J, Ganesh Ramakrishnan, Preethi Jyothi

    Abstract: Subword tokens in Indian languages inherently carry meaning, and isolating them can enhance NLP tasks, making sub-word segmentation a crucial process. Segmenting Sanskrit and other Indian languages into subtokens is not straightforward, as it may include sandhi, which may lead to changes in the word boundaries. We propose a new approach of utilizing a Character-level Transformer model for Sanskrit… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  2. arXiv:2406.18135  [pdf

    cs.CL cs.SD eess.AS

    Automatic Speech Recognition for Hindi

    Authors: Anish Saha, A. G. Ramakrishnan

    Abstract: Automatic speech recognition (ASR) is a key area in computational linguistics, focusing on developing technologies that enable computers to convert spoken language into text. This field combines linguistics and machine learning. ASR models, which map speech audio to transcripts through supervised learning, require handling real and unrestricted text. Text-to-speech systems directly work with real… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  3. arXiv:2406.17377  [pdf, other

    cs.CL

    A Three-Pronged Approach to Cross-Lingual Adaptation with Multilingual LLMs

    Authors: Vaibhav Singh, Amrith Krishna, Karthika NJ, Ganesh Ramakrishnan

    Abstract: Low-resource languages, by its very definition, tend to be under represented in the pre-training corpora of Large Language Models. In this work, we investigate three low-resource cross-lingual approaches that enable an LLM adapt to tasks in previously unseen languages. Llama-2 is an LLM where Indic languages, among many other language families, contribute to less than $0.005\%$ of the total $2$ tr… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  4. arXiv:2405.11200  [pdf, other

    cs.CL

    LexGen: Domain-aware Multilingual Lexicon Generation

    Authors: Karthika NJ, Ayush Maheshwari, Atul Kumar Singh, Preethi Jyothi, Ganesh Ramakrishnan, Krishnakant Bhatt

    Abstract: Lexicon or dictionary generation across domains is of significant societal importance, as it can potentially enhance information accessibility for a diverse user base while preserving language identity. Prior work in the field primarily focuses on bilingual lexical induction, which deals with word alignments using mapping-based or corpora-based approaches. Though initiated by researchers, the rese… ▽ More

    Submitted 18 May, 2024; originally announced May 2024.

  5. arXiv:2403.08370  [pdf, other

    cs.CL cs.AI cs.LG

    SMART: Submodular Data Mixture Strategy for Instruction Tuning

    Authors: H S V N S Kowndinya Renduchintala, Sumit Bhatia, Ganesh Ramakrishnan

    Abstract: Instruction Tuning involves finetuning a language model on a collection of instruction-formatted datasets in order to enhance the generalizability of the model to unseen tasks. Studies have shown the importance of balancing different task proportions during finetuning, but finding the right balance remains challenging. Unfortunately, there's currently no systematic method beyond manual tuning or r… ▽ More

    Submitted 7 July, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

  6. arXiv:2403.04890  [pdf, other

    cs.CL

    Few shot chain-of-thought driven reasoning to prompt LLMs for open ended medical question answering

    Authors: Ojas Gramopadhye, Saeel Sandeep Nachane, Prateek Chanda, Ganesh Ramakrishnan, Kshitij Sharad Jadhav, Yatin Nandwani, Dinesh Raghu, Sachindra Joshi

    Abstract: Large Language models (LLMs) have demonstrated significant potential in transforming healthcare by automating tasks such as clinical documentation, information retrieval, and decision support. In this aspect, carefully engineered prompts have emerged as a powerful tool for using LLMs for medical scenarios, e.g., patient clinical scenarios. In this paper, we propose a modified version of the MedQA-… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

  7. arXiv:2402.15472  [pdf, other

    cs.LG

    FAIR: Filtering of Automatically Induced Rules

    Authors: Divya Jyoti Bajpai, Ayush Maheshwari, Manjesh Kumar Hanawal, Ganesh Ramakrishnan

    Abstract: The availability of large annotated data can be a critical bottleneck in training machine learning algorithms successfully, especially when applied to diverse domains. Weak supervision offers a promising alternative by accelerating the creation of labeled training data using domain-specific rules. However, it requires users to write a diverse set of high-quality rules to assign labels to the unlab… ▽ More

    Submitted 4 July, 2024; v1 submitted 23 February, 2024; originally announced February 2024.

    Comments: EACL 2024

  8. arXiv:2402.09811  [pdf, other

    cs.CV

    TEXTRON: Weakly Supervised Multilingual Text Detection through Data Programming

    Authors: Dhruv Kudale, Badri Vishal Kasuba, Venkatapathy Subramanian, Parag Chaudhuri, Ganesh Ramakrishnan

    Abstract: Several recent deep learning (DL) based techniques perform considerably well on image-based multilingual text detection. However, their performance relies heavily on the availability and quality of training data. There are numerous types of page-level document images consisting of information in several modalities, languages, fonts, and layouts. This makes text detection a challenging problem in t… ▽ More

    Submitted 15 February, 2024; originally announced February 2024.

    Comments: Accepted at the WACV 2024 Conference

  9. arXiv:2402.07173  [pdf, other

    cs.CV

    INSITE: labelling medical images using submodular functions and semi-supervised data programming

    Authors: Akshat Gautam, Anurag Shandilya, Akshit Srivastava, Venkatapathy Subramanian, Ganesh Ramakrishnan, Kshitij Jadhav

    Abstract: The necessity of large amounts of labeled data to train deep models, especially in medical imaging creates an implementation bottleneck in resource-constrained settings. In Insite (labelINg medical imageS usIng submodular funcTions and sEmi-supervised data programming) we apply informed subset selection to identify a small number of most representative or diverse images from a huge pool of unlabel… ▽ More

    Submitted 11 February, 2024; originally announced February 2024.

  10. arXiv:2401.06989  [pdf, other

    cs.LG

    Gradient Coreset for Federated Learning

    Authors: Durga Sivasubramanian, Lokesh Nagalapatti, Rishabh Iyer, Ganesh Ramakrishnan

    Abstract: Federated Learning (FL) is used to learn machine learning models with data that is partitioned across multiple clients, including resource-constrained edge devices. It is therefore important to devise solutions that are efficient in terms of compute, communication, and energy consumption, while ensuring compliance with the FL framework's privacy requirements. Conventional approaches to these probl… ▽ More

    Submitted 13 January, 2024; originally announced January 2024.

    Comments: Accepted at WACV-24

  11. arXiv:2312.09599  [pdf

    eess.SP

    Brain-scale Theta Band Functional Connectivity As A Signature of Slow Breathing and Breath-hold Phases

    Authors: Anusha A. S., Pradeep Kumar G., A. G. Ramakrishnan

    Abstract: The study reported herein attempts to understand the neural mechanisms engaged in the conscious control of breathing and breath-hold. The variations in the electroencephalogram (EEG) based functional connectivity (FC) of the human brain during consciously controlled breathing at 2 cycles per minute (cpm), and breath-hold have been investigated and reported here. An experimental protocol involving… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

  12. arXiv:2311.13993  [pdf, other

    cs.CV

    EIGEN: Expert-Informed Joint Learning Aggregation for High-Fidelity Information Extraction from Document Images

    Authors: Abhishek Singh, Venkatapathy Subramanian, Ayush Maheshwari, Pradeep Narayan, Devi Prasad Shetty, Ganesh Ramakrishnan

    Abstract: Information Extraction (IE) from document images is challenging due to the high variability of layout formats. Deep models such as LayoutLM and BROS have been proposed to address this problem and have shown promising results. However, they still require a large amount of field-level annotations for training these models. Other approaches using rule-based methods have also been proposed based on th… ▽ More

    Submitted 23 November, 2023; originally announced November 2023.

    Comments: In Proceedings of ML for Health Conference, 2023 (co-located with Neurips)

  13. arXiv:2310.18590  [pdf, other

    cs.LG cs.AI

    Using Early Readouts to Mediate Featural Bias in Distillation

    Authors: Rishabh Tiwari, Durga Sivasubramanian, Anmol Mekala, Ganesh Ramakrishnan, Pradeep Shenoy

    Abstract: Deep networks tend to learn spurious feature-label correlations in real-world supervised learning tasks. This vulnerability is aggravated in distillation, where a student model may have lesser representational capacity than the corresponding teacher model. Often, knowledge of specific spurious correlations is used to reweight instances & rebalance the learning process. We propose a novel early rea… ▽ More

    Submitted 8 November, 2023; v1 submitted 28 October, 2023; originally announced October 2023.

  14. arXiv:2310.17860  [pdf, other

    astro-ph.IM hep-ex

    A meta-analysis of distance measurements to M87

    Authors: Gunasekar Ramakrishnan, Shantanu Desai

    Abstract: We obtain the median, arithmetic mean, and the weighted mean-based central estimates for the distance to M87 using all the measurements collated in De Grijs et al (2020). We then reconstruct the error distribution for the residuals of the combined measurements and also splitting them based on the tracers used. We then checked for consistency with a Gaussian distribution and other symmetric distrib… ▽ More

    Submitted 26 October, 2023; originally announced October 2023.

    Comments: 16 pages, 1 figure

  15. arXiv:2310.17138  [pdf, other

    cs.CV

    A Classifier Using Global Character Level and Local Sub-unit Level Features for Hindi Online Handwritten Character Recognition

    Authors: Anand Sharma, A. G. Ramakrishnan

    Abstract: A classifier is developed that defines a joint distribution of global character features, number of sub-units and local sub-unit features to model Hindi online handwritten characters. The classifier uses latent variables to model the structure of sub-units. The classifier uses histograms of points, orientations, and dynamics of orientations (HPOD) features to represent characters at global charact… ▽ More

    Submitted 26 October, 2023; originally announced October 2023.

    Comments: 23 pages, 8 jpg figures. arXiv admin note: text overlap with arXiv:2310.08222

  16. arXiv:2310.08222  [pdf, other

    cs.CV

    Structural analysis of Hindi online handwritten characters for character recognition

    Authors: Anand Sharma, A. G. Ramakrishnan

    Abstract: Direction properties of online strokes are used to analyze them in terms of homogeneous regions or sub-strokes with points satisfying common geometric properties. Such sub-strokes are called sub-units. These properties are used to extract sub-units from Hindi ideal online characters. These properties along with some heuristics are used to extract sub-units from Hindi online handwritten characters.… ▽ More

    Submitted 12 October, 2023; originally announced October 2023.

    Comments: 34 pages, 36 jpg figures

  17. arXiv:2310.06702  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    Temporally Aligning Long Audio Interviews with Questions: A Case Study in Multimodal Data Integration

    Authors: Piyush Singh Pasi, Karthikeya Battepati, Preethi Jyothi, Ganesh Ramakrishnan, Tanmay Mahapatra, Manoj Singh

    Abstract: The problem of audio-to-text alignment has seen significant amount of research using complete supervision during training. However, this is typically not in the context of long audio recordings wherein the text being queried does not appear verbatim within the audio file. This work is a collaboration with a non-governmental organization called CARE India that collects long audio health surveys fro… ▽ More

    Submitted 10 October, 2023; originally announced October 2023.

    Comments: Work Accepted in IJCAI-23- AI and Social Good Track

  18. arXiv:2309.02067  [pdf, other

    cs.CV eess.SP

    Histograms of Points, Orientations, and Dynamics of Orientations Features for Hindi Online Handwritten Character Recognition

    Authors: Anand Sharma, A. G. Ramakrishnan

    Abstract: A set of features independent of character stroke direction and order variations is proposed for online handwritten character recognition. A method is developed that maps features like co-ordinates of points, orientations of strokes at points, and dynamics of orientations of strokes at points spatially as a function of co-ordinate values of the points and computes histograms of these features from… ▽ More

    Submitted 5 September, 2023; originally announced September 2023.

    Comments: 21 pages, 12 jpg figures

  19. arXiv:2305.14004  [pdf

    cs.CL

    Sāmayik: A Benchmark and Dataset for English-Sanskrit Translation

    Authors: Ayush Maheshwari, Ashim Gupta, Amrith Krishna, Atul Kumar Singh, Ganesh Ramakrishnan, G. Anil Kumar, Jitin Singla

    Abstract: We release Sāmayik, a dataset of around 53,000 parallel English-Sanskrit sentences, written in contemporary prose. Sanskrit is a classical language still in sustenance and has a rich documented heritage. However, due to the limited availability of digitized content, it still remains a low-resource language. Existing Sanskrit corpora, whether monolingual or bilingual, have predominantly focused on… ▽ More

    Submitted 29 March, 2024; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: LREC-COLING, 2024

  20. arXiv:2305.06677  [pdf, other

    cs.CL cs.AI cs.LG

    INGENIOUS: Using Informative Data Subsets for Efficient Pre-Training of Language Models

    Authors: H S V N S Kowndinya Renduchintala, Krishnateja Killamsetty, Sumit Bhatia, Milan Aggarwal, Ganesh Ramakrishnan, Rishabh Iyer, Balaji Krishnamurthy

    Abstract: A salient characteristic of pre-trained language models (PTLMs) is a remarkable improvement in their generalization capability and emergence of new capabilities with increasing model capacity and pre-training dataset size. Consequently, we are witnessing the development of enormous models pushing the state-of-the-art. It is, however, imperative to realize that this inevitably leads to prohibitivel… ▽ More

    Submitted 19 October, 2023; v1 submitted 11 May, 2023; originally announced May 2023.

  21. arXiv:2305.02997  [pdf, other

    cs.LG cs.AI stat.ML

    When Do Neural Nets Outperform Boosted Trees on Tabular Data?

    Authors: Duncan McElfresh, Sujay Khandagale, Jonathan Valverde, Vishak Prasad C, Benjamin Feuer, Chinmay Hegde, Ganesh Ramakrishnan, Micah Goldblum, Colin White

    Abstract: Tabular data is one of the most commonly used types of data in machine learning. Despite recent advances in neural nets (NNs) for tabular data, there is still an active discussion on whether or not NNs generally outperform gradient-boosted decision trees (GBDTs) on tabular data, with several recent works arguing either that GBDTs consistently outperform NNs on tabular data, or vice versa. In this… ▽ More

    Submitted 30 October, 2023; v1 submitted 4 May, 2023; originally announced May 2023.

    Comments: NeurIPS Datasets and Benchmarks Track 2023

  22. arXiv:2211.07980  [pdf, other

    cs.CL

    A Benchmark and Dataset for Post-OCR text correction in Sanskrit

    Authors: Ayush Maheshwari, Nikhil Singh, Amrith Krishna, Ganesh Ramakrishnan

    Abstract: Sanskrit is a classical language with about 30 million extant manuscripts fit for digitisation, available in written, printed or scannedimage forms. However, it is still considered to be a low-resource language when it comes to available digital resources. In this work, we release a post-OCR text correction dataset containing around 218,000 sentences, with 1.5 million words, from 30 different book… ▽ More

    Submitted 15 November, 2022; originally announced November 2022.

    Comments: Findings of EMNLP, 2022. Code and Data: https://github.com/ayushbits/pe-ocr-sanskrit

  23. arXiv:2211.01454  [pdf, other

    cs.LG

    Speeding up NAS with Adaptive Subset Selection

    Authors: Vishak Prasad C, Colin White, Paarth Jain, Sibasis Nayak, Ganesh Ramakrishnan

    Abstract: A majority of recent developments in neural architecture search (NAS) have been aimed at decreasing the computational cost of various techniques without affecting their final performance. Towards this goal, several low-fidelity and performance prediction methods have been considered, including those that train only on subsets of the training data. In this work, we present an adaptive subset select… ▽ More

    Submitted 2 November, 2022; originally announced November 2022.

  24. arXiv:2210.16892  [pdf, other

    cs.LG

    Partitioned Gradient Matching-based Data Subset Selection for Compute-Efficient Robust ASR Training

    Authors: Ashish Mittal, Durga Sivasubramanian, Rishabh Iyer, Preethi Jyothi, Ganesh Ramakrishnan

    Abstract: Training state-of-the-art ASR systems such as RNN-T often has a high associated financial and environmental cost. Training with a subset of training data could mitigate this problem if the subset selected could achieve on-par performance with training with the entire dataset. Although there are many data subset selection(DSS) algorithms, direct application to the RNN-T is difficult, especially the… ▽ More

    Submitted 30 October, 2022; originally announced October 2022.

  25. arXiv:2210.06996  [pdf, other

    cs.CL cs.LG

    DICTDIS: Dictionary Constrained Disambiguation for Improved NMT

    Authors: Ayush Maheshwari, Piyush Sharma, Preethi Jyothi, Ganesh Ramakrishnan

    Abstract: Domain-specific neural machine translation (NMT) systems (\eg, in educational applications) are socially significant with the potential to help make information accessible to a diverse set of users in multilingual societies. It is desirable that such NMT systems be lexically constrained and draw from domain-specific dictionaries. Dictionaries could present multiple candidate translations for a sou… ▽ More

    Submitted 21 May, 2023; v1 submitted 13 October, 2022; originally announced October 2022.

  26. arXiv:2210.03324  [pdf, other

    cs.LG cs.AI stat.ML

    AutoML for Climate Change: A Call to Action

    Authors: Renbo Tu, Nicholas Roberts, Vishak Prasad, Sibasis Nayak, Paarth Jain, Frederic Sala, Ganesh Ramakrishnan, Ameet Talwalkar, Willie Neiswanger, Colin White

    Abstract: The challenge that climate change poses to humanity has spurred a rapidly developing field of artificial intelligence research focused on climate change applications. The climate change AI (CCAI) community works on a diverse, challenging set of problems which often involve physics-constrained ML or heterogeneous spatiotemporal data. It would be desirable to use automated machine learning (AutoML)… ▽ More

    Submitted 7 October, 2022; originally announced October 2022.

  27. arXiv:2210.01526  [pdf, other

    cs.CV

    DIAGNOSE: Avoiding Out-of-distribution Data using Submodular Information Measures

    Authors: Suraj Kothawade, Akshit Srivastava, Venkat Iyer, Ganesh Ramakrishnan, Rishabh Iyer

    Abstract: Avoiding out-of-distribution (OOD) data is critical for training supervised machine learning models in the medical imaging domain. Furthermore, obtaining labeled medical data is difficult and expensive since it requires expert annotators like doctors, radiologists, etc. Active learning (AL) is a well-known method to mitigate labeling costs by selecting the most diverse or uncertain samples. Howeve… ▽ More

    Submitted 4 October, 2022; originally announced October 2022.

    Comments: Accepted to MICCAI 2022 MILLanD Workshop

  28. arXiv:2210.01520  [pdf, other

    cs.CV

    CLINICAL: Targeted Active Learning for Imbalanced Medical Image Classification

    Authors: Suraj Kothawade, Atharv Savarkar, Venkat Iyer, Lakshman Tamil, Ganesh Ramakrishnan, Rishabh Iyer

    Abstract: Training deep learning models on medical datasets that perform well for all classes is a challenging task. It is often the case that a suboptimal performance is obtained on some classes due to the natural class imbalance issue that comes with medical data. An effective way to tackle this problem is by using targeted active learning, where we iteratively add data points to the training data that be… ▽ More

    Submitted 4 October, 2022; originally announced October 2022.

    Comments: Accepted to MICCAI 2022 MILLanD Workshop

  29. arXiv:2204.04653  [pdf, other

    cs.CV cs.MM

    Counting in the 2020s: Binned Representations and Inclusive Performance Measures for Deep Crowd Counting Approaches

    Authors: Sravya Vardhani Shivapuja, Ashwin Gopinath, Ayush Gupta, Ganesh Ramakrishnan, Ravi Kiran Sarvadevabhatla

    Abstract: The data distribution in popular crowd counting datasets is typically heavy tailed and discontinuous. This skew affects all stages within the pipelines of deep crowd counting approaches. Specifically, the approaches exhibit unacceptably large standard deviation wrt statistical measures (MSE, MAE). To address such concerns in a holistic manner, we make two fundamental contributions. Firstly, we mod… ▽ More

    Submitted 10 April, 2022; originally announced April 2022.

    Comments: Extended version of arXiv:2108.08784. In review

  30. arXiv:2203.16860  [pdf, other

    cs.CV cs.MM cs.SD eess.AS eess.IV

    Investigating Modality Bias in Audio Visual Video Parsing

    Authors: Piyush Singh Pasi, Shubham Nemani, Preethi Jyothi, Ganesh Ramakrishnan

    Abstract: We focus on the audio-visual video parsing (AVVP) problem that involves detecting audio and visual event labels with temporal boundaries. The task is especially challenging since it is weakly supervised with only event labels available as a bag of labels for each video. An existing state-of-the-art model for AVVP uses a hybrid attention network (HAN) to generate cross-modal features for both audio… ▽ More

    Submitted 11 November, 2022; v1 submitted 31 March, 2022; originally announced March 2022.

    Comments: Work under review for ICASSP 2023

  31. arXiv:2203.08212  [pdf, other

    cs.LG

    AUTOMATA: Gradient Based Data Subset Selection for Compute-Efficient Hyper-parameter Tuning

    Authors: Krishnateja Killamsetty, Guttu Sai Abhishek, Aakriti, Alexandre V. Evfimievski, Lucian Popa, Ganesh Ramakrishnan, Rishabh Iyer

    Abstract: Deep neural networks have seen great success in recent years; however, training a deep model is often challenging as its performance heavily depends on the hyper-parameters used. In addition, finding the optimal hyper-parameter configuration, even with state-of-the-art (SOTA) hyper-parameter optimization (HPO) algorithms, can be time-consuming, requiring multiple training runs over the entire data… ▽ More

    Submitted 15 March, 2022; originally announced March 2022.

  32. arXiv:2203.05651  [pdf, other

    cs.LG

    BASIL: Balanced Active Semi-supervised Learning for Class Imbalanced Datasets

    Authors: Suraj Kothawade, Pavan Kumar Reddy, Ganesh Ramakrishnan, Rishabh Iyer

    Abstract: Current semi-supervised learning (SSL) methods assume a balance between the number of data points available for each class in both the labeled and the unlabeled data sets. However, there naturally exists a class imbalance in most real-world datasets. It is known that training models on such imbalanced datasets leads to biased models, which in turn lead to biased predictions towards the more freque… ▽ More

    Submitted 10 March, 2022; originally announced March 2022.

  33. arXiv:2203.03171  [pdf

    astro-ph.EP physics.bio-ph physics.space-ph

    Assessment of Microbial Habitability Across Solar System Targets

    Authors: Dimitra Atri, Todd Godderidge, Dee Cirium, Dimple Patel, Gunasekar Ramakrishnan

    Abstract: With a fleet of exploratory space missions on the horizon, the study of target specific biospheres is crucial for accurately determining the probability of the existence of microbial life on various planetary bodies and prioritising targets accordingly. Although previous studies have compared the potential habitability of objects in our solar system by bulk characteristics, it is less common that… ▽ More

    Submitted 10 March, 2022; v1 submitted 7 March, 2022; originally announced March 2022.

    Comments: 42 pages, 3 figures

  34. arXiv:2203.01644  [pdf, other

    cs.CL

    UDAAN: Machine Learning based Post-Editing tool for Document Translation

    Authors: Ayush Maheshwari, Ajay Ravindran, Venkatapathy Subramanian, Ganesh Ramakrishnan

    Abstract: We introduce UDAAN, an open-source post-editing tool that can reduce manual editing efforts to quickly produce publishable-standard documents in several Indic languages. UDAAN has an end-to-end Machine Translation (MT) plus post-editing pipeline wherein users can upload a document to obtain raw MT output. Further, users can edit the raw translations using our tool. UDAAN offers several advantages:… ▽ More

    Submitted 21 November, 2022; v1 submitted 3 March, 2022; originally announced March 2022.

    Comments: Demo paper at CoDS-COMAD 2023. Vist project website at https://udaanproject.org

  35. arXiv:2202.10680  [pdf, other

    cs.LG cs.IR

    Submodlib: A Submodular Optimization Library

    Authors: Vishal Kaushal, Ganesh Ramakrishnan, Rishabh Iyer

    Abstract: Submodular functions are a special class of set functions which naturally model the notion of representativeness, diversity, coverage etc. and have been shown to be computationally very efficient. A lot of past work has applied submodular optimization to find optimal subsets in various contexts. Some examples include data summarization for efficient human consumption, finding effective smaller sub… ▽ More

    Submitted 23 February, 2022; v1 submitted 22 February, 2022; originally announced February 2022.

    Comments: 23 pages with references, 10 figures, 5 tables

  36. arXiv:2202.03250  [pdf, other

    cs.LG

    Adaptive Mixing of Auxiliary Losses in Supervised Learning

    Authors: Durga Sivasubramanian, Ayush Maheshwari, Pradeep Shenoy, Prathosh AP, Ganesh Ramakrishnan

    Abstract: In several supervised learning scenarios, auxiliary losses are used in order to introduce additional information or constraints into the supervised learning objective. For instance, knowledge distillation aims to mimic outputs of a powerful teacher model; similarly, in rule-based approaches, weak labeling information is provided by labeling functions which may be noisy rule-based approximations to… ▽ More

    Submitted 7 December, 2022; v1 submitted 7 February, 2022; originally announced February 2022.

  37. arXiv:2202.01157  [pdf, other

    cs.CL cs.LG

    Error Correction in ASR using Sequence-to-Sequence Models

    Authors: Samrat Dutta, Shreyansh Jain, Ayush Maheshwari, Souvik Pal, Ganesh Ramakrishnan, Preethi Jyothi

    Abstract: Post-editing in Automatic Speech Recognition (ASR) entails automatically correcting common and systematic errors produced by the ASR system. The outputs of an ASR system are largely prone to phonetic and spelling errors. In this paper, we propose to use a powerful pre-trained sequence-to-sequence model, BART, further adaptively trained to serve as a denoising model, to correct errors of such types… ▽ More

    Submitted 23 August, 2022; v1 submitted 2 February, 2022; originally announced February 2022.

  38. arXiv:2110.04908  [pdf, other

    eess.AS cs.SD

    DITTO: Data-efficient and Fair Targeted Subset Selection for ASR Accent Adaptation

    Authors: Suraj Kothawade, Anmol Mekala, Chandra Sekhara D, Mayank Kothyari, Rishabh Iyer, Ganesh Ramakrishnan, Preethi Jyothi

    Abstract: State-of-the-art Automatic Speech Recognition (ASR) systems are known to exhibit disparate performance on varying speech accents. To improve performance on a specific target accent, a commonly adopted solution is to finetune the ASR model using accent-specific labeled speech. However, acquiring large amounts of labeled speech for specific target accents is challenging. Choosing an informative subs… ▽ More

    Submitted 5 June, 2023; v1 submitted 10 October, 2021; originally announced October 2021.

    Comments: ACL 2023

  39. arXiv:2109.11410  [pdf, other

    cs.LG

    Learning to Robustly Aggregate Labeling Functions for Semi-supervised Data Programming

    Authors: Ayush Maheshwari, Krishnateja Killamsetty, Ganesh Ramakrishnan, Rishabh Iyer, Marina Danilevsky, Lucian Popa

    Abstract: A critical bottleneck in supervised machine learning is the need for large amounts of labeled data which is expensive and time consuming to obtain. However, it has been shown that a small amount of labeled data, while insufficient to re-train a model, can be effectively used to generate human-interpretable labeling functions (LFs). These LFs, in turn, have been used to generate a large amount of a… ▽ More

    Submitted 10 March, 2022; v1 submitted 23 September, 2021; originally announced September 2021.

    Comments: Findings of ACL, 2022

  40. arXiv:2109.05494  [pdf, other

    cs.CL cs.SD eess.AS

    Unsupervised Domain Adaptation Schemes for Building ASR in Low-resource Languages

    Authors: Anoop C S, Prathosh A P, A G Ramakrishnan

    Abstract: Building an automatic speech recognition (ASR) system from scratch requires a large amount of annotated speech data, which is difficult to collect in many languages. However, there are cases where the low-resource language shares a common acoustic space with a high-resource language having enough annotated data to build an ASR. In such cases, we show that the domain-independent acoustic models lea… ▽ More

    Submitted 16 September, 2021; v1 submitted 12 September, 2021; originally announced September 2021.

    Comments: Submitted to ASRU 2021

  41. Wisdom of (Binned) Crowds: A Bayesian Stratification Paradigm for Crowd Counting

    Authors: Sravya Vardhani Shivapuja, Mansi Pradeep Khamkar, Divij Bajaj, Ganesh Ramakrishnan, Ravi Kiran Sarvadevabhatla

    Abstract: Datasets for training crowd counting deep networks are typically heavy-tailed in count distribution and exhibit discontinuities across the count range. As a result, the de facto statistical measures (MSE, MAE) exhibit large variance and tend to be unreliable indicators of performance across the count range. To address these concerns in a holistic manner, we revise processes at various stages of th… ▽ More

    Submitted 19 August, 2021; originally announced August 2021.

    Comments: Accepted at ACM Multimedia (ACMMM) 2021 . Code, pretrained models and interactive visualizations can be viewed at our project page https://deepcount.iiit.ac.in/

  42. arXiv:2108.00373  [pdf, other

    cs.LG

    SPEAR : Semi-supervised Data Programming in Python

    Authors: Guttu Sai Abhishek, Harshad Ingole, Parth Laturia, Vineeth Dorna, Ayush Maheshwari, Rishabh Iyer, Ganesh Ramakrishnan

    Abstract: We present SPEAR, an open-source python library for data programming with semi supervision. The package implements several recent data programming approaches including facility to programmatically label and build training data. SPEAR facilitates weak supervision in the form of heuristics (or rules) and association of noisy labels to the training dataset. These noisy labels are aggregated to assign… ▽ More

    Submitted 5 October, 2022; v1 submitted 1 August, 2021; originally announced August 2021.

    Comments: EMNLP Demonstrations - 2022

  43. arXiv:2106.15324  [pdf, other

    cs.CV cs.AI cs.LG

    Effective Evaluation of Deep Active Learning on Image Classification Tasks

    Authors: Nathan Beck, Durga Sivasubramanian, Apurva Dani, Ganesh Ramakrishnan, Rishabh Iyer

    Abstract: With the goal of making deep learning more label-efficient, a growing number of papers have been studying active learning (AL) for deep models. However, there are a number of issues in the prevalent experimental settings, mainly stemming from a lack of unified implementation and benchmarking. Issues in the current literature include sometimes contradictory observations on the performance of differ… ▽ More

    Submitted 2 November, 2021; v1 submitted 16 June, 2021; originally announced June 2021.

    Comments: 10 pages in main paper, 6 figures in main paper, 2 tables in main paper. 24 pages in total, 15 figures in total, 13 tables in total

  44. arXiv:2106.12491  [pdf, other

    cs.LG stat.ML

    Training Data Subset Selection for Regression with Controlled Generalization Error

    Authors: Durga Sivasubramanian, Rishabh Iyer, Ganesh Ramakrishnan, Abir De

    Abstract: Data subset selection from a large number of training instances has been a successful approach toward efficient and cost-effective machine learning. However, models trained on a smaller subset may show poor generalization ability. In this paper, our goal is to design an algorithm for selecting a subset of the training data, so that the model can be trained quickly, without significantly sacrificin… ▽ More

    Submitted 23 June, 2021; originally announced June 2021.

    Journal ref: ICML 2021

  45. arXiv:2106.05852  [pdf

    eess.AS cs.CL cs.LG cs.SD

    Automatic Speech Recognition in Sanskrit: A New Speech Corpus and Modelling Insights

    Authors: Devaraja Adiga, Rishabh Kumar, Amrith Krishna, Preethi Jyothi, Ganesh Ramakrishnan, Pawan Goyal

    Abstract: Automatic speech recognition (ASR) in Sanskrit is interesting, owing to the various linguistic peculiarities present in the language. The Sanskrit language is lexically productive, undergoes euphonic assimilation of phones at the word boundaries and exhibits variations in spelling conventions and in pronunciations. In this work, we propose the first large scale study of automatic speech recognitio… ▽ More

    Submitted 23 July, 2021; v1 submitted 2 June, 2021; originally announced June 2021.

    Comments: Accepted paper at the 59th Annual Meeting of the Association for Computational Linguistics (ACL 2021 Findings)

  46. arXiv:2105.10193  [pdf, other

    cs.CL cs.LG

    Rule Augmented Unsupervised Constituency Parsing

    Authors: Atul Sahay, Anshul Nasery, Ayush Maheshwari, Ganesh Ramakrishnan, Rishabh Iyer

    Abstract: Recently, unsupervised parsing of syntactic trees has gained considerable attention. A prototypical approach to such unsupervised parsing employs reinforcement learning and auto-encoders. However, no mechanism ensures that the learnt model leverages the well-understood language grammar. We propose an approach that utilizes very generic linguistic knowledge of the language present in the form of sy… ▽ More

    Submitted 21 May, 2021; originally announced May 2021.

    Comments: Accepted at Findings of ACL 2021. 10 Pages, 5 Tables, 2 Figures

  47. arXiv:2105.00043  [pdf, other

    cs.LG cs.CV

    Submodular Mutual Information for Targeted Data Subset Selection

    Authors: Suraj Kothawade, Vishal Kaushal, Ganesh Ramakrishnan, Jeff Bilmes, Rishabh Iyer

    Abstract: With the rapid growth of data, it is becoming increasingly difficult to train or improve deep learning models with the right subset of data. We show that this problem can be effectively solved at an additional labeling cost by targeted data subset selection(TSS) where a subset of unlabeled data points similar to an auxiliary set are added to the training data. We do so by using a rich class of Sub… ▽ More

    Submitted 30 April, 2021; originally announced May 2021.

    Comments: Accepted to ICLR 2021 S2D-OLAD Workshop; https://s2d-olad.github.io/. arXiv admin note: substantial text overlap with arXiv:2103.00128

  48. arXiv:2104.06722  [pdf, other

    cs.CL cs.LG

    WARM: A Weakly (+Semi) Supervised Model for Solving Math word Problems

    Authors: Oishik Chatterjee, Isha Pandey, Aashish Waikar, Vishwajeet Kumar, Ganesh Ramakrishnan

    Abstract: Solving math word problems (MWPs) is an important and challenging problem in natural language processing. Existing approaches to solve MWPs require full supervision in the form of intermediate equations. However, labeling every MWP with its corresponding equations is a time-consuming and expensive task. In order to address this challenge of equation annotation, we propose a weakly supervised model… ▽ More

    Submitted 13 June, 2023; v1 submitted 14 April, 2021; originally announced April 2021.

    Comments: Accepted in COLING'22

  49. arXiv:2104.04998  [pdf, other

    cs.CL cs.LG

    Unsupervised Learning of Explainable Parse Trees for Improved Generalisation

    Authors: Atul Sahay, Ayush Maheshwari, Ritesh Kumar, Ganesh Ramakrishnan, Manjesh Kumar Hanawal, Kavi Arya

    Abstract: Recursive neural networks (RvNN) have been shown useful for learning sentence representations and helped achieve competitive performance on several natural language inference tasks. However, recent RvNN-based models fail to learn simple grammar and meaningful semantics in their intermediate tree representation. In this work, we propose an attention mechanism over Tree-LSTMs to learn more meaningfu… ▽ More

    Submitted 11 April, 2021; originally announced April 2021.

    Comments: 8 Pages, 5 Tables, 4 Figures. To appear at IJCNN 2021

  50. arXiv:2104.04598  [pdf, other

    cs.SD cs.CV cs.LG eess.AS eess.IV

    Cross-Modal learning for Audio-Visual Video Parsing

    Authors: Jatin Lamba, Abhishek, Jayaprakash Akula, Rishabh Dabral, Preethi Jyothi, Ganesh Ramakrishnan

    Abstract: In this paper, we present a novel approach to the audio-visual video parsing (AVVP) task that demarcates events from a video separately for audio and visual modalities. The proposed parsing approach simultaneously detects the temporal boundaries in terms of start and end times of such events. We show how AVVP can benefit from the following techniques geared towards effective cross-modal learning:… ▽ More

    Submitted 21 June, 2021; v1 submitted 3 April, 2021; originally announced April 2021.

    Comments: Work accepted at Interspeech 2021