About
Activity
-
It went under the radar but OpenAI released their most performant GPT-4o assistant model yesterday. - Structured outputs with 100% reliability - 4x…
It went under the radar but OpenAI released their most performant GPT-4o assistant model yesterday. - Structured outputs with 100% reliability - 4x…
Liked by Mitodru Niyogi
-
Some HRs/Hiring Managers who were never researchers should understand that most researchers are not driven by money, otherwise they would have…
Some HRs/Hiring Managers who were never researchers should understand that most researchers are not driven by money, otherwise they would have…
Posted by Mitodru Niyogi
-
Thrilled to announce the launch of Aurascape AI and our oversubscribed $12.8M seed funding round! 🚀 As a company born in the AI era, we are…
Thrilled to announce the launch of Aurascape AI and our oversubscribed $12.8M seed funding round! 🚀 As a company born in the AI era, we are…
Liked by Mitodru Niyogi
Experience & Education
Licenses & Certifications
Publications
-
PARAMANU-AYN: An Efficient Novel Generative and Instruction-tuned Language Model for Indian Legal Case Documents
ArXiV
In this paper, we present PARAMANU-AYN, a language model based exclusively on case documents of the Supreme Court of India, the Constitution of India, and the Indian Penal Code. The novel Auto Regressive (AR) decoder based model is pretrained from scratch at a context size of 8192. We evaluated our pretrained legal model on perplexity metrics. We also instruction-tuned our pretrained model on a set of 10,763 instructions covering various legal tasks such as legal reasoning, judgement…
In this paper, we present PARAMANU-AYN, a language model based exclusively on case documents of the Supreme Court of India, the Constitution of India, and the Indian Penal Code. The novel Auto Regressive (AR) decoder based model is pretrained from scratch at a context size of 8192. We evaluated our pretrained legal model on perplexity metrics. We also instruction-tuned our pretrained model on a set of 10,763 instructions covering various legal tasks such as legal reasoning, judgement explanation, legal clause generation, legal drafting, legal contract drafting, case summarization, constitutional question-answering, etc. We also evaluated the responses of prompts for instruction-tuned models by GPT-3.5-Turbo on clarity, relevance, completeness, and legal reasoning metrics in a scale of 10. Our model can be run on CPU and achieved 42.46 tokens/sec CPU inference speed. We found that our models, despite not being pretrained on legal books, various legal contracts, and legal documents, were able to learn the domain knowledge required for drafting various legal contracts and legal clauses, and generalize to draft legal contracts and legal clauses with limited instruction tuning. Hence, we conclude that for a strong domain-specialized generative language model (such as legal), very large amounts of data are not required to develop models from scratch. We believe that this work is the first attempt to make a dedicated generative legal language model from scratch for Indian Supreme Court jurisdiction or in legal NLP overall. We plan to release our Paramanu-Ayn model at this https://www.bharatgpts.com
Other authorsSee publication -
Paramanu: A Family of Novel Efficient Indic Generative Foundation Language Models
ArXiV
We present Gyan AI Paramanu ("atom"), a family of novel language models for Indian languages. It is a collection of auto-regressive monolingual, bilingual, and multilingual Indic language models pretrained from scratch on a single GPU for 10 Indian languages (Assamese, Bangla, Hindi, Konkani, Maithili, Marathi, Odia, Sanskrit, Tamil, Telugu) across 5 scripts (Bangla, Devanagari, Odia, Tamil, Telugu) of varying sizes ranging from 13.29M to 367.5M.The models are pretrained with a context size of…
We present Gyan AI Paramanu ("atom"), a family of novel language models for Indian languages. It is a collection of auto-regressive monolingual, bilingual, and multilingual Indic language models pretrained from scratch on a single GPU for 10 Indian languages (Assamese, Bangla, Hindi, Konkani, Maithili, Marathi, Odia, Sanskrit, Tamil, Telugu) across 5 scripts (Bangla, Devanagari, Odia, Tamil, Telugu) of varying sizes ranging from 13.29M to 367.5M.The models are pretrained with a context size of 1024 on a single GPU. The models are very efficient, small, fast, and powerful. We have also developed an efficient most advanced Indic tokenizer that can even tokenize unseen languages. In order to avoid the "curse of multi-linguality" in our multilingual mParamanu model, we pretrained on comparable corpora by typological grouping using the same script. We performed human evaluation of our pretrained models for open end text generation on grammar, coherence, creativity, and factuality metrics for Bangla, Hindi, and Sanskrit. Our Bangla, Hindi, and Sanskrit models outperformed GPT-3.5-Turbo (ChatGPT), Bloom 7B, LLaMa-2 7B, OPT 6.7B, GPT-J 6B, GPTNeo 1.3B, GPT2-XL large language models (LLMs) by a large margin despite being smaller in size by 66 to 20 times compared to standard 7B LLMs. To run inference on our pretrained models, CPU is enough, and GPU is not needed. We also instruction-tuned our pretrained Bangla, Hindi, Marathi, Tamil, and Telugu models on 23k instructions in respective languages. Our pretrained and instruction-tuned models which are first of its kind, most powerful efficient small generative language models ever developed for Indic languages, and the various results lead to the conclusion that high quality generative language models are possible without high amount of compute power and humongous number of parameters.
Other authorsSee publication -
Neural Models for Source Code Synthesis and Completion
Heidelberg University ArXiV
Natural language (NL) to code suggestion systems assist developers in Integrated Development Environments (IDEs) by translating NL utterances into compilable code snippet. The current approaches mainly involve hard-coded, rule-based systems based on semantic parsing. These systems make heavy use of hand-crafted rules that map patterns in NL or elements in its syntax parse tree to various query constructs and can only work on a limited subset of NL with a restricted NL syntax. These systems are…
Natural language (NL) to code suggestion systems assist developers in Integrated Development Environments (IDEs) by translating NL utterances into compilable code snippet. The current approaches mainly involve hard-coded, rule-based systems based on semantic parsing. These systems make heavy use of hand-crafted rules that map patterns in NL or elements in its syntax parse tree to various query constructs and can only work on a limited subset of NL with a restricted NL syntax. These systems are unable to extract semantic information from the coding intents of the developer, and often fail to infer types, names, and the context of the source code to get accurate system-level code suggestions. In this master thesis, we present sequence-to-sequence deep learning models and training paradigms to map NL to general-purpose programming languages that can assist users with suggestions of source code snippets, given a NL intent, and also extend auto-completion functionality of the source code to users while they are writing source code. The developed architecture incorporates contextual awareness into neural models which generate source code tokens directly instead of generating parse trees/abstract meaning representations from the source code and converting them back to source code. The proposed pretraining strategy and the data augmentation techniques improve the performance of the proposed architecture. The proposed architecture has been found to exceed the performance of a neural semantic parser, TranX, based on the BLEU-4 metric by 10.82%. Thereafter, a finer analysis for the parsable code translations from the NL intent for CoNaLA challenge was introduced. The proposed system is bidirectional as it can be also used to generate NL code documentation given source code. Lastly, a RoBERTa masked language model for Python was proposed to extend the developed system for code completion.
-
Learning Multilingual Embeddings for Cross-Lingual Information Retrieval in the Presence of Topically Aligned Corpora
CoRR ArXiV
Cross-lingual information retrieval is a challenging task in the absence of aligned parallel corpora. In this paper, we address this problem by considering topically aligned corpora designed for evaluating an IR setup. To emphasize, we neither use any sentence-aligned corpora or document-aligned corpora, nor do we use any language specific resources such as dictionary, thesaurus, or grammar rules. Instead, we use an embedding into a common space and learn word correspondences directly from…
Cross-lingual information retrieval is a challenging task in the absence of aligned parallel corpora. In this paper, we address this problem by considering topically aligned corpora designed for evaluating an IR setup. To emphasize, we neither use any sentence-aligned corpora or document-aligned corpora, nor do we use any language specific resources such as dictionary, thesaurus, or grammar rules. Instead, we use an embedding into a common space and learn word correspondences directly from there. We test our proposed approach for bilingual IR on standard FIRE datasets for Bangla, Hindi and English. The proposed method is superior to the state-of-the-art method not only for IR evaluation measures but also in terms of time requirements. We extend our method successfully to the trilingual setting.
Other authorsSee publication -
Discovering conversational topics and emotions from Demonetization tweets in India
Springer
In Proceedings of the 2017 International Conference on Computational Intelligence: Theories, Applications and Future Directions (ICCI 2017), IIT Kanpur, India
Other authors -
IR-IITBHU at TREC 2016 Open Search Track: Retrieving documents using Divergence From Randomness model in Terrier
NIST Special Publication: SP 500-321
The Twenty-Fifth Text REtrieval Conference Proceedings (TREC 2016), Gaithersburg, Maryland.
Other authors
Courses
-
C Programming & Fundamentals
-
-
Computer Graphics
-
-
Computer Networks
-
-
Cryptography & Network Security
-
-
Cyber law & Security
-
-
DBMS
-
-
Data Mining & Data Warehousing
-
-
Data Structures & Algorithms
-
-
Design & Analysis of Algorithms
-
-
Distributed Computing Systems
-
-
E-commerce
-
-
Engineering Economics
-
-
Formal Languages & Automata Theory
-
-
Image Processing
-
-
Industrial Management
-
-
Information Theory & Coding
-
-
Internetworking
-
-
Multimedia Systems
-
-
Object Oriented Programming
-
-
Operating System
-
-
Operation Research
-
-
Web Technology
-
Honors & Awards
-
4th All Bengal Mathematics Talent Search Exam
JMMC RESEARCH FOUNDATION
Ranked 3rd in All Bengal Mathematics Talent Search Exam 2011 conducted by JMMC
-
All India Camel Colour Contest (State Level)
Camlin Limited
Awarded All India Camel Colour Contest Prize at state level thrice in a row
-
Best All-rounder
Maria's Day School
Awarded award in class 10 for interdisciplinary achievements in academics,co-curricular activities and sports.
-
Best Speaker
Council for the Indian School Certificate Examinations
Best Speaker at Secondary School in class X and Runners up in Frank Anthony Memorial Debate(City Round)
-
Bivakar(5th year) Distinction in Painting
Bangiya Sangeet Parishad, Rabindra Bharati University, Kolkata
Fifth year qualified in painting with triple distinction awarded by Bangiya Sangeeet Parishad
-
Limca Quiz City Round,Bengal ALSOC, &TTIS School Quizzes
-
Ranked 3rd in Limca Quiz City Round & top 8 in Bengal ALSOC, TTIS School Quizzes
Languages
-
English
Native or bilingual proficiency
-
Bengali
Native or bilingual proficiency
-
Hindi
Professional working proficiency
-
German
Limited working proficiency
More activity by Mitodru
-
We’re opening applications for the next round of Llama Impact Grants. The program supports organizations pursuing ideas to use Llama to address…
We’re opening applications for the next round of Llama Impact Grants. The program supports organizations pursuing ideas to use Llama to address…
Liked by Mitodru Niyogi
-
Rise and Exit of AI Unicorns Recent high-profile acquisitions of AI startups Character.ai (Google), Adept (Amazon), and Inflection AI (Microsoft)…
Rise and Exit of AI Unicorns Recent high-profile acquisitions of AI startups Character.ai (Google), Adept (Amazon), and Inflection AI (Microsoft)…
Liked by Mitodru Niyogi
-
As the [ICML] Int'l Conference on Machine Learning in Vienna drew to a close, I am proud to reflect on the progress we're making at AstraZeneca in…
As the [ICML] Int'l Conference on Machine Learning in Vienna drew to a close, I am proud to reflect on the progress we're making at AstraZeneca in…
Liked by Mitodru Niyogi
Other similar profiles
-
Manesh Narayan
Connect -
Verena Weber
Connect -
Jyoti Prajapati
Connect -
Danish Pruthi
Connect -
Sayan Dutta
Senior Engineer @ Qualcomm | AIR 2 GATE '22 | Rank-1 ISI Entrance | IISc Bangalore▪️IIT Bombay▪️Jadavpur University
Connect -
Ananya Chavali
Decision Analytics Associate @ ZS Associates | NITK'24
Connect -
Ayush Goel
Software Development Engineer @ Amazon (Payments) || Ex-TA @ Scaler Academy || AWS Certified || Distributed Systems || DSA || Problem Solving || Java
Connect -
Heng-Tze Cheng
Connect -
Kripabandhu Ghosh
Assistant Professor at IISER KOLKATA
Connect -
Ashok Narayan Tripathi
PhD Scholar, IIT Kanpur | High Voltage Engineering | India Representative, IEEE DEIS YP | Chair, IEEE DEIS SBC |
Connect
Explore collaborative articles
We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.
Explore More