Ganga-1B — pre-trained Hindi AI model developed at IITGN

This project aims to develop pocket size open-source large language models for Indic languages, says Prof Mayank Singh

IIT Gandhinagar, AI model in Hindi-Ganga-1B, artificial intelligence, language models, Lingo Research Group, Ganga-1B, Ganga-1B Hindi model, academic research laboratory, Indian express news

Indian Institute of Technology Gandhinagar (File Photo)

The Lingo Research Group at Indian Institute of Technology Gandhinagar (IITGN) has developed an artificial intelligence (AI) model in Hindi-Ganga-1B — “a breakthrough in language models”. Named after the longest river flowing through the country, Ganga-1B is the first pre-trained Hindi model developed by an academic research laboratory.

“The initiative strives to achieve performance in understanding and generating text in Indian languages. The first milestone of which is the release of the Ganga-1B model, trained on an extensive monolingual Hindi language dataset,” said Professor Mayank Singh, assistant professor (Computer Science and Engineering) and head of IITGN’s Lingo Research Group.

The Ganga-1B model has been based on the dataset found on the public domain in regard to Hindi language, including news articles, web documents, books, government publications, educational materials and quality-filtered social media conversations.

“The unity project aims to develop pocket size open-source Large Language Models (LLMs) for Indic languages, created and trained from scratch from Indian data. This initiative will propel the Indian open-source community to build LLMs and chatbots that can be trained and deployed under resource-constrained scenarios,” Professor Mayank Singh told The Indian Express.

Ganga-1B — which has already been downloaded by over 600 people in less than 48 hours following the announcement — was built over nearly 1.5 years to develop, using open-source data from various websites.

The research team has been working on models for other languages including Gujarati, Urdu, Tamil, Telugu and Marathi; they are exploring the use of AI in e-governance for regional languages as well as on an education LLM to support school students and teachers.
Native Indian speakers have further curated the dataset to ensure high quality.

Live Updates | Click here for Union Budget 2024 announcements by FM Nirmala Sitharaman | New Income Tax changes announced - check here

First uploaded on: 09-07-2024 at 05:29 IST

Tags:
IIT Gandhinagar

Schools in Pune, Pimpri Chinchwad, nearby areas shut today as rains batter region

CitiesUpdated: July 25, 2024 12:06 IST

Schools in Pune and nearby areas have been shut today due to incessant rains and a red alert issued by the IMD. Heavy rains and increased water discharge from the Khadakwasla dam have led to a risk of waterlogging in low-lying areas. The IMD has forecasted heavy to intense rain in the next few hours and the collector has urged citizens to take precautions.

View all shorts

Live Blog