India's Multilingual GenAI Language Models 💬 🇮🇳

Gyan AI’s Paramanu: A Family of Novel Efficient Indic Generative Foundation Language Models

We present Gyan AI Paramanu (“atom”), a family of novel language models for Indian languages.

It is a collection of auto-regressive monolingual, bilingual, and multilingual Indic language models pretrained from scratch on a single GPU for 10 Indian languages (Assamese, Bangla, Hindi, Konkani, Maithili, Marathi, Odia, Sanskrit, Tamil, Telugu) across 5 scripts (Bangla, Devanagari, Odia, Tamil, Telugu) of varying sizes ranging from 13.29M to 367.5M.

The models are pretrained with a context size of 1024 on a single GPU. The models are very efficient, small, fast, and powerful. We have also developed an efficient most advanced Indic tokenizer that can even tokenize unseen languages.

In order to avoid the “curse of multi-linguality” in our multilingual mParamanu model, we pretrained on comparable corpora by typological grouping using the same script. We performed human evaluation of our pretrained models for open end text generation on grammar, coherence, creativity, and factuality metrics for Bangla, Hindi, and Sanskrit.

Our Bangla, Hindi, and Sanskrit models outperformed GPT-3.5-Turbo (ChatGPT), Bloom 7B, LLaMa-2 7B, OPT 6.7B, GPT-J 6B, GPTNeo 1.3B, GPT2-XL large language models (LLMs) by a large margin despite being smaller in size by 66 to 20 times compared to standard 7B LLMs.

To run inference on our pretrained models, CPU is enough, and GPU is not needed.

We also instruction-tuned our pretrained Bangla, Hindi, Marathi, Tamil, and Telugu models on 23k instructions in respective languages.

Our pretrained and instruction-tuned models which are first of its kind, most powerful efficient small generative language models ever developed for Indic languages.

The various results in our research lead to the conclusion that high quality generative language models are possible without high amount of compute power and humongous number of parameters.

Our models: Paramanu-Assamese, Paramanu-Bangla, Paramanu-Hindi, Paramanu-Tamil, Paramanu-Telugu, Paramanu-Konkani-Maithili, Paramanu-Odia, Paramanu-Sanskrit, and multilingual mParamanu.

Bangla Evaluation

Model	MMLU	HellaSwag	ARC
Bloom 7B	28.2	32.8	29.2
Bloomz 7B	25.9	31.5	28.2
Paramanu-Bangla 108.5M	31.7	33.45	32.5

Hindi Evaluation

Model	MMLU	HellaSwag	ARC
Bloom 7B	27.5	36.4	29.2
Bloomz 7B	25.9	34.0	28.2
Open Hathi 7B	32.27	25.59	38.48
Airavata 7B	34.96	25.37	44.96
Paramanu-Hindi 367.5M	38.47	37.65	41.7

Zero-shot XNLI and XStoryCloze for Hindi

XNLI	XStoryCloze
33.49	52.42

Tamil Evaluation

Model	MMLU	HellaSwag	ARC
Bloom 7B	26.6	29.4	24.2
Bloomz 7B	26.7	29.5	25.6
Paramanu-Tamil 207M	30.70	32.42	33.8

Telugu Evaluation

Model	MMLU	HellaSwag	ARC
Bloom 7B	26.2	29.2	24.3
Bloomz 7B	25.7	30.7	25.8
Paramanu-Telugu 208M	30.23	32.2	32.9

Generative AI technology for multilingual India and World.

We are seeking for grants/funds to keep pursuing world-class AI research for India and the world.

German Engineering in India

Interested in learning more about how we can grow your business using our Generative State-of-the-art domain-adaptive sector-agnostic AI models?
Contact us today to schedule a consultation.
Email: info@bharatgpts.com

Gyan AI Research