Home Batori Sarvam-1: AI for Multilingual India

Sarvam-1: AI for Multilingual India

0

Sarvam AI, an emerging force in India’s generative AI ecosystem, has launched Sarvam-1, a groundbreaking language model tailored specifically for Indian languages. This open-source model offers support for 10 Indian languages, including Bengali, Hindi, Tamil, along with English. Launched in October 2024, Sarvam-1 builds upon the company’s previous release, Sarvam 2B, which was introduced in August 2024.

What Makes Sarvam-1 Special?

Built with 2 billion parameters, Sarvam-1 delivers powerful natural language processing capabilities. Parameters are a measure of a model’s complexity—more parameters often mean better performance. For perspective, Microsoft’s Phi-3 Mini features 3.8 billion parameters. With fewer than 10 billion parameters, Sarvam-1 is classified as a Small Language Model (SLM), in contrast to Large Language Models (LLMs) like OpenAI’s GPT-4, which boast trillions of parameters.

Technical Overview

Sarvam-1 operates on the NeMo framework developed by NVIDIA, with training powered by 1,024 GPUs from Yotta. A key challenge in creating this model was the scarcity of high-quality datasets for Indian languages. To address this, Sarvam AI built its own comprehensive training dataset—Sarvam-2T.

The Sarvam-2T Dataset

  • 2 trillion tokens, evenly spread across the 10 supported languages.
  • Uses synthetic data generation to improve dataset quality and coverage.
  • Approximately 20% of the corpus is in Hindi, with significant portions in English and programming languages.

This multilingual diversity ensures that Sarvam-1 excels at both monolingual and multilingual tasks, providing robust language capabilities across a wide spectrum.

Performance and Benchmarks

Sarvam-1 stands out in handling Indic language scripts by optimizing token usage, which improves efficiency. Despite having fewer parameters, Sarvam-1 has outperformed several larger models like Meta’s Llama-3 and Google’s Gemma-2 in benchmarks such as MMLU and ARC-Challenge.

Key Achievements

  • TriviaQA Benchmark:
    • Accuracy for Indic languages: 86.11
    • Outperforms Meta’s Llama-3.1 8B, which scored 61.47
  • Inference Speeds:
    • 4-6 times faster than larger models like Gemma-2-9B and Llama-3.1-8B

These metrics highlight Sarvam-1’s computational efficiency and ability to deliver high performance even with fewer resources.

Real-World Applications

With its speed and efficiency, Sarvam-1 is well-suited for edge deployment—critical for use cases in resource-limited environments like rural areas. This makes it a practical tool for various applications, including:

  • Chatbots and virtual assistants
  • Voice recognition systems in regional languages
  • Translation tools for multilingual communication

Open Access for Developers

Sarvam-1 is available for download on Hugging Face, a leading platform for open-source AI models. This accessibility empowers developers, researchers, and businesses to leverage the model for projects that require high-quality Indian language processing.

NO COMMENTS

Exit mobile version