NVIDIA Explains AI Foundation Models (ChatGPT, Stable, ChatRTX)

NVIDIA continues to detail the developments in artificial intelligence, very present in its graphics cards and technologies, as DLSS. After a long article onAvatar Cloud Engine (ACE), the manufacturer returns this week to foundation models, neural networks trained on volumes of data, the very basis of generative AI.

Without further ado, here is the presentation of the foundation models by NVIDIA :

Skyscrapers rest on solid foundations. The same goes for AI-powered applications.

A foundation model is an AI neural network trained on immense amounts of raw data, usually with unsupervised learning.

It is a type of artificial intelligence model trained to understand and generate human-like language. Imagine giving a computer a huge library of books to read and learn, so it can understand the context and meaning of words and sentences, just like a human does.

A base model’s deep knowledge base and ability to communicate in natural language make it useful for a wide range of applications, including text generation and summarization, co-pilot production, and computer code analysis , image and video creation, as well as audio transcription and text-to-speech.

ChatGPT, one of the most notable applications of generative AI, is a chatbot built with OpenAI’s GPT core model. Now in its fourth version, GPT-4 is a large multimodal model that can ingest text or images and generate text or image responses.

Online applications built on base models typically access the models from a data center. But many of these models, and the applications they power, can now run locally on GPU-equipped PCs and workstations NVIDIA GeForce And NVIDIA RTX.


Using foundation templates

Foundation models can serve a variety of functions, including:

  • Language processing: understanding and generating text.
  • Code generation: analysis and debugging of computer code in many programming languages.
  • Visual processing: analyzing and generating images.
  • Speech: generate text to speech and transcribe from speech to text.

They can be used as is or with additional refinement. Rather than training an entirely new AI model for each generative AI application – a costly and time-consuming endeavor – users commonly refine base models for specialized use cases.

The pre-trained base models perform remarkably well, thanks to prompts and data extraction techniques such as generation augmented by extraction or RAG (retrieval-augmented generation). Base models also excel in transfer learningmeaning they can be trained to perform a second task related to their original goal.

For example, a general-purpose large language model (LLM) designed to converse with humans can be trained to act as a customer service chatbot capable of responding to inquiries using a company knowledge base.

Companies across industries are fine-tuning core models to get the best performance from their AI applications.

Types of foundation designs

More than 100 foundation designs are in use, and the number continues to grow. LLMs and image generators are the two most popular basic model types. Most of them can be tried for free – on any hardware – in the catalog ofNVIDIA API.

LLMs are models that understand natural language and can answer queries. Gemma from Google is an example; it excels in text understanding, transformation and code generation. When asked about astronomer Cornelius Gemma, he noted that his “contributions to celestial navigation and astronomy had a significant impact on scientific progress.” He also provided information about his major achievements, legacy and other facts.

By extending the collaboration Gemma models, accelerated by NVIDIA TensorRT-LLM on RTX GPUs, Google’s CodeGemma brings powerful yet lightweight coding capabilities to the community. CodeGemma models are available as pre-trained 7B and 2B variants, specialized in code completion and generation tasks.

Mistral LLM from MistralAI can follow instructions, respond to requests and generate creative texts. Invited to use a variation of the series keyword, decoded, he contributed to the brainstorming of the title of this blog and the text explaining what a foundation model is.

THE Llama 2 by Meta is a cutting-edge LLM that generates text and code in response to prompts.

Mistral and Llama 2 are available in the technical demo NVIDIA ChatRTX, running on RTX PCs and workstations. ChatRTX allows users to personalize these foundation templates by connecting them to personal content (documents, doctor’s notes, and other data) through RAG. It is accelerated by TensorRT-LLM for quick, contextual responses. And because it works locally, results are fast and secure.

Image generators such as Stable Diffusion XL And SDXL Turbo of StabilityAI allow users to generate stunning and realistic images and visuals. The StabilityAI video generator, Stable Video Broadcastuses a generative diffusion model to synthesize video sequences with a single frame as the conditioning frame.

Multimodal foundation models can simultaneously process more than one type of data – such as text and images – to generate more sophisticated results.

A multimodal model that works with both text and images could allow users to upload an image and ask questions about it. These types of models are quickly finding their way into real-world applications such as customer service, where they can serve as faster, more user-friendly versions of traditional manuals.

Cosmos 2 is Microsoft’s revolutionary multimodal model designed to understand and reason about the visual elements of images.

Think globally, run AI models locally

GeForce RTX and NVIDIA RTX GPUs can run foundation models locally.

The results are quick and safe. Rather than relying on cloud-based services, users can leverage applications like ChatRTX to process sensitive data on their local PC without sharing the data with a third party or needing an internet connection.

Users can choose from a growing catalog of open foundation models to download and run on their own hardware. This helps reduce costs compared to using cloud-based applications and APIs, and eliminates latency and network connectivity issues.

You can find graphics cards NVIDIA GeForce RTX on Amazon, Cdiscount And there Fnac.

source site-121