Hugging Face HUGS accelerates the development of AI apps based on open models

Over the past year or so, open-source AI models have significantly caught up in performance to the popular closed-source models from OpenAI, Google, and others. Yet, developers haven't widely adopted them due to the overhead associated with deploying and maintaining these models on different hardware. To solve this problem, Hugging Face today announced Hugging Face Generative AI Services (HUGS), an optimized, zero-configuration inference microservice for developers to accelerate the development of AI apps based on open models.

HUGS model deployments also provide an OpenAI-compatible API for a drop-in replacement of existing apps built on top of model provider APIs. This will help developers easily migrate from OpenAI model-based apps to open-source model-based apps.

HUGS is built on open-source technologies such as Text Generation Inference and Transformers. It is optimized to run open models on various hardware accelerators, including NVIDIA GPUs, AMD GPUs, AWS Inferentia (coming soon), and Google TPUs (coming soon). Thirteen popular open LLMs, including Meta's LLaMa, are supported today, and more LLMs will be supported in the future. HUGS can be deployed on Amazon Web Services, Google Cloud Platform, and Microsoft Azure (coming soon). HUGS offers on-demand pricing based on the uptime of each container on public clouds.

According to Hugging Face, HUGS offers the following advantages:

In YOUR infrastructure: Deploy open models within your own secure environment. Keep your data and models off the Internet!
Zero-configuration Deployment: HUGS reduces deployment time from weeks to minutes with zero-configuration setup, automatically optimizing the model and serving configuration for your NVIDIA, AMD GPU or AI accelerator.
Hardware-Optimized Inference: Built on Hugging Face's Text Generation Inference (TGI), HUGS is optimized for peak performance across different hardware setups.
Hardware Flexibility: Run HUGS on a variety of accelerators, including NVIDIA GPUs, AMD GPUs, with support for AWS Inferentia and Google TPUs coming soon.
Model Flexibility: HUGS is compatible with a wide selection of open-source models, ensuring flexibility and choice for your AI applications.
Industry Standard APIs: Deploy HUGS easily using Kubernetes with endpoints compatible with the OpenAI API, minimizing code changes.
Enterprise Distribution: HUGS is an enterprise distribution of Hugging Face open source technologies, offering long-term support, rigorous testing, and SOC2 compliance.
Enterprise Compliance: Minimizes compliance risks by including necessary licenses and terms of service.

You can learn more about HUGS here. With its focus on open-source and ease of use, HUGS has the potential to democratize access to powerful AI models and accelerate the development of innovative AI applications.