Last month, Microsoft introduced the new Phi-3.5 family of lightweight models with several improvements. The Phi-3.5-MoE was the first model in the Phi family to leverage Mixture of Experts (MoE) technology.
Microsoft has now announced that the Phi-3.5-MoE model is available in Azure AI Studio and GitHub through serverless API. This will allow developers to use the Phi-3.5-MoE model in their workflows and applications without worrying about the underlying infrastructure.
Phi-3.5-MoE and other Phi-3.5 models are available in East US 2, East US, North Central US, South Central US, West US 3, West US, and Sweden Central regions. As a serverless offering, developers pay based on their consumption. The cost is $0.00013 per 1K input tokens and $0.00052 per 1K output tokens.
In popular AI benchmarks, Phi-3.5-MoE outperforms nearly all other open models in its class, including Llama-3.1 8B, Gemma-2-9B, and Mistral-Nemo-12B, despite utilizing fewer active parameters compared to other models. Microsoft also claims that this model delivers performance comparable to, or slightly exceeding, Google"s Gemini-1.5-Flash, one of the most popular closed-source models in its class.
This MoE model has 42B total parameters but activates only 6.6B parameters and features 16 experts. The Microsoft Research team designed the model from scratch to boost its performance, multi-lingual capability, and safety measures. Also, instead of using traditional training methods, the Microsoft Phi team developed a new training method called GRIN (GRadient INformed) MoE to improve the use of parameters and expert specialization. Using this new training method, Microsoft was able to achieve significantly higher quality gains compared to traditional training methods.
With its impressive performance and accessibility, Phi-3.5-MoE is poised to empower developers and accelerate innovation in the AI landscape. Its serverless availability and pay-per-use model further lower the barriers to entry, making advanced AI capabilities more attainable than ever before.
Source: Microsoft