When you purchase through links on our site, we may earn an affiliate commission. Here’s how it works.

Foxconn unveils its own large language model distilled from Meta's Llama 3.1

Foxconn

Foxconn, the company that is well known for assembling iPhones and other Apple products, has announced its first large language model (LLM), called FoxBrain, and intends to use it to improve manufacturing and supply chain management.

In a statement, the Taiwanese company said that FoxBrain was trained using just 120 H100 GPUs from Nvidia. The LLM is based on Meta's Llama 3.1 architecture with 70B parameters using distillation. Distillation of a model involves using a parent model and training the "child" model based on its responses. Foxconn also acknowledged that its LLM wasn't as good as China's DeepSeek distillation model, but the overall performance is very close to world-class standards.

Dr. Yung-Hui Li, Director of the Artificial Intelligence Research Center at Hon Hai Research Institute, said:

"In recent months, the deepening of reasoning capabilities and the efficient use of GPUs have gradually become the mainstream development in the field of AI. Our FoxBrain model adopted a very efficient training strategy, focusing on optimizing the training process rather than blindly accumulating computing power.

Through carefully designed training methods and resource optimization, we have successfully built a local AI model with powerful reasoning capabilities."

Foxconn not only assembles Apple products but also produces Nvidia's artificial intelligence servers. Along with the 120 H100 GPUs, FoxBrain was scaled with Nvidia's Quantum-2 InfiniBand networking, and the training was finished in just about four weeks (with a total computational cost of 2,688 GPU days). Foxconn was able to generate 98B tokens of high-quality pre-training data in traditional Chinese with a context window length of 128 K tokens.

FoxBrain benchmarks
TMMLU+ benchmark results of FoxBrain, Meta-Llama-3.1-70B and Taiwan-Llama-70B

Foxconn and Nvidia's partnership isn't new, and both companies are also working on other projects, including building the world's largest facility for manufacturing Blackwell GPUs.

Nvidia also provided Foxconn with its Taipei-1 Supercomputer to complete the pre-training of the model. Foxconn said that FoxBrain will become an "important engine" to upgrade its three major platforms: Smart Manufacturing, Smart EV, and Smart City.

Report a problem with article
The new iPad Pro
Next Article

Apple's purported foldable iPad Pro might feature under-display Face ID

The TCL 75QM6K
Previous Article

Get QLED TCL 75QM6K 2025 models now for under $1,000 and save $300

Join the conversation!

Login or Sign Up to read and post a comment.

0 Comments - Add comment