DeepSeek-V3 has a problem: it keeps claiming to be ChatGPT

AI and large language models are moving so fast it’s hard to keep up. It started with ChatGPT taking over the internet, and now we’ve got names like Gemini, Claude, and the newest contender, DeepSeek-V3. DeepSeek-V3 is an open-source LLM developed by DeepSeek AI, a Chinese company. This model has made headlines for its impressive performance and cost efficiency.

DeepSeek-V3 boasts 671 billion parameters, with 37 billion activated per token, and can handle context lengths up to 128,000 tokens. It was trained on 14.8 trillion tokens over approximately two months, using 2.788 million H800 GPU hours, at a cost of about $5.6 million. This is significantly less than the $100 million spent on training OpenAI's GPT-4.

Despite its capabilities, users have noticed an odd behavior: DeepSeek-V3 sometimes claims to be ChatGPT. For example, when asked, "What model are you?" it responded, "ChatGPT, based on the GPT-4 architecture." This phenomenon, known as "identity confusion," occurs when an LLM misidentifies itself. Similar instances have been observed with other models, like Gemini-Pro, which has claimed to be Baidu's Wenxin when asked in Chinese.

This actually reproduces as of today. In 5 out of 8 generations, DeepSeekV3 claims to be ChatGPT (v4), while claiming to be DeepSeekV3 only 3 times.

Gives you a rough idea of some of their training data distribution. https://t.co/Zk1KUppBQM pic.twitter.com/ptIByn0lcv
— Lucas Beyer (bl16) (@giffmana) December 27, 2024

The cause of this identity confusion seems to come down to training data. DeepSeek-V3 likely picked up text generated by ChatGPT during its training, and somewhere along the way, it started associating itself with the name.

Researchers have even looked into this problem in detail. A paper published in November found that around 25% of proprietary large language models experience this issue. And while it might seem like a harmless glitch, it can become a real problem in fields like education or professional services, where trust in AI outputs is critical.