Sunday, January 26, 2025

DeepSeek - China’s new AI model

  • Silicon Valley Is Raving About a Made-in-China AI Model: “Deepseek R1 is one of the most amazing and impressive breakthroughs I’ve ever seen,” said Marc Andreessen, the Silicon Valley venture capitalist who has been advising President Trump.

    DeepSeek said training one of its latest models cost $5.6 million, compared with the $100 million to $1 billion range cited last year by Anthropic.

    DeepSeek said R1 and V3 both performed better than or close to leading Western models. As of Saturday, the two models were ranked in the top 10 on Chatbot Arena, a platform hosted by University of California, Berkeley, researchers that rates chatbot performance. A Google Gemini model was in the top spot, while DeepSeek bested Anthropic’s Claude and Grok from Elon Musk’s xAI.


  • How China’s new AI model DeepSeek is threatening U.S. dominance: “To see the DeepSeek new model, it’s super impressive in terms of both how they have really effectively done an open-source model that does this inference-time compute, and is super-compute efficient,” Microsoft CEO Satya Nadella said at the World Economic Forum in Davos. “We should take the developments out of China very, very seriously.”

    DeepSeek also had to navigate the strict semiconductor restrictions that the U.S. government has imposed on China, cutting the country off from access to the most powerful chips, like Nvidia’s H100s. The latest advancements suggest DeepSeek either found a way to work around the rules, or that the export controls were not the chokehold Washington intended.

    “They can take a really good, big model and use a process called distillation,” said Benchmark General Partner Chetan Puttagunta. “Basically you use a very large model to help your small model get smart at the thing you want it to get smart at. That’s actually very cost-efficient.”


  • DeepSeek R1 Explained to your grandma:



  • How small Chinese AI start-up DeepSeek shocked Silicon Valley: DeepSeek’s R1 release sparked a frenzied debate in Silicon Valley about whether better resourced US AI companies, including Meta and Anthropic, can defend their technical edge.

    Industry insiders say DeepSeek’s singular focus on research makes it a dangerous competitor because it is willing to share its breakthroughs rather than protect them for commercial gains. DeepSeek has not raised money from outside funds or made significant moves to monetise its models.

    DeepSeek claimed it used just 2,048 Nvidia H800s and $5.6mn to train a model with 671bn parameters, a fraction of what OpenAI and Google spent to train comparably sized models.

    Ritwik Gupta, AI policy researcher at the University of California, Berkeley, said DeepSeek’s recent model releases demonstrate that “there is no moat when it comes to AI capabilities”. “The first person to train models has to expend lots of resources to get there,” he said. “But the second mover can get there cheaper and more quickly.”

    Gupta added that China had a much larger talent pool of systems engineers than the US who understand how to get the best use of computing resources to train and run models more cheaply.


  • The Empire Strikes Back: China Prepares One Trillion Yuan AI Plan to Rival $500 Billion US Stargate Project.


No comments:

Post a Comment

Relevant and respectful comments only, please