-
Silicon Valley Is Raving About a Made-in-China AI Model: “Deepseek R1 is one of the most amazing and impressive breakthroughs I’ve ever seen,” said Marc Andreessen, the Silicon Valley venture capitalist who has been advising President Trump.
DeepSeek said training one of its latest models cost $5.6 million, compared with the $100 million to $1 billion range cited last year by Anthropic.
DeepSeek said R1 and V3 both performed better than or close to leading Western models. As of Saturday, the two models were ranked in the top 10 on Chatbot Arena, a platform hosted by University of California, Berkeley, researchers that rates chatbot performance. A Google Gemini model was in the top spot, while DeepSeek bested Anthropic’s Claude and Grok from Elon Musk’s xAI. -
How China’s new AI model DeepSeek is threatening U.S. dominance: “To see the DeepSeek new model, it’s super impressive in terms of both how they have really effectively done an open-source model that does this inference-time compute, and is super-compute efficient,” Microsoft CEO Satya Nadella said at the World Economic Forum in Davos. “We should take the developments out of China very, very seriously.”
DeepSeek also had to navigate the strict semiconductor restrictions that the U.S. government has imposed on China, cutting the country off from access to the most powerful chips, like Nvidia’s H100s. The latest advancements suggest DeepSeek either found a way to work around the rules, or that the export controls were not the chokehold Washington intended.
“They can take a really good, big model and use a process called distillation,” said Benchmark General Partner Chetan Puttagunta. “Basically you use a very large model to help your small model get smart at the thing you want it to get smart at. That’s actually very cost-efficient.” - DeepSeek R1 Explained to your grandma:
-
How small Chinese AI start-up DeepSeek shocked Silicon Valley: DeepSeek’s R1 release sparked a frenzied debate in Silicon Valley about whether better resourced US AI companies, including Meta and Anthropic, can defend their technical edge.
Industry insiders say DeepSeek’s singular focus on research makes it a dangerous competitor because it is willing to share its breakthroughs rather than protect them for commercial gains. DeepSeek has not raised money from outside funds or made significant moves to monetise its models.
DeepSeek claimed it used just 2,048 Nvidia H800s and $5.6mn to train a model with 671bn parameters, a fraction of what OpenAI and Google spent to train comparably sized models.
Ritwik Gupta, AI policy researcher at the University of California, Berkeley, said DeepSeek’s recent model releases demonstrate that “there is no moat when it comes to AI capabilities”. “The first person to train models has to expend lots of resources to get there,” he said. “But the second mover can get there cheaper and more quickly.”
Gupta added that China had a much larger talent pool of systems engineers than the US who understand how to get the best use of computing resources to train and run models more cheaply. - The Empire Strikes Back: China Prepares One Trillion Yuan AI Plan to Rival $500 Billion US Stargate Project.
Sunday, January 26, 2025
DeepSeek - China’s new AI model
Subscribe to:
Post Comments (Atom)
-
Soybean oil causes genetic changes in the brain : Sold in grocery stores as Vegetable Oil , America's most widely consumed oil impaire...
-
The AI Industry Is Steaming Toward A Legal Iceberg : Legal scholars, lawmakers and at least one Supreme Court justice agree that companies w...
-
Billionaire Raj : India's richest 1% earn 22.6% of total income — the highest share since data going back to the 1920s — and hold m...
No comments:
Post a Comment
Relevant and respectful comments only, please