DeepSeek: China's AI Lab Disrupting the Tech World

DeepSeek: China's AI Lab Disrupting the Tech World

A relative newcomer, DeepSeek, has rapidly ascended to prominence in the artificial intelligence arena, sparking debate and raising questions about the existing power dynamics within the industry. This Chinese AI lab, propelled by its high-performing chatbot app, has captured the attention of Wall Street analysts and tech experts alike, leading to discussions about the sustainability of U.S. dominance in AI and the future demand for AI-powering semiconductors.

From Hedge Fund to AI Powerhouse

DeepSeek's origins are rooted in the world of quantitative finance. Backed by High-Flyer Capital Management, a Chinese hedge fund utilizing AI for trading strategies, DeepSeek emerged from within this innovative environment.

Liang Wenfeng, an AI enthusiast and co-founder of High-Flyer, established the hedge fund in 2019 with a focus on AI-driven algorithms. In 2023, High-Flyer incubated DeepSeek as a dedicated AI research lab, separate from its core financial operations. This lab eventually spun off into an independent entity, retaining the DeepSeek name and benefiting from High-Flyer's continued investment.

From its inception, DeepSeek prioritized building its own data center infrastructure for model training. However, like other Chinese AI firms, it has faced challenges due to U.S. export restrictions on advanced hardware. The company has been compelled to use Nvidia H800 chips, a less potent version of the H100, to train some of its advanced models.

DeepSeek's workforce is known for its youthful energy and talent. The company aggressively recruits doctoral-level AI researchers from leading Chinese universities. Recognizing the importance of diverse perspectives, DeepSeek also hires individuals from non-technical backgrounds to enrich its AI's understanding across a wide spectrum of subjects.

The Rise of DeepSeek's AI Models

DeepSeek unveiled its initial suite of models, including DeepSeek Coder, DeepSeek LLM, and DeepSeek Chat, in late 2023. However, it was the release of its next-generation DeepSeek-V2 family that truly established its reputation.

DeepSeek-V2, a versatile system capable of analyzing both text and images, demonstrated impressive performance on various AI benchmarks. Its cost-effectiveness compared to competing models forced domestic rivals like ByteDance and Alibaba to reduce their model usage fees and, in some cases, offer them for free.

The launch of DeepSeek-V3 further solidified DeepSeek's standing. Internal benchmark tests suggest that DeepSeek V3 surpasses both openly accessible models like Meta's Llama and "closed" models like OpenAI's GPT-4o.

DeepSeek's R1 "reasoning" model is another remarkable achievement. The company asserts that R1 performs comparably to OpenAI's o1 model on critical benchmarks.

As a reasoning model, R1 incorporates a self-checking mechanism, mitigating common errors. While reasoning models may take slightly longer to reach solutions, they offer greater reliability in fields like physics, science, and mathematics.

However, DeepSeek's models are subject to content review by Chinese authorities, ensuring responses align with "core socialist values." Consequently, the DeepSeek chatbot may avoid answering questions on sensitive topics like Tiananmen Square or Taiwan's independence.

Despite these limitations, DeepSeek's user base continues to grow. DeepSeek's website traffic reached 16.5 million visits in March, placing it second in popularity.

In May, DeepSeek released an updated version of its R1 reasoning AI model on Hugging Face, a platform for developers.

Furthering its innovation, DeepSeek introduced V3.2-exp in September, an experimental model designed to significantly reduce inference costs for long-context operations.

A Disruptive Approach to Business

DeepSeek's business model remains somewhat ambiguous. The company's pricing strategy often undercuts market rates, and it provides some services at no cost. Despite significant investor interest, DeepSeek has not sought external funding.

DeepSeek attributes its competitive pricing to breakthroughs in efficiency. However, some experts question the company's reported figures.

Regardless, developers have widely adopted DeepSeek's models, which are available under licenses permitting commercial use. According to Clem Delangue, CEO of Hugging Face, developers have created over 500 derivative models of R1, generating over 2.5 million downloads.

DeepSeek's success has been described as "upending AI," but also "over-hyped," even playing a role in an 18% drop in Nvidia's stock price in January, and triggering a response from OpenAI CEO Sam Altman. In March, U.S. Commerce Department bureaus indicated DeepSeek would be banned on government devices.

Microsoft integrated DeepSeek into its Azure AI Foundry service. During Meta's first-quarter earnings call, CEO Mark Zuckerberg affirmed that AI infrastructure investment remains a "strategic advantage" for Meta. OpenAI has labeled DeepSeek "state-subsidized" and "state-controlled," suggesting the U.S. government consider banning its models.

Nvidia CEO Jensen Huang acknowledged DeepSeek's "excellent innovation" and noted that reasoning models benefit Nvidia due to their high compute requirements.

However, several entities, including South Korea, New York state and US government agencies have restricted DeepSeek's use due to security concerns.

Microsoft vice chairman Brad Smith stated that Microsoft employees are prohibited from using DeepSeek due to data security and potential propaganda risks.

The trajectory of DeepSeek remains uncertain. The company will undoubtedly continue to develop increasingly powerful models. However, the U.S. government is expressing growing concerns about potential foreign influence.

The U.S. government is likely to ban DeepSeek on government devices.

Related articles