At the bottom of Deepseek, why do Deepseek prefer young people who have no job resume?

Reprinted from chaincatcher

01/28/2025·3M

0. Write in front

Recently, Deepseek V3, which appeared successively, R1 made American AI researchers, entrepreneurs and investors start FOMO. This feast can even be as surprising as ChatGPT came out at the end of 2022.

With the thorough open source of Deepseek R1 (HuggingFace can download the model for free for local reasoning) and extremely low prices (the price of 1/100 of Openai O1), DeepSeek has appeared on the Apple AppStore in the United States in just 5 days. champion.

So, where is this mysterious new AI forces hatched by a Chinese quantitative company?

1. The origin ofDeepseek

I first heard that DeepSeek was still in 2021. At that time, when working at Dharma, the genius girl next door published 8 ACL (Natural Language Treatment Top) Luo Fuli, who left his post and joined the fantasy to leave his post. High-Flyer Quant. At that time, everyone was very curious and made a very profitable quantitative company. Why should we recruit talents in the AI field: Does the fantasy side also need to send Paper?

At that time, as far as I knew, most of the AI researchers recruited by the Fantasy Fang were fighting each other and found some cutting-edge directions for exploration. Among them, the core direction was the big model (LLM) and the Wensheng graph model (then Openai Dall-E at the time ) Related.

At the end of 2022, the fantasy prescription gradually began to absorb more and more top AI talents (mostly students from Tsinghua University of Peking University). Under the stimulus of ChatGPT, Liang Wenfeng, CEO, who has accumulated for many years in the AI field, was determined to enter the field of general artificial intelligence: " We have built a new company, starting from a large language model, and then vision."

Yes, this company is DeepSeek. In early 2023, the six Xiaolong Company represented by the wisdom, the dark side of the moon, and Baichuan Intelligence gradually treated the stage in the middle of the stage. To the extent, the company that was hit by these hot money was taken away "attention".

Therefore, in 2023, as a pure research institution, DeepSeek (such as Li Kaifu's 1.1 million things, Yang Zhilin's dark side, Wang Xiaochuan's Baichuan Intelligence, etc.), which are not found in the star, are difficult to independently raise funds from the market. Therefore, the fantasy party decided to peel off DeepSeek and fully funded the development of DeepSeek. In the era of fire cooking in 2023, there was no risk investment company willing to provide funds for Deepseek. One is that most of the PHDs who just graduated in Deepseek are not very well -known top researchers, and the other is because the capital exits in the future.

In an environment full of noise and impetuous, Deepseek began to write its stories in AI exploration:

In November 2023, Deepseek launched the DeepSeek LLM, with a parameter of up to 67 billion, and its performance is close to GPT-4.
In May 2024, DeepSeek-V2 was officially launched.
In December 2024, DeepSeek-V3 was released, and the benchmark test showed that its performance was better than LLAMA 3.1 and Qwen 2.5, which was equivalent to GPT-4O and Claude 3.5 Sonnet, which caused the industry's attention.
2025 年1 月， 第一代有推理能力的大模型模型DeepSeek-R1发布，以OpenAI o1 1/100不到的价格和卓越的性能，让全世界科技界为之战栗: 世界真正意识到，中国The power is really here ... open source will always win!

2. Talent strategy

In the early days, I met some researchers of DeepSeek, mainly studying the direction of AIGC, such as the author of Janus released in November 2024 and the author of DreamCraft3D. Among them, one of them helped me optimize the latest thesis @XingChaoliu.

According to my discovery, most of the researchers I know are very young, basically within 3 years of doctoral students or graduation.

Among them, most of these people are graduate or doctoral students in Beijing, and have a strong accomplishment in academic aspects: mostly researchers who published 3-5 papers.

I asked Deepseek's friends, why did Liang Wenfeng only recruit young people?

They turned me to the words of Liang Wenfeng, CEO of Fantasy Fang, the original words were as follows:

The mysterious veil of the DeepSeek team makes people curious: What is its secret weapon? Foreign media said that this secret weapon is "youthful genius", which is enough to compete with the financially American giants.

In the AI industry, hiring experienced veterans is the norm. Many Chinese AI startups are more inclined to recruit senior researchers or talents with overseas doctoral degrees. However, DeepSeek is against it, and he prefers young people with no job resume.

A headhunter who had collaborated with DeepSeek revealed that DeepSeek did not recruit senior technical personnel. "The work experience is the most in 3-5 years, and the work for more than 8 years is basically PASS." Liang Wenfeng accepted 36 in May 2023 36. During the interview, most developers of DeepSeek are either fresh graduates or people who have just begun to engage in artificial intelligence occupations. He emphasized: " Most of our core technical positions are the fresh graduates or those who have one or two years of work experience."

Without a job resume, how did DeepSeek choose? The answer is to see potential.

Liang Wenfeng once said that doing a long -term experience is not so important. In contrast, basic ability, creativity and love are more important. He believes that perhaps the top 50 AI talents in the world's top 50 are not yet in China, " but we can build such people by ourselves."

This strategy reminds me of the early strategy of Openai. When OpenAI was established at the end of 2015, Sam Altman's core idea was to find young and ambitious researchers. Therefore, except for president Greg Brockman and chief scientist Ilya Sutskever, there are four cores left. ANDREW Karpathy (Andrew Karpathy, Durk Kingma, John Schulman, Wojciech Zaremba) are fresh doctoral graduates, graduated from Stanford University, the University of Amsterdam, Netherlands, Berkeley, California, and New York University.

From left to right: Ilya Sutskever (former chief scientist), Greg Brockman (former president), Andrej Karpathy (former technical person in charge), Durk Kingma (former researcher), John Schulman (former head of the learning team) and Wojciech Zaremba ( The current technical person in charge)

This "young wolf strategy" has made OPENAI taste the sweetness, incubating the father, such as the father of GPT (equivalent to the three private graduates), the father-Ramesh (NYU undergraduate) of the father-E of the Wensheng graph Dall-E, and the GPT-4O's multi-mode person in charge, the three Olympic gold medal winners Prafulla Dhariwal, etc. In the early days of the establishment, OPENAI, which rescued the world plan, was unclear. In the horizontal collision of young people, a raw road was raised to grow OpenAI from the unknown little strokes around DeepMind and grew into a giant.

It was Liang Wenfeng who saw Sam Altman's successful strategy that he firmly chose this road. However, unlike OpenAI waiting for 7 years before I saw ChatGPT. Liang Wenfeng's investment has been effective after more than 2 years, which is the speed of China.

3. Speak for Deepseek

In the article of Deepseek R1, its indicators are amazing. But it also triggered everyone's doubt: there are two doubts,

① The expert mixed (MOE) technology it uses has high requirements for training and high requirements for data. This shows that everyone questioned that DeepSeek's use of OpenAI data training makes sense.
② DEEPSEEK uses a reinforced learning technology with enhanced learning (RL). It has high requirements for hardware, but compared to Meta, Openai's Wanka cluster, Deepseek's training only uses 2048 H800.

Due to the limitations of computing power and the complexity of MOE, this makes it suspicious to see the DeepSeek R1 that is successful with only 5 million US dollars once, but whether your attitude towards R1 worships its "low -cost miracle" or questioning it or questioning it "Hua but not true" cannot ignore the dazzling functional innovation.

Bitmex co -founder Arthur Hayes posted a post saying: Will the rise of Deepseek cause global investors to question the United States excellence? Is the US asset value seriously overvalued?

Wu Enda, a professor at Stanford University, publicly stated at the Davos Forum this year: "My progress was impressed by DeepSeek. I think they can train the model in a very economical way. The latest reasoning model released is excellent ..." Come on "! "

The founder of A16Z, Marc Andreessen said, "Deepseek R1 is one of the most amazing and impressive breakthroughs I have ever seen -and as an open source, it is a profound gift for the world."

DeepSeek, who stood in the corner of the stage in 2023, finally stood on the top of the world AI before the Spring Festival of the lunar calendar in 2025.

4.ARGO and Deepseek

As ARGO's technical developer and AIGC researcher, I took the important features of ARGO to DeepSeek: As a workflow system, the rough original work running work, Argo was carried out with Deepseek R1. In addition, ARGO built LLM as a standard Deepseek R1 and chose to abandon the expensive OpenAI model because the Workflow system usually contains a large amount of token consumption and context information (average> = 10K token), which leads to a high price. The Openai or Claude 3.5, the execution cost of the Workflow is very expensive. Before the web3 users get the real value capture, this kind of overdrawn is a kind of damage to the product.

As Deepseek is getting better and better, ARGO will cooperate closer to Chinese power represented by Deepseek: including Sinification of the Text2Image/Video interface, Sinicization of LLM.

In terms of cooperation, ARGO will invite researchers from DeepSeek to share technical results in the future, and provide Grants for top AI researchers to learn about AI progress for Web3 investors and users.