Dialogue with Entropy Simplification Technology: Standing on DeepSeek's Shoulders, Pushing Open the Door to a New World of AI Investment Research | Gaorong Future

高榕创投·February 21, 2025·27·0

From AI-assisted investment research to AI-assisted investment decision-making.

This Spring Festival, DeepSeek set the global AI world ablaze with its high-performance models, low training costs, and open-source approach.

Beyond the technical innovation discourse, we're particularly focused on a critical question: how do you translate model capabilities into concrete application scenarios? How can vertical-domain entrepreneurs seize the opportunities DeepSeek creates?

Entropy Technology was among the first to validate the transformative impact of DeepSeek-R1 and reasoning models on financial investment research — a B2B use case.

Entropy positions itself as an infrastructure provider for AI-powered investment research. The company initially served large domestic asset management institutions, helping clients build investment research data centers. After large language models emerged, Entropy launched its flagship product AlphaEngine — a professional AI search engine for finance professionals. In just over a year, it grew its institutional investor user base from zero to 40,000+, serving more than 5,000 asset management firms domestically. Through AlphaEngine, users can efficiently search and review industry research reports, announcements, and meeting transcripts, and engage in Q&A with their document libraries via FinGPT.

After DeepSeek-R1's open-source release, Entropy moved quickly. Using DeepSeek-R1 combined with high-quality investment research chain-of-thought (CoT) trajectories as references, it distilled FinGPT Deep — China's first financial large language model with deep reasoning capabilities — and deployed it on AlphaEngine.

"The market response has been tremendous since launch. Many users have shared screenshots saying the AI's answers are incredible — a qualitative leap compared to previous models," said Fei Binjie, founder and CEO of Entropy Technology.

With deep reasoning capabilities, what upgrades does AlphaEngine enable? Let's look at two examples.

Query to FinGPT: After DeepSeek ignited the A-share market, which stocks have substantive connections? The model demonstrates multi-step reasoning to arrive at its answer.

Xiaomi's stock price and market cap have been climbing steadily. What is its future investment value? Here's FinGPT's response based on fundamental analysis, business growth potential, and valuation analysis — with precise source tracing and transparent demonstration of the investment research thought process.

"We see the possibility of AI-assisted investment research evolving toward AI-assisted investment decision-making." Recently, we spoke with Fei Binjie about how to stand on DeepSeek's shoulders and push open the door to a new world of AI-powered investment research.

Here are Fei Binjie's insights:

DeepSeek-R1 allows users to train other models through distillation, and with DeepSeek's explosive popularity during the Spring Festival, we thought of using R1 to distill Entropy's FinGPT model. This would enable us to support user access under limited compute conditions. We mobilized fewer than 30 GPUs and spent two weeks migrating and distilling the deep understanding and long-range reasoning capabilities embedded in R1's 60-billion-plus parameters into FinGPT, endowing it with logical reasoning, causal analysis, and multi-step decision-making capabilities.

Leveraging FinGPT's deep reasoning capabilities, AlphaEngine achieved three core functional upgrades: 1) multi-step reasoning Q&A, 2) precise source tracing, and 3) integrated analyst thinking.

The reason we could complete this work rapidly and achieve a disruptive improvement in product experience stems from Entropy's accumulated comprehensive and deep investment research database and high-quality research CoT data.

In reality, investment research data remains far from fully exploited. A few years ago, the research data people frequently encountered included market data, financial data, announcement data, and domestic and international research reports. Today, a new format has emerged — meetings. Financial markets host massive numbers of meetings daily: earnings calls, public and private sessions organized by broker analysts for clients. These meetings are highly timely and tremendously valuable.

Entropy has assembled the most comprehensive collection of meeting transcripts in the market, with 300+ sessions updated daily across all industries. Some might say that to efficiently mine the value of this meeting data, the prerequisite is sufficiently accurate speech-to-text transcription. "If Company A is mistranscribed as Company B, even the strongest AI analysis capabilities are useless." Compounding this, investment research involves extensive specialized terminology. That's why Entropy developed FinAudio, a model specifically for meeting scenarios that accurately transcribes investment research audio into text.

Because FinGPT connects to industry-specific private document libraries, its responses are considerably more aligned with user needs than general-purpose models. For example, when asked about stocks worth watching after Ne Zha 2's release, other large models might reference internet news reports. FinGPT, by contrast, draws on expert interviews, research reports, or meeting notes from broker research chiefs.

Furthermore, Entropy has accumulated hundreds of thousands of high-quality investment research CoT datasets — reasoning process data — for cold start, which also accelerated the entire process and improved model performance.

We've been tracking reasoning models for some time. From OpenAI's Q* to Strawberry to o1, the underlying principle is the same: scaling up inference compute improves model reasoning capabilities and enables the emergence of complex thinking logic. o1 essentially internalized the chain-of-thought reasoning capability that previously existed outside large language models.

The emergence of DeepSeek-R1 marks an entirely new inflection point. DeepSeek didn't simply learn from OpenAI and replicate an open-source version of o1. It developed original insights and innovations in algorithms, parallel computing, and other areas — such as using pure reinforcement learning to elicit reasoning capabilities and the GRPO algorithm. DeepSeek isn't merely standing on the shoulders of giants; it has become a new shoulder itself.

Reading the R1 paper, we were tremendously excited because it brings disruptive change to vertical-domain AI vendors.

1. DeepSeek validated that distillation can transfer a large model's reasoning capabilities to a smaller model, significantly boosting the smaller model's performance even beyond the original large model. The insight for us is that the best way to train an industry model is to take an open-source foundation model and distill an industry-specific smaller model from it.

2. Some have said that with R1, industry models no longer have a reason to exist. We believe the opposite: with R1, we can rapidly train a model that outperforms R1 within a specific vertical domain.

Two logics underpin this: 1) compared to a general-purpose model of equivalent parameter scale, an industry model shows significantly improved performance in its domain after industry-specific training; 2) when a large model's reasoning capabilities are distilled into a smaller model, that smaller model's ability to handle industry problems markedly exceeds other smaller models of comparable scale.

3. General-purpose models and industry models must coexist symbiotically. Imagine a tree: the general-purpose model is the trunk, industry models are the branches. As the trunk grows upward, new branches sprout from the sides. These new branches — industry models — will necessarily outperform the general model by an order of magnitude within their domains.

Only when the general model advances another step do we train new industry models on top of it, continuing to bring new changes to the industry.

4. DeepSeek-R1 validated that pure reinforcement learning alone, without large-scale labeled data or SFT (supervised fine-tuning), can enable a large model to spontaneously generate reasoning processes. This suggests that model distillation should prepare industry-specific training sets, and these sets don't need to rely on manual annotation — they should be generated more through machine-driven, rule-based methods.

In other words, you only need to prepare sufficient industry-specific datasets with definitive conclusions, and you can batch-generate endless training data. For example, have a large model recommend the 10 stocks most likely to rise next week; the model reasons through various approaches to arrive at its answer, and we can use a clearly rule-based evaluation system to assess those 10 stocks' subsequent weekly market performance. When the evaluation model (stock rise/fall) is 100% accurate, training costs drop dramatically and convergence speed improves rapidly.

5. A few years ago, training industry models required practitioners to extensively label data and contribute knowledge. For instance, to teach a model how to analyze a company's investment value, analysts would need to build a knowledge framework and graph for the machine to learn from. Today, large models can spontaneously generate intelligence through reinforcement learning.

This is the significant opportunity R1 brings to third-party AI vendors. Our previous pain point was not having a large enough research team to contribute knowledge. Now we can tackle this through emergence, bringing industry-wide intelligence within reach.

From a practical application perspective, as large model reasoning capabilities evolve, AlphaEngine can now address customer needs that were previously out of reach.

Last year we conducted extensive customer research and compiled an AI-assisted investment "wish list" covering 30+ functions. Previous-generation large models could satisfy roughly 20% of these needs; with reasoning models, we can now essentially address 50%.

One recently cracked capability: AI-driven investment idea generation. Generating investment ideas is the core competency of the asset management industry. Investment research is fundamentally about extracting correct non-consensus insights from uncertain information. Investment opportunities span short-term (under one week), medium-term (1-3 months), and long-term horizons. The longer the horizon, the greater the reasoning difficulty because the depth of thinking required increases. Current reasoning model capabilities are relatively well-suited to identifying short-to-medium-term opportunities.

This also means R1's emergence opens substantial new territory for intelligent investment research. Borrowing the autonomous driving L1-L4 framework: what we previously achieved was L1 — AI-assisted investment research, where AI helps you search more accurately, essentially replacing interns. After R1, we see AI can evolve toward assisted investment decision-making — L2, beginning exploration with short-to-medium-term horizons and gradually advancing toward long-term investing.

Looking further ahead, we believe intelligent investment research products will inevitably evolve toward AI Agents. From a user experience standpoint, beyond self-initiated searches, future large models could daily compile and process briefings for you, proactively push them to you, and even automatically execute certain tasks.

Envisioning the L4 endgame for intelligent investment research: perhaps every fund company becomes an AI. What employees do is feed data to the AI, curate high-quality training sets for it, prepare exclusive premium industry materials, or sync meeting notes after expert interviews — then give all this data to the large model for comprehensive judgment.

We named our product AlphaEngine because we aim to build an engine that helps investors generate Alpha (excess returns).

We firmly believe that deep reasoning AI will become an entirely new source of Alpha. The key is which investment institutions can grasp it faster and internalize AI capabilities.

Why do we say this? Suppose you're an extremely diligent investment manager working 10 or even 12 hours daily. Attending 8 meetings is basically your upper limit, and with massive amounts of new research reports emerging daily, plus articles from various media outlets and influencers, you simply don't have the energy to read everything. AI, however, can read all materials daily and then deeply analyze which contain valuable leads and opportunities. This is something humans cannot do, but machines can.

Going forward, Entropy will continue deepening its focus on investment research, threading AI capabilities throughout investment professionals' workflows. For example, Entropy has already launched an expert services module that uses AI to efficiently match investment researchers with expert resources, maximizing the value of expert calls.

As large models race ahead, vertical-domain vendors should go deeper and more thoroughly. Only by achieving the utmost within an industry can you demonstrate value, and value then converts to profit. As we've always said: "Pick something two centimeters wide, and dig it two meters deep."

Perks

AlphaEngine Research Tool — Free VIP Access

Web: https://www.alphaengine.top

Mini Program: AlphaEngine

App: https://www.alphaengine.top/app/index.html

In the mini program or app, go to "My" page and enter the code "gaorong" to redeem 1 month of VIP access. First come, first served — while supplies last.