MiniMax's Self-Developed "Abab6" — China's First Trillion-Parameter MoE Large Language Model — Goes Fully Live | YunXun
Large parameters solve complex problems + MoE improves computational efficiency
Recently, MiniMax fully released its large language model Abab6, the first MoE (Mixture-of-Experts) trillion-parameter LLM in China. With its MoE architecture, Abab6 gains the ability to handle complex tasks that come with massive parameter counts, while also enabling the model to train on sufficient data within a given timeframe — a major boost to computational efficiency. It also addresses issues that Abab5.5 encountered in more complex scenarios requiring finer-grained outputs.
Yunqi Capital has long tracked AI developments and led MiniMax's angel round in late 2021. We view model compute costs as a critical challenge for application-layer deployment today, and the MoE architecture represents one of the most promising solutions currently available — combining several expert models into a single larger model with relatively smaller parameter counts per expert, thereby reducing inference costs.
Below is a detailed evaluation of Abab6 under the MoE architecture, along with application examples.
Over the past six-plus months, MiniMax has served nearly a thousand customers, including major internet companies such as Kingsoft Office, Xiaohongshu, Tencent, Xiaomi, and China Literature. The MiniMax open platform processes tens of billions of tokens on average per day.
Abab6's parameters are an order of magnitude larger than the previous version. A bigger model means Abab6 can learn more nuanced patterns from its training corpus and accomplish more complex tasks.
To ensure Abab6's inference speed, MiniMax adopted the MoE architecture. Under this structure, model parameters are divided into multiple groups of "experts," with only a subset participating in computation during each inference pass. Based on this MoE structure, Abab6 gains the complex task-handling capabilities that come with large parameter counts, while computational efficiency improves — allowing the model to train on enough data within a given time unit.
Currently, most open-source and academic large language model efforts do not use the MoE architecture. To train Abab6, MiniMax developed its own efficient MoE training and inference framework and invented several MoE model training techniques. To date, Abab6 is China's first billion-parameter-plus large language model built on the MoE architecture.
Abab6 Evaluation Results
We conducted automated evaluations of abab6, abab5.5, GPT-3.5, GPT-4, Claude 2.1, and the commercial version of Mistral-Medium. Since abab5.5 already performs well on simple tasks, we selected three evaluation methods covering more complex problems:

IFEval: This evaluation tests the model's ability to follow user instructions. We ask the model questions with specific constraints, such as "Using XX as the title, list three specific methods, with each description limited to two sentences," then measure how many responses strictly satisfy those constraints.
MT-Bench: This evaluation measures the model's comprehensive English capabilities. We ask the model questions across multiple categories including roleplay, writing, information extraction, reasoning, math, coding, and knowledge Q&A. Another large model (GPT-4) scores the responses, and we calculate the average.
AlignBench: This reflects the model's comprehensive Chinese capabilities, with a testing format similar to MT-Bench.
As shown, abab6 significantly outperforms our previous-generation model abab5.5 across all three test sets. In instruction following, comprehensive Chinese capabilities, and comprehensive English capabilities, abab6 substantially surpasses GPT-3.5. Compared to Claude 2.1, abab6 also holds a slight edge in instruction following, comprehensive Chinese capabilities, and comprehensive English capabilities. Versus Mistral's commercial version Mistral-Medium, abab6 outperforms it in instruction following and comprehensive Chinese capabilities, while matching it in comprehensive English capabilities. Abab6 is still in continuous training. Those interested in accessing the Abab6 large model are welcome to visit the MiniMax open platform at api.minimax.chat to register.
Below are several concrete examples of Abab6 in action.
Application Examples
1. Fun Tutoring
Prompt: Teach the following math problem to a child around 7 years old in a simple, easy-to-understand, and fun way: There are 5 apples in a basket. Xiaoming eats 2 apples, then his mom puts 3 more apples in. How many apples are in the basket now? To keep the child engaged, please use SpongeBob's tone of voice.
Abab: Hey there, little buddies! It's your good friend SpongeBob! Today we're gonna play a math game and see what happens to the apples in our basket.
At the start, we had 5 apples in our basket — just like treasure in the underwater world! Then our good friend Xiaoming came along and ate 2 apples, like gobbling up 2 pearls from the sea floor. So now we had 5-2=3 apples left in the basket.
But don't worry, because Xiaoming's mom is super kind — she put 3 more apples in, like finding 3 more pearls! So now our basket has 3+3=6 apples again.
So now there are 6 apples in the basket, like 6 shiny new pearls in our underwater world. Got it? Remember, math is just like exploring the ocean floor — always full of surprises!
2. Game Design
Prompt: Create a fictional board game for me. The theme should revolve around Shanghai, and the game must include Shanghai's major tourist attractions, historical culture, and distinctive cuisine. Please format your response as a clear game rulebook, with descriptions of each game component organized as ordered lists.
Abab: Game Title: Magic City Quest
Game Type: City Building & Cultural Exploration
Game Objective:
Players become urban planners of Shanghai, developing their own city by investing in tourist attractions, historical landmarks, and food districts to attract visitors — ultimately becoming Shanghai's most successful urban planner.
Number of Players: 2-4
Play Time: Approximately 60 minutes
Game Components:
-
Game Board: A map of Shanghai showing major tourist attractions, historical-cultural areas, and food districts. Different colored zones on the map represent different types of plots.
-
Landmark Cards: Describe famous Shanghai tourist attractions such as the Bund, Yu Garden, Oriental Pearl Tower, etc. Each card shows investment cost and tourist appeal score.
-
Historical-Culture Cards: Introduce Shanghai's historical culture, such as the Shanghai Museum, the Site of the First CPC National Congress, etc. Each card shows investment cost and historical-cultural score.
-
Food District Cards: List Shanghai's specialty foods such as xiaolongbao, shengjian, Nanxiang xiaolongbao, etc. Each card shows investment cost and food score.
-
Development Die: A six-sided die used to determine player turn order and development phases.
-
Currency: Represents each player's funds, used to purchase plots and make investments.
-
Scoreboard: Tracks each player's tourist appeal score, historical-cultural score, and food score.
-
Player Tokens: Represent each player's position on the game board.
Game Rules:
At the start of the game, each player chooses a colored token and places it at the starting point on the game board.
Players take turns rolling the die, moving their token according to the die result, and drawing the corresponding card type based on their location.
Players may choose to purchase drawn landmark cards, historical-culture cards, or food district cards, placing them in their own urban planning area.
Each card has a corresponding investment cost; players must pay the required currency.
On their turn, players may choose to invest in existing buildings to increase their appeal or historical-cultural scores, or purchase new plots and cards.
The game ends after a predetermined number of rounds.
At game end, players calculate their total score from tourist appeal score, historical-cultural score, and food score combined.
The player with the highest total score becomes Shanghai's most successful urban planner and wins the game.
Strategic Tips:
Balance your investments — don't focus solely on one type of building.
Observe opponents' layouts and try to disrupt them by competing for popular tourist attractions or food districts.
Plan your cash flow carefully, ensuring sufficient funds for investments and unexpected events.
Use historical-culture cards to add cultural depth to your city, attracting visitors interested in history and culture.
Magic City Quest will lead players deep into Shanghai's culture and charm, letting them enjoy the game while growing to love and understand the city better through engaging city-building and economic competition.
3. Instruction Following
Prompt: What story does The Legend of Qin (Qin Shi Ming Yue) tell?
Answer in two paragraphs; the first paragraph's first sentence must begin with "Hailuo," its second sentence with "Ji," and its third sentence with "Zhi"; the second paragraph must begin with "STARFIELD," and include the keywords "tomorrow" and "dream" at appropriate points.
Abab: Hailuo shells hold secrets of history, and The Legend of Qin uses such a mysterious object as its opening to unfold a time-spanning adventure story. Ji — pushing imagination and creativity to their limits — this work combines the historical backdrop of late Qin dynasty with wuxia and fantasy elements, telling of heroic men and women who fight for their ideals and beliefs in turbulent times. Zhi — the brave and resourceful protagonists must not only face powerful enemies but also unravel historical mysteries one by one, their adventures filled with challenges and growth.
Under the STARFIELD, the dream of tomorrow appears especially precious in The Legend of Qin. The protagonists, caught in the torrent of history, continuously pursue their own tomorrow, their story inspiring viewers to contemplate the relationship between individuals and history, and how to persist in their dreams amid adversity. The dawn of tomorrow always seems to emerge at the darkest moments, and The Legend of Qin conveys beautiful aspirations for the future and an unwavering pursuit of dreams through such storytelling.
4. Meeting Summary
Prompt: We ask Abab 6 to summarize all topics from a 10-minute meeting with 7 participants based on [Meeting Content], explicitly noting any [Action] items in the meeting content. The number of topics should not exceed 5. Please output in the following format:
Topic 1: xxxxx
Topic Content: xxxxx
Time: xxxxx
Topic 2: xxxx
Topic Content: xxxx
Time: xxxx
Abab:
Topic 1: Accelerating Project Progress While Ensuring Quality
Topic Content: Speaker 5 emphasized maintaining work quality while accelerating progress; Zhang Hua stated that stability is the prerequisite.
Time: 10:02:15
Topic 2: In-depth Discussion of Data Model
Topic Content: Speaker 8 proposed an in-depth discussion of the data model discussed yesterday; Speaker 2 and Speaker 5 indicated they would participate and provide materials and ideas.
Time: 10:02:33
Topic 3: Backup Issue
Topic Content: Speaker 3 asked whether the backup issue mentioned in the last meeting had been resolved; Speaker 6 indicated it was still being handled, expected to be resolved tomorrow.
Time: 10:03:00
Topic 4: Market Research and Questionnaire Development
Topic Content: Speaker 7 asked about the status of the market research questionnaire discussed in the past two days; Speaker 2 indicated a draft was ready, planned for completion before the weekend.
Time: 10:03:56
Topic 5: Work Assignment and Skills Summary
Topic Content: Speaker 5 proposed discussing upcoming work assignments to let everyone contribute in their strongest areas; Zhang Hua indicated he would summarize everyone's skills and preferences before finalizing through further discussion.
Time: 10:04:42
5. Write a Poem
Prompt: Write an English acrostic poem, 5 lines total, with the first letters spelling "ABAB6," on the theme of MiniMax's ABAB6 large language model release
Abab:
Astonishing breakthroughs in AI,
Bringing forth a new era of digital intelligence,
Awesome power in the palm of your hand,
Beyond imagination, the ABAB6 model shines,
Sixth sense in technology, MiniMax leads the way.









