Voice Agent Camp: On GPT-4o's First Anniversary, We're Launching a Startup Camp | Supersonic Plan 2025
Have you ever pored over the voice interaction in *Her*, dreaming of replicating—or even surpassing—that blend of intelligence and emotion?

Have you ever found yourself endlessly studying the voice interactions in Her, dreaming of replicating — or even surpassing — that blend of intelligence and emotional depth? At Voice Agent Camp, you'll meet others who share that same passion and ambition.
Voice Agent Camp is looking for early-stage startups reimagining real-time human-machine interaction through voice agents! Multimodal AI, voice synthesis, and multi-agent systems are maturing fast, and应用场景 are poised to explode. We'll recruit roughly 15 teams to build the next generation of products around a "Voice First" philosophy — AI companions, personal assistants, enterprise customer service, AI voice-enabled hardware, and more.
"Supersonic Plan 2025: Voice Agent Camp" kicks off in early June and runs for three months. Through peer learning, we'll exchange ideas on voice technology and product development, open-source strategy and business models, global expansion and growth. Participants also receive free technical resources, investor matchmaking, exhibition booth access, media interviews, and other startup acceleration support.
Apply here:
https://www.rtecommunity.dev/t/t_n5Fb9YDJS9JGPA
Voice Technology Is Maturing, and Applications Are About to Explode
We're launching this camp on the one-year anniversary of GPT-4o's release.
The past year has seen voice technology advance by leaps and bounds. End-to-end speech systems don't just recognize words — they convey rich sonic information like timbre, laughter, and sighs. Full-duplex technology enables fluid, natural bidirectional conversation with seamless interruption. As agent scenarios and devices diversify, foundational audio-video capabilities — noise suppression, voice activity detection, audio-video compression, transmission latency — have become critical to building smooth voice interactions.
Middleware layers like agent-building tools, along with data, search, and memory services surrounding the agent ecosystem, have dramatically streamlined development, making it easier for developers to rapidly test scenarios and build minimum viable products.
Meanwhile, advances in reasoning models, multi-agent architecture, and AI visual understanding are giving agents stronger intent recognition and cross-modal reasoning capabilities — with voice interaction poised to become a crucial interface and entry point.
In 2025, platforms with massive user bases like ChatGPT, Grok, and Meta AI are gradually rolling out voice and video calling features. This will not only accelerate voice-interactive applications "crossing the chasm" into mainstream life, but also generate demand for more specialized, domain-specific voice agent applications. Sectors like AI spoken-language tutoring and AI interview prep are emerging precisely from this trend.
Thanks to maturing foundational technology, increasingly rich middleware tools, and market cultivation by general-purpose voice-interactive applications, we believe this ecosystem will yield ever more innovative products for vertical scenarios.
Who and What Are We Looking For?
We hope to meet founders who:
- Have deep vertical expertise: A profound understanding of and unique insights into a specific industry or scenario.
- Skillfully leverage voice interaction: Can cleverly use voice and dialogue technology to unlock "Voice-to-X" scenario value and create distinctive user experiences.
- Are early-stage teams: Pre-Series A or Series A, with products still in early development.
As a community-driven startup partner program, we've witnessed surging demand for voice agent scenarios through numerous Demo Days and Workshops in the RTE Developer Community — AI companions and personal assistants, enterprise customer service, smart AI hardware, AI user interviews, AI podcasts, real-time translation, and other innovative applications.
We champion Voice First, but we care even more about your depth in a vertical domain and your potential for cross-business integration. We look forward to your joining us in expanding the boundaries of what's possible with voice agents.
How to Participate in Voice Agent Camp
"Supersonic Plan 2025: Voice Agent Camp" launches in early June and runs for three months. It follows a hybrid online-offline format, with core offline modules planned for Beijing or Shanghai. The opening kickoff and final closing ceremony require in-person attendance; activities in between will be conducted through a more flexible mix of online and offline engagement.







Swipe to see more
We'll share and exchange ideas on voice technology and product development, open-source strategy and business models, global expansion and growth through peer learning. Participants also receive abundant free technical resources, investor matchmaking, exhibition booth access, media interviews, and other startup acceleration support.

Five Years In, Supersonic Plan Has Gathered 103 Permanent Startup Partners and Continues to Provide Long-Term Entrepreneurial Support.
Peer Learning Partners
In this Voice Agent Camp, these experts and founders in voice and real-time interaction will be your co-learning partners:
- Zhizheng Wu, Associate Professor and PhD Advisor at The Chinese University of Hong Kong, Shenzhen (CUHK-Shenzhen). Dr. Wu is dedicated to open-source voice technology exchange, leading development of the speech synthesis open-source systems Merlin, Amphion, and the open-source database Emilia.
- Keyu Chen, former AI Researcher at NetEase Fuxi & Shanghai AI Laboratory, co-founder of Pinch, a real-time AI voice translation meeting tool. Brings both academic research experience and serial YC entrepreneurship.
- Le Wang, co-founder of Folotoy, an AI toy startup. As one of the earliest AI toy teams, they've practiced Build in Public from day one. Leveraging community strength, Folotoy's dozen-person team has shipped tens of thousands of units with a 20%+ repurchase rate.
...and we look forward to you joining them.
Meanwhile, you'll integrate into a community of thousands of Voice Agent and real-time interaction practitioners. Here, we exchange the latest technical and product insights through Voice Agent Learning Notes, Meetups, podcasts, and other formats.
Beyond Voice: Building the Future of Real-Time Interaction
To be sure, this is a camp focused on voice and agents. But if you're exploring concepts like context awareness, ambient agents, proactive agents, and always-on experiences, we welcome your application too. In our view, products built on these concepts will converge with voice in the future, and all represent the future that Real-Time Engagement (RTE) is working toward.
Apply to Supersonic Plan

- Application period: May 13 – May 31
- Camp dates: June 7 – August 31
Application steps:
- Fill out the application
- Online interview after initial screening
- Final selection and evaluation
- Notification and official camp launch
Resources & Benefits: Long-Term Support
Selected Supersonic Plan startup partners receive four categories of resources and opportunities:
Technology & Product
- Real-Time AI DevKit support, including free resource packages from Agora's Conversational AI Engine/RTC/RTM, SenseTime's SenseNova real-time interactive multimodal large model, iFlytek's startup incubation voice resources, and more (additional real-time multimodal development resources continuously added);
- Agora Conversational AI Engine/RTC/RTM includes 60,000 free minutes of Conversational AI Engine, or 1 million free minutes of real-time audio/video calling, or equivalent RTM product acceleration packages;
- Easemob (Hyphenate) product support worth ¥10,000;
- Cloud credits from AWS, Microsoft, Google, or overseas marketing consultation support;
- Free access, testing, and joint product development opportunities with 50+ partners (voice agent frameworks such as TEN, etc.)
VC & Incubation
- Green-channel opportunities for interviews at top accelerators and major tech company startup programs;
- Lifelong alumni community for mutual growth, including long-term technical communication and support, community benefits and events, talent development, referral rewards, industry insight sharing, and more;
- 1-on-1 deep-dive sessions with tier-one VC partners including 5Y Capital, Jinqiu Fund, Linear Capital, and GGV Capital.
Community & Ecosystem
- Supersonic Plan's unique buddy system — in-depth exchanges with buddies who are commercialization, product, and technical leaders in audio-video, networking, AI, and related fields;
- Permanent Supersonic Plan startup partner support (long-term technical communication and support, community benefits and events, talent development);
- Connections with global top partners, bringing exclusive access to cutting-edge global trends and innovative playbooks, keeping you one step ahead in understanding the future.
Marketing & Brand
- Pitch opportunity at the annual RTE Scenario Showcase;
- Startup exhibition opportunities at industry conferences (speaking, showcasing, business matchmaking, etc.);
- Co-marketing media exposure (podcasts, interviews, VC Day);
- Exhibition and speaking opportunities at partner conferences and industry trade shows.
2024 RTE Scenario Showcase
Supersonic Plan culminates each year with the "RTE Scenario Showcase," inviting teams to demo their innovations and engage in deep conversations with a panel of top domestic and international investors. The 2024 annual projects are already showing directions in multimodal agent scenarios:
Infiniflow
An AI-native database that works with large models to serve RAG scenarios, providing complete industry RAG solutions. Helps build more nuanced AI agents. A global top open-source project with 50,000 GitHub stars, solving end-to-end pain points of large model enterprise applications.
Talk to Xiaotian
A free psychological chatbot providing 24/7 companionship and listening services, incubated by the Deep Learning Lab at Westlake University. Features extensive professional psychological assessments and secure psychological counseling services.
Traini
Focused on human-pet interaction, primarily providing pet behavior translation and service agent capabilities for pet parents. Created the world's first multimodal model for pet behavior translation.
FAQ
1. Are there any fees to apply or participate?
Application and selection involve no fees. Selected teams not based in the offline event city are responsible for their own travel expenses; we provide lunch and snacks during event days.
2. What happens after I submit the application?
After submission, Supersonic Plan will contact you (please ensure your contact information is accurate). Screening is rolling; interested applicants are encouraged to apply early. Selection involves online screening and judge scoring. After initial screening, we may conduct a 30-60 minute video call to better understand your product and support needs.
3. What information is required to apply? Do I need a business plan?
Filling out the application form is sufficient to apply, but a BP will be needed later. During the online video interview, applicants must share their BP and present a company introduction and product demo of no more than 15 minutes. BP documents must also be provided before and after the interview for cross-evaluation. We recommend having a developed BP at the time of application.
4. How much time and effort is required after selection? Can I participate online?
The Supersonic startup partner camp spans roughly three months in a hybrid online-offline format. The opening kickoff and final closing ceremony — two core modules — require in-person participation in Shanghai (or Beijing). These core modules are highly interactive and cannot be attended online; they take place on alternate weekends. Additionally, based on specific project needs, one-on-one sessions with industry experts or investors may be scheduled on weekdays.
About Supersonic Plan
Supersonic Plan is a startup acceleration program for Real-Time Engagement (RTE) entrepreneurs. It focuses on new scenarios and technologies in real-time interaction, aiming to accelerate value growth for startups in the RTE space, jointly define and expand the RTE track, and empower developers to innovate and build at lower cost and higher efficiency.
About RTE Developer Community
RTE is a developer community focused on Real-Time Engagement. We are dedicated to connecting developers and ecosystem partners in the industry, sparking new technologies and scenarios, and jointly exploring the infinite possibilities of real-time interaction. Here, you'll meet like-minded technology explorers, working together to transform how people connect with each other, with the world, and with AI.



5Y Capital seeks out, supports, and inspires lone entrepreneurs, providing support from the spiritual to all operational aspects. We believe that if the "crazy" you in others' eyes begins to be believed in, the world will become a different place.
BEIJING · SHANGHAI · SHENZHEN · HONG KONG

