Hi there,
We recently had a chat about my business idea Soundpaste and I wanted to keep you updated on the progress.
- Strong Validation of the Core Problem:
- Across almost every conversation there was clear and often unprompted validation that people want to consume written content (blogs, newsletters) in audio format but lack the time or inclination to read.
- Creators, similarly, have archives of valuable written content they’d like to repurpose for audio but find current methods too time-consuming or technical. The “Seth Godin” dream use case resonated universally as a clear illustration of this unmet need. Just need to find many more Seth…
- WordPress Plugin as a Solid MVP:
- Initial product, the WordPress plugin, was consistently seen as a logical and strong starting point.
- Features like batch processing, handling API limitations (chunking/stitching), and branding (intro/outros) were recognized as key differentiators over simple TTS API usage.
- Critical Feature Requirements Crystallized:
- Effortless Distribution (RSS is King): The need for integrated RSS feed generation and hosting, making it seamless to get audio onto major podcast platforms, emerged as a non-negotiable critical feature.
- Monetization Tools are Highly Appealing:
- Easy Ad Insertion: The idea of simple, “no-hassle” ad integration, potentially LLM-powered for contextual relevance, was a major point of interest, especially for creators.
- Paywalled/Private Feeds: Enabling creators (especially Substack users) to offer exclusive audio to paying subscribers is a significant opportunity. Need to navigate Substack partnership as they also offer a voice product, just no TTS yet.
- Voice Quality & Customization: Access to high-quality, expressive voices and the potential for voice cloning (with privacy considerations) are important. Dynamic voice selection was also noted.
- Automated Editing: Features to clean up audio (remove pauses, filler words) could add significant value, as editing is a major pain point.
- Refined Understanding of Target Audiences & Their Needs:
- WordPress Bloggers & Substack Writers: Confirmed as prime initial targets.
- Semi-Professional/Hobbyist Podcasters: For this group, the value proposition needs to be heavily skewed towards effortless monetization or solving a major time sink, as their current production might already be lean.
- Corporate Communications & B2B: An unexpected but promising avenue for internal comms, website accessibility, and branded content.
- Solo Creators & Professionals: Repurposing LinkedIn content or personal blogs for quick audio snippets is appealing. Where does the initial content live? Linkedin will not make it easy to grab the content.
- Monetization Strategy Evolving:
- While the “Bring Your Own Token” model is a good start, discussions highlighted the friction it creates and the opportunity to capture more value by bundling API usage into tiered subscriptions or credit packs.
- The focus shifted towards providing monetization tools as a core part of the value proposition, rather than just being a conversion utility.
- Key Strategic Opportunities Uncovered:
- Potential for Angel Investment/Partnerships: Some interest for early investment discussions.
- Leveraging High-Profile Creators: Successfully onboarding a figure like Seth Godin could create a powerful marketing flywheel.
- Building a “Creator-First” Platform: The consensus is that while TTS tech is becoming a commodity, owning the creator workflow, user experience, branding, distribution, and monetization tools is where the defensible moat lies.
- Clearer View of Challenges & Competitive Landscape:
- Competition from TTS Providers Moving Up the Stack: Awareness that ElevenLabs and others are becoming app-layer competitors.
- Need for “Additional Value”: Simply creating more content isn’t enough; Soundpaste must offer tangible benefits like new revenue, significant time savings, or dramatically improved reach/accessibility.
- Distribution is Key: Getting content created is only half the battle; making it easily discoverable is crucial.
Overall Progress:
I’ve moved from a promising idea with a working WordPress plugin to a significantly more validated business concept. These discussions have:
- Confirmed the core need robustly.
- Highlighted critical features required for product-market fit, especially around distribution and monetization.
- Broadened my understanding of potential user segments and revenue models.
- Uncovered specific strategic opportunities for partnerships and growth.
- Sharpened my awareness of the competitive landscape and how to build a defensible business.
State of the app
- Advanced Audio Customization:
- Branding: Users can now apply custom intro and outro audio segments to their existing audio versions, allowing for consistent branding across content. This includes new interface elements to manage and trigger branding.
- Intro/Outro Support: Beyond just branding, the system now supports dedicated intro and outro audio files that can be stitched into the main audio content during generation.
- Enhanced Audio Processing & Voice Management:
- Flexible Audio Stitching: Users can now choose their preferred audio stitching method: either using local FFmpeg or integrating with the external “Soundpaste” service. Configuration options for Soundpaste have been added.
- Dynamic Voice List Refresh: A “Refresh Voice List” button in the admin panel allows users to manually update the available text-to-speech voices from providers like ElevenLabs and Cartesia, ensuring access to the latest options.
- Bulk Voice Overrides: For greater control, users can now apply voice and provider overrides in bulk for audio generation tasks.
- Improved Admin Interface & User Experience:
- Clearer Job Statuses: The display of audio generation job statuses (e.g., queued, processing, completed, error) has been enhanced with visual indicators and an auto-refresh feature on the job status page for real-time updates.
- Compact Audio Player: The audio player in the media library has been redesigned to be more compact, improving space efficiency.
- Dynamic Voice Information: The library now dynamically shows voice names from providers instead of just IDs.
- Better Error Handling & Feedback: Error logging and user feedback mechanisms have been improved.
- Underlying Structural Improvements:
- Action Scheduler Integration: Background processing of text-to-speech tasks is now handled more robustly using Action Scheduler, leading to better performance and reliability.
Thanks for reading if you made it that far 🙂
I’ll continue keeping you updated of the progress.
