Voice AI Pricing Socket + REST API & WEB UI

How much does text to speech cost?
Learn about the
Cost of Voice AI

All new accounts are offered a small amount of voice compute that can be used to generate speech using the text to speech web interface or for testing the implementation using REST API. Voice AI pricing is based upon the number of spoken words provided as text to be spoken.

Pay as you go voice compute can be added at any time using the account page which also displays the current balance for each compute.

With asticaVoice v2.0_full we are delivering the most realistic and natural-sounding voices to date while making it possible for developers to build affordable, lower-latency applications, including real-time scenarios.

Expanding on the original v1.0, which offered a selection of more than 40 voices spanning a wide range of nationalities, genders, and ages; asticaVoice 2.0_full includes an even larger selection of voices and the ability to clone your own voice.

Generate Speech
Synthesize speech for audio playback based on a provided text input.
Realistic Human Voices
Offering the highest amount of realism with natural disfluencies.
Price per word
$0.000095
Approx. per minute
$0.012
  • Produces the most realistic sounding speech.
  • Optional timestamps of each word.
  • Low priority processing for reduced cost.
  • Suitable for real-time applications.
  • Voice cloning available at the same cost.
  • Supports English and French
  • Every audio generated is naturally unique.
  • REST API time to first audio: ~450ms
  • WS API time to first audio: ~250ms
Custom Voice Clones
Use your own voices at no additional cost per word.
Price per word
$0.000095
Approx. per minute
$0.012
Low Priority Mode (Discount)
Audio is processed within a low priority queue for reduced costs.
Price per word
$0.000065
Approx. per minute
$0.0084
Programmable Voices
Supports custom prompts to determine speaking style or persona.
Price per word
$0.000145
Approx. per minute
$0.0185
  • Supports custom prompts to dictate how the voice sounds.
  • Supports 55+ different languages
  • API time to first audio: ~800 - 1200ms
Neural Voices
Most accurate speech with slightly lower quality.
Price per word
$0.000075
Approx. per minute
$0.0115
  • Provides different ages and nationalities.
  • Supports 55+ different languages
  • API time to first audio: ~450ms
  • WS API time to first audio: ~250ms

Use Coupon Code: evaluate
Receive $5.00 off your next purchase.

  1. Specific vocabulary which contain a large number of characters will be measured as two words.
  2. Programmable voice prompts are calculated and billed per word identical to input text.
  3. Regular punctuation use is included in the measurement. (I.e. comma or a sentence period)
  4. All compute usage and costs of use will be calculated and displayed online. View Usage.
  5. All audio files are usable for commercial purposes.
Success Just Now
Copied to clipboard
Success Just Now
Submission has been received.
Success Just Now
Account preferences have been updated.