elevenlabs

package module
v0.3.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 4, 2025 License: MIT Imports: 13 Imported by: 0

README

elevenlabs-go v0.3.X

Go version License

This is a Go client library for the ElevenLabs voice cloning and speech synthesis platform. It provides a interface for Go programs to interact with the ElevenLabs API.

About this Fork

This repo is a fork of the original elevenlabs-go and integrates the updates from Mlivui79`s fork.

Why this is needed?:
This evolution of elvenlabs-go by WP-Nova-GmbH is primarly a package support takeover because we need some of the elevenlabs features that aren't supported by the versions before. There isn't a need to reinvent the weel but a need for up-to-date packages, when integrating TTS/STT in the long run. So here we go!

To make a clear seperation between the package versions they are labeld like this:

  • Original: v0.1.X
  • Mlivui79`s fork: v0.2.X
  • WP-Nova-GmbH: v0.3.X

And here is the feature overview so you can decide what version to use :).

Feature v0.1.X v.0.2.X v0.3.X
TextToSpeech [ ] [x] [ ]
TextToSpeechStream [ ] [x] [ ]
GetModels [x] [ ] [ ]
GetVoices [x] [ ] [ ]
GetDefaultVoiceSettings [x] [ ] [ ]
GetVoiceSettings [x] [ ] [ ]
GetVoice [x] [ ] [ ]
DeleteVoice [x] [ ] [ ]
EditVoiceSettings [x] [ ] [ ]
AddVoice [x] [ ] [ ]
EditVoice [x] [ ] [ ]
DeleteSample [x] [ ] [ ]
GetSampleAudio [x] [ ] [ ]
GetHistory [x] [ ] [ ]
GetHistoryItem [x] [ ] [ ]
DeleteHistoryItem [x] [ ] [ ]
GetHistoryItemAudio [x] [ ] [ ]
DownloadHistoryAudio [x] [ ] [ ]
GetSubscription [x] [ ] [ ]
GetUser [x] [ ] [ ]
SpeechToText [ ] [ ] [x]

Installation

go get github.com/WP-Nova-GmbH/[email protected]

Example Usage

Make sure to replace "your-api-key" in all examples with your actual API key. Refer to the official Elevenlabs API documentation for further details.

Using a New Client Instance

Using the NewClient function returns a new Client instance will allow to pass a parent context, your API key and a timeout duration.

package main

import (
 "context"
 "log"
 "os"
 "time"

 "github.com/WP-Nova-GmbH/elevenlabs-go"
)

func main() {
 // Create a new client
 client := elevenlabs.NewClient(context.Background(), "your-api-key", 30*time.Second)

 // Create a TextToSpeechRequest
 ttsReq := elevenlabs.TextToSpeechRequest{
  Text:    "Hello, world! My name is Adam, nice to meet you!",
  ModelID: "eleven_monolingual_v1",
 }

 // Call the TextToSpeech method on the client, using the "Adam"'s voice ID.
 audio, err := client.TextToSpeech("pNInz6obpgDQGcFmaJgB", ttsReq)
 if err != nil {
  log.Fatal(err)
 }

 // Write the audio file bytes to disk
 if err := os.WriteFile("adam.mp3", audio, 0644); err != nil {
  log.Fatal(err)
 }

 log.Println("Successfully generated audio file")
}
Using the Default Client and proxy functions

The library has a default client you can configure and use with proxy functions that wrap method calls to the default client. The default client has a default timeout set to 30 seconds and is configured with context.Background() as the the parent context. You will only need to set your API key at minimum when taking advantage of the default client. Here's the a version of the above example above using shorthand functions only.

package main

import (
 "log"
 "os"
 "time"

 el "github.com/WP-Nova-GmbH/elevenlabs-go"
)

func main() {
 // Set your API key
 el.SetAPIKey("your-api-key")

 // Set a different timeout (optional)
 el.SetTimeout(15 * time.Second)

 // Call the TextToSpeech method on the client, using the "Adam"'s voice ID.
 audio, err := el.TextToSpeech("pNInz6obpgDQGcFmaJgB", el.TextToSpeechRequest{
   Text:    "Hello, world! My name is Adam, nice to meet you!",
   ModelID: "eleven_monolingual_v1",
  })
 if err != nil {
  log.Fatal(err)
 }

 // Write the audio file bytes to disk
 if err := os.WriteFile("adam.mp3", audio, 0644); err != nil {
  log.Fatal(err)
 }

 log.Println("Successfully generated audio file")
}
Streaming

The Elevenlabs API allows streaming of audio "as it is being generated". In elevenlabs-go, you'll want to pass an io.Writer to the TextToSpeechStream method where the stream will be continuously copied to. Note that you will need to set the client timeout to a high enough value to ensure that request does not time out mid-stream.

package main

import (
 "context"
 "log"
 "os/exec"
 "time"

 "github.com/WP-Nova-GmbH/elevenlabs-go"
)

func main() {
 message := `The concept of "flushing" typically applies to I/O buffers in many programming 
languages, which store data temporarily in memory before writing it to a more permanent location
like a file or a network connection. Flushing the buffer means writing all the buffered data
immediately, even if the buffer isn't full.`

 // Set your API key
 elevenlabs.SetAPIKey("your-api-key")

 // Set a large enough timeout to ensure the stream is not interrupted.
 elevenlabs.SetTimeout(1 * time.Minute)

 // We'll use mpv to play the audio from the stream piped to standard input
 cmd := exec.CommandContext(context.Background(), "mpv", "--no-cache", "--no-terminal", "--", "fd://0")

 // Get a pipe connected to the mpv's standard input
 pipe, err := cmd.StdinPipe()
 if err != nil {
  log.Fatal(err)
 }

 // Attempt to run the command in a separate process
 if err := cmd.Start(); err != nil {
  log.Fatal(err)
 }

 // Stream the audio to the pipe connected to mpv's standard input
 if err := elevenlabs.TextToSpeechStream(
  pipe,
  "pNInz6obpgDQGcFmaJgB",
  elevenlabs.TextToSpeechRequest{
   Text:    message,
   ModelID: "eleven_multilingual_v1",
  }); err != nil {
  log.Fatalf("Got %T error: %q\n", err, err)
 }

 // Close the pipe when all stream has been copied to the pipe
 if err := pipe.Close(); err != nil {
  log.Fatalf("Could not close pipe: %s", err)
 }
 log.Print("Streaming finished.")

 // Wait for mpv to exit. With the pipe closed, it will do that as
 // soon as it finishes playing
 if err := cmd.Wait(); err != nil {
  log.Fatal(err)
 }

 log.Print("All done.")
}

TextToSpeechRequest Struct

The TextToSpeechRequest struct has been updated to match the latest ElevenLabs API:

type TextToSpeechRequest struct {
	Text                            string                           `json:"text"`
	ModelID                         string                           `json:"model_id,omitempty"`
	LanguageCode                    string                           `json:"language_code,omitempty"`
	PronunciationDictionaryLocators []PronunciationDictionaryLocator `json:"pronunciation_dictionary_locators,omitempty"`
	Seed                            int                              `json:"seed,omitempty"`
	PreviousText                    string                           `json:"previous_text,omitempty"`
	NextText                        string                           `json:"next_text,omitempty"`
	PreviousRequestIds              []string                         `json:"previous_request_ids,omitempty"`
	NextRequestIds                  []string                         `json:"next_request_ids,omitempty"`
	ApplyTextNormalization          bool                             `json:"apply_text_normalization,omitempty"`
	ApplyLanguageTextNormalization  bool                             `json:"apply_language_text_normalization,omitempty"`
	VoiceSettings                   *VoiceSettings                   `json:"voice_settings,omitempty"`
}

Text-to-Speech with Timestamps

This fork includes support for the new ElevenLabs API endpoint that provides character-level timestamps along with the generated audio. The response contains:

  • audio_base64: Base64 encoded audio data
  • alignment: Timestamp information for each character in the original text
  • normalized_alignment: Timestamp information for each character in the normalized text
Example Usage
package main

import (
 "context"
 "log"
 "os"
 "time"

 "github.com/Mliviu79/elevenlabs-go"
)

func main() {
 // Create a new client
 client := elevenlabs.NewClient(context.Background(), "your-api-key", 30*time.Second)

 // Create a TextToSpeechRequest
 ttsReq := elevenlabs.TextToSpeechRequest{
  Text:    "Hello, world! This is a test with timestamps.",
  ModelID: "eleven_monolingual_v1",
 }

 // Create a file to write the JSON response containing audio and timestamps
 file, err := os.Create("response_with_timestamps.json")
 if err != nil {
  log.Fatal(err)
 }
 defer file.Close()

 // Call the TextToSpeechWithTimestamps method, streaming the JSON response to the file
 err = client.TextToSpeechWithTimestamps(file, "pNInz6obpgDQGcFmaJgB", ttsReq)
 if err != nil {
  log.Fatal(err)
 }

 log.Println("Successfully generated audio with timestamps")
}

The response structure includes:

type TextToSpeechWithTimestampsResponse struct {
	AudioBase64         string        `json:"audio_base64"`
	Alignment           AlignmentInfo `json:"alignment"`
	NormalizedAlignment AlignmentInfo `json:"normalized_alignment"`
}

type AlignmentInfo struct {
	Characters                 []string  `json:"characters"`
	CharacterStartTimesSeconds []float64 `json:"character_start_times_seconds"`
	CharacterEndTimesSeconds   []float64 `json:"character_end_times_seconds"`
}

Status and Future Plans

This library is an evolution of the original elevenlabs-go library. This project is not designed to be a complete coverage of the elevenlabs API but more a project to cover the needs for our (WP-Nova GmbH) AI platfrom: chat.wp-nova.ai.

Four our AI speech features we use the elevenlabs services so I guess if you want to do the typical "text"/"speech" stuff and don't need special neachy elevenlabs features this lib is what you want.

Contributing

Contributions are welcome! If you have any ideas, improvements, or bug fixes, please open an issue or submit a pull request.

Looking for a Python library?

The Elevenlabs's official Python library is excellent and fellow Pythonistas are encouraged to use it (and also to give Go, a go 😉🩵)!

Disclaimer

This is an independent project and is not affiliated with or endorsed by Elevenlabs. Elevenlabs and its trademarks are the property of their respective owners. The purpose of this project is to provide a client library to facilitate access to the public API provided Elevenlabs within Go programs. Any use of Elevenlabs's trademarks within this project is for identification purposes only and does not imply endorsement, sponsorship, or affiliation.

License

This project is licensed under the MIT License.

Test Data

The audio files in this directory are from the Mozilla Common Voice dataset.

  • Source: Mozilla Common Voice
  • License: Creative Commons CC0 (Public Domain)

The files have been included here to be used for integration tests for this library.

Warranty

This code library is provided "as is" and without any warranties whatsoever. Use at your own risk. More details in the LICENSE file.

Documentation

Overview

Package elevenlabs provide an interface to interact with the Elevenlabs voice generation API in Go.

Index

Examples

Constants

This section is empty.

Variables

This section is empty.

Functions

func AddVoice

func AddVoice(voiceReq AddEditVoiceRequest) (string, error)

AddVoice calls the AddVoice method on the default client.

func DeleteHistoryItem

func DeleteHistoryItem(itemId string) error

DeleteHistoryItem calls the DeleteHistoryItem method on the default client.

func DeleteSample

func DeleteSample(voiceId, sampleId string) error

DeleteSample calls the DeleteSample method on the default client.

func DeleteVoice

func DeleteVoice(voiceId string) error

DeleteVoice calls the DeleteVoice method on the default client.

func DownloadHistoryAudio

func DownloadHistoryAudio(dlReq DownloadHistoryRequest) ([]byte, error)

DownloadHistoryAudio calls the DownloadHistoryAudio method on the default client.

func EditVoice

func EditVoice(voiceId string, voiceReq AddEditVoiceRequest) error

EditVoice calls the EditVoice method on the default client.

func EditVoiceSettings

func EditVoiceSettings(voiceId string, settings VoiceSettings) error

EditVoiceSettings calls the EditVoiceSettings method on the default client.

func GetHistory

func GetHistory(queries ...QueryFunc) (GetHistoryResponse, NextHistoryPageFunc, error)

GetHistory calls the GetHistory method on the default client.

func GetHistoryItemAudio

func GetHistoryItemAudio(itemId string) ([]byte, error)

GetHistoryItemAudio calls the GetHistoryItemAudio method on the default client.

func GetSampleAudio

func GetSampleAudio(voiceId, sampleId string) ([]byte, error)

GetSampleAudio calls the GetSampleAudio method on the default client.

func SetAPIKey

func SetAPIKey(apiKey string)

SetAPIKey sets the API key for the default client.

It should be called before making any API calls with the default client if authentication is needed. The function takes a string argument which is the API key to be set.

func SetTimeout

func SetTimeout(timeout time.Duration)

SetTimeout sets the timeout duration for the default client.

It can be called if a custom timeout settings are required for API calls. The function takes a time.Duration argument which is the timeout to be set.

func SpeechToText added in v0.3.1

func SpeechToText(sttReq SpeechToTextRequest, queries ...QueryFunc) ([]byte, error)

SpeechToText calls the SpeechToText method on the default client.

func TextToSpeech

func TextToSpeech(voiceID string, ttsReq TextToSpeechRequest, queries ...QueryFunc) ([]byte, error)

TextToSpeech calls the TextToSpeech method on the default client.

func TextToSpeechStream

func TextToSpeechStream(streamWriter io.Writer, voiceID string, ttsReq TextToSpeechRequest, queries ...QueryFunc) error

TextToSpeechStream calls the TextToSpeechStream method on the default client.

func TextToSpeechStreamWithTimestamps

func TextToSpeechStreamWithTimestamps(streamWriter io.Writer, voiceID string, ttsReq TextToSpeechRequest, queries ...QueryFunc) error

TextToSpeechStreamWithTimestamps calls the TextToSpeechStreamWithTimestamps method on the default client.

Types

type APIError

type APIError struct {
	Detail APIErrorDetail `json:"detail"`
}

APIError represents an error response from the API.

At this stage, any error that is not a ValidationError is returned in this format.

func (*APIError) Error

func (e *APIError) Error() string

type APIErrorDetail

type APIErrorDetail struct {
	Status         string `json:"status"`
	Message        string `json:"message"`
	AdditionalInfo string `json:"additional_info,omitempty"`
}

APIErrorDetail contains detailed information about an APIError.

type AddEditVoiceRequest

type AddEditVoiceRequest struct {
	Name        string
	FilePaths   []string
	Description string
	Labels      map[string]string
}

type AddVoiceResponse

type AddVoiceResponse struct {
	VoiceId string `json:"voice_id"`
}

type AlignmentInfo

type AlignmentInfo struct {
	Characters                 []string  `json:"characters"`
	CharacterStartTimesSeconds []float64 `json:"character_start_times_seconds"`
	CharacterEndTimesSeconds   []float64 `json:"character_end_times_seconds"`
}

type Client

type Client struct {
	// contains filtered or unexported fields
}

Client represents an API client that can be used to make calls to the Elevenlabs API. The NewClient function should be used when instantiating a new Client.

This library also includes a default client instance that can be used when it's more convenient or when only a single instance of Client will ever be used by the program. The default client's API key and timeout (which defaults to 30 seconds) can be modified with SetAPIKey and SetTimeout respectively, but the parent context is fixed and is set to context.Background().

func NewClient

func NewClient(ctx context.Context, apiKey string, reqTimeout time.Duration) *Client

NewClient creates and returns a new Client object with provided settings.

It should be used to instantiate a new client with a specific API key, request timeout, and context.

It takes a context.Context argument which act as the parent context to be used for requests made by this client, a string argument that represents the API key to be used for authenticated requests and a time.Duration argument that represents the timeout duration for the client's requests.

It returns a pointer to a newly created Client.

func (*Client) AddVoice

func (c *Client) AddVoice(voiceReq AddEditVoiceRequest) (string, error)

AddVoice adds a new voice to the user's VoiceLab.

It takes an AddEditVoiceRequest argument that contains the information of the voice to be added.

It returns the ID of the newly added voice, or an error.

func (*Client) DeleteHistoryItem

func (c *Client) DeleteHistoryItem(itemId string) error

DeleteHistoryItem deletes a specific history item by its ID.

It takes a string argument representing the ID of the history item to be deleted.

It returns nil if successful or an error otherwise.

func (*Client) DeleteSample

func (c *Client) DeleteSample(voiceId, sampleId string) error

DeleteSample deletes a sample associated with a specific voice.

It takes two string arguments representing the ID of the voice to which the sample belongs and the ID of the sample to be deleted respectively.

It returns nil if successful or an error otherwise.

func (*Client) DeleteVoice

func (c *Client) DeleteVoice(voiceId string) error

DeleteVoice deletes a voice.

It takes a string argument that represents the ID of the voice to be deleted.

It returns a nil if successful, or an error.

func (*Client) DownloadHistoryAudio

func (c *Client) DownloadHistoryAudio(dlReq DownloadHistoryRequest) ([]byte, error)

DownloadHistoryAudio downloads the audio data for a one or more history items.

It takes a DownloadHistoryRequest argument that specifies the history item(s) to download.

It returns a byte slice containing the downloaded audio data. If one history item ID was provided the byte slice is a mpeg encoded audio file. If multiple item IDs where provided, the byte slice is a zip file packing the history items' audio files.

func (*Client) EditVoice

func (c *Client) EditVoice(voiceId string, voiceReq AddEditVoiceRequest) error

EditVoice updates an existing voice belonging to the user.

It takes a string argument that represents the ID of the voice to update, and an AddEditVoiceRequest argument 'voiceReq' that contains the updated information for the voice.

It returns nil if successful or an error otherwise.

func (*Client) EditVoiceSettings

func (c *Client) EditVoiceSettings(voiceId string, settings VoiceSettings) error

EditVoiceSettings updates the settings for a specific voice.

It takes a string argument that represents the ID of the voice to which the settings to be updated belong, and a VoiceSettings argument that contains the new settings to be applied.

It returns nil if successful or an error otherwise.

func (*Client) GetDefaultVoiceSettings

func (c *Client) GetDefaultVoiceSettings() (VoiceSettings, error)

GetDefaultVoiceSettings retrieves the default settings for voices

It returns a VoiceSettings object or an error.

func (*Client) GetHistory

func (c *Client) GetHistory(queries ...QueryFunc) (GetHistoryResponse, NextHistoryPageFunc, error)

GetHistory retrieves the history of all created audio and their metadata

It accepts an optional list of QueryFunc 'queries' to modify the request. The QueryFunc functions relevant for this function are PageSize and StartAfter.

It returns a GetHistoryResponse object containing the history data, a function of type NextHistoryPageFunc to retrieve the next page of history, and an error.

Example
// Define a helper function to print history items
printHistory := func(r elevenlabs.GetHistoryResponse, p int) {
	fmt.Printf("--Page %d--\n", p)
	for i, h := range r.History {
		t := time.Unix(int64(h.DateUnix), 0)
		fmt.Printf("%d. %s - %s: %d bytes\n", p+i, t.Format("2006-01-02 15:04:05"), h.HistoryItemId, len(h.Text))
	}
}
// Create a new client
client := elevenlabs.NewClient(context.Background(), "your-api-key", 30*time.Second)

// Get and print the first page (5 items).
page := 1
historyResp, nextPage, err := client.GetHistory(elevenlabs.PageSize(5))
if err != nil {
	log.Fatal(err)
}
printHistory(historyResp, page)

// Get all other pages
for nextPage != nil {
	page++
	/*
		Retrieve the next page. The page size from the original call is retained but
		can be overwritten by passing a call to PageSize with the new size.
	*/
	historyResp, nextPage, err = nextPage()
	if err != nil {
		log.Fatal(err)
	}
	printHistory(historyResp, page)
}

func (*Client) GetHistoryItem

func (c *Client) GetHistoryItem(itemId string) (HistoryItem, error)

GetHistoryItem retrieves a specific history item by its ID.

It takes a string argument 'representing the ID of the history item to be retrieved.

It returns a HistoryItem object representing the retrieved history item, or an error.

func (*Client) GetHistoryItemAudio

func (c *Client) GetHistoryItemAudio(itemId string) ([]byte, error)

GetHistoryItemAudio retrieves the audio data for a specific history item by its ID.

It takes a string argument representing the ID of the history item for which the audio data is retrieved.

It returns a byte slice containing the audio data or an error.

func (*Client) GetModels

func (c *Client) GetModels() ([]Model, error)

GetModels retrieves the list of all available models.

It returns a slice of Model objects or an error.

func (*Client) GetSampleAudio

func (c *Client) GetSampleAudio(voiceId, sampleId string) ([]byte, error)

GetSampleAudio retrieves the audio data for a specific sample associated with a voice.

It takes two string arguments representing the IDs of the voice and sample respectively.

It returns a byte slice containing the audio data in case of success or an error.

func (*Client) GetSubscription

func (c *Client) GetSubscription() (Subscription, error)

GetSubscription retrieves the subscription details for the user.

It returns a Subscription object representing the subscription details, or an error.

func (*Client) GetUser

func (c *Client) GetUser() (User, error)

GetUser retrieves the user information.

It returns a User object representing the user details, or an error.

The Subscription object returned with User will not have the invoicing details populated. Use GetSubscription to retrieve the user's full subscription details.

func (*Client) GetVoice

func (c *Client) GetVoice(voiceId string, queries ...QueryFunc) (Voice, error)

GetVoice retrieves metadata about a certain voice.

It takes a string argument that represents the ID of the voice for which the metadata are retrieved and an optional list of QueryFunc 'queries' to modify the request. The QueryFunc relevant for this function is WithSettings.

It returns a Voice object or an error.

func (*Client) GetVoiceSettings

func (c *Client) GetVoiceSettings(voiceId string) (VoiceSettings, error)

GetVoiceSettings retrieves the settings for a specific voice.

It takes a string argument that represents the ID of the voice for which the settings are retrieved.

It returns a VoiceSettings object or an error.

func (*Client) GetVoices

func (c *Client) GetVoices() ([]Voice, error)

GetVoices retrieves the list of all voices available for use.

It returns a slice of Voice objects or an error.

func (*Client) SpeechToText added in v0.3.1

func (c *Client) SpeechToText(sttReq SpeechToTextRequest, queries ...QueryFunc) ([]byte, error)

SpeechToText converts a given audio file to text.

It takes a SpeechToTextRequest argument that contains the audio file to be converted and an optional list of QueryFunc 'queries' to modify the request.

It returns the JSON response byte slice containing the transcription as text property in case of success or an error.

func (*Client) TextToSpeech

func (c *Client) TextToSpeech(voiceID string, ttsReq TextToSpeechRequest, queries ...QueryFunc) ([]byte, error)

TextToSpeech converts and returns a given text to speech audio using a certain voice.

It takes a string argument that represents the ID of the voice to be used for the text to speech conversion, a TextToSpeechRequest argument that contain the text to be used to generate the audio alongside other settings and an optional list of QueryFunc 'queries' to modify the request. The QueryFunc functions relevant for this method are LatencyOptimizations and OutputFormat

It returns a byte slice that contains mpeg encoded audio data in case of success, or an error.

Example
// Create a new client
client := elevenlabs.NewClient(context.Background(), "your-api-key", 30*time.Second)

// Create a TextToSpeechRequest
ttsReq := elevenlabs.TextToSpeechRequest{
	Text:    "Hello, world! My name is Adam, nice to meet you!",
	ModelID: "eleven_monolingual_v1",
}

// Call the TextToSpeech method on the client, using the "Adam"'s voice ID.
audio, err := client.TextToSpeech("pNInz6obpgDQGcFmaJgB", ttsReq)
if err != nil {
	log.Fatal(err)
}

// Write the audio file bytes to disk
if err := os.WriteFile("adam.mp3", audio, 0644); err != nil {
	log.Fatal(err)
}

log.Println("Successfully generated audio file")

func (*Client) TextToSpeechStream

func (c *Client) TextToSpeechStream(streamWriter io.Writer, voiceID string, ttsReq TextToSpeechRequest, queries ...QueryFunc) error

TextToSpeechStream converts and streams a given text to speech audio using a certain voice.

It takes an io.Writer argument to which the streamed audio will be copied, a string argument that represents the ID of the voice to be used for the text to speech conversion, a TextToSpeechRequest argument that contain the text to be used to generate the audio alongside other settings and an optional list of QueryFunc 'queries' to modify the request. The QueryFunc functions relevant for this method are LatencyOptimizations and OutputFormat.

It is important to set the timeout of the client to a duration large enough to maintain the desired streaming period.

It returns nil if successful or an error otherwise.

Example
message := `The concept of "flushing" typically applies to I/O buffers in many programming 
languages, which store data temporarily in memory before writing it to a more permanent location
like a file or a network connection. Flushing the buffer means writing all the buffered data
immediately, even if the buffer isn't full.`

// Set your API key
elevenlabs.SetAPIKey("your-api-key")

// Set a large enough timeout to ensure the stream is not interrupted.
elevenlabs.SetTimeout(1 * time.Minute)

// We'll use mpv to play the audio from the stream piped to standard input
cmd := exec.CommandContext(context.Background(), "mpv", "--no-cache", "--no-terminal", "--", "fd://0")

// Get a pipe connected to the mpv's standard input
pipe, err := cmd.StdinPipe()
if err != nil {
	log.Fatal(err)
}

// Attempt to run the command in a separate process
if err := cmd.Start(); err != nil {
	log.Fatal(err)
}

// Stream the audio to the pipe connected to mpv's standard input
if err := elevenlabs.TextToSpeechStream(
	pipe,
	"pNInz6obpgDQGcFmaJgB",
	elevenlabs.TextToSpeechRequest{
		Text:    message,
		ModelID: "eleven_multilingual_v1",
	}); err != nil {
	log.Fatalf("Got %T error: %q\n", err, err)
}

// Close the pipe when all stream has been copied to the pipe
if err := pipe.Close(); err != nil {
	log.Fatalf("Could not close pipe: %s", err)
}
log.Print("Streaming finished.")

/*
	Wait for mpv to exit. With the pipe closed, it will do that as
	soon as it finishes playing
*/
if err := cmd.Wait(); err != nil {
	log.Fatal(err)
}

log.Print("All done.")

func (*Client) TextToSpeechStreamWithTimestamps

func (c *Client) TextToSpeechStreamWithTimestamps(streamWriter io.Writer, voiceID string, ttsReq TextToSpeechRequest, queries ...QueryFunc) error

TextToSpeechStreamWithTimestamps converts a given text to speech audio with character-level timestamps.

It takes an io.Writer argument to which the streamed response will be copied, a string argument that represents the ID of the voice to be used for the text to speech conversion, a TextToSpeechRequest argument that contains the text to be used to generate the audio alongside other settings and an optional list of QueryFunc 'queries' to modify the request. The QueryFunc functions relevant for this method are LatencyOptimizations and OutputFormat.

The response stream contains JSON data with base64 encoded audio and timestamp information for each character in the original and normalized text.

It is important to set the timeout of the client to a duration large enough to maintain the desired streaming period.

It returns nil if successful or an error otherwise.

type DownloadHistoryRequest

type DownloadHistoryRequest struct {
	HistoryItemIds []string `json:"history_item_ids"`
}

type Feedback

type Feedback struct {
	AudioQuality    bool    `json:"audio_quality"`
	Emotions        bool    `json:"emotions"`
	Feedback        string  `json:"feedback"`
	Glitches        bool    `json:"glitches"`
	InaccurateClone bool    `json:"inaccurate_clone"`
	Other           bool    `json:"other"`
	ReviewStatus    *string `json:"review_status,omitempty"`
	ThumbsUp        bool    `json:"thumbs_up"`
}

type File

type File struct {
	FileId         string `json:"file_id"`
	FileName       string `json:"file_name"`
	MimeType       string `json:"mime_type"`
	SizeBytes      int    `json:"size_bytes"`
	UploadDateUnix int    `json:"upload_date_unix"`
}

type FineTuning

type FineTuning struct {
	FineTuningRequested         bool                  `json:"fine_tuning_requested"`
	FineTuningState             string                `json:"finetuning_state"`
	IsAllowedToFineTune         bool                  `json:"is_allowed_to_fine_tune"`
	Language                    string                `json:"language"`
	ManualVerification          ManualVerification    `json:"manual_verification"`
	ManualVerificationRequested bool                  `json:"manual_verification_requested"`
	SliceIds                    []string              `json:"slice_ids"`
	VerificationAttempts        []VerificationAttempt `json:"verification_attempts"`
	VerificationAttemptsCount   int                   `json:"verification_attempts_count"`
	VerificationFailures        []string              `json:"verification_failures"`
}

type GetHistoryResponse

type GetHistoryResponse struct {
	History           []HistoryItem `json:"history"`
	LastHistoryItemId string        `json:"last_history_item_id"`
	HasMore           bool          `json:"has_more"`
}

type GetVoicesResponse

type GetVoicesResponse struct {
	Voices []Voice `json:"voices"`
}

type HistoryItem

type HistoryItem struct {
	CharacterCountChangeFrom int           `json:"character_count_change_from"`
	CharacterCountChangeTo   int           `json:"character_count_change_to"`
	ContentType              string        `json:"content_type"`
	DateUnix                 int           `json:"date_unix"`
	Feedback                 Feedback      `json:"feedback"`
	HistoryItemId            string        `json:"history_item_id"`
	ModelId                  string        `json:"model_id"`
	RequestId                string        `json:"request_id"`
	Settings                 VoiceSettings `json:"settings"`
	ShareLinkId              string        `json:"share_link_id"`
	State                    string        `json:"state"`
	Text                     string        `json:"text"`
	VoiceCategory            string        `json:"voice_category"`
	VoiceId                  string        `json:"voice_id"`
	VoiceName                string        `json:"voice_name"`
}

func GetHistoryItem

func GetHistoryItem(itemId string) (HistoryItem, error)

GetHistoryItem calls the GetHistoryItem method on the default client.

type Invoice

type Invoice struct {
	AmountDueCents         int `json:"amount_due_cents"`
	NextPaymentAttemptUnix int `json:"next_payment_attempt_unix"`
}

type Language

type Language struct {
	LanguageId string `json:"language_id"`
	Name       string `json:"name"`
}

type ManualVerification

type ManualVerification struct {
	ExtraText       string `json:"extra_text"`
	Files           []File `json:"files"`
	RequestTimeUnix int    `json:"request_time_unix"`
}

type Model

type Model struct {
	CanBeFineTuned                     bool       `json:"can_be_finetuned"`
	CanDoTextToSpeech                  bool       `json:"can_do_text_to_speech"`
	CanDoVoiceConversion               bool       `json:"can_do_voice_conversion"`
	CanUseSpeakerBoost                 bool       `json:"can_use_speaker_boost"`
	CanUseStyle                        bool       `json:"can_use_style"`
	Description                        string     `json:"description"`
	Languages                          []Language `json:"languages"`
	MaxCharactersRequestFreeUser       int        `json:"max_characters_request_free_user"`
	MaxCharactersRequestSubscribedUser int        `json:"max_characters_request_subscribed_user"`
	ModelId                            string     `json:"model_id"`
	Name                               string     `json:"name"`
	RequiresAlphaAccess                bool       `json:"requires_alpha_access"`
	ServesProVoices                    bool       `json:"serves_pro_voices"`
	TokenCostFactor                    float32    `json:"token_cost_factor"`
}

func GetModels

func GetModels() ([]Model, error)

GetModels calls the GetModels method on the default client.

type NextHistoryPageFunc

type NextHistoryPageFunc func(...QueryFunc) (GetHistoryResponse, NextHistoryPageFunc, error)

NextHistoryPageFunc represent functions that can be used to access subsequent history pages. It is returned by the GetHistory client method.

A NextHistoryPageFunc function wraps a call to GetHistory which will subsequently return another NextHistoryPageFunc until all history pages are retrieved in which case nil will be returned in its place.

As such, a "while"-style for loop or recursive calls to the returned NextHistoryPageFunc can be employed to retrieve all history in a paginated way if needed.

type PronunciationDictionaryLocator

type PronunciationDictionaryLocator struct {
	PronunciationDictionaryId string `json:"pronunciation_dictionary_id"`
	VersionId                 string `json:"version_id"`
}

type QueryFunc

type QueryFunc func(*url.Values)

QueryFunc represents the type of functions that sets certain query string to a given or certain value.

func LatencyOptimizations

func LatencyOptimizations(value int) QueryFunc

LatencyOptimizations returns a QueryFunc that sets the http query 'optimize_streaming_latency' to a certain value. It is meant to be used used with TextToSpeech and TextToSpeechStream to turn on latency optimization.

Possible values: 0 - default mode (no latency optimizations). 1 - normal latency optimizations. 2 - strong latency optimizations. 3 - max latency optimizations. 4 - max latency optimizations, with text normalizer turned off (best latency, but can mispronounce things like numbers or dates).

func OutputFormat

func OutputFormat(value string) QueryFunc

OutputFormat returns a QueryFunc that sets the http query 'output_format' to a certain value. It is meant to be used used with TextToSpeech and TextToSpeechStream to change the output format to a value other than the default (mp3_44100_128).

Possible values: mp3_22050_32 - mp3 with 22.05kHz sample rate at 32kbps. mp3_44100_32 - mp3 with 44.1kHz sample rate at 32kbps. mp3_44100_64 - mp3 with 44.1kHz sample rate at 64kbps. mp3_44100_96 - mp3 with 44.1kHz sample rate at 96kbps. mp3_44100_128 - mp3 with 44.1kHz sample rate at 128kbps (default) mp3_44100_192 - mp3 with 44.1kHz sample rate at 192kbps (Requires subscription of Creator tier or above). pcm_8000 - PCM (S16LE) with 8kHz sample rate. pcm_16000 - PCM (S16LE) with 16kHz sample rate. pcm_22050 - PCM (S16LE) with 22.05kHz sample rate. pcm_24000 - PCM (S16LE) with 24kHz sample rate. pcm_44100 - PCM (S16LE) with 44.1kHz sample rate (Requires subscription of Independent Publisher tier or above). pcm_48000 - PCM (S16LE) with 48kHz sample rate. alaw_8000 - A-law with 8kHz sample rate. ulaw_8000 - μ-law with 8kHz sample rate. Note that this format is commonly used for Twilio audio inputs. opus_48000_32 - Opus with 48kHz sample rate at 32kbps. opus_48000_64 - Opus with 48kHz sample rate at 64kbps. opus_48000_96 - Opus with 48kHz sample rate at 96kbps. opus_48000_128 - Opus with 48kHz sample rate at 128kbps (default). opus_48000_192 - Opus with 48kHz sample rate at 192kbps (Requires subscription of Independent Publisher tier or above).

func PageSize

func PageSize(n int) QueryFunc

PageSize returns a QueryFunc that sets the http query 'page_size' to a given value. It is meant to be used with GetHistory to set the number of elements returned in the GetHistoryResponse.History slice.

func StartAfter

func StartAfter(id string) QueryFunc

StartAfter returns a QueryFunc that sets the http query 'start_after_history_item_id' to a given item ID. It is meant to be used with GetHistory to specify which history item to start with when retrieving history.

func WithSettings

func WithSettings() QueryFunc

WithSettings returns a QueryFunc that sets the http query 'with_settings' to true. It is meant to be used with GetVoice to include Voice setting info with the Voice metadata.

type Recording

type Recording struct {
	MimeType       string `json:"mime_type"`
	RecordingId    string `json:"recording_id"`
	SizeBytes      int    `json:"size_bytes"`
	Transcription  string `json:"transcription"`
	UploadDateUnix int    `json:"upload_date_unix"`
}

type SpeechToTextRequest added in v0.3.1

type SpeechToTextRequest struct {
	ModelID    string
	FileName   string
	Audio      io.Reader
	FileFormat string `json:"file_format,omitempty"`
}

type Subscription

type Subscription struct {
	AllowedToExtendCharacterLimit  bool    `json:"allowed_to_extend_character_limit"`
	CanExtendCharacterLimit        bool    `json:"can_extend_character_limit"`
	CanExtendVoiceLimit            bool    `json:"can_extend_voice_limit"`
	CanUseInstantVoiceCloning      bool    `json:"can_use_instant_voice_cloning"`
	CanUseProfessionalVoiceCloning bool    `json:"can_use_professional_voice_cloning"`
	CharacterCount                 int     `json:"character_count"`
	CharacterLimit                 int     `json:"character_limit"`
	Currency                       string  `json:"currency"`
	NextCharacterCountResetUnix    int     `json:"next_character_count_reset_unix"`
	VoiceLimit                     int     `json:"voice_limit"`
	ProfessionalVoiceLimit         int     `json:"professional_voice_limit"`
	Status                         string  `json:"status"`
	Tier                           string  `json:"tier"`
	MaxVoiceAddEdits               int     `json:"max_voice_add_edits"`
	VoiceAddEditCounter            int     `json:"voice_add_edit_counter"`
	HasOpenInvoices                bool    `json:"has_open_invoices"`
	NextInvoice                    Invoice `json:"next_invoice"`
	WithInvoicingDetails           bool    `json:"with_invoicing_details"`
}

func GetSubscription

func GetSubscription() (Subscription, error)

GetSubscription calls the GetSubscription method on the default client.

type TextToSpeechRequest

type TextToSpeechRequest struct {
	Text                            string                           `json:"text"`
	ModelID                         string                           `json:"model_id,omitempty"`
	LanguageCode                    string                           `json:"language_code,omitempty"`
	PronunciationDictionaryLocators []PronunciationDictionaryLocator `json:"pronunciation_dictionary_locators,omitempty"`
	Seed                            int                              `json:"seed,omitempty"`
	PreviousText                    string                           `json:"previous_text,omitempty"`
	NextText                        string                           `json:"next_text,omitempty"`
	PreviousRequestIds              []string                         `json:"previous_request_ids,omitempty"`
	NextRequestIds                  []string                         `json:"next_request_ids,omitempty"`
	ApplyTextNormalization          bool                             `json:"apply_text_normalization,omitempty"`
	ApplyLanguageTextNormalization  bool                             `json:"apply_language_text_normalization,omitempty"`
	VoiceSettings                   *VoiceSettings                   `json:"voice_settings,omitempty"`
}

type TextToSpeechWithTimestampsResponse

type TextToSpeechWithTimestampsResponse struct {
	AudioBase64         string        `json:"audio_base64"`
	Alignment           AlignmentInfo `json:"alignment"`
	NormalizedAlignment AlignmentInfo `json:"normalized_alignment"`
}

type User

type User struct {
	Subscription                Subscription `json:"subscription"`
	FirstName                   string       `json:"first_name,omitempty"`
	IsNewUser                   bool         `json:"is_new_user"`
	IsOnboardingComplete        bool         `json:"is_onboarding_complete"`
	XiApiKey                    string       `json:"xi_api_key"`
	CanUseDelayedPaymentMethods bool         `json:"can_use_delayed_payment_methods"`
}

func GetUser

func GetUser() (User, error)

GetUser calls the GetUser method on the default client.

type ValidationError

type ValidationError struct {
	Detail *[]ValidationErrorDetailItem `json:"detail"`
}

ValidationError represents a request validation error response from the API.

func (*ValidationError) Error

func (e *ValidationError) Error() string

type ValidationErrorDetailItem

type ValidationErrorDetailItem struct {
	Loc  []ValidationErrorDetailLocItem `json:"loc"`
	Msg  string                         `json:"msg"`
	Type string                         `json:"type"`
}

type ValidationErrorDetailLocItem

type ValidationErrorDetailLocItem string

func (*ValidationErrorDetailLocItem) UnmarshalJSON

func (i *ValidationErrorDetailLocItem) UnmarshalJSON(b []byte) error

type VerificationAttempt

type VerificationAttempt struct {
	Accepted            bool      `json:"accepted"`
	DateUnix            int       `json:"date_unix"`
	LevenshteinDistance float32   `json:"levenshtein_distance"`
	Recording           Recording `json:"recording"`
	Similarity          float32   `json:"similarity"`
	Text                string    `json:"text"`
}

type Voice

type Voice struct {
	AvailableForTiers       []string          `json:"available_for_tiers"`
	Category                string            `json:"category"`
	Description             string            `json:"description"`
	FineTuning              FineTuning        `json:"fine_tuning"`
	HighQualityBaseModelIds []string          `json:"high_quality_base_model_ids"`
	Labels                  map[string]string `json:"labels"`
	Name                    string            `json:"name"`
	PreviewUrl              string            `json:"preview_url"`
	Samples                 []VoiceSample     `json:"samples"`
	Settings                VoiceSettings     `json:"settings,omitempty"`
	Sharing                 VoiceSharing      `json:"sharing"`
	VoiceId                 string            `json:"voice_id"`
}

func GetVoice

func GetVoice(voiceId string, queries ...QueryFunc) (Voice, error)

GetVoice calls the GetVoice method on the default client.

func GetVoices

func GetVoices() ([]Voice, error)

GetVoices calls the GetVoices method on the default client.

type VoiceSample

type VoiceSample struct {
	FileName  string `json:"file_name"`
	Hash      string `json:"hash"`
	MimeType  string `json:"mime_type"`
	SampleId  string `json:"sample_id"`
	SizeBytes int    `json:"size_bytes"`
}

type VoiceSettings

type VoiceSettings struct {
	Stability       float32 `json:"stability,omitempty"`
	SimilarityBoost float32 `json:"similarity_boost,omitempty"`
	Style           float32 `json:"style,omitempty"`
	SpeakerBoost    bool    `json:"use_speaker_boost,omitempty"`
	Speed           float32 `json:"speed,omitempty"` // 0.25 to 4.0
}

func GetDefaultVoiceSettings

func GetDefaultVoiceSettings() (VoiceSettings, error)

GetDefaultVoiceSettings calls the GetDefaultVoiceSettings method on the default client.

func GetVoiceSettings

func GetVoiceSettings(voiceId string) (VoiceSettings, error)

GetVoiceSettings calls the GetVoiceSettings method on the default client.

type VoiceSharing

type VoiceSharing struct {
	ClonedByCount          int               `json:"cloned_by_count"`
	DateUnix               int               `json:"date_unix"`
	Description            string            `json:"description"`
	DisableAtUnix          bool              `json:"disable_at_unix"`
	EnabledInLibrary       bool              `json:"enabled_in_library"`
	FinancialRewardEnabled bool              `json:"financial_reward_enabled"`
	FreeUsersAllowed       bool              `json:"free_users_allowed"`
	HistoryItemSampleId    string            `json:"history_item_sample_id"`
	Labels                 map[string]string `json:"labels"`
	LikedByCount           int               `json:"liked_by_count"`
	LiveModerationEnabled  bool              `json:"live_moderation_enabled"`
	Name                   string            `json:"name"`
	NoticePeriod           int               `json:"notice_period"`
	OriginalVoiceId        string            `json:"original_voice_id"`
	PublicOwnerId          string            `json:"public_owner_id"`
	Rate                   float32           `json:"rate"`
	ReviewMessage          string            `json:"review_message"`
	ReviewStatus           string            `json:"review_status"`
	Status                 string            `json:"status"`
	VoiceMixingAllowed     bool              `json:"voice_mixing_allowed"`
	WhitelistedEmails      []string          `json:"whitelisted_emails"`
}

Directories

Path Synopsis
cmd
codegen command

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL