Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
259 changes: 259 additions & 0 deletions livekit-plugins/livekit-plugins-camb/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,259 @@
# Camb.ai Plugin for LiveKit Agents

Text-to-Speech plugin for [Camb.ai](https://camb.ai) TTS API, powered by MARS-8 technology.

## Features

- High-quality neural text-to-speech with MARS-8 series models
- Multiple model variants (mars-8, mars-8-flash, mars-8-instruct)
- User instructions for style and tone control
- Speed control and enhanced pronunciation
- Support for 140+ languages
- Real-time HTTP streaming
- Pre-built voice library

## Installation

```bash
pip install livekit-plugins-camb
```

## Prerequisites

You'll need a Camb.ai API key. Set it as an environment variable:

```bash
export CAMB_API_KEY=your_api_key_here
```

Or obtain it from [Camb.ai Studio](https://studio.camb.ai/public/onboarding).

## Quick Start

```python
from livekit.plugins.camb import TTS

# Initialize TTS (uses CAMB_API_KEY env var)
tts = TTS()

# Synthesize speech
stream = tts.synthesize("Hello from Camb.ai!")
audio_frame = await stream.collect()

# Save to file
with open("output.wav", "wb") as f:
f.write(audio_frame.to_wav_bytes())
```

## List Available Voices

```python
from livekit.plugins.camb import list_voices

voices = await list_voices()
for voice in voices:
print(f"{voice.name} ({voice.id}): {voice.gender}, {voice.language}")
```

## Select a Specific Voice

```python
tts = TTS(voice_id=2681) # Use Attic voice
stream = tts.synthesize("Using a specific voice!")
```

## Model Selection

Camb.ai offers multiple MARS-8 models for different use cases:

```python
# Default balanced model
tts = TTS(model="mars-8")

# Faster inference
tts = TTS(model="mars-8-flash")

# Supports user instructions for style/tone
tts = TTS(
model="mars-8-instruct",
user_instructions="Speak in a friendly, conversational tone"
)
```

## Advanced Configuration

```python
tts = TTS(
api_key="your-api-key", # Or use CAMB_API_KEY env var
voice_id=2681, # Voice ID from list-voices (Attic voice)
language="en-us", # BCP-47 locale
model="mars-8-instruct", # MARS model variant
speed=1.0, # Speech rate (0.5-2.0)
user_instructions="Speak energetically with clear enunciation",
output_format="pcm_s16le", # Audio format
enhance_named_entities=True, # Better pronunciation for names/places
)
```

## Usage with LiveKit Agents

```python
from livekit import agents
from livekit.plugins.camb import TTS

async def entrypoint(ctx: agents.JobContext):
# Connect to room
await ctx.connect()

# Initialize TTS
tts = TTS(language="en-us", speed=1.1)

# Synthesize and publish
stream = tts.synthesize("Hello from LiveKit with Camb.ai!")
audio_frame = await stream.collect()

# Publish to room
source = agents.AudioSource(tts.sample_rate, tts.num_channels)
track = agents.LocalAudioTrack.create_audio_track("tts", source)
await ctx.room.local_participant.publish_track(track)
await source.capture_frame(audio_frame)
```

## Configuration Options

### TTS Constructor Parameters

- **api_key** (str | None): Camb.ai API key
- **voice_id** (int): Voice ID to use (default: 2681)
- **language** (str): BCP-47 locale (default: "en-us")
- **model** (SpeechModel): MARS model variant (default: "mars-8")
- **speed** (float): Speech rate (default: 1.0)
- **user_instructions** (str | None): Style/tone guidance (requires mars-8-instruct)
- **output_format** (OutputFormat): Audio format (default: "pcm_s16le")
- **enhance_named_entities** (bool): Enhanced pronunciation (default: False)
- **base_url** (str): API base URL
- **http_session** (aiohttp.ClientSession | None): Reusable HTTP session

### Available Models

- **mars-8**: Default, balanced quality and speed
- **mars-8-flash**: Faster inference, lower latency
- **mars-8-instruct**: Supports user_instructions for style control
- **mars-7**: Previous generation model
- **mars-6**: Older generation model
- **auto**: Automatic model selection

### Output Formats

- **pcm_s16le**: 16-bit PCM (recommended for streaming)
- **pcm_s32le**: 32-bit PCM (highest quality)
- **wav**: WAV with headers
- **flac**: Lossless compression
- **adts**: ADTS streaming format

## API Reference

### TTS Class

Main text-to-speech interface.

**Methods:**
- `synthesize(text: str) -> ChunkedStream`: Synthesize text to speech
- `update_options(**kwargs)`: Update voice settings dynamically
- `aclose()`: Clean up resources

**Properties:**
- `model` (str): Current MARS model name
- `provider` (str): Provider name ("Camb.ai")
- `sample_rate` (int): Audio sample rate (24000 Hz)
- `num_channels` (int): Number of audio channels (1)

### list_voices Function

```python
async def list_voices(
api_key: str | None = None,
base_url: str = "https://client.camb.ai/apis",
) -> list[VoiceInfo]
```

Returns list of available voices with metadata.

### VoiceInfo

Voice metadata object with:
- **id** (int): Unique voice identifier
- **name** (str): Human-readable voice name
- **gender** (str | None): Voice gender
- **language** (str | None): BCP-47 locale

## Multi-Language Support

Camb.ai supports 140+ languages. Specify using BCP-47 locales:

```python
# French
tts = TTS(language="fr-fr", voice_id=...)

# Spanish
tts = TTS(language="es-es", voice_id=...)

# Japanese
tts = TTS(language="ja-jp", voice_id=...)
```

## Dynamic Options

Update TTS settings without recreating the instance:

```python
tts = TTS()

# Change voice
tts.update_options(voice_id=12345)

# Change speed and model
tts.update_options(speed=1.2, model="mars-8-flash")

# Add user instructions
tts.update_options(
model="mars-8-instruct",
user_instructions="Speak warmly and enthusiastically"
)
```

## Error Handling

The plugin handles errors according to LiveKit conventions:

```python
from livekit.agents import APIStatusError, APIConnectionError, APITimeoutError

try:
stream = tts.synthesize("Hello!")
audio = await stream.collect()
except APIStatusError as e:
print(f"API error: {e.status_code} - {e.message}")
except APIConnectionError as e:
print(f"Connection error: {e}")
except APITimeoutError as e:
print(f"Request timed out: {e}")
```

## Future Features

Coming soon:
- GCP Vertex AI integration
- Voice cloning via custom voice creation
- Voice generation from text descriptions
- WebSocket streaming for real-time applications

## Links

- [Camb.ai Documentation](https://camb.mintlify.app/)
- [LiveKit Agents Documentation](https://docs.livekit.io/agents/)
- [GitHub Repository](https://github.com/livekit/agents)

## License

Apache License 2.0
Loading