aquarion.libs.libtts.kokoro

Kokoro TTS backend package.

Classes

`KokoroDeviceNames`(*values)	Kokoro TTS device names supported by this backend.
`KokoroLocales`(*values)	Voice locales supported by this backend.
`KokoroSettings`(*[, locale, voice, speed, ...])	Kokoro TTS backend settings.
`KokoroVoices`(*values)	Kokoro TTS voices supported by this backend.

class aquarion.libs.libtts.kokoro.KokoroDeviceNames(*values)

Bases: StrEnum

Kokoro TTS device names supported by this backend.

I.e. PyTorch device names.

cpu = 'cpu'

cuda = 'cuda'

class aquarion.libs.libtts.kokoro.KokoroLocales(*values)

Bases: StrEnum

Voice locales supported by this backend.

The locales also have to be supported by Kokoro in some way too, of course.

en_GB = 'en_GB': British English (works with voices prefixed with bf_ or bm_)

en_US = 'en_US': American English (works with voices prefixed with af_ or am_)

fr_FR = 'fr_FR': French (works with voices prefixed with ff_` or fm_)

class aquarion.libs.libtts.kokoro.KokoroSettings(*, locale: str = 'en_US', voice: KokoroVoices = KokoroVoices.af_heart, speed: float = 1.0, device: KokoroDeviceNames | None = None, repo_id: str = 'hexgrad/Kokoro-82M', model_path: FilePath | None = None, config_path: FilePath | None = None, voice_path: FilePath | None = None)

Bases: object

Kokoro TTS backend settings.

Note

To work in an offline or air-gapped environment, you must provides local paths for model_path, config_path and voice_path.

config_path: Annotated[Path, PathType(path_type=file)] | None

Offline mode local file path to the Kokoro TTS config file.

This is only required for offline or air-gapped use; otherwise, files are downloaded and cached automatically.

Default:: None

Example

~/my_kokoro_tts_downloads/config.json

device: KokoroDeviceNames | None

The compute device to use to generate the speech.

I.e. to use the GPU or only the CPU.

device must be selected from KokoroDeviceNames or be None. If it set to None, then a GPU will be used if present, with the CPU as the fallback option.

Default:: None

Note

Kokoro TTS does not currently support integer GPU numbers, so if you you multiple GPUs, you will have to specify which one to use in some other way. (E.g. environment variables, etc.)

property lang_code: str

The Kokoro TTS language code for the current locale.

E.g. a for American English, b for British English, f for French, etc.

This is not a settings, it is a derived property used by the Kokoro backend.

locale: str

Used to help specify which language to speak.

locale influences pronunciation, inflections, etc. of the specified voice and must be one of the locales supported by this backend.

While locale must be a string to conform with the ITTSSettings interface, the valid / supported options for it are defined in KokoroLocales.

Default:: KokoroLocales.en_US

model_path: Annotated[Path, PathType(path_type=file)] | None

Offline mode local file path to the Kokoro TTS model file.

This is only required for offline or air-gapped use; otherwise, files are downloaded and cached automatically.

Default:: None

Example

~/my_kokoro_tts_downloads/kokoro-v1_0.pth

repo_id: str

The HuggingFace repository ID to use to download the Kokoro Model.

This normally does not need to be changed, unless you have an alternative download location that works with the HuggingFace API.

Default:: hexgrad/Kokoro-82M

speed: float

The speed at which to speak.

Speech can be sped up or slowed down with this setting.

speed is must be between 0.1 and 2.0, inclusive.

Default:: 1.0, i.e. normal speed.

to_dict() → dict[str, JSONSerializableTypes]

Export all settings as a dictionary of only JSON-serializable types.

Returns:: A dictionary where the keys are the setting names and the values are the setting values converted as necessary to simple base JSON-compatible types.

Example

{
    "locale": "en_US",
    "voice": "af_heart",
    "speed": 1.0,
    "device": "cuda",
    "repo_id": "hexgrad/Kokoro-82M",
    "model_path": "kokoro-v1_0.pth",
    "config_path": "config.json",
    "voice_path": "af_heart.pt",
}

voice: KokoroVoices

The voice in which to speak.

Voices are either male or female and are optimized for specific languages / dialects. voice must be selected from KokoroVoices.

For best results, use a voice that is optimized for the specified locale.

Default:: KokoroVoices.af_heart

voice_path: Annotated[Path, PathType(path_type=file)] | None

Offline mode local file path to the Kokoro TTS voice file.

This is only required for offline or air-gapped use; otherwise, files are downloaded and cached automatically.

If voice_path is not None, then the voice` attribute is ignored.

Default:: None

Example

~/my_kokoro_tts_downloads/voices/af_heart.pt

class aquarion.libs.libtts.kokoro.KokoroVoices(*values)

Bases: StrEnum

Kokoro TTS voices supported by this backend.

Voice grades and details can be found on VOICES.md

af_bella = 'af_bella': American female voice, grade A- quality.

af_heart = 'af_heart': American female voice, grade A quality.

af_nicole = 'af_nicole': American female voice, grade B- quality.

am_fenrir = 'am_fenrir': American male voice, grade C+ quality.

am_michael = 'am_michael': American male voice, grade C+ quality.

am_puck = 'am_puck': American male voice, grade C+ quality.

bf_emma = 'bf_emma': British female voice, grade B- quality.

bm_fable = 'bm_fable': British male voice, grade C quality.

bm_george = 'bm_george': British male voice, grade C quality.

ff_siwis = 'ff_siwis': French female voice, grade B- quality.