pretrained.vocoder.waveglow

Defines a pre-trained WaveGlow vocoder model.

This vocoder can be used with TTS models that output mel spectrograms to synthesize audio.

from pretrained.vocoder import pretrained_vocoder

vocoder = pretrained_vocoder("waveglow")

class pretrained.vocoder.waveglow.WaveGlowLoss(sigma: float = 1.0)[source]

Bases: Module

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(model_output: tuple[torch.Tensor, list[torch.Tensor], list[torch.Tensor]]) → Tensor[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class pretrained.vocoder.waveglow.Invertible1x1Conv(c: int)[source]

Bases: Module

Initializes internal Module state, shared by both nn.Module and ScriptModule.

weight_inv: Tensor

forward(z: Tensor) → tuple[torch.Tensor, torch.Tensor][source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

infer(z: Tensor) → Tensor[source]

class pretrained.vocoder.waveglow.WaveNetConfig(n_layers: int = 8, kernel_size: int = 3, n_channels: int = 512)[source]

Bases: object

n_layers: int = 8

kernel_size: int = 3

n_channels: int = 512

class pretrained.vocoder.waveglow.WaveNet(n_in_channels: int, n_mel_channels: int, config: WaveNetConfig, lora_rank: int | None = None)[source]

Bases: Module

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(audio: Tensor, spect: Tensor) → Tensor[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class pretrained.vocoder.waveglow.WaveGlowConfig(n_mel_channels: int = 80, n_flows: int = 12, n_group: int = 8, n_early_every: int = 4, n_early_size: int = 2, sampling_rate: int = 22050, wavenet: pretrained.vocoder.waveglow.WaveNetConfig = <factory>, lora_rank: int | None = None)[source]

Bases: object

n_mel_channels: int = 80

n_flows: int = 12

n_group: int = 8

n_early_every: int = 4

n_early_size: int = 2

sampling_rate: int = 22050

wavenet: WaveNetConfig

lora_rank: int | None = None

class pretrained.vocoder.waveglow.WaveGlow(config: WaveGlowConfig)[source]

Bases: Module

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(forward_input: tuple[torch.Tensor, torch.Tensor]) → tuple[torch.Tensor, list[torch.Tensor], list[torch.Tensor]][source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

infer(spect: Tensor, sigma: float = 1.0) → Tensor[source]

remove_weight_norm() → None[source]: Removes weight normalization module from all of the WaveGlow modules.

pretrained.vocoder.waveglow.pretrained_waveglow(*, fp16: bool = True, pretrained: bool = True, lora_rank: int | None = None) → WaveGlow[source]

Loads the pretrained WaveGlow model.

Reference:: https://github.com/NVIDIA/DeepLearningExamples/blob/master/PyTorch/SpeechSynthesis/Tacotron2/waveglow/entrypoints.py

Parameters:

fp16 – When True, returns a model with half precision float16 weights
pretrained – When True, returns a model pre-trained on LJ Speech dataset
lora_rank – The LoRA rank to use, if LoRA is desired.

Returns:

The WaveGlow model