pretrained.vocoder.waveglow

Defines a pre-trained WaveGlow vocoder model.

This vocoder can be used with TTS models that output mel spectrograms to synthesize audio.

from pretrained.vocoder import pretrained_vocoder

vocoder = pretrained_vocoder("waveglow")
class pretrained.vocoder.waveglow.WaveGlowLoss(sigma: float = 1.0)[source]

Bases: Module

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(model_output: tuple[torch.Tensor, list[torch.Tensor], list[torch.Tensor]]) Tensor[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class pretrained.vocoder.waveglow.Invertible1x1Conv(c: int)[source]

Bases: Module

Initializes internal Module state, shared by both nn.Module and ScriptModule.

weight_inv: Tensor
forward(z: Tensor) tuple[torch.Tensor, torch.Tensor][source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

infer(z: Tensor) Tensor[source]
class pretrained.vocoder.waveglow.WaveNetConfig(n_layers: int = 8, kernel_size: int = 3, n_channels: int = 512)[source]

Bases: object

n_layers: int = 8
kernel_size: int = 3
n_channels: int = 512
class pretrained.vocoder.waveglow.WaveNet(n_in_channels: int, n_mel_channels: int, config: WaveNetConfig, lora_rank: int | None = None)[source]

Bases: Module

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(audio: Tensor, spect: Tensor) Tensor[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class pretrained.vocoder.waveglow.WaveGlowConfig(n_mel_channels: int = 80, n_flows: int = 12, n_group: int = 8, n_early_every: int = 4, n_early_size: int = 2, sampling_rate: int = 22050, wavenet: pretrained.vocoder.waveglow.WaveNetConfig = <factory>, lora_rank: int | None = None)[source]

Bases: object

n_mel_channels: int = 80
n_flows: int = 12
n_group: int = 8
n_early_every: int = 4
n_early_size: int = 2
sampling_rate: int = 22050
wavenet: WaveNetConfig
lora_rank: int | None = None
class pretrained.vocoder.waveglow.WaveGlow(config: WaveGlowConfig)[source]

Bases: Module

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(forward_input: tuple[torch.Tensor, torch.Tensor]) tuple[torch.Tensor, list[torch.Tensor], list[torch.Tensor]][source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

infer(spect: Tensor, sigma: float = 1.0) Tensor[source]
remove_weight_norm() None[source]

Removes weight normalization module from all of the WaveGlow modules.

pretrained.vocoder.waveglow.pretrained_waveglow(*, fp16: bool = True, pretrained: bool = True, lora_rank: int | None = None) WaveGlow[source]

Loads the pretrained WaveGlow model.

Reference:

https://github.com/NVIDIA/DeepLearningExamples/blob/master/PyTorch/SpeechSynthesis/Tacotron2/waveglow/entrypoints.py

Parameters:
  • fp16 – When True, returns a model with half precision float16 weights

  • pretrained – When True, returns a model pre-trained on LJ Speech dataset

  • lora_rank – The LoRA rank to use, if LoRA is desired.

Returns:

The WaveGlow model