Shortcuts

torchsynth documentation

The fastest synth in the universe.

_images/logo-with-caption.jpg

torchsynth is based upon traditional modular synthesis written in pytorch. It is GPU-optional and differentiable.

Most synthesizers are fast in terms of latency. torchsynth is fast in terms of throughput. It synthesizes audio 16200x faster than realtime (714MHz) on a single GPU. This is of particular interest to audio ML researchers seeking large training corpora.

Additionally, all synthesized audio is returned with the underlying latent parameters used for generating the corresponding audio. This is useful for multi-modal training regimes.

If you’d like to hear torchsynth, check out synth1K1, a dataset of 1024 4-second sounds rendered from the Voice synthesizer, or listen to the following SoundCloud embed:

Here is another set of sounds created with the Voice Drum Nebula. In torchsynth a nebula is a set of hyperparameters that defines how synthesizer parameters are sampled. The hyperparameters of the Drum Nebula were hand-tuned to increase the likelihood of the Voice producing percussive samples.

Reproducibility

Contributing

INDICES AND TABLES


© Copyright Copyright (c) 2020-2021, Jordie Shier, Joseph Turian, Max Henry.. Revision a2fdb489.

Built with Sphinx using a theme provided by Read the Docs.
Read the Docs v: latest
Versions
latest
stable
v1.0.1
v1.0.0
Downloads
pdf
html
epub
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.