MusicLM

Last updated: Jun 18, 2023

Google Research has introduced a model called MusicLM that generates high-quality music from text descriptions that are fed to the model as input.

The objective of MusicLM is to generate music that is consistent over several minutes and remains true to the original text description. The model outperforms previous systems in both audio quality and adherence to the text description.

To support further research, MusicCaps – a dataset consisting of 5,500 music-text pairs that can train and test the model – is publicly available. MusicLM can also be conditioned on both text and melody to generate music that aligns with a given prompt and style.

MusicLM is a new model that generates high-fidelity music from text descriptions.
The model can generate music consistently for several minutes and outperforms previous systems in audio quality and adherence to the text description.
MusicCaps is a dataset of 5,500 music-text pairs that is publicly available for future research.
The model can be conditioned on both text and melody to generate music that respects the prompt and style.
The MusicLM model is a significant advancement in audio generation technology.
The model's ability to generate music from text descriptions has significant implications for music production and audio research.

Try the tool