Last updated: Jun 18, 2023
Google Research has introduced a model called MusicLM that generates high-quality music from text descriptions that are fed to the model as input.
The objective of MusicLM is to generate music that is consistent over several minutes and remains true to the original text description. The model outperforms previous systems in both audio quality and adherence to the text description.
To support further research, MusicCaps – a dataset consisting of 5,500 music-text pairs that can train and test the model – is publicly available. MusicLM can also be conditioned on both text and melody to generate music that aligns with a given prompt and style.