Introduction

The staff notation is provided solely with melody details for ease of comprehension.

Orange: Call. Blue: Response.

Example 1: Growth ("My Heart Will Go On", James Horner)

Growth Call (melody)
Call (melody and accompaniment)
Growth Response (melody)
Response (melody and accompaniment)

Example 2: Extension ("My Heart Will Go On", James Horner)

Growth Call (melody)
Call (melody and accompaniment)
Growth Response (melody)
Response (melody and accompaniment)

Example 3: Liquidation ("My Heart Will Go On", James Horner)

Growth Call (melody)
Call (melody and accompaniment)
Growth Response (melody)
Response (melody and accompaniment)

Example 4: Conversion ("Part of Your World", Alan Menken)

Growth Call (melody)
Call (melody and accompaniment)
Growth Response (melody)
Response (melody and accompaniment)

Example 5: Condensation ("The Sound of Silence", Paul Simon)

Growth Call (melody)
Call (melody and accompaniment)
Growth Response (melody)
Response (melody and accompaniment)



Experiments

Demo 1: Compare with human-composed music

Call
Although the results suggest that human-composed music received higher ratings
than the proposed model in terms of quality, groove, and coherence, the subjects
perceived greater creativity in the music generated by CRG. It is a valuable tool
for arrangement or style transfer. Furthermore, the CRG’s ability to generate creative
responses can potentially inspire and complement human creativity during the composition
process.


Call Call (melody)
Call (melody and accompaniment)
Ground-truth Response Growth (melody)
Growth (melody and accompaniment)

CRG-composed Music

Growth Growth (melody)
Growth (melody and accompaniment)
Extension Extension (melody)
Extension (melody and accompaniment)
Liquidation Liquidation (melody)
Liquidation (melody and accompaniment)
Conversion Conversion (melody)
Conversion (melody and accompaniment)
Condensation Condensation (melody)
Condensation (melody and accompaniment)

Demo 2: Compare with SOTA

Call
Both subjective and objective experiments corroborate that our model is proficient in producing outputs with superior
musical quality.



Call ("Pearl of the Orient" by Ta-yu Lo)
Ground-truth Response

Music-Transformer
CP-Transformer
HAT
CRG

Demo 3: Compare with ablated models

Call
Evidence supports that the performance of our proposed model is enhanced through the integration of a knowledge-enhanced mechanism
and multiple training tasks.


Call ("Drowning Sorrows" by Buyi Mao)
Ground-truth Response

CRG-Base
CRG-K
CRG-KM1
CRG-KM2
CRG-KM12


Demo 4: Different knowledge types

Call
It's evident that every type of knowledge plays a part in performance improvement. Notably, knowledge candidates sharing similar composition attributes
seem to exert a more pronounced influence on enhancing model performance compared to the remaining knowledge types.


Call ("Sweet As Honey" by Teresa Teng)
Ground-truth Response

Random
Distinct
Similar

Demo 5: Different amounts of knowledge

Call
A trade-off between music diversity and music quality is observed when increasing the number of
knowledge candidates


Call ("Leave With Sorrow" by Yuanjie Li)
Ground-truth Response

k=1
k=1
k=3
k=3
k=5
k=5
k=8
k=8