Evaluation of Findings
This page demonstrates both quantitative and qualitative evaluation of DRSA, proving that concepts extracted with DRSA are indeed class specific. The quantitative evaluation shows that subspaces are most effective when applied to their corresponding genre, while the qualitative evaluation reveals that cross-genre application produces inferior explanations.
Quantitative Evaluation
To assess the class distinctiveness of subspaces and eliminate the possibility of DRSA simply dividing explanations by frequency bands, an exhaustive cross-class evaluation was performed. Explanations of each target class c were decomposed using subspaces U(c) from all 10 genres. The resulting AUPC scores reveal that the highest scores consistently appear on the diagonal, proving that the best disentanglement is obtained when decomposing explanations with their true class subspaces.
Key findings include: Hip-hop and classical achieve the best disentanglement across all classes. Genre similarities align with human intuition—metal responds strongly to blues and rock subspaces, while reggae achieves the worst score with classical subspaces. The rock class shows minimal improvement over standard explanations and similar scores for both rock and blues subspaces, reflecting known mislabeling issues in the GTZAN dataset. These patterns demonstrate that DRSA learns meaningful, class-specific concepts rather than simple frequency-based divisions.

Qualitative Evaluation
In this qualitative evaluation, explanations for jazz samples were disentangled with the subspaces extracted for hip-hop music. As an example: subspace 2 of hip-hop music represents the kick drum (low frequencies). As shown in the figure below, this subspace fails to accurately extract the low frequencies of jazz music, which indicates that subspaces react not only to frequency bands but to higher-level dynamics and rhythmic structures of sound concepts. When inspecting this special case, it is recommended to also examine the "real" explanations of both genres jazz and hip-hop.
Shown are snippets of two different samples of jazz music, whose explanations were propagated through hip-hop subspaces. The standard heatmap specifies the local explanation as obtained with standard LRP. The subspace heatmaps represent the disentangled explanation-components as extracted with DRSA and LRP. Audio players for each explanation are provided in collapsible sections below the figure.

Extracted Concepts
The model learned these distinct concepts when classifying this genre:
Hip-Hop Concept 1 on Jazz
Hip-hop vocal subspace applied to jazz - shows misalignment as jazz lacks rap vocals.
Hip-Hop Concept 2 on Jazz
Hip-hop kick drum subspace fails to extract jazz low frequencies, revealing class-specific rhythmic structures.
Hip-Hop Concept 3 on Jazz
Hip-hop rhythmic drum pattern subspace applied to jazz rhythm section - demonstrates structural mismatch.
Hip-Hop Concept 4 on Jazz
Hip-hop bass subspace on jazz bass line - highlights genre-specific low-frequency dynamics.
Audio Samples & Explanations
Click on each sample to expand and listen to the original audio, standard LRP explanation, and the four disentangled concept explanations.
Original & Standard Explanation
Disentangled Explanation Components