Shades of Music

Projektarbeit

Bearbeitet von: Tim Langer

Betreuer: Dominikus Baur

Aufgabenstellung

Bei der Visualisierung von Musiksammlungen wird oft nur auf die oberflächliche Struktur eines Liedes eingegangen, obwohl viele Lieder intern ebenfalls eine differenzierte Struktur haben. Besonders bei Visualisierungen, die versuchen die Ähnlichkeit von Liedern abzubilden ist das ärgerlich, da bei manchen Liedern Ähnlichkeit untereinander in einem bestimmten Teil sehr wohl gegeben sein mag, in einem anderen aber überhaupt nicht.

Im Rahmen der Projektarbeit soll ein System entwickelt werden, das diese innere Struktur eines Liedes bei der Darstellung berücksichtigt: Während das Lied abgespielt wird, werden andere Lieder angezeigt, die eine Ähnlichkeit zu dem aktuellen Teil des Liedes haben. Hierfür wird als erster Anhaltspunkt der Text der Lieder genommen (d.h. wenn ein Lied den gleichen Satz enthält wie ein anderes ist es ähnlich), später sollen aber auch tatsächliche musikalische Eigenschaften (ähnliches Tempo, ähnlicher Rhythmus, ähnliche Tonart) für diese Ähnlichkeitsanalyse benutzt werden.

Konkrete Aufgaben

  • Erstellung einer Literaturliste von verwandten/relevanten wissenschaftlichen Arbeiten
  • Erstellung einer ausführlichen Dokumentation im Medieninformatik-Wiki
  • Design und Implementierung eines lauffähigen Prototypen
  • Schrittweise Verfeinerung der Arbeit
  • Erstellung einer mindestens 30-seitigen Ausarbeitung, die den Hintergrund, das Design, die Implementierung und die Ergebnisse beschreibt und sich an diese Vorgaben (http://www.medien.ifi.lmu.de/lehre/arbeiten/richtlinien.xhtml) hält
  • Halten eines Abschlußvortrags im Oberseminar

Zeitplan

Beginn: 23.02.09

Monat 1:

Woche 1:
  • Aufgaben: Einarbeitung, Literaturrecherche
  • Ergebnis: mind. 6 relevante (!) wissenschaftliche (!!) Arbeiten

Woche 2:
  • Aufgaben: Literaturrecherche, Beginn der Prototypimplementierung
  • Ergebnis: mind. 12 relevante (!) wissenschaftliche (!!) Arbeiten

Woche 3:
  • Aufgaben: Design
  • Ergebnis: Designdokument (User Interface, Architektur)

Woche 4:
  • Aufgaben: Tests (Audioanalyse)
  • Ergebnis: Erkenntnisse

Monat 2:
  • Implementierung

Monat 3:
  • Verfeinerung
  • Ausarbeitung
  • Abschlußvortrag

Design

Implementierung

Echonest Tests

Literatur

Vorgegeben:

Fujihara et al.: Hyperlinking Lyrics: A Method for Creating Hyperlinks Between Phrases in Song Lyrics

  • hyperlyrics: hyperlinked phrases/keywords of song lyrics
  • problem: automatic lyric recognition from audio material
  • hyperlinking (using some prepared text lyrics)
    • text --> text keyword extraction (criteria: length & spread), estimate start&end time, hyperlink
    • text --> audio: extract the singing voice(s) / resynthesize, keyword (pronounciation) & garbage model, cadidate detection & narrowing, hyperlink
  • only small test data

--

Segmentation:

Masataka Goto: A chorus-section detecting method for musical audio signals

  • concentrates on chorus (repeating sections in general, many single ones might be a problem)
  • tests only with popular music
  • 80% correct finding of the chorus section

Matthew Cooper & Jonathan Foote: Summarizing popular music via structural similarity analysis

Namunu C Maddage, Changsheng Xu, Mohan S Kankanhalli, Xi Shao: Content-based music structure analysis with applications to music semantics understanding

S Abdallah, K Noland, M Sandler, M Casey, C Rhodes: Theory and evaluation of a Bayesian music structure extractor

  • general problems concerning segmentation
  • little overview about segmentation methods (harmony, timbre, rhythm, pitch)
  • but: their own method is not really innovative

Jean-Julien Aucouturier, Mark Sandler: Segmentation of Musical Signals Using Hidden Markov Models.

  • textures: the composite “polyphonic timbre” resulting from instruments playing together
  • texture recognition using HMM (problem: number of states for the model)

Jean-Julien Aucouturier, Mark Sandler : Finding repeating patterns in acoustic musical signals

  • general application examples for music structuring:
    • retrieval & indexing
    • intelligent fast forward
    • audio thumbnailing (main characteristics)
  • (weird) graphical pattern detection

Batlle, Eloi ; Cano, Pedro: Automatic Segmentation for Music Classification using Competitive Hidden Markov Models

  • yet another HMM algorithm

J Ajmera, I McCowan, H Bourlard: Speech/music segmentation using entropy and dynamism features in a HMM classification framework

  • it is actually about discriminating speech from music and not segmenting music (or speech) --> remove

MA Bartsch, GH Wakefield - Ann Arbor: To catch a chorus: Using chroma-based representations for audio thumbnailing

  • again only popular music
  • Shepard (1960's): two distinct pitch attributes * tone height: general increase in pitch * chroma: (cyclic with) octave periodicity
  • total of 12 distinct chroma classes (tests showed that this is sufficient without a great loss)
  • able to "encode" harmony within a song
  • evaluation showed superiority to MFCCs

B Pardo, WP Birmingham - Ann Arbor: Automated partitioning of tonal music

  • MIDI input --> "piani-roll style display" & text output (chord-vise segmentation)
  • harmony based partitioning - find best partitioning points from all possible ones (whenever a note ends there is a point)

MM Goodwin, J Laroche: Audio segmentation by feature-space clustering using linear discriminant analysis and dynamic programming

  • extract feature-space representation (sliding window analysis) measures zero-crossing rate, spectral centroid, tilt, flux, ...
  • cluster sequences of the feature space using linar discriminant analysis (LDA)
  • dynamic programming (DP) used to set the optimal boundaries
  • various applications (speech/music discrimination, aduio stream segmentation, music structure analysis)

RB Dannenberg, N Hu: Pattern Discovery Techniques for Music Audio

  • basic concept influenced by human perception: repetition
  • good "related work" overview
  • overview of techniques: * monophonic pitch estimation (there is no chapter 4.2 !) * spectrum analysis (see Wakefieeld: chroma-based representation) * polyphonic transcription

T Zhang, CCJ Kuo: Heuristic approach for generic audio data segmentation and annotation

  • accuracy of more than 90%
  • measures energy function (amplitude variation), zero-crossing rate, fundamental frequency, spektral peak tracks
  • see Graphic
  • segment boundaries at every abrupt change of any of the measured values
  • overlapping boundaries used? (not sure if thats right)
  • segment classification steps:
    • silence detection (silence = imperceptible audio)
    • seperating sounds with music components
    • detect harmonic (environmental) sounds (fundamental frequency quite stable or not?)
    • distinguish pure music (average zero-crossing rate & fundamental frequency)
    • distinguish songs
    • distinguish pure speech

G Tzanetakis, P Cook :Marsyas: A framework for audio analysis

  • see chapter 3 (Segmentation)
  • feature vectors (no further information)
  • distance metric (Mahalonobis distance)
  • derivate (?) & peaks
  • heuricstic peak selection used for segmentation

ALP Chen, M Chang, J Chen, JL Hsu, CH Hsu, SYS Hua : Query by Music Segments: An Efficient Approach for Song Retrieval (link might not work)

  • beat & pitch
  • four segment types defined (beat --> duration, pitch --> segment pitch (equals note number minus previous note number))
  • organisation of music segment sequences in suffix-trees to aid querying
  • approach: song retrieval in music databases

Lyrics:

B Logan, A Kositsky, P Moreno: Semantic Analysis of Song Lyrics

  • compares lyric-based similarity to acoustic-based similarity
  • Probabilistic Latent Semantic Analysis (PLSA): extract frequent words and map them to a topic (--> genre)
  • only artist similarity, not song similarity
  • genres: problems with raegae and electronic, superioir with Latin music (table 5)

P Knees, M Schedl, G Widmer: Multiple lyrics alignment: Automatic retrieval of song lyrics

  • lyric web mining - merging different lyric versions (false lyrics, additional informationen added to the lyrics, ...)

Recommendation:

Social information filtering for music recommendation

A music recommendation system based on music data grouping and user interests

Music recommendation from song sets

Social information filtering: algorithms for automating “word of mouth”

Content-based, collaborative recommendation

Content-based recommendation systems

Toward the next generation of recommender systems: A survey of state- of-the-art and possible extensions

Personalization of user profiles for content-based music retrieval based on relevance feedback

General :

  • most research relies on the fact that music inherits structures that include verses, chorus, bridges and so on - what about music that does not fit such patterns?
  • boundary location is quite important

Notes:

  • test a longer DJ set vs. the occuring songs at their own
  • segment --> segment or segment --> song or both?

Webreferenzen

Vorgegeben:

The Echo Nest

LastFM

LastFM API

Musicbrainz

-- DominikusBaur - 28 Oct 2008
Topic revision: r18 - 29 May 2009, TimLanger
 
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Medieninformatik-Wiki? Send feedback