Music analysis and retrieval using large datasets of symbolic musical data has been hampered by the lack of an adequate, standardized format for symbolic music representation supported by commercial software tools. This gap makes it difficult to acquire and reuse either musical data or musical tools. The tools that are developed for music analysis research do not have the technical underpinnings to scale up to large-scale commercial usage of music information retrieval. The need to use databases to build collections of symbolic music information is well understood [7], but the technology has been lacking.
Building scalable database systems is a costly undertaking. It makes more sense for music applications to leverage the investment of other, better-funded application areas such as electronic commerce, as long as that technology is adequate—not necessarily ideal—for the needs of musical applications
XML has the potential to finally break through the database barrier through the efforts of the World Wide Web Consortium’s XML Query working group. The group’s mission is “to provide flexible query facilities to extract data from real and virtual documents on the Web, therefore finally providing the needed interaction between the web world and the database world. Ultimately, collections of XML files will be accessed like databases.” [18]
The current focus of the XML Query Working Group is the XQuery 1.0 language. Though this language is still a work-in-progress, available only in working draft form, there are already a dozen prototype implementations available for evaluation. These come both from major relational database vendors like Oracle and Microsoft as well as native XML database vendors like Software AG.
The combination of an XML language for music and an XML query language is not sufficient by itself to break through the database barrier for music information retrieval. The two languages must be able to work together to solve musical problems. Early XQuery working drafts had significant problems in this area, lacking powerful facilities to deal with queries that combine aspects of sequence and hierarchy. These shortcomings have been addressed in the XQuery 1.0 working draft of April 30, 2002, and we have now been able to build our first interesting musical queries using XQuery and MusicXML.
Given XQuery’s importance and scope, it is likely to be some time yet before the language definition is completed, issued as a W3C recommendation, and commercial tools made available for effective development of XQuery applications. Fortunately, for research purposes, many analysis applications can be developed effectively today with existing tools: the XML Document Object Model (DOM) [17] and the XML Path Language 1.0 (XPath) [3].
Musical analysis is not just applicable in musicological research; it can also be useful in music publishing. For instance, as Recordare publishes its editions of classical art songs, it is helpful to show the range of each song. This process can be automated by a musical analysis program working on the MusicXML data. Figure 3 shows a screen shot from a program that generates a distribution graph of the pitch range for any particular part in a piece of music. Here we are computing the range for the voice part of the last song in Schumann’s Frauenliebe und Leben, Op. 42.

Figure 3: Pitch Range Distribution Analysis Program
Figure 4 shows the synopsis produced by clicking on the “Report” button. It focuses on the low and high notes.

Figure 4: Pitch Range Synopsis Report
The program that generates this synopsis report is easy to write in MusicXML. For comparison, we will show two implementations. The first uses the DOM, programmed in Visual Basic 6.0 with Microsoft’s MSXML3 parser. An equivalent program can be built using XQuery. Our example uses the QuiP 2.1.1 prototype program from Software AG, which is based on the April 30 working draft of XQuery 1.0. QuiP and XQuery are both works in progress, so the syntax of a working program is likely to change by the time XQuery becomes a formal recommendation from the World Wide Web Consortium.
The DOM approach is implemented within a function that takes a MusicXML document and MusicXML part ID as input, and returns the dialog box string as output. After the initial variable declaration and initialization, the variable oNodes is assigned to all the <pitch> elements within the <part> specified by the PartID parameter. The selection is made using XPath 1.0 syntax.
The program then loops through each pitch, calling the MIDINote function to compute the MIDI note value from the different components of the <pitch> element. If the resulting pitch is lower or higher than any seen before, the spelling of the note is saved in a variable, using a separate SpellNote function on the same <pitch> element. The measure containing the extreme pitch also saved.
After all the pitches are searched, the program returns a string composed from the saved values for the lowest and highest MIDI pitches, along with their musical spellings and the measure where they were first encountered.
Function FindRange _
(ThisXML As DOMDocument30, _
ByVal PartID As String)
Dim oRoot As IXMLDOMElement ' Root of XML document
Dim oNodes As IXMLDOMNodeList ' Pitches to analyze
Dim oElement As IXMLDOMElement ' Current pitch
Dim oMeasure As IXMLDOMElement ' Parent measure
Dim lPitch As Long ' Current pitch
Dim lMinPitch As Long ' Lowest MIDI pitch
Dim sMinPitch As String ' Spelling of low pitch
Dim lMaxPitch As Long ' Highest MIDI pitch
Dim sMaxPitch As String ' Spelling of high pitch
Dim sMinMeasure As String ' Measure for low pitch
Dim sMaxMeasure As String ' Measure for high pitch
lMinPitch = 128
lMaxPitch = -1
Set oRoot = moXML.documentElement
Set oNodes = _
oRoot.selectNodes( _
"//part[@id='" & PartID & "']//pitch")
' Search each pitch for the lowest and highest
' values, saving the spelling and measure number.
Do
Set oElement = oNodes.nextNode
If oElement Is Nothing Then Exit Do
lPitch = MIDINote(oElement)
If lPitch < lMinPitch Then
lMinPitch = lPitch
sMinPitch = SpellNote(oElement)
Set oMeasure = _
oElement.selectSingleNode _
("ancestor::measure")
sMinMeasure = _
oMeasure.getAttribute("number")
End If
If lPitch > lMaxPitch Then
lMaxPitch = lPitch
sMaxPitch = SpellNote(oElement)
Set oMeasure = _
oElement.selectSingleNode _
("ancestor::measure")
sMaxMeasure = _
oMeasure.getAttribute("number")
End If
Loop
FindRange = "Lowest note is " & sMinPitch & _
" (MIDI " & lMinPitch & _
") in measure " & sMinMeasure & vbCrLf & _
"Highest note is " & sMaxPitch & _
" (MIDI " & lMaxPitch & _
") in measure " & sMaxMeasure
End Function
clear
The MIDINote function builds the MIDI note number by reading the <octave>, <step>, and <alter> elements in turn to build the note number value. The CLng function called here casts the string returned by the XML Element into a 32-bit integer (the Long type in Visual Basic 6.0).
' Return MIDI note value from a MusicXML pitch
' element, ignoring microtones.
Function MIDINote _
(ThisPitch As IXMLDOMElement) As Long
Dim oElement As MSXML2.IXMLDOMElement
Dim lTemp As Long ' Temporary pitch
' Get octave
Set oElement = _
ThisPitch.selectSingleNode("octave")
lTemp = 12 * (CLng(oElement.Text) + 1)
' Get pitch step
Set oElement = _
ThisPitch.selectSingleNode("step")
Select Case oElement.Text
Case "a", "A": lTemp = lTemp + 9
Case "b", "B": lTemp = lTemp + 11
Case "c", "C": lTemp = lTemp + 0
Case "d", "D": lTemp = lTemp + 2
Case "e", "E": lTemp = lTemp + 4
Case "f", "F": lTemp = lTemp + 5
Case "g", "G": lTemp = lTemp + 7
End Select
' Get alteration if any
Set oElement = _
ThisPitch.selectSingleNode("alter")
If Not oElement Is Nothing Then
lTemp = lTemp + CLng(oElement.Text)
End If
' Assign and exit
MIDINote = lTemp
End Function
clear
The SpellNote function is even more straightforward, as the only conversion that needs to be done is to go from the numeric <alter> value to a text symbol for the sharps and flats in the note spelling.
' Spell the pitch as a string, e.g. "C#4"
Function SpellNote _
(ThisPitch As IXMLDOMElement) As String
Dim oElement As IXMLDOMElement
Dim sSpell As String ' Temporary string
Dim sAlter As String ' Alteration string
' Get pitch step
Set oElement = _
ThisPitch.selectSingleNode("step")
sSpell = UCase$(oElement.Text)
' Get alteration if any
Set oElement = _
ThisPitch.selectSingleNode("alter")
If Not oElement Is Nothing Then
Select Case CLng(oElement.Text)
Case -2: sAlter = "bb"
Case -1: sAlter = "b"
Case 0: sAlter = vbNullString
Case 1: sAlter = "#"
Case 2: sAlter = "##"
Case Else
sAlter = "(" & oElement.Text & ")"
End Select
sSpell = sSpell & sAlter
End If
' Get octave
Set oElement = _
ThisPitch.selectSingleNode("octave")
sSpell = sSpell & oElement.Text
' Assign and exit
SpellNote = sSpell
End Function
clear
Our XQuery implementation follows a similar approach to the DOM implementation. Since QuiP is a standalone prototype tool for learning XQuery, we have hardcoded the file name and part ID that were parameterized in the DOM example. This example takes a very simple approach to the query, reviewing all the pitches twice in order to locate the minimum and maximum values. Once we have these values, we then find the pitch elements whose MIDI note values match the high and low values. XQuery results are returned in XML format, so we do not need a SpellNote function. We simply output the first <pitch> elements that match each of the extreme values, and then find the number of the measure that contains the first instance of these matching elements. XQuery makes use of XPath 2.0 and does not support the ancestor:: axis, so our query assumes the <measure> element is the grandparent of the <pitch> element. Therefore this query will only work with partwise MusicXML files, not timewise files. We have revised the syntax slightly to better match the XQuery working draft, using the string function where QuiP 2.1.1 used the string-value function. [See updated examples for the November 15, 2002 XQuery working draft, published after this paper was presented.]
define function MIDINote(element $thispitch) returns integer
{
let $step := $thispitch/step
let $alter :=
if (empty($thispitch/alter)) then 0
else if (string($thispitch/alter) =
"1") then 1
else if (string($thispitch/alter) =
"-1") then -1
else 0
let $octave :=
integer(string($thispitch/octave))
let $pitchstep :=
if (string($step) = "C") then 0
else if (string($step) = "D") then 2
else if (string($step) = "E") then 4
else if (string($step) = "F") then 5
else if (string($step) = "G") then 7
else if (string($step) = "A") then 9
else if (string($step) = "B") then 11
else 0
return 12 * ($octave + 1) + $pitchstep + $alter
}
let $doc := document("MusicXML/Frauenliebe8.xml")
let $part := $doc//part[./@id = "P1"]
let $highnote :=
max(for $pitch in $part//pitch
return MIDINote($pitch))
let $lownote :=
min(for $pitch in $part//pitch
return MIDINote($pitch))
let $highpitch :=
$part//pitch[MIDINote(.) = $highnote]
let $lowpitch :=
$part//pitch[MIDINote(.) = $lownote]
let $highmeas :=
string($highpitch[1]/../../@number)
let $lowmeas :=
string($lowpitch[1]/../../@number)
return
<result>
<low-note>{$lowpitch[1]}
<measure>{$lowmeas}</measure>
</low-note>
<high-note>{$highpitch[1]}
<measure>{$highmeas}</measure>
</high-note>
</result>
clear
This query returns the following result in XML:
<?xml version="1.0"?> <result> <low-note> <pitch> <step>C</step> <alter>1</alter> <octave>4</octave> </pitch> <measure>16</measure> </low-note> <high-note> <pitch> <step>D</step> <octave>5</octave> </pitch> <measure>12</measure> </high-note> </result>
clear
Melody retrieval provides a more typical XQuery example, using a FLWR (for-let-where-return) expression. Here we are looking for the instances of the Frere Jacques theme in the key of C. We simply this query to look just for the pitch step sequence of C, D, E, C. This query also assumes a partwise MusicXML file. It will match instances of the pitch sequence that cross <measure> boundaries, but will not match across <part> boundaries:
<result>
{let $doc :=
document("MusicXML/frere-jacques.xml")
let $notes := $doc//note
for $note1 in
$notes[string(./pitch/step) = "C"],
$note2 in $notes[. follows $note1][1],
$note3 in $notes[. follows $note2][1],
$note4 in $notes[. follows $note3][1]
let $meas1 := $note1/..
let $part1 := $meas1/..
let $part2 := $note2/../..
let $part3 := $note3/../..
let $part4 := $note4/../..
where string($note2/pitch/step) = "D"
and string($note3/pitch/step) = "E"
and string($note4/pitch/step) = "C"
and (string($part1/@id) =
string($part2/@id))
and (string($part2/@id) =
string($part3/@id))
and (string($part3/@id) =
string($part4/@id))
return
<motif>
{$note1/pitch} {$note2/pitch}
{$note3/pitch} {$note4/pitch}
<measure>{$meas1/@number}</measure>
<part>{$part1/@id}</part>
</motif>
}
</result>
clear
When run against a simple three-part round of Frere Jacques prepared in Finale and exported to MusicXML, the query returns six instances of the motif, the first of which is shown below:
<?xml version="1.0"?> <result> <motif> <pitch> <step>C</step> <octave>5</octave> </pitch> <pitch> <step>D</step> <octave>5</octave> </pitch> <pitch> <step>E</step> <octave>5</octave> </pitch> <pitch> <step>C</step> <octave>5</octave> </pitch> <measure number="1" /> <part id="P1" /> </motif> <motif> <!-- Remaining 5 motifs removed for brevity --> </result>