Content Classification
Last updated
Last updated
Content Classification is an SDK feature that analyzes the entire query file to determine whether its segments contain MUSIC, SPEECH, or SILENCE. It works by creating a classification fingerprint - similar to an audio or melody fingerprint - that enables the system to label each detected segment and provide a confidence score. This feature can be run on its own or combined with audio/melody fingerprint searches in a single request.
Audio and melody/phonetic matching are designed to identify a piece of media by comparing its fingerprint to Pex’s database of known assets. This results in matches tied to specific tracks or works.
Content Classification, by contrast, does not attempt to identify an asset. Instead, it analyzes the audio content itself to label what kind of sound is present - even if that content is not in Pex’s database. This makes classification useful for understanding the nature of the audio regardless of whether a commercial match is found.
For example, if classification shows a MUSIC segment but there’s no identification match, that music could be non-commercial or from a sound library.
Currently, Content Classification returns one or more of the following top-level categories:
MUSIC – Any type of music, instrumental or with vocals.
SPEECH – Spoken voice segments.
SILENCE – Periods with no significant audio.
Each category may include additional subclasses that provide more detail (e.g., instrumental, vocal style, or genre indicators).
A Content Classification response includes each detected category, its time range in seconds, a confidence score (0–100), and any subcategories with their own confidence values. Example:
"content_classification": {
"silence": [
{
"start": 230,
"end": 240,
"confidence": 100.0,
"subclasses": []
}
],
"music": [
{
"start": 2,
"end": 230,
"confidence": 98.4,
"subclasses": [
{ "name": "singing", "confidence": 82.1 },
{ "name": "happy music", "confidence": 37.2 },
{ "name": "pop music", "confidence": 35.2 }
]
}
],
"speech": []
}