Oscilla Audio Object System

Overview

The Audio Object system provides a browser-native surface for placing, configuring, and triggering audio elements on the score. Audio objects are authored live through a GUI editor and rendered as SVG elements that participate fully in the standard DSL cue pipeline.

Core principle: Audio objects are serialized to DSL strings and injected into the score SVG as <rect> elements. The existing pipeline -- parseCueToAST, registerSingleCue, primeWaveform, checkCueTriggers, handleAudioCue -- handles everything from there. There is no separate rendering or trigger path.


Key Terms

Term Definition
audioObject A placed element on the contribution surface with audio trigger config
rect The SVG <rect> element representing an audio object on the score
trigger The nested config object defining source, type, playback, and impulse params
DSL The serialized string form of a trigger config, set as the rect's id attribute
extent Horizontal width of the audio object region in SVG world units
voice A single active audio playback instance in activeAudioCues registry
uid Unique identifier for a voice: trigger-{id} for single, trigger-{id}-{ts} for pool

Architecture

Signal Flow

User places audioObject via GUI editor
        |
        v
  audioObjectState.addAudioObject(item)      JSON config stored on server (audio-objects.json)
        |
        v
  triggerRender() -> renderAll()
        |
        v
  renderAudioObjectsSVG()                     (audioObjectSVG.js)
        |
        +---> triggerConfigToDSL(item)        JSON config -> DSL string
        +---> createElementNS("rect")         SVG rect with DSL as id
        +---> registerSingleCue(rect)         (cueDispatcher.js)
                    |
                    +---> parseCueToAST(dsl)  DSL -> AST
                    +---> primeWaveform()     SVG polyline waveform
                    +---> primeOverlay()      HTML text overlay
                    +---> window.cues.push()  registered for playhead trigger
        
Playhead crosses rect
        |
        v
  checkCueTriggers()                          (cueDispatcher.js, called by RAF.js)
        |
        v
  handleCueTrigger(ast)
        |
        v
  handleAudioCue(ast)                         (js/cues/audio/audioFile.js)
  handleAudioImpulseCue(ast)                  (js/cues/audio/audioImpulse.js)
        |
        v
  oscilla:audio "play" / "stop"               (CustomEvent on window)

Module Dependency Graph

interactionSurface.js
    |
    +---> audioObjectState.js      (CRUD, persistence, state)
    +---> audioObjectEditor.js     (editor dialog UI)
    +---> audioObjectSVG.js        (SVG rendering, drag, registration)
    |         |
    |         +---> configToDSL.js          (JSON config -> DSL serializer)
    |         +---> cueDispatcher.js        (registerSingleCue, unregister, update)
    |                   |
    |                   +---> parseCueToAST           (DSL -> AST)
    |                   +---> primeWaveform            (SVG polyline waveform)
    |                   +---> primeAudioOverlay        (HTML text overlay)
    |                   +---> checkCueTriggers         (playhead crossing)
    |                   +---> handleAudioCue           (audio playback engine)
    |
    +---> trigger.js               (annotation triggers only)

Module Reference

File Lines Role
js/interaction/audioObjectSVG.js ~426 SVG rect creation, drag handling, cue registration, cleanup
js/interaction/configToDSL.js ~253 Serializer: JSON trigger config to DSL string
js/interaction/audioObjectEditor.js ~799 Tabbed editor dialog (source, playback, impulse), audio browser, upload
js/interaction/audioObjectState.js ~282 State management, server persistence (REST API), CRUD operations
js/interaction/trigger.js ~610 Annotation/audioObject trigger execution (click-to-trigger), playhead gates
js/interaction/audioBrowser.js ~393 Directory navigator for project audio files
js/interaction/audioRecorder.js ~831 In-browser audio recording
js/interaction/interactionSurface.js ~1066 Integration: mode toggle, click routing, render dispatch
js/cues/cueDispatcher.js ~1580 Cue pipeline: assignCues, registerSingleCue, checkCueTriggers
js/cues/audio/audioFile.js ~860 Shared audio engine (single, pool, impulse playback)

How It Works

State Management (audioObjectState.js)

State follows the same pattern as shared.js for annotations:

export const audioState = {
    initialized: false,
    enabled: true,
    audioMode: false,      // true when placement tool is active
    project: null,
    items: [],             // Array of audio object descriptors
    activeEditor: null,
};

Each item is a plain object:

{
    id: "aud_mlqfa042_reyiqu9l",  // unique ID from audioObjectId()
    kind: "audioObject",
    placement: {
        x: 5000,                  // SVG world-space X
        y: 200,                   // SVG world-space Y
        extent: 800,              // horizontal width in world units
    },
    trigger: {
        type: "audioPool",        // "audio" | "audioPool" | "audioImpulse"
        source: { path: "noise" },
        playback: {
            gain: 0.8, pan: 0, pitch: 1,
            fadeIn: 0, fadeOut: 0.5,
            loop: 1, toggle: false, order: "shuffle",
            poly: 6, panRandom: 0.3, pitchRandom: 0.1,
        },
        impulse: { ... },         // only for audioImpulse type
        playheadTrigger: true,
    },
    waveform: {
        show: true,
        displayMode: "waveform",  // "waveform" | "handle" | "hidden"
        height: 120,
    },
    scope: "local",               // "local" | "shared"
}

Persistence uses server REST API, storing to audio-objects.json in the project folder. WebSocket sync broadcasts changes to other connected clients in real-time.

DSL Serialization (configToDSL.js)

triggerConfigToDSL(item) converts the JSON trigger config to a valid DSL string. The DSL is set as the SVG rect's id attribute, making it a standard cue element.

audio type:
  { type: "audio", source: { path: "noise/test.wav" }, playback: { gain: 0.8 } }
  -> audio(src:"noise/test.wav", amp:0.8, uid:trigger-aud_abc123)

audioPool type:
  { type: "audioPool", source: { path: "noise" }, playback: { gain: 0.8, panRandom: 0.3 } }
  -> audioPool(path:noise, mode:shuffle, amp:0.8, pan:rand(-0.3, 0.3), uid:trigger-aud_abc123)

audioImpulse type:
  { type: "audioImpulse", source: { path: "sfx/rain" }, impulse: { rate: 30, jitter: 0.4 } }
  -> audioImpulse(path:"sfx/rain", rate:30, jitter:0.4, uid:trigger-impulse-aud_abc123)

Key serialization rules:

SVG Rendering (audioObjectSVG.js)

renderAudioObjectsSVG(onEdit) is the main render function, called by renderAll() in interactionSurface.js. It:

  1. Clears previous SVG elements and unregisters their cues
  2. Creates a <g id="oscilla-audio-objects-svg-group"> in the score SVG
  3. For each visible item in audioState.items:
    • Generates DSL via triggerConfigToDSL(item)
    • Creates an SVG <rect> with the DSL as its id
    • Positions it using placementToSVG() (world coords + viewBox offset)
    • Sets dimensions from placement.extent and waveform.height
    • Attaches drag handler (edit mode only, uses getScreenCTM().inverse())
    • Calls registerSingleCue(rect) to enter the cue pipeline

Display modes control rect appearance:

Mode Performance Edit Mode
waveform Semi-transparent fill, 0.5px stroke Same
handle 24px height, same fill Same
hidden Transparent, no pointer events Dashed outline at 0.3 opacity

Dynamic Cue Registration (cueDispatcher.js)

Three functions support runtime cue management:

registerSingleCue(element) -- parses the element's id as DSL, primes overlays and waveforms, builds a cue entry, pushes to window.cues[]. The cue entry has _dynamicCue: true to distinguish it from Inkscape-authored cues.

unregisterSingleCue(element) -- removes from window.cues[] by element reference. Does not remove DOM elements (caller handles that).

updateCueElement(element, newDSL) -- unregisters, clears priming flags, destroys old overlays, sets new id, re-registers. Used when the editor saves changes to an existing audio object.

The priming follows the same pattern as assignCues():

cueAudio:        primeWaveform(ast, element)
                 primeAudioOverlay(ast, element)     [only if waveform:none]

cueAudioPool:    primePoolWaveform(ast, element)
                 primeAudioPoolOverlay(ast, element)

cueAudioImpulse: primeImpulseWaveform(ast, element)
                 primeAudioImpulseOverlay(ast, element)

Drag Handling

SVG rect drag uses getScreenCTM().inverse() to convert screen coordinates to SVG world coordinates. On drag end, the new position is written back through updateAudioObject() which persists to the server and triggers a re-render.

The move threshold (4px) distinguishes click from drag. Clicks in edit mode route through interactionSurface.js onScoreClick() which matches [data-audio-object-id] on the SVG rect and opens the editor.

Coordinate System

User clicks score
        |
        v
  getScoreClickPlacement(evt)
  screenX = clientX - inner.getBoundingClientRect().left
  worldX = screenX / localScale
        |
        v
  Stored as placement.x/y (SVG world units)
        |
        v
  placementToSVG(svg, placement)
  svgX = viewBox.x + placement.x
  svgY = viewBox.y + placement.y
        |
        v
  rect.setAttribute("x", svgX)

localScale = svgScreenWidth / scoreWidth maps between CSS pixels and SVG world units.


Render Cycle

Every state mutation (add, update, delete, drag) flows through the same path:

CRUD operation (audioObjectState.js)
        |
        v
  triggerRender()
        |
        v
  renderAll()                                 (interactionSurface.js)
        |
        +---> renderAnnotations()             annotations + markers
        +---> renderAudioObjects()            
                    |
                    v
              renderAudioObjectsSVG(onEdit)    (audioObjectSVG.js)
                    |
                    +---> cleanupAudioSVGElements()  unregister + remove old rects
                    +---> for each item:
                            triggerConfigToDSL() -> makeAudioSVGRect() -> registerSingleCue()

The clear-and-rebuild cycle is the same pattern used by the annotation system. For incremental updates, updateAudioSVGElement(itemId) is available but not currently wired in.


Integration Points

What calls it

What it calls

Event contract

All audio types dispatch oscilla:audio CustomEvents on window:

// Play
{ uid: "trigger-aud_abc123", file: "noise/hit.wav", state: "play" }

// Stop
{ uid: "trigger-aud_abc123", file: "noise/hit.wav", state: "stop" }

UID patterns:


Debugging

Quick Health Check

// Check audio object state
console.log("Audio objects:", audioState?.items?.length);
console.log("Audio mode:", audioState?.audioMode);

// Check SVG elements
const g = document.getElementById("oscilla-audio-objects-svg-group");
console.log("SVG rects:", g?.children.length);

// Check cue registration
const dynamicCues = window.cues?.filter(c => c._dynamicCue);
console.log("Dynamic cues:", dynamicCues?.length);
dynamicCues?.forEach(c => console.log("  ", c.id.slice(0, 60)));

// Check active voices
console.log("Active voices:", window.activeAudioCues?.size);
for (const [k, v] of window.activeAudioCues || []) {
    console.log(" ", k, v._startedAt ? `started@${v._startedAt.toFixed(3)}` : "pending");
}

// Check impulse processes
console.log("Active impulses:", window.audioImpulses?.size);

Verify DSL Serialization

// Get an item from state and check its DSL
const item = audioState.items[0];
const dsl = triggerConfigToDSL(item);
console.log("DSL:", dsl);

// Verify it parses
const ast = parseCueToAST(dsl);
console.log("AST:", ast);

Verify SVG Rect Properties

// Check a specific rect
const rect = document.querySelector("[data-audio-object-id]");
console.log("id (DSL):", rect?.id);
console.log("x:", rect?.getAttribute("x"));
console.log("y:", rect?.getAttribute("y"));
console.log("width:", rect?.getAttribute("width"));
console.log("height:", rect?.getAttribute("height"));

Force Stop All Audio Objects

// Emergency stop
for (const [k, v] of window.activeAudioCues || []) {
    if (k.startsWith("trigger-")) {
        try { v?.stop?.(0.05); } catch {}
        window.activeAudioCues.delete(k);
    }
}
window.stopAllAudioImpulses?.();

Common Issues

Symptom Likely Cause Check
No rects in SVG renderAudioObjectsSVG not called Check renderAll() runs, check for [audioObjectSVG] Rendered log
Rect in wrong position viewBox offset mismatch Check svg.viewBox.baseVal.x/y, verify placementToSVG()
DSL parse failure Unquoted path with slashes Verify configToDSL quotes paths containing /
No audio on playhead cross Cue not registered Check window.cues for _dynamicCue: true entries
Click opens empty editor Click not hitting rect Check [data-audio-object-id] attribute on rect, check onScoreClick guard
Can't drag in edit mode Not in audio mode Check audioState.audioMode === true
Waveform not rendering primeWaveform not called Check registerSingleCue logs, check audio file path resolves
Editor missing controls Type not matching Check updateAdvancedVisibility() in audioObjectEditor.js
Double triggering Old trigger.js path active Verify checkAnnotationPlayheadTriggers no longer iterates audioState.items

Migration Notes

This architecture replaced the previous HTML pin system (audioObject.js, ~1438 lines deleted). Key differences:

Aspect Old (HTML pins) New (SVG rects)
DOM elements HTML divs in overlay layer SVG rects in score SVG
Waveforms HTML canvas + drawWaveform() SVG polylines via primeWaveform()
Cursors HTML divs animated via deferredCursorStart() SVG cursors via standard cue pipeline
Text overlay createAudioOverlay() called from makeAudioPin primeAudioOverlay() called from registerSingleCue
Trigger path executeTrigger() in trigger.js checkCueTriggers() in cueDispatcher.js
Playhead gate checkPlayheadTriggerForItem() in trigger.js checkCueTriggers() with triggerWidth-based region
Coordinates getBoundingClientRect() + CSS left/top getScreenCTM().inverse() + SVG x/y attributes
Registration None (HTML elements, not cues) registerSingleCue() adds to window.cues[]

The annotation system continues to use trigger.js with executeTrigger() for its audio triggers. Only audio objects were migrated.

Tip: use ← → or ↑ ↓ to navigate the docs