AudioHijackKit2 Manual
======================


Class Organization
------------------

The conceptual root of AHKit2 is the AHAudioGraph class. Virtually nothing can be done without having an AHAudioGraph instance, so start from there.

AHAudioGraph contains and manages AHAudioNodes. You can add and remove nodes, start and stop audio processing, and serialize the graph. Note that AHAudioGraph is fully thread-safe, as is most of the rest of the framework.

AHAudioNode is a generic audio processing class. A node can have zero or more audio inputs, and zero or more audio outputs. Audio which arrives at an input is processed in a manner that's determined by code in subclasses of AHAudioNode.

Conceptually, there are three major types of node subclasses.

1: Source nodes have no inputs, only a single output. They produce audio from "somewhere else", such as a microphone, a network source, etc. This is how audio enters the graph. AHSourceAudioNode is an AHAudioNode subclass which provides various common services useful for source nodes.

2: Sink nodes have no outputs, only a single input. They consume audio and do "something" with it, such as playing it to an audio device, or recording it to a file. AHSinkAudioNode provides common services for these.

3: In/Out nodes have a single input and a single output. Audio goes into the input, and an equal amount of audio comes out of the output after processing. Processing could consist of changing gain, resampling, pitch change, equalizer, etc. AHInOutAudioNode provides common services for these.

Not all nodes strictly fit these categories. For example, AHDuckingNode is more or less an In/Out node, but has two inputs and combines them into a single output.

There's also one "meta" type, a composite node. A composite node is a node which exists to wrap a collection of other nodes. A composite node can present inputs and outputs of its subnodes as if they were its own. For example, AHVoiceOver is externally an In/Out node (although it does not subclass AHInOutAudioNode). Internally, it wraps an AHAudioDeviceInputNode and an AHDuckingNode to implement the VoiceOver service. To the outside world, it just looks like a single node with an input, an output, and various controls.

Nodes contain AHAudioPorts, which are inputs and outputs. AHAudioInputPort and AHAudioOutputPort are subclasses which implement inputs and outputs. For an outside user, these classes are what you use to connect nodes together. You can query a source or in/out node for its output, and query a sink or in/out node for its input. Tell the output to connect to the input, and the two nodes are now connected.

AHAudioGraph and AHAudioNode (and its subclasses) and AHAudioPort are about 99% of what an AHKit2 user needs to use, in terms of useful classes.


A Very Quick Tour of Useful Audio Node Classes
----------------------------------------------

Source nodes:
    AHDummySourceNode - Allows you to write custom audio into a graph from outside.
    AHApplicationSourceNode - Application hijacker node.
    AHSoundflowerNode - System audio node.
    AHAudioDeviceInputNode - Uses an audio input as an audio source.

Sink nodes:
    AHCallbackSinkAudioNode - Allows you to retrieve audio out of the graph and into custom code.
    AHAudioRecorderNode - Records audio to a file.
    AHAudioDeviceOutputNode - Plays audio to an audio output device.

In/Out nodes:
    AHPassThroughNode - Audio flows through with no changes. Useful to use as a common point to mix a bunch of other nodes together so that a graph can be re-wired more easily.
    AHAudioUnitNode - Allows using a CoreAudio AudioUnit as a node.
    There are also a bunch of audio effects nodes in the InOut group in the AHKit2 project.


Audio Buffers
-------------

AHAudioBuffer is a class that manages audio buffers. An audio buffer is conceptually a two-dimensional array of floats, with one dimension being channels and the second dimension being time. These floats can either be interleaved (a single array containing all of the channels laid out next to each other) or deinterleaved (an array of float *, each of which contains a single channel). AHAudioBuffer can accept and provide both types. Internally, it currently always stores data deinterleaved.

This class is conceptually a value class. Although the -interleavedFrames and -deinterleavedFrames methods aren't declared as returning const, you should treat the data they return as read-only.

There are a few useful patterns to use when dealing with AHAudioBuffer because of this read-only nature. It often happens when dealing with non-AHKit2 APIs that you need to provide memory to an API so it can write to it in order to return audio data to you. AHAudioBuffer provides some methods to help with allocating this memory.

As an example, you need to allocate memory to hold 512 frames of two-channel audio data, write to it, and then wrap it in an AHAudioBuffer:

    float **bufs = [AHAudioBuffer allocateMemoryForChannels: 2 frames: 512];
    GetAudio(bufs, 2, 512); // call to external API
    AHAudioBuffer *bufferObject = [AHAudioBuffer bufferWithDeinterleavedFloats: bufs channels: 2 frames: 512 rate: rate takeOwnership: YES];

Another common pattern is needing to process one audio buffer, and write the results into a new one. AHAudioBuffer provides a convenience method similar to the above, which automatically allocates memory to match the number of channels and frames in the buffer:

    float **bufs = [bufferObject allocateMatchingMemory];
    GetAudio(bufs, [bufferObject channels], [bufferObject frames]); // call to external API
    AHAudioBuffer *newBufferObject = [AHAudioBuffer bufferWithDeinterleavedFloats: bufs channels: 2 frames: 512 rate: rate takeOwnership: YES];

And finally, the -copyMatchingMemory method will do the same thing, except the returned memory will have a copy of the buffer's audio data already present.

One last useful method is the -convertToChannels: method. This will mix down or expand an AHAudioBuffer to have the desired number of channels. For example, if you need to call into some audio processing that only handles mono audio, you can use [buf convertToChannels: 1] to ensure that your audio is mono. [buf convertToChannels: 2] will ensure that it's stereo, etc. An important note: -convertToChannels: doesn't currently handle converting from multiple channels to a different number of multiple channels, e.g. converting from 2 to 3. If you need this to work, tell Mike to implement it!


Creating Source Nodes
---------------------

To create a source node, subclass AHSourceAudioNode. Then, you need to obtain audio from somewhere. Where you get it is up to you! Once you have it, you need to wrap it in an AHAudioBuffer and then call writeAudio:toOutput:, like so:

    float **bufs = [AHAudioBuffer allocateMemoryForChannels: 2 frames: 512];
    GetAudio(bufs, 2, 512); // call to external API
    AHAudioBuffer *bufferObject = [AHAudioBuffer bufferWithDeinterleavedFloats: bufs channels: 2 frames: 512 rate: rate takeOwnership: YES];
    [self writeAudio: bufferObject toOutput: [self output]];

And that's all there is to it. Note that with the current implementation of AHKit2, it's really good if your sources are clocked to produce audio in realtime, like pulling audio from an audio device. Reading audio from a file and blasting it into the graph will cause audio to pile up or be dropped, because there is currently no real limiting mechanism to throttle writes.


Creating Sink Nodes
-------------------

To create a sink node, subclass AHSinkAudioNode. Override the -processAudio:fromInput: method. In this method, do what you wish with the audio in the audio buffer.

Sinks are discouraged, because they can make it harder for the user to create graphs.


Creating In/Out Nodes
---------------------

To create an in/out node, subclass AHInOutAudioNode. Then override the -processAudio:fromInput: method, just like a sink node. At the end of this method, after processing the audio, write the new buffer to your output just like a source node. Example:

    - (void)processAudio: (AHAudioBuffer *)buffer fromInput: (AHAudioInputPort *)input
    {
        // NOTE: only works on stereo
        if( _balance == 0 )
        {
            [self writeAudio: buffer toOutput: [self output]];
        }
        else
        {
            buffer = [buffer convertToChannels: 2];
            
            int frames = [buffer frames];
            float **inputs = [buffer deinterleavedFrames];
            float **outputs = [buffer allocateMatchingMemory];
            
            // lots of code goes here to implement the balance processing
            
            AHAudioBuffer *out = [AHAudioBuffer bufferWithDeinterleavedFloats: outputs
                                                                     channels: 2
                                                                       frames: [buffer frames]
                                                                         rate: [buffer rate]
                                                                takeOwnership: YES];
            [self writeAudio: out toOutput: [self output]];
        }
    }


Clocking
--------

Note: clocking is new to AHKit2 as of 8/2010.

Nodes can be conceptually divided into two types: "active" nodes which produce or consume audio in real time, and "passive" nodes which can produce or consume audio arbitrarily. An example of the former would be an AHAudioDeviceInputNode, and the latter would be an AHBalanceNode.

Most passive nodes are driven by audio which arrives at their inputs. As audio arrives, they process it, and if they have outputs, write the processed audio there. Their sense of time is completely implicit.

Some passive nodes don't have inputs, though, but can simply produce audio on demand. An example of this type of node would be an AHFileSourceNode, which can produce audio from a file as needed, but has no concept of how much audio to produce over a given amount of time.

Clocking allows this type of passive node to be driven by the activity of active nodes. It also allows nodes which aren't producing audio, or aren't producing it fast enough, to remain in sync. Finally, it allows nodes with disconnected outputs to receive silence on those outputs.

Much of this happens automatically without intervention. Silence on disconnected outputs and sync between different sources is built in. If you are building your own node, then you may need to do some special things for clocking.


Clocking Active Nodes
---------------------

If you're creating a new active source node, one which produces audio at a defined rate in real time, then override the -isClockDriver method to return YES. This is all you have to do to make the graph see your node as an active node, and use it to drive the graph's clock. You'll also want to override -wantsClocking to return YES to indicate that the graph should potentially pad the node's output for syncing. (Note that if you're subclassing AHSourceAudioNode, it already returns YES for -wantsClocking for you.) For examples of this, look at AHAudioDeviceInputNode.

If you're creating an active sync node (these are rare, but an example is AHAudioDeviceOutputNode) then you need to manually keep track of your node's concept of time, and inform the owning graph whenever it changes. To do this, you'll want to create an instance variable to hold an AHAudioTimeCounter. Whenever your clock changes (basically, whenever audio is consumed by whatever real-time process that the node tracks), use -addSamples:rate: on the time counter to update it, then use -clockingNodeUpdated: on the audio graph to inform it that your concept of time has changed.

You also need to provide your node's concept of time to the graph when it requests it. To do this, implement -updateTimeCounter: and call takeMaximum: on the passed-in counter, giving it your node's counter as the parameter, to update it.

Finally, override -isClockDriver to tell the graph that your node will be driving the clock.


Clocking Passive Nodes
----------------------

If you're creating a passive node without inputs, you generally want its audio to be driven by the graph's clock. To do this, override -wantsClocking to return YES. (If you subclass AHAudioSourceNode, it already does this for you.) Do *not* override -isClockDriver to return YES.

Once you have done this, the graph will drive your node based on the clocks provided by the graph's active nodes. The default implementation in AHAudioNode responds to this by outputting silence.

To override this behavior, override -emitAudioOnOutput:forTimeCounter:. Generate your audio here and use -writeAudio:toOutput: to output it. To figure out how much audio to write, you can use -totalSamplesAtRate: on the provided time counter and on the output's time counter to see how many samples are needed to make up the difference.

When implementing this method, note that it will be called frequently and does *not* necessarily indicate that your node *needs* to generate audio. You should calculate the number of samples needed and only generate new audio if it exceeds some small threshold.

For examples of how to implement this, look at AHWaveGeneratorNode.


Persistence
-----------

Note: property list persistence is new to AHKit2 as of Sep 2010.

Graphs can be saved using an NSCoder, but ideally they will be written to a property list. AHAudioNode subclasses should return an array in -propertiesForSerialization; these values are automatically saved. Two caveats: if youre saving a node pointer, use AHAudioNodeNodeProperty so that its serialized as a reference. Otherwise, the class must be savable in a property list. If its not, youll need to add two methods:

- (id)plistRepresentation;
+ (id)objectWithPlistRepresentation: (id)plist;

-plistRepresentation should return an NSDictionary, with a class key holding a string representation of the class name.

If necessary, you can override -plistRepresentation in an AHAudioNode subclass.

Nodes are loaded from a property list using -initWithPlistRepresentation:. Some AHAudioNode subclasses will need to track internal nodes. These may not exist at initWithPlistRepresentation: time, so each node is later sent -finishWithPlistRepresentation:graph:.
