||Image Processing Techniques for Audio - Part 2|
Last month we introduced the unusual technique of using image processing software as a tool to manipulate audio data. This month, we will further apply the technique to create some interesting effects. We will also take a look at using the image processor to generate new sounds from scratch.
To hear the potential of this unique method of sound processing, listen to the song clips included in the SongClips/ sub-directory.
First, a quick recap of the processing method covered last month:
STEP 1: Preparing the audio file.
To load a sound file into image processing software we first need to convert it to a format common to both audio and image data: RAW. This can be done simply with a program like SOX:
STEP 2: Loading the audio file.
If the image processing software does not contain a RAW loader module, the SCULPT module can be used. For 8-bit audio files use SCULPT GREY. The loader will prompt you for the width and height to use for the image. Use dimensions that are large enough to accommodate the entire audio file: WIDTH * HEIGHT = RAW BYTES.
STEP 3: Processing the loaded data.
Perform whatever image processing functions you like on the loaded sound.
STEP 4: Saving the data.
If provided, save using a RAW saver module, otherwise use a SCULPT module.
STEP 5: Listening to the modified sound.
To hear the raw sound, use a program such as Play16 and indicate the frequency to play back the sound at:
For the following tutorial, we will again use ImageFX 1.52 from the coverdisk of CU-Amiga - June 1995 to do the data manipulation. However, the methods described here can be easily applied to most image processing programs. For audio playback and audio format conversion we have included Play16 and SOX, respectively, in the SoundLab directory.
Also in the Sounds/ sub-directory we have included some sounds to experiment with. In the Sounds-Processed/ directory you will find some examples of these sounds after having been processed with ImageFX using the methods that are described here.
Load the Sounds/Piano.raw sound using a width of 300 and height of 289. From the toolbox, select COLOR and then SOLARIZE (Figure 1). This is a preset function so it does not have any options.
The effect is not drastic but makes the piano sound like it is swarming with noisy bees. Compare it to the original Piano.raw sound:
Load the Sounds/Thunder.raw sound using a width of 250 and height of 281. Perform the same solarize function and save the result as Thunder-Solarize.raw. Listen and compare it to the original:
Play16 FREQ=16780 Sounds/Thunder.raw
Play16 FREQ=9600 Sounds/Sine.raw
Now, try the solarize function with the Sounds/YMMind.raw sound. Load it using a width of 200 and height of 141, solarize, save it as YMMind-Solarize.raw and listen with:
This effect makes the voice sound very nasal and cheesy. Add a pocket protector and a pair of glasses with some white tape on the nose bracket and we have got a real nerd here!
It is easy to see that it is important to experiment with an effect on several different kinds of sounds. Otherwise, you might never know what potentialy great surprises you have missed.
Load the Sounds/YMMind.raw sound using a width of 200 and height of 141. From the toolbox, select ROTATE, set the angle to +28 degrees and select ANY ANGLE (Figure 2) to start the process.
This is a very drastic change, but not very useful, unless you like raunchy sounds like this. Reload the original sound, or use ImageFX's undo function and try the ROTATE effect with an angle of -90 degrees (Figure 2). Save it as YMMind-Rotate-90.raw and listen:
This is still a drastic change from the original speaking voice, but the change in rotation angle gives a much nicer result for this sound. If you needed a good electricity sound then it could be a very useful one.
To begin, we need to create a buffer to store the image that we are going to create. From the toolbox, select BUFFER and then CREATE BUFFER. Set the width to 1060, height to 226 and make it a GREYSCALE buffer (Figure 3). The width and height values given here determine the length of the sound file, so 1060 x 226 = 239560 bytes. At a playback rate of 16780, that will give us 14.28 seconds of sound.
Signed Vs. Unsigned Data
The Laser-Copter sound that we just generated demonstrates this very well. The image processing software will create the image with all positive values. But, since the Amiga uses the signed data format for audio, Play16 will use that by default when playing back the sound. For this example, that assumption would be wrong. Try it:
Instead of using the UNSIGNED option of Play16 you could first use SOX to convert the file.
will do the conversion. The -u indicates the source file is in unsigned format, -b indicates that the source is in 8-bit byte format, -s indicates the destination file should be in signed format.
Note that SOX requires that a rate (-r) be given for the source file. For RAW data this value is not used, so it does not matter what value you use here. Refer to the SOX documentation for more details.
Having converted the format already, it is no longer necessary to specify the UNSIGNED option. So, this will now work:
The Big Finish
These unique processes can produce some very useful results. But like any great experimental technique, it takes time and patience to get to the really good results. The best things in life come from hard work, and this is no exception.
In many cases, it may be necessary to clean up the created sounds by using audio waveform editing software to filter unwanted noise or cut bad sections. Sometimes, even just a small section of a processed sound is enough to make a terrific percussion sound or some strange instrument.
Use the results of this experimentation as a starting point. Further experiment with the ideas presented here and with the many other image processing effects available. Above all else, have fun! If you come up with anything particularly exciting please drop me an email.