MIT Portfolio: Steven Pliam

SoundPlot: Exploring geometric expressions for auditory material.

The problem given forth in finding an appropriate or satisfactory representation of sonic material in the graphic or geometric domain has been, for much time and many people, a difficult challenge of interpretation. The question of context, whether it is in the realm of science, music, the graphic or plastic arts, usually provides an initial conceptual direction for addressing the problem. Nevertheless, the basic question of how one addresses the interpretive complexities of visually representing sonic phenomena still remains.

The Cartesian X-Y coordinate scheme provides a very widely used two dimensional means to graphically plot a representational curve of a sound waveform; whereby, time corresponds to the X-axis and sonic amplitude or energy corresponds to the Y-axis. The result is a curve of varying complexity which directly represents the sonic energy over time. It is most notable that this representation can be generated physically through the technological means of an oscilloscope which electronically deforms a high-voltage cathode ray. The deformation is then plotted on a phosphorous screen that renders a two dimensional curve graphing the sonic waveform. As this process is completely physical, one could argue that this representation, given its lack of abstraction or symbolic reliance, is very close to the actual phenomenon itself. Hence, an immediate question emerges: Being sensitive to the given context, to what extent can we abstract from the actual physical phenomena of the auditory material and move toward a more symbolic representation, without diminishing relevancy or meaning? Even the oscilloscope representation which is directly generated from the physical event itself is, to some measure, an abstraction.

We know that the phenomenon of sound is described physically by the changing patterns of air pressure, literally greater and lesser densities of air molecules, as they propagate through space at a constant and specific speed (around 1100 ft/sec in normal conditions) with a given air density, humidity and so on. Yet, how we actually experience sound seems to reside in a vastly different realm than that of the physical description. Consequently, while the visual representations given to us by technological devices developed by applied science, such as the oscilloscope, may be very closely tied to the physical occurrence of an auditory phenomena and may help to provide a scientific understanding of such phenomena, these representations seem to fall very short of adequately expressing the experiential qualities of sound phenomena. I would suggest that it is this realm with which the architect, the artist is mostly concerned. And, henceforth, a primary objective for the artist then becomes the achievement of a balance between an adequate representation of the physical phenomenon of sound and the level of abstraction necessary to realize a relevant and meaningful expression.

One extreme example of a high level of abstraction would be the traditional convention of musical notes on the staff, which is after all a highly symbolic representation of specific auditory events. As a system for representing auditory material, an argument is easily made for the highly limited usefulness of this system with respect to being able to adequately visually represent the broader set of practically infinite possibilities of such material. Just as with any language or symbolic system, the limits are set forth by the language itself. This would seem to point to a general theoretical axiom that could be formulated: In order to give the greatest amount of freedom in visually expressing auditory phenomena, the system of representation, within which one works, must be suitably close enough to the physical nature of the phenomenon. With that axiom in mind, the representational system should also accord the possibility of a widely expressive and dynamic means to concretize the ideas of sonic space.

Current day computers and software naturally lend themselves, as powerful tools, to generate and help articulate visual ideas for representing sound. Historically, there have been many systems of visual representation put into practice. The visual parameters of color, shape, density, gradation, perspective, and so on have been used as symbolic dimensions, constituents of a graphic representational whole or sonogram. Such sonograms have been generated by computers with specific software both for the purposes of spectral analysis as well as for descriptive synthesis of musical and sonic ideas. More recently, the emergence of 3 dimensional sonograms has been a most interesting and exciting development. Quite ironically, the graphical extension of the sonogram into 3 dimensions establishes a much more profound visual connection with the physical reality of the sonic phenomenon. What is otherwise known as a 'wave terrain' is essentially a type of sonogram which is a 3 dimensional surface representing the changing behavior of a sonic waveform. This type of sonogram can begin to resemble very closely the actual physical distributions of air pressure waves that exist in space which constitute the phenomenon of sound.

By historical convention, a 3 dimensional sonogram will be a frequency spectrum display which will project each frequency component over time. An integrated perspective plot shows a time-varying topology of energy on a frequency versus amplitude terrain, whereby the frequency of the constituent harmonics is usually plotted along the X-axis while the progression of time is plotted on the Y-axis and energy level or amplitude is generally the vertical dimension or Z-axis. So called 'control function' plots may map the separate amplitude and frequency envelopes for each partial of the sound, which is much less of a terrain in that case and more like a decomposition analysis of the sound.

SoundPlot as a generative medium for aesthetic form.

Thus far, the present discussion has been limited to visual representations of sonic material as an end result in itself. In other words, after such visuals are generated, there is no data of any kind to use, perhaps as a point of departure, for any further exploration or development of ideas, formally or otherwise.

When Herwig Baumgartner and Scott Uriu, both principles at SpaceKraft Lab, approached me with the question of a possible software program which could translate auditory material into a geometric visual expression such as a 3 dimensional terrain surface, and furthermore, such a surface would subsequently have to provide usable, highly numerically accurate data that could then be manipulated and used for application in architectural design, well, it did not take long to realize that such software did not in fact exist and would have to be developed. As the conceptual and schematic planning for the development of the software began, certain notions about how the program should work began to emerge. One immediate idea that evolved was that such a software tool, within this artistic context, should certainly not produce any exact one-to-one translation of a given auditory event, something that the users would be locked into with no real room to explore possible geometric modes of expression. Rather, the program should enable the user to explore many different graphical iterations or interpretations of the same sonic material. Ideally, the software should be able to process any kind of sonic sample and be able to produce many different possible geometric expressions of the sample while still adhering to the basic function of a meaningful translation. It became clear that to achieve this, many user parameters would have to be implemented.

Eventually, the approach chosen was to incorporate a series of algorithms that would generate accurate numerical data that could be used to construct a wave-terrain surface. Most of the algorithms centered around division or separation of the sound input. The two basic directions taken were to divide the sound by micro time intervals or by frequency.

The division or separation of the sound into its constituent frequencies is a widely known and used algorithm; that is the Fourier Transform. The transform discovered by Jean-Baptiste Fourier is just one of many similar transforms used for audio processing. The original Fourier Transform describes continuous signals, such as those used in electronic audio applications. The Discrete Fourier Transform (DFT) is the same idea applied to sampled signals. Incidentally, there are other frequency transforms which include the Cosine Transform and the Discrete Cosine Transform (DCT). The other transform used in the SoundPlot program is the Fast Fourier Transform. To be clear, these are not different transforms, but simply a family of fast methods for computing the DFT. And, there are many subtly different FFT algorithms. In short, these methods will apply a certain mathematics to a sound sample in order to break down the sample into its constituent harmonic partials. Once this information derived, it can then be used to construct a surface terrain to express the energy changes of the frequencies of the whole sound over time. SoundPlot uses the X-axis to represent the frequency spectrum and the Y-axis to represent time and the Z-axis to represent amplitude. The trade-off for using the speed of the FFT is that the user is confined to a limited set of numeric inputs on one parameter of the program, namely the number of data points that can be used. Thus, in order to implement the FFT algorithm, one can only use a number that is a power of two for the number of data points in that parameter. In order to use just any number for the number of data points, the DFT algorithm will need to be used which is of course much slower.

The other basic method used in SoundPlot for producing data to construct a wave-terrain surface is one in which the sound source is divided into micro 'grains' which are micro pictures of the whole sound at a given point in time. This approach is fundamentally different from the Fourier Transform algorithm in that there is no decomposition or 'breakdown' of the waveform as with the FFT. The geometric surface which results is a true wave-terrain surface that represents the progression of the sonic grains, taken at a user defined interval, through time. Various user parameters can be controlled to affect the scale, percentage of signal source viewed, amplitude multipliers, resolution, and logarithmic scaling of the sonic material input. One very interesting characteristic of this kind of terrain surface is that, providing that the sonic grains are of an adequate size, the relationships of phase changes over time give an aesthetically compelling form to the surface which is something that is unique among sonograms.