HOME MUSIC
  :ACOUSTIC  :DIGITAL   :VIRTUAL
CONTACT

Zones of Intensities:
Auditory Hyper-Reality and Virtual Music Performance


"Today with the technical means that exist and are easily adaptable, the differentiation of the various masses and different planes as these beams of sound could be made discernible to the listener by means of certain acoustical arrangements. Moreover, such an acoustical arrangement would permit the delimitation of what I call Zones of Intensities. These zones would be differentiated by various timbres or colors and different loudnesses. Through such a physical process these zones would appear of different colors and of different magnitude in different perspectives for our perception. The role of color or timbre would be completely changed from being incidental, anecdotal, sensual or picturesque, it would become an agent of delineation like the different colors on a map separating different areas, and an integral part of form. These zones would be felt as isolated, and the hitherto unobtainable non-blending (or at least the sensation of non-blending) would become possible."

Edgard Varèse: The Liberation of Sound, 1936



Having composed music using a digital workstation for over fifteen years which draws on, responds to and extends Western classical or concert techniques I have sought to create music which both seems to be played by musicians and yet is impossible for hands on an instrument to achieve.

The piano soloist can play complex rhythms at great speed, repeat patterns that could not be maintained for such durations, crisply define widespread twenty note chords, change tempi in contrary motion with inexorable, unhesitant resolve. Finely balanced dynamics may be achieved at very low or high volumes, absolute control of articulation perceived amidst compelling momentum.

This is amongst the most interesting offerings of a technology whose principal outcome is sometimes perceived by opponents of the use of digital technology in music to be the instant gratification with pleasing results for the untrained user.

Digital reproduction may used to examine and develop our assumptions about music, as composers and listeners, extending the boundaries of expressive possibility in new directions. Despite these remarkable developments there remains nonetheless a significant divide between the practices of listening to this new music and its older counterparts, played to sedentary audiences by players of orchestral instruments.

The spectacle of performance, its social rituals, the formal artifice of the event, variability of musical experience from one performance to another, the unpredictability of even an apparently familiar work: these cannot be matched by listening repeatedly to a fixed, uniform reproduction through speakers or headphones possibly alone, in transit or semi-attentively whilst performing a task. Experiments though both in interactivity and an increased physicality of music listening have been conducted since before the invention of recording.

The separation between performance and listening to recordings - and particularly to works which only exist as recorded productions - however remains profound.

Whilst it is possible therefore to give voice to new musical utterance with digital production, the means by which the listener in fact hears this remains deeply inferior to the experience of attending amongst others a performance in which communication occurs between artist and listener. Concert experiments have been conducted by modernist composers using computers and arrays of speakers either alongside or in replacement of performers.

In John Cage's 1942 Imaginary Landscape No.4, he used a performer tuning multiple radios on a stage. With the music of later electro-acoustic composers, often the only operatives on view to the listeners are engineers at mixing desks and computers, instructing automata by movement of slider controls to perform the predominantly unseen, invisible sonic transmissions.

It remains rightly unproven to audiences at large that these experiences can be as compelling or expressive as watching an accomplished soloist coax successive wafting bubbles of tactile sonic geometries from a strung wooden box or valve-stopped brass pipe.

Consider then the following scenario: Attending the first performance of a new work, instead of being seated in an auditorium to await the opening notes performed by musicians, on arrival in the performance space the listener puts on wireless headphones and the music begins. They find themselves at a particular point within a three dimensional sonic projection in which an invisible orchestra and soloist seem to be playing music impossible to perform. Imaginary musicians are perceived all around, spaced perhaps as an orchestra but floating in height, separation and orientation.

The listener investigates the phenomenon, moving to find themselves closer to a particular sound and further from another. Moving through the empty space, it is as though amongst ghostly performers of music that human hands could not achieve.

An opportunity is presented by the combined use of certain technologies to entirely shift the impassive role of the listener - it is now the part of the listener to approach, examine, select from the combination of sounds as they would do at an exhibition of sculpture or painting.

The starting point of this project is the composition of music ostensibly for acoustic instruments but which is in fact unplayable by human hands, having been designed in a digital studio with minutely detailed simulation of live performance.

Digital realisation of music is now at a point where it can obtain subtlety and nuances of delivery as to be barely distinguishable from human performance; new uses emerge for the digital studio creating entirely new ways of hearing music, of greater complexity than possible by live musicians. Expressive extensions can be reached for, beyond extant technical boundaries, of the centuries of musical practice already engrained in our collective consciousness.

This project is based on a composition for solo player-piano and 72-part string ensemble entitled "N". It has its roots in Western concert music traditions of orchestral and concerto composition but appropriates digital technology to demand more of the "performer" than would be humanly possible, partly with the aim of achieving such effects imagined by Varèse and others, at last possible seventy five years later.

By using the computer to compose and determining from the outset that this would be the principal medium of the music's performance, certain of the constraints associated with composing for particular instruments are removed, in that the technically impossible is now theoretically possible to achieve.

Chords of many notes may be played at speeds where live articulation, however well rehearsed would be imprecise amongst such a large number of players. Subtleties of tempo change, or several occurring at once, can be achieved through automata with accuracy impossible in the orchestral setting, even before consideration of the obstacles presented both by orchestral conservatism and lack of rehearsal time for highly complex or dense music, not instantly audible in all its parts.

The former process by which the composer's musical intention, codified in score was then interpreted by conductor and musician is fundamentally altered by digital production. There is now a direct link between the composer and listener. A performer's interpretation no longer features in the process. The source of the music is now the very instructions programmed by the composer into the digital workstation.

The score's new form, a hybrid set of parameters defining the machine's mode of sound production, is similar to the instructions to a musician. Its difference however is a profound one. This data contains minutely precise instructions to the automated performer that would be impossible to convey to a musician or expect to be accurately rendered. The data and its sonic results also never change. Each rendition is identical, exactly as with forms of mechanical reproduction. The absence, in unwaveringly uniform digital music, of a tangible "aura" or "spirit" as bemoaned by Walter Benjamin is perhaps more vividly evident than in any other medium. Whether heard on two or ten speakers, in a hall or a vehicle, it is essentially identical.

The challenge of humanising musical automation has a long and singularly unsuccessful history, most especially since the advent of digital sound sources. A principal reason for this has been the low power, until quite recently, of processors. It is now possible to process CD-quality audio in real time on multiple output channels. In other words, the rates at which audio data is read, interpreted, adapted, transmitted, are now so high as to be imperceptibly different from the continuous stream of information entering the ear from pre-digital sources.

Through these means smooth curves of transformation, occurring at speeds whereby they are only subliminally perceived, may be achieved. The distance between the best sampled instruments skilfully used and their sources is now frequently short enough for there to be considerable uncertainty whether a performance is real or simulated.

In exploiting the sequencer and digital audio workstation in this manner multiple questions present: How may boundaries of musical sensory credulity be extended? If attempting to extend extant orchestral timbre, beyond what point has the simulacrum become so distant as to be merely referencing rather than representing performance on acoustic instruments? Once the music is composed, what are the ways to simulate a live performance? Is it a dilution of such an auditory experiment to retain actual performers?

In this project, five live performers are added to the simulated parts, at times doubling them and at others in counterpoint. The overwhelming majority of sound to which the audience listens is generated by computer transmission of recorded sound. Interplay and overlap between the live and recorded/generated lines obscures the sources' identity whilst adding clarity to doubled lines and to the combination of live sound sources' correlation to the generated music. The listener's ambulatory interaction with and movement between real and virtual sound sources whose perceptual distance from each other has been extended and emphasised sets up such a combination of variables that each audition is unique.

In motion-tracked wireless headphones, the listener moves within a hall, hearing music as though played from various points within the physical space, approaching and listening more closely here, moving away towards other sources of sound that combine as determined by the listener's position.The principal distinction between music heard via electronic reproduction and in live performance, lies in the presence (in the latter) unforeseen and uncontrollable factors. It has been this unpredictability and variation from one experience to another that for most listeners defines the live performance as superior in interest, colour, expressive power.

In this proposed setting, the natures of public performance and private listening become entwined and interchangeable as the listener moves around the space, inspecting at will the individual elements that constitute the composition, built of lines played by live performers and a combination for virtual musicians and sonic found objects distributed around the space.

A combination of audio and meta-data (Cartesian coordinates and directionality of sound source) is transmitted on each output channel. The "mix" heard by each listener is a uniquely rendered version of the three-dimensional recording which simulates movement by the listener amongst virtual performers. The output is spatialised and remains statically placed, throughout the music. As the listener moves around the space, the relation in distance and direction between each instrument and the listener is re-expressed to give the impression that they are moving inside an environment containing actual sources of sound.

Example: A room is 10 m long and 10 m wide. 2 m from the far left corner, a violist plays, facing the entrance. The listener enters the door at back left of the space and approaches the viola; when directly in front of the violist, the listener turns to the right. The instrument has been simulated, in response to the listener's movement, to approach, playing directly towards the listener then suddenly moving behind the listener's left side. This effect is simultaneously rendered in multiple layers, repositioning the sources of sound from the listener's ears which are at the shifting centre of a virtual sound field.

From the first mapping of sound sources' virtual spacings to correlation of these with the listener's actual position a large amount of data must be processed and used to form a live output mix to each individual listener in the space. Additionally, as listeners enter the space at different times, each hears the composed work from the beginning from separate starting points.

The "performance" begins at the moment the listener puts on their headset: the music is broadcast to their headset and rendered according to their position. Many listeners partake of the experience at once, each joining at will and listening to the performance in their own way, governing its character by the way in which they move investigatively around the physical space.

The experience of the listener is at once private, unique and non-replicable. Whilst simultaneously engaging with the same broadcast sounds at the same time in a single public space, the audience are engaged in an act of solitary examination and engagement with the music.

The motion capture software relays the data about the listener's position back to a computer relaying the 3-D sound and processes, as they move about the space and listen, a unique version of the sound according to where they stand in the space. The listener experiences sound as though walking between the performing musicians of an orchestra, into streets, amongst machines, from one glimpsed conversation to another, through vast caverns into tiny rooms and into the open air.

A vast variety of sound sources can be heard in one space. Unlike the experience either of surround relay of recorded sound or listening to a live performance, where one is stationary and impassive, the listener moves around the sonic environment, lingering or moving through zones at will.

Distances are distorted, amplified, warped: movement of a metre may create the impression of having moved twenty away from or towards a sound source. At particular points in the space it is possible to identify and separate certain less distinguishable middle parts of complex sonic textures and listen to the relation of those parts to musically more prominent, but now perceptually backgrounded, elements.

The musical composition is based in simultaneous elaborations of a centrally developing theme - in their entirety an indistinguishable, incomprehensible mass. In separation, the fragmentary elements comprising the whole may be inspected and absorbed by each listener in unique combinations of more or less audible parts at a given point in time and space.

Auditory distance between sound producing objects is amplified such that when the listener moves a metre, they have crossed thresholds of buildings, traversed a street, shifted to an entirely different acoustic environment. This permits an amplified, magnified inspection of individual detail, untainted by surrounding activity: it is as though the instrument whose coordinates are notionally one metre away were playing on the other side of a high wall.

Music producers were initially concerned over the advent of stereo, where the vividly distinct timbres essential to mono production for a listener to distinguish between instruments and voices had the potential to be crudely overridden by mere spacing for their differentiation. The advent of quadraphonic, and later surround sound was met with enthusiasm by practitioners and audiophiles whilst the majority of listeners remained unaware of them.

They were influential upon electro-acoustic composers seeking sonic spatialisation as an additional textural layer to their work. Nonetheless, their limitations were equal: at best a ring was made of moving sound around the static listener. Listeners could walk, ride or drive around the space but they would never obtain the asomatous wonder of walking through force-fields of invisible music in a space empty of all but other wanderers at a different stage and position in the illusory cycle of a ghostly, androidal band.

The vision of Varèse referenced in the title is now possible through combined ingenuity to achieve: "In the moving masses you would be conscious of their transmutations when they pass over different layers, when they penetrate certain opacities, or are dilated in certain rarefactions."

The process requires new forms of musical notation and close collaboration between composer, acoustical engineer and software developer. The conventionally-notated score is the basis for a graphic restructuring by instrument, each of which will be output on 8-12 tracks, dependent on the complexity of multi-sample simulation of the acoustic or real-world counterpart.

This secondary score describes physical distribution and movement of the virtual sound sources.Individual tracks are recorded with a dummy head, a mannequin with microphones where should be ears, placing these in certain positions and orientations and recording individual tracks to each head in its separate position, projecting the sound from speakers arranged in all axes around it.

A final mix of the binaurally recorded tracks is created to regenerate the perceptual position of the listener to a sound emanating from multiple sources for each individual track. The sound may then be reproduced via headphones such that it appears to emanate from every dimension and distance within and beyond the limits of the auditorium.

Realisation of this performance will be achieved through an integration of motion capture with 3-D sound virtualisation, an area of rapidly expanding expertise which deploys systems of "cross-talk cancellation", which take account of the head-related transfer function (HRTF), whereby each ear perceives through the head certain signals directed at the other and removes this additional perception. By overcoming the constant of HRTFs it is possible to overcome the limitations of a pair of (left and right) sound sources and simulate the emanation of sound from all points in a hemisphere.

Digital signal processors capture data about each listener's coordinates and orientation, reproducing a continually responsive production of the music. The music is output as a static, spatially separated series of sound sources: as the listener moves amongst them, she is treated by the data processor as though static herself and the virtual music is moved in relation to her changing position, thus creating the powerful illusory sensation of moving between static sound producing instruments and objects.

I am working on a system for the integration of extant technologies: motion capture data's conversion to MIDI instructions which command real-time sculpting of the spatialised sonic output driven by Logic Pro software on Apple Mac computers.

As the composer-originator of this project I am excited by the richness of possibility for increasingly direct, interactive communication with the listener and for future possible uses of this technology as we rethink the experience both of public and private music listening.

A significant outcome to the project may be the creation of a commercial system for the realisation of virtual performance spaces with reactive three-dimensional sound projection, a versatile production software suite through which multi-channel binaural output can be controlled by motion-capture data to reproduce music and sound with unprecedented distance, direction and clarity.

Benjamin Louis Mawson
January 2011

©Copyright Benjamin Mawson 2011
http://music.benmawson.com