COMPOSING FOR VIDEOGAMES
By James Hannigan (www.jameshannigan.co.uk)
Issue: Audio - June 2007

COMPOSING FOR VIDEOGAMES

In writer/director Woody Allen's film, "The Purple Rose of Cairo", actor Jeff Daniels' leading-man character in a Depression era action-adventure film suddenly steps out of the picture - literally: he takes three-dimensional form and leaps from the silver screen and on to the grimy chewing-gum-and-popcorn-encrusted linoleum floor of an American Midwest movie theatre. As what has happened slowly dawns on him, we see that he is both a character in a film and a character in a film within a film. Woody Allen milks the set-up for plenty of sight gags and double entendres.  

The duality of "The Purple Rose of Cairo" is becoming a more common philosophical conundrum as media art and technology forms overlap each other faster than Wired or Vibe can keep up with them. Thus, it is natural to look to the previous generation of anything to make sense of the generation coming next. It certainly makes sense in the still-evolving arena of graphics, sound and music for video gaming, for example. One accepted paradigm has been to look to film as the model for gaming. This works to some degree, but increasingly it begs the question: are games their own art form, quite apart from film, television and other linear media forms?

Before trying to answer that, a little historical perspective on sound and music may be useful. Film music has been with us for so long, it's easy for us to forget how it first came along. In the late 1920s, when film sound was introduced, music - and sound in general - was of interest to audiences simply because of its novelty, regardless of its relationship with events onscreen. It was over a decade before film composers as we think of them today began emerging; the language of film music we're familiar with was slowly introduced by composers and filmmakers seeking to explore the possibilities of a new and unique medium. 

The novelty of any digitally recorded music in games has long since passed, replaced by consumer expectations for high quality and some degree of stylistic appropriateness. Yet, when players encounter large-scale, ornate music (like that heard in the most intense moments of epic films) placed somewhat arbitrarily in games or while characters stand idly around in-game, they may well ask what this music is for, over and above sounding good or 'setting the scene'. To some, the music may seem capricious, rather than complementary -- demonstrating an unconvincing relationship with what players see taking place before them.

History also suggests a precedent for where a more appropriate approach to game music might come from. In 1941, Orson Welles - an outsider from the world of radio who knew little about the technology or conventions of filmmaking - transformed the film industry. With the soundtrack to the classic Citizen Kane, music and picture became more organically intertwined and mutually supportive. Dedicated, forward-looking games composers also try to bend the rules by writing and preparing music with games in mind. However, what they're up against is the conventions of scoring for picture in general and the expectations of game developers in particular, who often want to experience the music complete, before it is placed against the game itself. Game composers are rarely able to audition music in context during the composition stage, apart from in cases of composing to picture, and are often forced to work in a vacuum.

Most games could loosely be divided into two distinct types of scenario: the "filmic," in which the player is perhaps as much audience as participant, or a participant in a scripted narrative with limited ability to change plots, locations or characterizations; and the simulation, or "Sim," scenario, in which the player proactively can affect not only the outcome of a game but its nature and his or her environment. Putatively, each type of game scenario would have its own music, with the filmic variety looking to the traditional Hollywood model in which music supports the often scripted visuals existing as if for an audience, and the Sim relying more on literal sound effects and realism. Crudely speaking, at its most emotionally manipulative, a very filmic game could be likened to a schmaltzy Hollywood melodrama, and the coldest Sim to a clinical, factual and emotionless documentary. Both are examples of 'moving pictures', but each result in a profoundly different viewing experience.

In films, characters are sealed in the film's 'story space', forever separated from the audience by the silver screen and, were they to exist independently, we could think of this as an 'actual' or 'virtual' reality for them. Thus, the conventional film score (classed as "non-diegetic") exists solely for the benefit of an audience, outside the story world depicted on screen and is inaudible to characters. Conversely, what is known as "diegetic" sound - originating within the story space and is audible to characters (radios, dialogue, footsteps and so on) - contributes to the illusion that the world depicted on screen is a real, self-contained place. Looking at it this way brings into focus questions relating to what music and sound really exist for in games, such as: is music and sound for the player, the game's protagonist or someone passively watching someone else play a game? Indeed, is the player supposed to be in the game, outside it or both at once?

The answer may be that games borrow from both the "filmic" and the "3D gameworld" models simultaneously. Players are neither sealed in a game nor solely watching it. The dichotomy of music and sound supporting the "reality" depicted on the screen, as in film, and the gameworld model in which the participant/player can motivate changes in the soundtrack dulls the distinction. The 'two-way traffic' of information between player and game renders it difficult to identify the boundaries of the gameworld in the way me might a film's story world. Metaphorically speaking, in games, the screen itself ceases to be a barrier between the world of screen characters and that of the audience.

This might seem academic, but the outcome of such a discourse will determine how game audio is conceived and executed in the future. Should sound and music be created together from the outset, or merely brought together in a final mix?  

At this point, it is worth bringing to mind that the elements of a film soundtrack (such as music, sound effects and dialogue) are distinct as a result of how they support the conventions of the film viewing experience and, although why this is so may now be long forgotten by many filmmakers, it is by no means a given outside the realm of film. In directly importing the same approach to the soundtrack of games, we import also the values of film itself -- some of which are welcome, while others merely confuse matters.

The games industry sometimes doesn't grasp the nature of the reality it strives to present to players, which often swings between hot and cold, emotional and clinical, within the same game. And unlike in film, the player's point of view in games can sometimes be determined as often by the player as by the designer. There are "first person" games, in which the player is the centre of the action, and "third person," in which the player's role leans slightly more towards the role of viewer, observing their own actions in-game -- often complete with a 'camera' in pre-selected locations or simply following them around. First person perhaps calls less for music that comments on the player's physical actions, and tends to have more to do with the effect the world is having on the player as a participant as the game unfolds and events take place around them. But such games mostly treat the player as a set of 'ears and eyes' - usually without any kind of perceptual filter between the objective sights and sounds of the 3D landscape and the player.

Film scholars have suggested another class of sound, termed "meta-diegetic", which originates in the minds of characters in films -- such as in dream sequences, memories and hallucinations. But this 'space,' which is neither diegetic nor non-diegetic, is thus far largely unexplored in games -- most probably because the gameworld is still clinically presented as if through a 'camera,' with an emphasis on realism. There have been 'internal objective' sounds, such as heartbeat and breathing in some games, but less in the way of subjective sounds such as memories or thoughts echoing around the mind of the protagonist (who we are often led to believe is 'us' in some way, and not merely someone to watch). This space would further allow the blurring of lines between the player/protagonist's internal state and the objective reality of the 3D gameworld, and might be useful in communicating information helpful in playing the game -- with less reliance on visual cues.

Some games challenge the boundaries of where sound effects begin and music ends. Linear music, for example, still often exists in games for narrative support, enforcing the idea we belong outside the story world as we play and rendering the player as 'audience' to some degree -- rather than being a part of the game. If games put the player somewhere in between, then the underlying role for sound and music is neither to support a cold realistic 'virtual life' or an entirely manipulated experience, but something in the middle. When we recognize that the player is both audience and participant in games, we can start thinking of the gameworld as a kind of 'fusion reactor' for sound and music, free of the tyranny of the old-school film model. Put another way, in order for sound and music to support the fact that players exist in a grey area between watching and being in games, sound and music gravitate to one another -- finding some middle-ground between being emotionally resonant (or even 'musical') and merely literal.

On the whole, the games industry still tends to think of music and sound as separate aspects of a production until mixed in-game, largely supporting games as they might a passive viewing experience. This is the paradigm that will change, slowly but surely, as games establish themselves as their own art form. When music and sound work together in games and centre on the player's emotional and/or perceptual inner world to some degree, it forms a unique concept for the soundtrack and does away with the difficulty sound designers and composers currently have in deciding from which space sound and music originates -- causing them instead to consider from the outset how content interrelates and serves a range of experiences. Whether something is diegetic, non-diegetic or strictly adheres to the film model in general may eventually cease to be a major concern as games evolve.

It is now that the notion of looking at games as ranging along a scale rather than occupying disparate camps makes a lot of sense. The yardstick for determining an approach to sound and music for a game could be to calculate where on the continuum between hot and cold, emotional and clinical, it sits. Hot would be games offering a dramatic, filmic experience, catering more for the player-as-audience, and cold would be those leaning towards simulation and virtual reality, attempting to draw the player into the reality of the gameworld as much as possible -- to 'seal them in'. Hot games seek to manipulate the emotions of the player and to deliver a universal 'one size fits all' experience, whereas colder games allow a greater level of control over events, allowing players to form emotional responses to the actions they influence or bring about -- and, in such a situation, non-diegetic music can feel as out of place as 'canned laughter' in a Sit-Com. That's clearly counter-intuitive in a very competitive landscape of media business that, as it does in music and movies, demands quick and simple "handles" on concepts to make them easier to market (often on the basis of being like something else already successful). But, as we are realising all the time, games are not limited to being simply films with multiple possible outcomes (or 'interactive movies').

If we can measure the depth of immersion, so to speak, by deciding where between the poles of audience and participant the player is to exist within a game - we can then consider how to keep them fixed at that 'depth' in order to suspend their disbelief and create an engaging experience. Once that is established, a balanced approach to sound and music (and anything else) can be adopted consistently throughout. Films show us how content supports something leaning towards a 'hotter,' passive viewing experience or, conversely, to imagine being someone 'sealed' in a film -- but these are mutually exclusive states within the film model. In games, where the line is blurred, content which would otherwise support a film very well can result in a paradox as players receive conflicting messages about who they are supposed to be (an inhabitant of the gameworld with music existing in the 'ether' or merely an onlooker being told a story?) Yet, treated as both audience and participant at once, an engaging and coherent playing experience can emerge. Some developers already succeed in this balancing act, but there may need to be industry-wide recognition of how content brings this experience about.

Implementing music written to picture, for 'cutscenes' and so on, presents few new difficulties - but, in the game itself, a lot of 'baked' music can stand out as detached or at odds with the open-endedness of the experience

For many years, the industry's answer to the problem of integration has been to tackle it on a technological, rather than musical level, which is no surprise when we bring to mind its software industry roots.  An assumption was made that it was mostly playback technology, and not music, which needed to change for games and that getting music as we already think of it to flow took priority over the question of why it was required in the first place or what it conveyed to players. So long as music operated within the framework of an 'imaginary film,' it was considered sufficient for many games. In this way, a lot of music has been conceived of almost as 'film music without a film' in games. Historically, had filmmakers and film composers adopted a similar approach, borrowing only from earlier forms, games would have no music to copy in the first place!

"New inventions often mimic the forms available at the time of their inception. The first automobiles did look like 'horseless carriages'; the first electric light fittings resembled gaslight fixtures; our current computers are a hybrid between the typewriter and television. Similarly, the content of new technological art forms often mimics earlier forms.

Early films were theatrical performances played to an unmoving camera; recordings were souvenirs of performances, trying to capture (in classical music, at least) the acoustic world of the best seat in the concert hall; and early television was radio with pictures. In most cases (classical music being an interesting exception), eventually the form begins to influence the content."  - Music for Interactive Moving Pictures, Stephen Deutsch

Just as early films first attempted to replicate the experience of stage productions on screen, many games will probably continue to be filmic experiences on some level. But if sound and music are to be integrated in games, there may be a widespread feeling that they ought to be at least as interactive and open-ended as the rest of the game. Shouldn't the implicit linear nature of scoring music become more non-linear, like the game itself? Scores tend to be linear and complete within themselves. But does music need to be 'complete' for games? One selling point of games is that players complete them and we wouldn't say that an audience similarly 'completes' a film. Many games are intentionally left open for players wishing to create a narrative for themselves. Inherent in the tools generally in use for music production today (the linear sequencers and tracklaying software) is the idea that music can be entirely composed and rendered before it reaches the point of application. Yet games have the effect of re-ordering or 'triggering' musical segments in unpredictable ways and it is often unclear to what extent games are driven by players, or vice versa.

Although the industry is getting proficient in the field of music production and in making audio 'elastic' enough to cope with the indeterminacy of time it takes to complete tasks in-game, this doesn't help answer the question of what unique experiences music and sound are setting out to support in the first place -- or why they may need to be mutually supportive at times (for reasons other than just being something 'cool' to try out' because it is fashionable to talk about blurring the lines between music and sound!)

At the moment, technologists still sometimes control musical (and a lot of other) content going into many games. A film industry analogy would be if the makers of cameras had exclusive rights to determine the content of films.

There is an aphorism that sums up this relationship well: 'Those who control the technology of a new medium control its content as well.' Then there is Professor Deutsch's reciprocal corollary, 'As the technology spreads, the control of its content dissipates.' We need to be aware of how the use of music in games is progressing so that we can proactively guide it through that process. On the other hand, the games industry will have a natural evolution into niches, some of which will be less the domain of the technologists, so a synergy between music and action (and interaction) can also come about organically.

Some day, rather than the elements of a game working against each other, developed independently, perhaps everything will be working towards one vision or design goal, and designers, like architects, will have accurate blueprints before the foundations are laid. Creating a film or a game is not a production line proposition but rather a process of integration that is hopefully as inclusive at the beginning as it is at the end. Maybe the art of games is partly to recognize that those who develop them and those who play them will be looking at them from a range of perspective beyond even the relationship between filmmaker and audience. Some like it hot. Some like it cold. But understand that Goldilocks is your real audience because they want it "just right" throughout. That happy medium can be achieved if the development of all of the game's elements is created to provide appropriate flexibility for the player.