[Editors' note: Copy provided by Rockstar.]
Matthew Smith, Craig Conner and Will Morton, three senior members of the Rockstar North Audio team.
What effect did the shift to more powerful hardware have on in-game audio?
MS: Perhaps the biggest change was the amount of environmental effects we can apply, by using Digital Signal Processing that was too expensive last-gen. For example, every sound in the game has its own filter, so it can appear muffled, and a unique amount and size of reverb - so a car horn coming from a tunnel 100m away sounds very different to a car horn right next to the player, instead of just being quieter as it might have been previously.
It also let us scale up the complexity and amount of sounds that play at any one time, letting us flesh out a world as varied and detailed as GTA IV – at times we’re playing thousands of individual sounds simultaneously. Stand in the street and you’ll hear three different radio stations at any one time coming from passing vehicles, all sounding appropriately tinny or boomy, depending on the type of vehicle and whether its doors are open, its windows broken, etc.
The ambient sound in GTA IV is one of the key features in making it seem like a believable place. Do you use a generic city sound, or is the background thrum made up of individual sounds from individual, identifiable sources? i.e. if I hear a police siren does that mean there is a police car in the next street, or is it just generic city noises?
MS: Most of the time what you hear is really happening - what’s so great about GTA IV from an audio team’s perspective is that you hardly ever need to fake anything – the world’s so rich and so busy, that if you make individual things sound right, the whole ambience pretty much just appears by itself. But we do also have a system of ambient sources to make the more distant city feel alive - what makes the two fit together so well is that they use exactly the same sounds, with exactly the same environmental effects, so it’s very hard to tell which is which. The ambient effects change dramatically based on your location in the city, the time of day, and so on – so it’s very dynamic, and obviously not just a pre-canned loop of ‘city ambience’. It was our aim that you could place the player in a random position on the map, shut your eyes and listen, and be able to tell where you are, and what time of day it is, and I think we’ve achieved that pretty well.
Do you know how many individual sound effect files are in GTA IV?
MS: We have around four-and-a-half thousand individual sfx in the game, which are combined into around 19 thousand different combinations. That’s ignoring the insane amount of dialogue, cutscenes and radio content. Rain sounds alone take up around a third of the entire audio budget for San Andreas!
I remember hearing distant gunfire and the noises subtly changing as I got closer to the battle. How do you create the effect of distance on each sound?
MS: Gunfire is a good example of how complex videogame audio has become; a single gunshot typically consists of around 10 individual components, whose volume and placement we control independently with distance, to create a bright, wide, punchy sound up-close, and a more reverberant, muffled sound at a distance. For multiplayer, we balance this so that guns only really sound dangerously beefy when they’re at a distance you can be hit from, so it’s intuitive what’s threatening and what’s someone else’s private war.
How does the team work with the written script? Do changes and edits happen throughout the development process?
WM: Changes can happen all the time, particularly with the pedestrian dialogue. In the case of the peds, each character is given a back-story, his or her behaviors are sorted out, and then using this information a script is written. This can be anything from 20 lines of dialogue (for peds who are used only in very specific situations) up to 300 lines. An average ped will have about 200 lines of dialogue. While the script is being written, actors are cast and studio time is booked.
As we are dealing with hundreds of voices at a time when it comes to the peds, the recordings are done in three or four large sessions, rather than a couple here and a couple there. It’s quite amazing to witness it from start to finish – we’ll have three studios booked for a week and will usually record about 100 peds in that time. Most peds take about an hour to record, so at any one time there are three new actors arriving at the studio and three others leaving. Franceska, the producer of the VO sessions in NY, will be running between the three studios making sure everything runs smoothly, ferrying the talent in (and out) on time, and waving her magic wand at anything else that needs attention. Each studio will have a director from Rockstar Games to take care of getting the right character and performance from the actors, plus Craig and I from the audio team at Rockstar North are on hand to ensure that everything recorded is exactly what is required for the game. As manic as it is, everything runs like clockwork and not a second is wasted.
Most changes will be made to the script during recording. Lines may change for many reasons, because they are too long or they are too funny for a serious situation, but sometimes lines change simply because an actor performs them differently. We always get multiple takes of lines, and the directors work hard to give us more choice when it comes to putting the best takes in the game.
Once the peds are recorded, the best takes of each line are selected and the lines edited, mastered, and implemented into the game. As there are so many peds in the game, we have to start recording them months before the game is complete. Sometimes new features will be put into the game which will mean that peds who have already been recorded need more speech to deal with them. The peds continue to grow and evolve until the game is finished.