
Ritsu Mizutani and Vincent Diamante explain how audio shaped the experience of thatgamecompany's latest project.
Posted on September 16, 2019
Thatgamecompany are renowned for emotionally inspiring games. The artistry of their creations has a global appeal. For them, the sound design is the storyteller.
Their latest project imagines worlds in harmony with the music and sound design. In Sky: Children of the Light, players socialize while exploring a kingdom above the clouds. In this interview, Lead Audio Designer Ritsu Mizutani, and Audio Director and Composer Vincent Diamante, explore the ways in which the audio design has influenced the art, ambience, and narrative of Sky.
Can you tell us a bit about your background? How did you end up working on this project?
Ritsu: I’ve been working as a sound designer and sound director in Tokyo for more than 12 years. I jumped into this project 3 years ago because I was a big fan of Journey and Flower, so I wanted to work with this creative team on their new project.

Vincent: I am a music composer and audio designer, living and working in Southern California for the last few decades. I have worked on various projects in a freelance capacity, including thatgamecompany’s previous PS3 game Flower as a music and sound contractor, and I was excited to join the company as in-house audio director.

How many people worked on the audio? How was the team structured?
Ritsu: One audio director/composer, one designer (me), and one programmer.
To engage with people who haven’t played games before, what sorts of things did you have to consider? How did audio play a part?
Ritsu: When designing sound effects for Sky, I paid much attention to the background story in each realm. This includes information that we wanted to hint to players, and the reasons why specific sounds are highlighted at each specific moment. I tried not to lean on video game clichés during these times.
Sky is unique because it is minimalistic. So regardless of whether players have played another video game or not, it’s important for us to make it possible for players to instantly have a sense about a new object, creature or character through their sounds. The sound design should be a storyteller instead of text.
What were the unique challenges when developing audio for Sky?
Ritsu: Ambience sound design was crafted in such a way that we could immerse players into this fantastical world above the skies, and move away from real life expectations or passively watching the character onscreen.
For example, you can fly and enter clouds in the sky. We know clouds are intangible vapour in reality. But also, touching clouds is also a dream that all of us have had when we were children. I took on the challenge to add a subtle fizzing sound to provide substance to players flying through each cloud.

Were sounds recorded in-house?
Ritsu: Most of the foley sounds were recorded in our studio, but part of it including footsteps, water streams and rainfall sounds were recorded at multiple outside locations.
How do you achieve thematic and sonic consistency with multiple people working on the project?
Ritsu: We made the guideline documentation for sound design and composition to share with the director and team. We updated it when the game design or user story was revised. We referred back to it when we tried new sounds for new expressions.
How did you communicate with the programming team, especially in terms of defining events and parameters?
Ritsu: Our audio programmer has vast knowledge of both low-end audio framework and audio design pipeline. As a result, he has provided us with the best functions that have been optimized and used in our source code. We also had another programmer who developed many Maya plugins to enable a version in real-time 3D audio.
I could use the full functionality of FMOD without any communicational concerns. In relation to events and parameters, I defined them myself and coordinated things thanks to easy parameter settings in FMOD. We’re a small team, and all developers had C++ coding skills and access to source code. The composer and the audio designers were no exception. It was a great asset.
Did you use the live update feature in FMOD? Can you describe the workflow?
Ritsu: Live update and the profiler were essential features for our audio mixing and bug fixing. No matter how much I polished a sound in a DAW, there was no way to judge whether or not the sound had the perfect balance in the game with a lot of concurrent sounds. Once I had implemented a sound effect in the game, adjustment of volume, pitch, event parameter settings followed. Adjusting spatializers, effects, and playlists, were also a crucial part of the process of polishing the sound. The real-time live update feature saved a lot of time and iteration.
Profiler was also important for my bug fixing workflow. If I wasn't able to solo and mute each event in real-time, I would have had to spend more time to find the exact event with an issue.
Does FMOD Studio help with collaboration? Do any team members work remotely on the FMOD project?
Ritsu: In this project, the composer and the audio designer were working in different locations using a common project file. Thanks to source control it worked well in most cases.

Thatgamecompany games are renowned for their music. Can you describe your approach to this aspect of the experience?
Vincent: As music development happens very early in the game development cycle, music is often tasked with helping to form the larger game design ethos. Even before a narrative, game loop, world area, or character design is crafted, music is evaluated and reflected on with those ideas in mind. This effect can manifest as something like artists listening to newly composed musical tracks as inspiration for new architectural structures, or as direct a connection as level designers specifically using the timing of musical movements to determine the length of sections of a new level.
In what ways does the music adapt to what the player is doing? Can you share any details on how this was done?
Vincent: Music does a lot of obvious and subtle adaptive moves throughout the game. The more obvious ones vary a lot over the course of the game and include different types of layering, branching points, entrances, and exits for different cinematic beats. Not every music cue utilizes all of these, but as the player dives deeper into the game, the music uses more and more of these techniques in larger pieces that have a lot more to offer. The more subtle adaptive moves use layering and DSP effects driven by things like level and player state (standing vs. running vs. flying, position, altitude, orientation). While there are some simpler music stingers happening, there are quite a few music events featuring many layers of tracks and extremely large numbers of markers that are exported along with the WAV files and help dictate looping and transition regions for the various FMOD parameters. Reaper's robust marker functionality and a few FMOD scripts written by TGC programmer Yang Liu really flex their muscle here.
Which sound was the most challenging to implement?
Ritsu: Avatar navigation is ergonomically optimized for the touch screen control, so we needed physical-momentum driven playback instead of animation based triggers. To play footsteps in this game, many physics calculations are carried out. FMOD has a deep affinity with this system since it can handle a lot of parameters at the same time. Additionally, it’s very easy to blend or transition between multiple events in accordance with the status of parameters. Depending on the state of the avatar (altitude, velocity, wetness and rain intensity) footstep sounds vary in texture. But I didn’t need to have separate events for them. The status of footsteps is managed in only a single event.

Was there a standout feature of FMOD for implementation?
Ritsu: Timeline transitions are a useful feature for adaptive audio design, not only for music composition but for sound effects. The ambience in the first level, where players visit the first time, adapts to time. The visual character of the level changes with respect to the real world time, so it needed a lot of audio resources. On the other hand, the level had a strict asset size limit, as that’s the first level, so we couldn’t move the resources to the additional download.
Transition regions made it possible for only one ambience event to adapt to different visuals without having many looping sound assets. In this event, the basic looping sound is common regardless of world time, but it changes its acoustic character by using parameter control. Additional scattering sounds such as bird chirps come in and out according to the time. It would have been impossible to do without timeline transitions.

How did you go about balancing the overall mix? How does multiplayer factor into this?
Ritsu: Considering the general environment for the customer, I used SONY MDR-CD900ST, Apple Earpods, Audio Technica’s consumer-use earbuds and internal device speakers for the mix in the regular audio design process. Large speakers were used occasionally for low-end and high-end testing only. At the end of the development period, we went to a post-production studio for the final mix using large monitor speakers.
To provide the best multiplayer experience, the music is synchronized between players who join the same game session at different times.
Depending on the type of device, bus volumes and the properties of some effects including EQs and reverb have different settings applied by snapshots.

Which technical detail are you most proud of?
Ritsu: The tone of many sound effects are pitch shifted in real-time according to the music key. This avoided subconscious player discomfort caused by discord between music and sound effect.
Are there any subtle sonic details that you felt had a big impact on the feeling of the game?
Ritsu: As I mentioned above, my goal for the sound of avatar navigation was to emphasize the sense of unity between players and avatars. Footsteps make sound on impact but also when the sole lifts from the surface. You can hear very subtle sounds as your foot kicks and pushes the ground. That’s the boon of the physics-based trigger system. Please try walking and running on the various surfaces at various speeds!

Sky uses FMOD for adaptive audio. Learn more about the workflow and get creative.