Scenes I Love (From Movies I Don't): 'Babel' (2006)

Eight sublime minutes of inspired sound-image interplay nearly save a shitty film from itself.

Nov 01, 2024

Apologies for the the long stretch between posts. October was jam-packed: a brief illness, my son’s birthday, a long weekend away with friends, co-organizing a big neighborhood event. Life, in other words. We now return our irregularly scheduled but still relatively more frequent programming.

I’ll be direct: I am of the opinion that Mexican director Alejandro González Iñárritu, perhaps best known for Amores Perros (2000) and the Oscar-winning Birdman, or The Unexpected Virtue of Ignorance (2014), is a shitty filmmaker. Or more generously, he’s a technically proficient but ultimately shallow filmmaker. He snazilly dresses the windows of storefronts with no inventory.

Iñárritu’s output is characterized by a certain faux profundity. Perhaps more than any other filmmaker of that new millennium “puzzle film” era, he embraced complex structures as a way to cover over his paucity of actual ideas. In each of his first four films, he deployed what critics have called the “network narrative” structure, wherein seemingly unrelated characters and disparate story threads, often across great expanses of space of time, are revealed, whether by fate or chance, to be intertwined.1

That’s not to say Iñárritu doesn’t have his moments though. If viewed as a soapy melodrama rather than a portentous mediation on grief, 21 Grams (2003) can be a satisfying sit. And for all my gripes about Iñárritu, he managed with Babel (2006) to commit to film one sequence that is positively sublime. It’s a scene I love from a movie I hate, and in a span of less than eight minutes, it nearly redeems an otherwise pretentious turd of a movie. If you’re unfamiliar with the film, here’s the trailer.

Now, on to the good part.

II.

Adolescence is difficult for everyone, especially Chieko (Rinko Kikuchi), who is grieving her mother’s death by suicide while at the same time being roiled by bourgeoning sexual desires. She struggles to find an outlet for these longings, as boys who are attracted to her from a distance are put off to learn that she is both deaf and mute.2 But she yearns to be desired, going so far as to make a bold pass at her dentist while in his chair and later spreading her skirted legs, Sharon Stone-style, to provoke some young men who earlier rejected her.

Rinko Kikuchi as Chieko in *Babel* (2006).

The scene in question begins in a Tokyo park. Chieko and her friends meet a group of boys, one of whom is cousin to one of Chieko’s compatriots, also deaf and mute, and thus he is minimally equipped to communicate with the girls. Whatever social barriers separating the teens all but disintegrate, though, when the boys introduce first a bottle of whisky and then tablets of ecstasy into the mix. Here, Chieko and Babel take flight.

Almost immediately, awkwardness gives way to giggles and glances and flirty, teasing touches as the teens cavort in the splash pad and take to the playground equipment. And just as they do, Iñárritu lowers the volume on their world, replacing the diegetic sound with an ambient, droning synthesizer.3

It’s quite the contrast, their spirited play against the airy, languorous tone of the music. But this unconventional use of sound does something even more significant: by removing the din of the city, the thrum of the fountains, and the teenagers’ giddy yelps, the film aligns us with Chieko. Like her, the viewer, too, hears little to nothing of the world.

Seconds later, we see Chieko, eyes closed, head back, rapturous on the swing set. Rather than take in this moment from a distance, the camera is tethered to the swing and ever so close to Chieko. The camera swings with her, and as a result we feel something of her woozy, floating release.

Generally speaking, the point-of-view shot is the favored device for depicting a character’s subjective experience, but Iñárritu achieves a similar end through its aural equivalent, point of audition. We viewers hear as if we were occupying Chieko’s position in space, but yet she remains in our field of view. Paradoxically, the sound track is subjective, but our viewpoint is not. The film has placed us on uncertain middle ground.4

But even that peculiar position is not fixed or stable. Take, for instance, when the teenagers, still tripping, enter a disco. Faintly, tinnily, Earth, Wind & Fire’s “September” enters the mix, and it subtly rises in volume as Chieko and friends make their way through the club’s long corridor and up its windy staircase. Is the sound here still subjective? Does it suggest that Chieko’s deafness is not total, that the disco’s booming music is sufficiently loud to be within her threshold of audibility?

No, as it turns out. For as the group spills onto the dancefloor (in a shot that calls to mind Dorothy’s arrival in Oz), we see on Chieko’s face her bewilderment — and perhaps a bit of fear — at the sensory assault of strobing lights and the throng of bodies. Off her glance, the film presents us with the reverse shot (what she sees), and with it, all sound, save a low rumble, is abruptly removed. It’s one of the few instances in which point of view and point of audition coincide, unambiguously from her perspective.

That 'Wizard of Oz' Theory Isn't True, But These Ones Are — Alice crosses the threshold from black-and-white to color in *The Wizard of Oz* (1939).

Befuddled, Chieko scans the room, taking it all in, before locking in to the rhythm and falling into a circle with her friends to dance. The uncertainty fades from her face, replaced with a smile. The revelry resumes. For nearly two minutes, the music remains at full volume as the film cuts freely between Chieko’s glances and the details onto which her gaze latches. There is no sensory deprivation; only abundance.

In these two minutes, Iñárritu lets us believe the earlier abrupt sonic shifts and the sensory distress they signal are behind us. But then Chieko catches a glimpse of her best friend Mitsu (Yuko Murata), deeply kissing the boy she herself had fancied, and with her recognition, diegetic sound is again withdrawn, grounding the viewer in her perspective and, more importantly, in her disappointment. The toggling of diegetic sound jarringly returns, underscoring the young girl’s sense of having been betrayed. The rug has been pulled from beneath Chieko’s feet, and with the replacement of buoyant song with flat, mechanical thrum, we, too, feel the ground beneath us shift.

Chieko waves goodbye to her friend and heads for the exit as the music slowly fades When next we see her, she is among strangers on the Tokyo streets, coming down from her high(s), and the world of the film is yet again quieted. Chieko trudges slowly along the sidewalk, passing at one point a rock band performing for tips on a street corner. Neither she nor we hear the melody.

III.

This sequence is a movie unto itself, taking us through a full arc from unease to rapture to community to treachery, all in a wordless — but by no means silent — eight minutes. In that the sequence doesn’t rely on dialogue, it would be tempting to say it harkens back to silent cinema, where filmmakers conveyed the emotions and desires of characters without recourse to words. But such an assessment would miss just how central sound is to the sequence, for it’s the interplay of sound and image that moves us in and out of Chieko’s perspective, that puts us in and out of her figurative shoes.

These eight minutes showcase some adroit, compelling filmmaking on the part of Iñárritu. It’s a rare bit of weight in an otherwise shiny, empty movie.

Thanks for reading! If you enjoyed this essay, consider subscribing to Material Ghosts or sharing this post with like-minded friends.

The network narrative was not an invention of this era, however. D. W. Griffith utilized the same conceit back in 1916 with Intolerance, as did Robert Altman in Nashville (1975) and Short Cuts (1993). Hell, one could probably trace it back to The Canterbury Tales, from the 15th century.

According the National Center on Disability and Journalism’s style guide, the use of the term “deaf-mute” is problematic. Here I follow the lead of the character, who refers to herself as such in the movie.

In film nerd terms, “diegetic sound” refers to sounds sourced in the fictional world. If a character listens to music in her car, it’s diegetic. If it’s orchestral score meant to, say, heighten emotion, it’s non-diegetic.

The film nerd term for this particular sort of sound-image relationship is “free indirect.” A POV shot would be direct (i.e., directly showing what the character see), while an objective shot with the character in frame would be indirect. The movement between these seemingly fixed poles is what characterizes the “free indirect” mode.

Discussion about this post

Ready for more?