Tag Archives: interview

Audio Recording Methodologies

Tools and methods for recording interviews with enough audio fidelity, integrity and selectivity to support social research may seem relatively mundane or matter-of-fact. And for some research purposes, almost any digital or tape-based audio recorder will suffice. But choices among different methods and technologies can also shape important theoretical priorities and concerns. That’s also true for choices about how a recorded interview will be processed, converted, examined, analyzed or reproduced.

Traditionally, the first step in the post-interview workflow was to make a written transcript of the audio recording, a verbatim text that could then be edited or analyzed. That’s still a very common practice, but recent software and hardware developments make possible some intriguing alternatives. Choices between these alternative “methods” involve matters of personal preference and technical skill, but they also reflect and support different kinds of theorizing about the substance of culture and social life in general and the “content” of interviews in particular.

 Alternative Audio Analysis Strategies

Some of these choices are displayed in the figure above as four different strategies for working with an audio recording of an interview. Each strategy appears as a vertical column (labeled at the bottom as A, B, C and D) that starts with the same audio recording “stream.” In column ‘A,’ the strategy involves listening to the recording and making more or less detailed notes with a standard word processing program. The product of this method could be a log of the interview (in which a few details or themes are indexed to sequence or duration), a narrative or thematic summary of the interview, or a verbatim transcript (that might also include some features of a log or summary). When the process is complete, the researcher has in hand a text that can stand in for the audio recording in any subsequent rounds of analysis or reporting.

The second and third columns (‘B’ and ‘C’) suggest two different ways of building computer data base functions into the analysis of the same audio recording. In the Column ‘B,’ a written transcript is prepared, much as it might be in column ‘A’, but the transcript is then broken into chunks that are imported as individual records in a data base program. Each chunk of text, or record, can be coded and annotated, retrieved, and re-assembled according to different themes. This “code and retrieve” approach enables a researcher to bring together related comments from the same or different interviews for further analysis. It entails the same kind of conversion from audio to text that takes place in strategy ‘A’, but the “text” product itself is enriched to include not only a sequential summary or transcript, but a database of text “chunks” drawn from it.

In strategy ‘C’ the same data base features appear that were part of strategy ‘B,’ but with an interesting twist. Rather than first converting the audio recording into a text, and then breaking up the text into meaningful chunks, the audio recording itself is broken into chunks, with each chunk then identified with particular themes, questions or issues. Once again each chunk appears as an individual data base record, but the records themselves include a section of the audio recording. In contrast to ‘B,’ strategy ‘C’ allows analysis of the interview to be based on the audio recording itself (not a text translation of the interview) and leaves open the option of selective transcription after analysis is concluded. That said, both ‘B’ and ‘C’ split the continuous coherent “stream” of the audio recording (or its text transcription) into discrete chunks, which may or may not make sense for a particular line of inquiry.

In the far right column I have suggested a fourth approach that combines features of the preceding three. Strategy ‘D’ starts with the same audio recording as ‘A,’ ‘B’ and ‘C,’ but preserves that recording intact through subsequent rounds of analysis or transcription. In contrast to the other three approaches, text transcriptions, codes and annotations are attached directly to the audio recording as another “layer” of a digital file. This transforms the audio stream into an audio-text database; text is segmented and indexed to different sections of the audio recording without fragmenting the recording. Working with strategy ‘D,’ a researcher could listen to the entire recording, locate audio segments by searching for code words or summaries assigned to them– within or across interviews. This strategy thus preserves all the information of the source audio recording throughout the process of analysis.

These four different approaches present somewhat different technical challenges, but they also support different kinds of interview-based studies and different kinds of theorizing about culture and social life. To understand the implications of these contrasts it’s useful to consider four related distinctions: data “chunks” and “streams;” analytical “annotation” and “coding;” “audio” and “text” representations;” and the boundaries between “informants,” “colleagues,” and “audiences.”

Field Recordings as Literature

As a listener, reader, social researcher, and citizen, I’m a great fan of good audio recordings and interviews and of field recordings in particular. There’s something about listening to what people have to say, recorded cleanly and fairly in their own voices, that I find stimulating, entertaining and enlightening. I don’t think of audio interviews as short stories, poems or novels, but as documents that are nevertheless akin to literature and, as such, objects that are worthy of cultural appreciation and critique. Audio recordings can also provide valuable data for social and cultural analysis, either on their own or when converted to written transcripts, which can then be annotated, indexed and coded. That makes them well worth attending to as a medium and method of social research.

As both documentary literature and data, audio recordings provide a distinctive way of depicting the interplay of voice, meaning and situation. Audio recordings allow us to feel that we’re listening to another person, for example, not just “encountering a text.” And in some sense we are, just not at the same time and place in which that person spoke.

Audio recordings enable us to discern deliberation, word choices and self-consciousness (or the lack thereof) in how someone speaks. These are reminders that talk is dynamic, flowing and performed, that one word does not follow another until someone uses her or his voice to make that happen. Audio recordings can also offer a leg up on understanding what people mean by what they say. That’s important to social researchers who want to understand what people think and do, not just the words they use, and it’s also important when we want to document forms of narrative and story telling that both illustrate and rely on subtleties of the spoken word.

Realizing the special virtues of audio recordings for both literary-documentary and social scientific purposes depends in part on the technical quality of the recording itself. When recording quality is so poor that there’s no audible difference between one voice and another, for example, we can forget that the words are coming from someone in particular. When a conversation is submerged by unwelcome ambient sounds, phrases become unintelligible and we can lose track of ideas and meaning. When an otherwise clear recording is fractured by bursts of static, or precipitous volume swings, we’re distracted from the flow and cadence of what a person says.

Problematics such as these frame three key challenges in making “good audio recordings”: First is the challenge of fidelity, or the level of acoustic detail and accuracy provided by the audio recording and how well this corresponds to the original sound source. Second is the challenge of integrity, ensuring that no additional or unwanted sounds are introduced by the recording equipment itself. A third challenge is selectivity, or the degree to which recorded sounds are inclusive of what we are interested in and exclusive of everything else.

Thoughtful efforts to address these challenges depend on an appropriate alignment of ideas and purposes with techniques and equipment. If the purpose is to create a set of personal voice memos or a written transcript, for example, trying to achieve broadcast quality standards of fidelity will be wasted effort. On the other hand, for the scientific analysis of audio signals and spectra–-a routine practice among ornithologists–-even broadcasting equipment can fall short of what’s required.

As these last comments suggest, there’s no such thing as a perfect audio recording, and there are different ideas about what’s good enough. These ideas reflect personal preferences for different kinds of equipment or effects, but they also reflect the fact that people make audio recordings for quite different reasons. Rarely are these reasons teased apart in audio recording accounts and guides in precisely the way someone would like to inform their next project (or understand shortcomings of their last). However, as described in the attached document (Recording Interviews), some general principles do apply in matching purposes to equipment, equipment to method, and method to methodology.