The Method in our Madness: Data Collection and Analysis for Our Study of Higher Education, Part I
by Wendy Fischman and Howard Gardner
When hearing about our ambitious national study of higher education, colleagues often ask us how we went about carrying out the study and how we will analyze the various kinds of data and draw conclusions. At first blush, the decision to carry out approximately 2000 semi-structured hour-long interviews across ten deliberately disparate campuses, to record and transcribe them, and then to analyze the resulting “data” seems overwhelming—and not just to others! Moreover, when asked for the “hypotheses” being tested, we always reply that we did not have specific hypotheses—at most, we knew what general issues we wanted to probe (e.g. academic, campus life, and general perspectives on the means and goals of higher education). Additionally, we wanted to discover approaches and programs that seemed promising and to probe them sufficiently so that we could write about them convincingly and—with luck—evocatively.
An Earlier Model
We did not undertake this study entirely free of expectations. Our colleague Richard Light, now a senior adviser to the project, spent decades studying higher education in the United States; he provided much valuable background information, ideas about promising avenues to investigate, and some intriguing questions to include in our interview. Both of us (Wendy and Howard) had devoted over a decade to an empirical study of “good work” across the professions. In that research, planned and carried out with psychologists Mihaly Csikszentmihalyi and William Damon and their colleagues, we had interviewed well over 1200 workers drawn from nine distinct professions. The methods of interviewing—and the lack of guiding hypotheses—were quite similar. Because we were frequently asked about our methodological approach, we prepared a free-standing paper on the “empirical basis” of good work. In addition, reports on our findings yielded ten books and close to 100 articles; moreover this project led to several other lines of research—see TheGoodProject.org. Our prior work on “good work” served as a reasonable model as we undertook an equally ambitious study of higher education.
In this and the succeeding two blogs, we seek to convey the “method” to our undertaking.
Part I. The Nuts and Bolts of our Research
Interview Questionnaire
As in the earlier study, we developed an interview questionnaire, which essentially takes an hour to administer (we could have easily spent 3 hours if we or the participants had the time!). Most of the interview consists of open-ended questions, except for two rank order questions in which we ask participants to prioritize specific items. (These “forced choice” questions were created because of clear patterns of preferred responses that emerged in pilot work.) The interview questionnaire covers wide-ranging topics in four main sections focused on: 1) goals for the student college experience; 2) academic curriculum; 3) campus life; and 4) broad questions about the value of higher education.
Once we had an interview questionnaire firmly established, we adhered quite closely to it for the remainder of the study (and we trained researchers to be consistent interviewers). And when we tweaked the protocol slightly as the study evolved, we did so in a way that did not invalidate the data gleaned from the earlier questionnaires—for example, adding new questions at the end of the interview. The interview questionnaire is quite similar across constituencies (which are outlined below). For ease of communication, in this trio of blogs, we focus almost entirely on methods and responses with the student population.
Selection of Sites and Participants
Before beginning the study in earnest, we carried out pilot work at two campuses (which eventually became “full sites”). We selected these two initial campuses because they were geographically close to us and because they differed from each other—specifically, a public state university in a rural area, and a private university in the Boston area. At these campuses, we had the luxury of interviewing nearly every participant in person. Eventually, as we selected campuses farther away from our home base, we interviewed some participants in person and others, including students, via Skype (or another platform).
From the beginning, we set out to recruit 2000 participants across seven major constituencies—incoming students, graduating students, faculty, administrators, parents, young alums, and trustees. We aimed for half of the interviews to be with students (approximately 1000) and the rest to be with the other constituencies (approximately 1000). (We interviewed an eighth constituency—job recruiters—as convenient.)
After the initial pilot sites had been selected, we chose the other eight campuses, one at a time. We sought to include campuses that represented different categories (e.g. private/public, large/small, urban/rural, residential/commuter), and those with distinctive cultures (e.g. a special focus on religion, athletics, community service, etc.). At the same time, each of the campuses offers a liberal arts form of education (at the larger universities, we interviewed individuals associated with the schools that offer the traditional liberal arts and sciences curricula). We often refer to the campus selection process as a “chess game” in which once one campus had been chosen, we carefully considered what we would still need and want before our next “move.” Obviously no ten campuses can capture the great variety of the several thousand institutions of higher learning in the United States; but we believe that our ten campuses represent an impressive range.
From the initial pilot schools, we learned a great deal about various strategies for recruiting participants for the study. With respect to all schools, we carefully and strategically selected faculty and administrators in order to ensure that we spoke with individuals who were knowledgeable about the school (e.g. those who have been in positions for more than a few years) and those who represented various academic and administrative departments throughout the school. We scoured the schools’ websites and Google for background information on each of these individuals. For most of the other constituency groups (students, parents, young alums), we used more of an opportunistic approach for recruiting, including fliers, emails, tabling, and advertisements on social media. Overall, we recruited a “convenience sample,” while checking to make sure that we recruited students who reflected the general demography of each school (e.g. gender, students involved with student government, religious organizations, athletics, Greek life, etc.). When we were undersubscribed with a given constituency, we made extra efforts, usually successful, to recruit subjects from that constituency. In general, as a group, trustees were difficult to recruit and difficult to schedule and re-schedule, but with the help of each school president and secretary of the board, we secured robust groups across the campuses. The ease or difficulty of recruiting subjects on a given campus turns out to be quite revealing (perhaps a topic for a different blog!).
Coding
In a nutshell, our coding scheme is divided into two major sections. The first section requires a researcher (or “coder”) to read an entire interview transcript and respond to a variety of questions about “holistic” concepts we have developed—concepts that can’t be inferred unless one has reviewed the entire interview. For example, with respect to each student participant, we ask coders to consider the primary “driver” the individual has for college (e.g. what exactly seems to motivate that student in college) and the value of “liberal arts and sciences” to each participant.
The second section requires coders to think about a participant’s response to specific questions throughout the interview and to categorize the participant’s responses. For example, we ask coders to categorize a participant’s response when asked to recommend a book for a graduating student (e.g. title, genre, how the participant knew about this book, and why he/she recommended it).
To ensure “reliability” across coders (i.e. to make sure that independent coders interpret the data in the same way), we review each transcript twice (by two different coders). The first coder reads the transcript and responds to the holistic section of the coding scheme; the second coder “shadows” the coding of the first coder (e.g. makes sure he/she agrees with the coding by reading the entire transcript independently and then reviewing the coding of the first coder). The second coder notes any “disagreements” about the coding and discusses such disagreements with the first coder, hoping to reach a decision that satisfies both coders. If the disagreement is still unresolved after a discussion between coders, these coders ask a third coder to participate in the discussion and resolve the decision. In addition, after reviewing the holistic section of the coding scheme, the second coder also responds to the second half of the coding scheme, mainly categorizing a participant’s specific responses to particular interview questions. Because this coding requires straightforward categorization, the coding of this section is not systematically reviewed by another coder. However, if at a later point (i.e. in analyzing the categorizations for patterns and themes), researchers come across a categorization that does not make sense, researchers can correct the mistake. We use Cohen’s Kappa to calculate reliability, consistently achieving over .80 reliability (a value above 0.80 is typically considered “excellent”).
We invest a lot of resources into our coding because it is the crux of our study. For an hour-long interview, we estimate that it takes about 3 hours to code and shadow (1.5 hours to code and 1.5 hours to shadow). To create an environment in which researchers could focus without distractions, we mandated “coding blitzes”: according to the norms of a blitz, researchers can work anywhere they wanted for days at a time, but have the responsibility of reaching certain coding goals. During these coding blitzes, we come together as a team once per week to assess progress, check in with each other, and talk about challenges—to make sure we agree with how to code a particular concept or a dilemma about a particular participant. We record these meetings in case a researcher can’t attend. To say that we take the coding seriously is an understatement—in fact, we often have to stop ourselves from “over processing” some of the tiny details and remember that we also need to focus on the big picture of what we are finding and what it might mean.
These topics are covered in the next two blogs.
© 2018 Wendy Fischman and Howard Gardner