When I read abstracts of papers, I have detected that two starters discourage me from keeping reading: a) Those that “present a novel data set” and b) Those that present an experiment or quasi experiment as its originality.
My first impulse was that of thinking that I’m biased against empiricist for academic tribal reasons. But then I tried to remember what had happened in the cases in which I had kept reading: in general, my initial reaction had rarely been disproved.
I discussed this the other day with my crazy empiricist office mate. My problem with the “credibility revolution” in applied social science, which I prefer to call “the data centric obsession”, is not just that I find Rubin identification an incomplete account of causality. My biggest beef has to do with the academic incentives it induces.
Consider the emphasis on data. For a PhD student, one way of finding a dissertation topic seems too often to be the following: 1) Find a topic where questions face measurement issues due to lack of public data 2) Spend a large part of your PhD source mining and “collecting data” 3) Construct an index or a data base 4) Push the Stata button until you get the right amount of stars and claim that you have provided an improve answer 5) Brand yourself in the market as an expert in the field. Naturally, there is nothing wrong abstractly speaking with this. Data collection is a fundamental part of modern research and someone ought to do it. The problem is that it induces bad incentives: it means that you can go to the market and get published doing what is essentially a form of extended RA’ing. It does not provide any incentive to acquire a methodologically sophisticated or substantively rich training; probably, this will be crowded out by the time you will spend collecting data. Moreover, this is plagued with all sort of problems, since measurement and index building is usually more involved than classifying and aggregating in 1 to 4. scales and taking the average.
A similar problem arises with respect to the “experimentalist” zeitgeist. It seems to me that many research questions are increasingly motivated, not by their interest in themselves, but by the availability of some quasi-experiment. The mindset induced by Rubin identification, as detailed in the first chapter of Angrist-Pischke’s book, is to keep thinking about what kind of experiment could work to answer a certain question. The effect is however perverted as the pressure to publish induces you to take the inverse road: find “natural” experiments and then think what questions could be answered with them, which naturally results into an overabundance of work whose only interest is the cleverness of the experimental design, but whose relevance or connection with a relevant question is miles away from being clear.
A counter criticism is obviously how is it that it is possible to find your way with such lame work. Shouldn’t someone show up at a seminar pointing what I’m pointing out here? The fact is that this does not seem to happen and I don’t have anything more than a tentative answer for it.
In my view, it has just to do with the fact that academia is a peer monitored organization. In the case of (bad) data collection papers, issues related to measurement are typically boring. They are relegated to appendices, no one really has an incentive to monitor it seriously. The problem is similar in formal theory: no one really goes through the algebra in detail, but it is in principle feasible to do it, and, actually, sometimes these errors are detected. If discussing the algebra of a proof is almost unthinkable in a seminar, going into the details of data collection, measurement and aggregation is not only hard to imagine, but probably intrinsically infeasible.
Something different happens for the experimentalist people. As I was saying, I feel we have come to a point in which many papers are evaluated based on the cleverness and originality of the research design (“Using the World Cup qualifiers as an instrument for patriotism!? Woaw! how cool/crazy is that! I wish I had had that idea”). The sexiness of the identification strategy has too often become a goal in itself. When your peers monitor you paying more attention to the originality of the identification strategy than to the research question, you probably have an incentive to mine reality for ever crazier discontinuities. It is true methodologists have been criticized in the past for analogous reasons, such as being guided by the desire to increase mathematical complexity without a clear benefit. But, if you work with pure formal theory or statistical theory, your work is not meant to immediately answer question about the real world, but instead to serve other researchers in their quest. This is something that can, in general, not be said of applied CI work.
I would not like to seem too harsh. There is excellent work out there, that marries carefully crafted theoretical hypotheses with clever identification strategies and, of course, I think that data collection is absolutely fundamental. What I’m highly critical of is the naif empiricist ethos that is often at the root of “data centric” applied work. One problem is of course that your conclusions can only be as precise as your theoretical assumptions. But what I’m concerned here is the sort of academic incentives it induces. I’ve always considered myself a philosophical (and epistemological) pragmatist, and any account of a method should consider it as a norm structuring a social group- academia in this case.