What if the rich could not escape democracy

My reflex as a political economist, I look at democracy mainly as a device of solving public goods and distributional concerns. I read for class a small book by Gabriel Zucman. The Hidden Wealth of Nations tells the story of tax heaven. It gathers data, and suggests realistic measures,  although one can feel that the author would eventually like to invade Luxembourg and Switzerland. It is very well written and I strongly recommend it.

The ability to tax is at the heart of democracy. Without it, there is little democracy purpose left. We would have no real state capacity, or no state for that matter, and thus not much to decide about. As such, my natural inclination reading Zucman’s evidence was to think that better tax capacity is necessary for democracy. If the rich can opt out from democracy, democratic decisions are constrained to those that do not to harm the rich. And in very unequal societies, it makes democracy pointless.

If you are familiar with the literature on democratization (Acemoglu&Robinson, Boix, partly Przeworski), this may ring some bells. In this research, one key problem to solve is how the poor can reach a compromise not to ‘abuse’ their power under democracy. If we operate under the tyranny of the majority, democracy will result in bad policy and social conflict. Historically, this is at the heart of the ‘class compromise’ idea of socialdemocracy. In Boix’s account, capital mobility plays a key role: if the rich can take their asset and leave, then the expropriation power is limited. As a result, a low tax-stay at home equilibrium is flexible, as well as a high tax-capital flight, depending on parameter values. Dictatorship occurs when the rich have their wealth in the form of assets that can easily be taxed -like land.

The provocative conclusion seemed hard to escape: perhaps tax havens are good for democracy. I like to think about this as an exit-voice-arms idea. Under the democratization literature, the dilemma is about exit vs arms. This idea is similar to that ‘make the world safe for (former) dictators‘ and the problem posed by universal transitional justice.

What is left unexplored and I find more interesting is the voice dimension. If the rich could not escape domestic taxation, then perhaps they would become more involved in domestic politics. They may lobby, for looser regulation of money and politics that will allow having a stronger voice.They may contribute more strongly to political campaigns. They may run for office. Perhaps this involvement would not go further from fighting against taxation. In that case, going after tax havens would do no harm, since this wealth is currently not taxed. However, getting the rich involved in domestic politics may have other negative consequences. They may be potential allies for other segments of society (the upper middle class), shifting the balance of influence. Or, if participation in politics has a fixed cost -setting up a lobby industry,  building a political network- the participation in tax politics may extend to other areas.

All the above is highly speculative. My professor asked me ‘So, you think that if we attack tax haven, then democracy will suffer’. I was reluctant to push my argument to say yes. But if we believe, like many in the literature, that asset mobility affects democratization in a quantitatively relevant manner,  one can not ignore that tax havens are just a form of asset mobility so it must have some effect. And the same is true for more generic dimensions of participation. What seems totally unlikely to me is that the rich will just watch and see while they are taxed.

What if the rich could not escape democracy

Audible in NC

I started Amazon Audible a while ago. My experience so far has been pretty positive. For for certain books, it works really well. These are mainly non-fiction but non-technical books, for which I can just pay attention intermittently. As as a result, I have restarted to ‘read’ History. I listened to the Oxford history of the United States (II, III)this summer, and Lindert Williamson’s great book on the economic history of the US.

I have been consistently unable to purchase certain books. For example, The Curse of Cash, or Eric Foner’s Reconstruction show up as not available in my area. This sounded weird to me, since I am a registered Amazon for the United States, but I talked this morning to their customer service and this seems to be the case indeed. I was told that certain books seem to have restrictions within the United States.

I couldn’t help but speculate* about the reasons why people in North Carolina are prevented from buying these books. Perhaps it is some form of boycott for the toilet bill.

*Note: I tend to think that the person on customer service just told me this because she did not know how to fix it.

Audible in NC

My experience being a terrible writer in academia

I have recently become aware of a very unpleasant fact: I am a terrible writer. At least, I’m a terrible academic writer. This awareness has built in progressively, but it reached a peak last weekend, when I had to write the final draft of an essay for class.

I have been aware of this weakness for a while. I just find it difficult to be succinct and synthetic (I guess this is obvious from my style writing in this blog), avoid grammar mistakes, and keep my sentences readable. I even face this ‘far too long sentence‘ bias when writing mails. For me, it is just much more natural to write in French or Spanish.

I also think that most successful research is, at least by 30% percent, about framing, about making it attractive and selling it properly. I recently attended a talk in a workshop. The speaker did not do anything terrific. Most of the contribution was just the data collection and a small conjoint experiment. But I came out of the talk feeling that it was, by far, one of the best talks I have seen in the last year and a half. Why? The framing, the talk, speaking skills were great -it did not hurt that the topic was of substantial interest to me, of course.

Becoming aware of these two facts simultaneously -the importance of writing well and what a terrible writer I am- is of course a very depressing prospect. My girlfriend said she thought, from reading my blog, that I am not that bad a writer (I think she is sincere on this one, she has a rather tough-love style to communicate these things in general). But that was hardly a reason to feel better, since a) she also a non-native speaker, and finds my Spanish transplanted structures much more natural and b) the tone and structure of a blog is very different than that of an academic paper. One of the areas I have problems with is in trying to control how formal I am being, how many words I use to communicate a single idea, etc. Here, you see, I have been writing now for 20 minutes without really saying anything.

I guess I should practice. That is what people suggest, you should practice. But in grad school, at least if you do the kind of work I do, there are less opportunities for writing than you may think. My workflow is only about writing in the very end- which arguably is also a problem which I should correct. Consider this back of the enveloppe calculation. Say that I take three classes per semester, and that I have, at most, two writing assignments per class. For each paper, until now, I started writing the results at most five days before the deadline (yeah, working under stress and so on). That means that in two years, I will gone through, at the very best, 24 writing pieces, with an average of 3 days writing per piece, which makes an overly optimistic upper bound of 74 days, that is, two months and a half out of 24, in the period of my academic life in which I am most likely to be writing most. I believe, however, that given that many of my classes have been methods/statistics classes, and that 2 essays per class is a much too optimistic prospect, the real time is likely to be about  half of this. Given that, as I was saying, I believe that at the very least your presentation skills (not just writing, but story-telling, thinking hard about your idea) are about 30% of what makes your research quotable, this seems really small to catch up for a non-native speaker.

I would like to take advantage of any writing opportunity to try to do it well. This means editing my writing as much as possible. Trying to pay attention to how I write, whether it is easy to read, etc. And, above all, trying to be more synthetic. I used this essay last weekend to start doing this. The result was horrible. I finished my first draft on Friday, and edited it every day until the deadline on Wednesday. Every single day I felt I had to do minor to medium editing. When I was analyzing my writing and trying to pay attention to what I do right and wrong, it was particularly painful to discover that I was incurring is the very same vices and biases that I usually criticize. It took an awfully large amount of time to do it. Eventually, I missed a deadline for another class that was completely crowded out of my span of attention.

I have a similar problem with reading fast. This is particularly obvious to me when I try to read books in French or Spanish and my reading speed doubles or triples. And this, in spite of the fact that I almost don’t read anything not written in English these days.

Moreover, there seems to be little hope for improvement and catch up. I have got in a stage where I am completely blind to my mistakes and left of my own. People are polite and do not correct you, so I am not aware of my mistakes. And when they do, it takes the form of a general comment about my writing/language skills, which does not help to improve any particular point.

I am determined to start using this blog to practice my writing, and do it more often, and this post is a form of commitment. I am reading a lot these days, so I may just issue some comment on pieces I read or perhaps report some data results. Hopefully something will come out.

My experience being a terrible writer in academia

Empirical and formal models are just models

One of the most stupid divides in the social sciences is that between those who put emphasis on empirical issues and those who put emphasis on theory. I know, I know that this is a discussion that has largely been overcome, that’ big data’, causal identification and formal modeling are to be seen as complementary, and so on.

However, it is a division that can be found in seminars, classrooms and, to some extent, fields: some people are more inclined toward elegant formal models and theories, while others emphasize data and empirical work. I tend to believe this is due to decisions to acquire one set of skills over another. After all, you may decide to take your classes in the computer science, the stats or the econ department and, to the extent to which your time is limited and you may like one over another, your appreciation of one over another is likely to depend on your skill endowment.

As I said, I believe this is a false dilemma, of course. But the most stupid instance of it is that of critics of the ‘unrealistic’ assumptions of formal models alone. I learned to live in peace with the lack of realism of assumptions long ago, when I understood that I had learned most from models that were ‘toy models‘: extreme simplifications of reality, yet, simplifications that can teach you something to the extent to which that something is not directly dependent on the lack of realism.

I’ve always found incredibly hard to understand how harsh people tend to be on the assumptions of, say, rationality, while they seem to be perfectly ok with most standard linear models. I saw the light about this the day I understood that everything that can be said about formal models can be said about empirical  models. Both are simplifications. Both are about making things tractable. If anything, regression models, to the extent to which they assume exogeneity, linearity, and so on, are even harder to believe!

What is a formal model after all, if not a tractable way of keeping track of how parameters (typically, behavioral parameters) related to each other?  As such, I liked Thomas Sargent definition of a model as a restriction on the data generating model. A model tells you how variables are related. Is the relation linear? Is one variable affected by another but not the other way around? A totally unrestricted model would if anything, be gigantic correlation matrix that would be extremely limited.

No one disputes this, of course. People regularly use the Stata command to run regressions with their right hand while they use their left hand to manifest outrage against the lack of realism of behaviorally grounded models to look at how one variable affects another. For some reason, they are often happy to use higher order polynomials as a functional form, but extremely averse to give any behavioral meaning to their parameters.

Empirical and formal models are just models

Causal inference is a decision sport

I spent last week at a workshop in Chicago on the topic of causal inference. It was really a great experience and I learned a lot, especially in terms of putting my ideas in order. There is, however, something that constantly made me feel unease.

Standard errors were one of the big topics about which people talked about. People have strong discussions about standard errors. Josh Angrist, for example, suggested that one good reason to prefer regression to matching strategies is that standard errors are easier to obtain and interpret.  In the frequentist worldview that is standard in the causal inference environment, this makes a lot of sense: you definitely want to know if the effects you’re estimating are just noise created out of small sample size or something else. That is, you want to have some sense of how the uncertainty derived from the sampling process affects your estimates.

Why are we interested in uncertainty? The way I was first taught about statistical inference was as a statistical decision problem: a game played between the statistician and nature. This is true either under a bayesian or a frequentist paradigm: only the kind of risk you try to minimize changes (informed or uniformed by a prior). Coming from the field of economics, it was pretty reasonable to think of the mean as an estimator derived under a quadratic loss function. Nonetheless, there should not be anything special about quadratic loss functions, and this is something pretty clear. Under absolute risk, for instance, the median is the optimal estimator. In this decision theory framework, uncertainty makes a whole lot of sense because you want to understand how much you should trust your estimation.

From the point of view of advancement science, and particularly from a bayesian point of view, it seemed very relevant to me to evaluate discovery by a) How big and b) How certain we can consider them. This is even more important in the case of causal inference which is framed explicitly in the language of policy and treatment analysis: the textbook case of the potential outcomes framework is the analysis of a  drug or a policy. If you want to evaluate a labor market program, you want to know if it is worth the cost and that means we should be able to present the policy maker with an estimate of the effect and how sure we are about it.

Although I would like to see all this talk being about posterior distribution of parameters, certainly wondering about statistical inference is good. Yet, what left me scratching my head was the total absence of discussion of the uncertainty coming from identification assumptions- i.e. sensitivity analysis. The discussion happens in two phases: first, there is some totally informal discussion of the assumption, then conditional on the assumptions being true the estimation occurs. Conclusions eventually follow from this last step, often recommending some treatment over another.

In practice, identification assumptions are not either believed or not. In practice, the truth is probably in the middle. Think of the popular RDD of using close elections to estimate the effect of holding office. For the identification strategy to be valid, we need close losers and close winners to be similar in every other respect. Yet, as has been pointed out as a critique, it may be the case that close losers systematically lose in close elections because close winner can manipulate the results (for example, they have some kind of connection with other politicians) and are thus not comparable (this may matter if you try to estimate the returns of holding office and you think career politicians tend to win and businessmen tend to lose: their performance in office and private sector are not comparable). These are two extremes cases: valid or invalid.

Should we choose? No! Probably the truth is somewhere in the middle, and that’s how we should estimate. In particular, we probably can formulate our believes in terms of probability distribution encompassing all intermediate outcomes. From this probability distribution, we probably will derive a certain measure of uncertainty that should be combined with the uncertainty derived from the estimation to understand the final estimate. Discussing this jointly is particularly important since, in practice, there is a tradeoff between the credibility and the significance of inference: assuming more decreases credibility but affects effective sample size.

Charles Manski has spent a large part of his career recommending this approach: bounds are identification sets should be derived first, and these should be progressive reduced by making every stronger assumptions, thus epitomizing the tradeoff between credibility. Coming back from the workshop, I re-read  a JEP piece some years ago by Ed Leamer in which he makes a case very close to mine:

The range of models economists are willing to explore creates ambiguity in the inferences that can properly be drawn from our data, and I have been uity in the inferences that can properly be drawn from our data, and I have been recommending mathematical methods of sensitivity analysis that are intended determine the limits of that ambiguity.

[extreme bound analysis]It is a solution to a clearly and precisely defined sensitivity question, which is to determine the range of estimates that the data could support given a precisely defined range of assumptions about the prior distribution. It’s a correspondence between the assumption space and the estimation space. Incidentally, if you could Incidentally, if you could see the wisdom in finding the range of estimates that the data allow, I would work to provide tools that identify the range of o provide tools that identify the range of t-values, a more important measure of the fragility of the inferences.

What is being said here about identification assumptions seems to me to be equally valid to other phases of the data analysis process, in particular to preprocessing. Preprocessing involves assumptions and choices and the uncertainty coming from them is typically ignored.

The bottom line is that if we look at data analysis from the perspective of a decision problem, every single step involves a choice, and every choice involves uncertainty that is relevant. When we look at causal inference, this is particularly true, since we are trying to choose between alternative scientifically grounded relationships. This uncertainty should be somehow quantifiable, and reflected, as an interval or a distribution, in the

Causal inference is a decision sport

History and the historical turn

When I think of the book that has taught me more in poli sci, a candidate that always comes to mind is Luebbert’s ‘Fascism, liberalism and socialdemocracy‘. There are others, of course. But few of them are quantitatively sophisticated. Perhaps, after all, I should have become a historian or an ethnographer.

If you read some of the great comparatists of the 80’s, most of their research is highly historically informed. In academia, people were scholars, professors whose competence spaned a large number of fields.

After a conversation with one of those persons today, I was somehow shocked of finding myself surprised. It has in fact become rarish to encounter in academia people with a vast area of knowledge. Most of the time, I think this is for the best: if you are going to do American politics, it is perfectly fine that you ignore say, the role that the civil service examination system played selecting elites in imperial China (just an example of something that is highly irrelevant to my research). In fact, many of those conversations work against parsimony of any sort. If this allow the conversation to concentrate on relevant, technical points instead of getting distracted by randomly .

Yet, I sometimes feel nostalgic of that time, and even if it is highly impractical, I would like that everyone was able to mantain an educated conversation. More paradoxical is the fact that this is true nowadays even under the “historical turn”. More and more, people seem to be interested in using exogenous historical accidents as a strategy for identification. But these acccidents do not seem to have pushed people to be more historically cultivated..

History and the historical turn

Shrinking your theory

I am currently reading a paper. It’s basically an empirical piece. The ‘theory’ it tries to test has the following structure: Things tend to be like A, which implies B, but then sometimes C, which is why we observe D, except if we observe E, then it does not hold any more because X believe F and…”

The operationalization  of the empirical test is done a partition of the outcome space as a sequence of proposition. ‘Hypothesis 1 [Fancy name such as “self interested voter effect”]: Agent A believe S which is why we should observe B’ and then an informal discussion of the mechanism. Several hypotheses are enumerated. Then, the empirical test claims to find support for one of them.

I’m sure you have  encountered papers like these. Empirical research is full of them. Surely, there is nothing that inevitably leads you to be wrong doing this. However, the style is painful, to read, and there are so many contingencies to keep track of. Eventually, the mechanisms discussed at each hypothesis are sometimes unidentified, even when endogeneity and stuff are taken care of, because ATE and LATE are about effect rather than mechanisms. Perhaps it is a matter of intellectual taste, but I can’t stand this sort of amateur-psychology-fed typology-building ad hoc style that are particularly common in in political behavior papers.

Something that was immediately obvious to me is that there would be no formal model in the paper -and I was right. I have the feeling that formal modeling induces parsimony, which translates into a certain clarity about what is being claimed. It imposes a certain penalty upon the complexity and, just to show the parallelism, it works as a “shrinkage method” on your theory, restraining your degrees of liberty.

Shrinking your theory