Turns out, not as much as you — and they — would like to think
A journalist once asked me how many jobs NAFTA had created or destroyed. I told him I had no reliable idea. Certainly jobs had been lost when factories closed and moved to Mexico but other jobs had been gained because Americans now had more resources and increased their demand for products that would not be easy to identify. Why not? Because thousands and thousands of jobs are created every month and it is very difficult, perhaps impossible to know which ones are related to NAFTA allowing Americans to buy less expensive goods from Mexico. I also told him that I believed that trade neither destroyed nor created jobs on net. It’s main impact was to change the kinds of jobs and what they paid.
The journalist got annoyed. “You’re a professional economist. You’re ducking my question.” I disagreed. I am answering your question, I told him. You just don’t like the answer.
A lot of professional economists have a different attitude. They will tell you how many jobs will be lost because of an increase in the minimum wage or that an increase in the minimum wage will create jobs. They will tell you how many jobs have been lost because of increased trade with China and the amount that wages fell for workers with a particular level of education because of that trade. They will tell you that inequality lowers health or that trade with China reduces the marriage rate or encourages suicide among manufacturing workers. They will tell you whether smaller classrooms improve test scores and by how much. And they will tell you things that are much more complex — what caused the financial crisis and why its aftermath led to a lower level of employment and by how much.
Claims about these issues fill our newspapers, our twitter feeds, and the seminar rooms in economics departments around the country. How should we evaluate these claims? How can we know which of the estimates are true and which are false? When two well-trained really smart economists are on opposite sides of one of these issues and have the empirical analysis to back up their claims, how do we judge which one is right?
A crucial question we would like to understand, for example, is the how trade with China has affected the US labor market. Since 2000, a time when trade with China increased dramatically, the number of manufacturing jobs in the United States has fallen by about 5 million. That’s a huge decrease. It’s also a fact or something close to a fact. Of course it’s not a straightforward fact. Some of those jobs disappeared because of trade with other countries. And some or maybe many of those jobs have disappeared because innovation and technology have been applied effectively to the manufacturing process leading to a reduced demand for workers in factories.
Another fact — since 2000, the US has added about 12 million jobs overall. While that fact is relevant, it does not tell us how many of the manufacturing workers who lost their jobs found some of those 12 million jobs. There seems to be evidence that a number of them did not. Employment among men 25–54 has fallen. Deaths from opioids and heroin have increased dramatically. Is that because of trade with China? Or is it related to changes in disability payments? Or the aging of the population? The eagerness of doctors to prescribe painkillers? The world is a complicated place and these are not easy questions to answer.
Over the last few years, work by David Autor of MIT and a number of co-authors (examples here and here) has shown that trade with China has had larger costs than many economists might have anticipated. This work which is statistically sophisticated and very thorough and which tries to eliminate or control for the various factors that might complicate the analysis, has been widely cited, widely praised, and seems to suggest that the increasing popularity of protectionism makes sense.
But are those conclusions accurate?
A new paper has just been released by Jonathan Rothwell suggesting that the main findings of Autor et al are essentially wrong. Their attribution of economic hardship to certain parts of the country and the economy to increased competition with Chinese production is an artifact of the time period they examined. That the results simply do not hold up when a fuller time frame is taken into account.
Is Rothwell correct?
I have no idea. Here is what I do know. There is likely to no way of knowing which view is correct with anything close to reliability or certainty. It is possible that David Autor and his co-authors will read Rothwell’s paper and decide that their work is fatally flawed and announce that to the world. I suspect that will not happen. It could happen but it is extremely rare in the academic world unless there is fraud or something akin to a coding error — data mis-entered into the spreadsheet or statistical package. That kind of error is not what is under consideration here. This is about interpretation. Like most of the questions I listed above — questions about the impact of the minimum wage, or the cause of the financial crisis, there is no simple way to resolve differences in analysis done by professional economists.
It’s also possible that Autor et al will explain that Rothwell’s interpretation is simply incorrect and provide an explanation that will convince an impartial observer. (Update: Autor, Dorn and Hanson have released an eight page response to Rothwell, dismissing it. Could be right and maybe Rothwell’s challenge will quickly be forgotten.) But that is usually not what happens either. What usually happens is that very smart well-trained people on both sides of the issue argue. They argue over the structures of the models and assumptions that enable the empirical conclusions, or the quality of the data or whether a finding is generalizable or not. Eventually, sometimes a consensus emerges but that consensus can be reversed by further empirical analysis. This consensus is something like a market in ideas. It’s something like the two sides in a trial — one hopes the process yields truth more often than not.
But there is no way of knowing reliably if the consensus reflects the truth. It may rely instead on the underlying biases of the prosecutors and defendants in the intellectual trial of ideas. Or where they received their PhD degrees. Or the fashionability of certain positions over time as society changes. Unlike product markets where poorly made products are punished by low prices or fewer and fewer consumers, there are no clear feedback loops in the world of academic economics. You can say something that is wrong and the price you pay may be zero. In fact you may be rewarded.
And that is because of what does not happen. There is never a clean empirical test that ultimately settles these issues. There is no reliable scientific experiment where each side is forced to make a prediction and the results settle the matter.
I saw in my twitter feed today that a paper from Owen Zidar of Chicago’s Booth School of Business shows that cutting tax rates for poorer taxpayers are better for the growth of the economy than tax cuts for the rich. According to Zidar’s website, his findings have been reported in Forbes, WSJ, Washington Post, Financial Times, Washington Post, Marketwatch, Congressional Quartely, International Business Times, Washington Post,Reuters, Huffington Post, International Business Times, The New York Times (Economix), Capital Ideas. The paper is under consideration at the Journal of Political Economy, one of the top journals. According to Zidar, the editor has asked for revision and resubmission, which often means it will survive the peer review process. This suggests Zidar’s claims are true.
Zidar’s claims about cuts in taxes certainly could be true. Certainly a story can be told using economic ideas that make the claims seem true. That story would talk about incentive effects of lower tax rates being more important for lower-income workers than higher-income workers. There is also the possibility that the increase in disposable income for low-wage workers is such that they are more likely to spend it and that spending more is better for the economy than investing it. But you don’t have to be an economist to realize that these claims while possibly true are also possibly false. It is easy to imagine a story on the other side. So how would you know which is true? Zidar’s statistical analysis shows that analyzing responses by workers to past cuts in tax rates show that his story is indeed the correct one.
But how do we know that statistical analysis is true? Wouldn’t it be nice if the author of every claim about trade or tax cuts or monetary policy had to state how confident they were in their analysis and its likelihood of being true? But when you stop and think about it, you realize there is no way to test that confidence. You might think you could test it by actually cutting taxes for the poor or the rich and seeing what happens. But that won’t work. You’d also have to simultaneously cut taxes for only one group and run the two experiments side by side. That can’t be done. But that’s not the real problem. The real problem is even if you could run the experiment of cutting taxes and seeing what happens, the reduction in tax rates isn’t the only thing that is going to be going on. There are many other things happening at the same time.
And then you realize that many other things were happening at the same time when the tax cuts were put in place in the past. To the extent that your statistical analysis does not control for most or all of those other changes, your estimate of their impact in the past is not accurate. So your estimate of what will happen in the future will not be accurate either.
Most economics claims are really not verifiable or replicable. (And if you are interested in the related crisis of statistical reliability and replicability in psychology and elsewhere, follow Brian Nosek on Twitter and listen to him here). Most economic claims rely on statistical techniques that try to simulate a laboratory experiment that holds all relevant factors constant. That is the hope. My claim is that in general, holding all relevant factors cannot be done in a way that is reliable or verifiable. And that is why so many empirical issues such as the minimum wage, immigration, fiscal policy, monetary policy and so on, have smart people on both sides of the issue each with their own sophisticated analysis to bolster their claim.
Let me make it clear what I mean about verifiability. In the movie Hidden Figures (slight spoiler alert), a crucial complex calculation is made about where NASA or John Glenn need to take a particular set of actions in order for him to reliably land in the ocean where the Navy is waiting to pick him up and to avoid death via the heat of the re-entry process. (I think I have that right but either way, you get the idea). Math and science work together to predict where John Glenn’s capsule will be at a certain time.
Were John Glenn’s capsule not to show up at the time and latitude and longitude that were predicted, that would call into question the math and science that predicted the details of his arrival. Of course the arrival or non-arrival of Glenn would neither confirm or refute the calculations with certainty. Other things might have played a role — a flock of birds, a mechanical failure, randomness — but the empirical reality of Glenn’s position and the existence or non-existence of those other factors, would tell us a lot about the reliability of the calculations. There is rarely if ever an equivalent test in the world of economic predictions.
I am arguing that the math and science of economic predictions and assessments are nothing like the math and science of space travel. Economics provides the illusion of science, the veneer of mathematical certainty. An American president considering an invasion of a country in the Middle East might ask a wise historian about past military adventures in that part of the world. Such an historian might even be bold enough to make a prediction about how an invasion today might turn out, based on her knowledge of the region and past military experiences there. But no matter how wise, there would be great uncertainty around such a prediction and an historian who attempted to forecast the number of military deaths in an upcoming invasion would not be doing science but fake science.
An American president considering a trade war with China would be wise to consult an economist. I think economists understand a lot about the benefits and costs of trade and who those benefits and costs fall on. An economist would remind a president considering a trade war that the short term and long term impacts might be very different. An economist would suggest that there is evidence that reducing trade will make the nation as a whole somewhat poorer. An economist would explain how trade is like innovation and we can learn something about how trade affects labor markets from what we have seen happen in the past. (My attempt to discuss these issues using analysis and simple evidence is here.) But such claims would not be ironclad or precise. They would be nothing like an Oval Office conversation with a mathematician or an engineer considering the potential for NASA to send someone to Mars.
There is nothing new about the statistical problems I am highlighting here. Every economist knows about these issues. Every statistician knows about them. Every statistical analysis comes with caveats from the author. Humility may be scarce, but most serious academic economists don’t believe in the absolute reliability of their results. But they publish them and inevitably wave them about. Dramatic claims and findings earn attention, and sometimes fame and fortune. If you ask people about how reliable their findings are, they will concede that they could be wrong. But they also presume that the court of intellectual opinion will sort out the good studies from the bad ones.
One response to the questions I am raising is that we have new techniques that solve a lot of the problems I’m talking about. We’ve had what’s called (by the creators of the new techniques) the credibility revolution. These new empirical techniques have allowed us to run quasi-experiments that while not perfect, eliminate many of the problems of complexity and multi-causal reality. I remain a skeptic. But maybe the champions of the new empirical techniques will convince the skeptics. We’ll see.
Young economists are enthusiastic about these quasi-experiments. As one economist once told me — I don’t rely on theory, I just listen to what the data tells me. But numbers don’t speak on their own. There are too many of them. We need some kind of theory to help us decide which numbers too listen to. Inevitably, our biases and incentives influence which numbers we think speak the loudest.
Economists generally believe that incentives are very powerful. I think non-economists believe this too, up to a point, which is really how far good economists should take it as well. But Luigi Zingales has pointed out that economists struggle to apply their ideas about incentives to themselves. In every other profession or area of life, economists believe incentives influence behavior. But academic economists struggle to see themselves as something other than truth-seekers, unaffected by the rewards or penalties associated with success or failure. I think Zingales is onto something here. Zingales’s observation reminds us that economists have a conflict of interest in assessing how reliable their empirical techniques are. They have an incentive to overstate the reliability of their techniques. Unfortunately, the techniques are only accessible to economists and other applied statisticians. That’s awkward. The defendant is also the judge in the trial. That is not ideal.
Where does that leave us?
First let me make it clear that facts and evidence matter. I am not saying that measurement is irrelevant.
It is useful to know that 40% of the American work force was in agriculture in 1900 and now the number is 2%. It is useful to understand that that transition (which was most faster in the first half of the 20th century than the last half) did not lead to mass unemployment and starvation. There are indeed roughly 5 million fewer manufacturing jobs today than in 2000. I recently saw a TED talk that claimed that median household income worldwide is about the same level as the cost or maybe the price of an iPhone. That claim surprised me. A quick search of the web suggested that a respectable estimate of global median household income is something like $10,000. I stopped listening to the talk. I don’t think it is straightforward to measure the global distribution of household income but $10,000 is not in the ballpark with $700.
Facts matter, but some facts are extremely difficult to measure and I am open to the possibility that my critique of that TED talk is unfair. Maybe there are potentially reliable estimates of global income patterns that would yield a number for the median below $1000. Some facts are quite difficult to pin down and prone to extreme misinformation and even deception. The fate of the American middle class over the last few decades is much in the news and is complicated by changes in family structure, measurement of inflations, decisions by the speaker as to who to include in the sample and what to define as income makes a seemingly simple calculation surprisingly tricky. (I have the first episodes in a forthcoming video series on the question coming out in the next few months.) But as economists, we understand something about how these issues matter and the range of estimates that are reasonable. Facts on these kinds of issues are helpful and we can even makes some progress in understanding why some price indices overstate inflation, say, and get a rough estimate of by how much, perhaps.
What I am saying in this essay is that while economic fundamentals like income or even changes in income over time are somewhat measurable with some precision, we are notoriously unreliable at the things the world really cares about and asks of our profession: why did income for this or that group go up by this or that amount? What will happen if this or that policy changes? Should the subsidy to college education be increased or decreased and if so, by how much? These much-demanded answers for precision and an understanding of the complex forces that shape the world around us are precisely the questions we are not very good at answering.
The fact that economists relentlessly and cheerfully do their best to answer them anyway might be because we are tenacious and optimistic that we are getting closer to the truth over time. A simpler answer uses the economics of the kind Luigi Zingales talks about: those who purport to “know” the answers to causal policy questions get attention, money, and influence. That doesn’t prove that they are wrong, of course. Sometimes incentives encourage good outcomes. But as I suggested earlier, I’m not convinced that the feedback loops of profit, loss, and competition in product markets are as effective in the market for ideas. Or as Brian Nosek, Jeffrey Spies and Matt Motyl put it:
Published and true are not synonyms.
So where does that leave us?
I think there are two things to think about. The first is how economists share their empirical findings. The second is the value of economics.
Here is how economists currently share their empirical findings: selectively. There is a nice table or two in the paper summarizing the findings that are described at the beginning and the end of the paper. What is not included is how many statistical forays were made that found nothing or that found results that were the opposite of the final set of conclusions. We need a track record — a log of statistical exploration. How many equations did you estimate? What proportion of them found something different than what you reported? Can you share them as well as the ones that you now feel are the “right” ones? And please explain why the ones you think are the right ones are not wrong. The ideal would be an econometric Go-Pro that would allow consumers of econometric analysis to see all the equations that were run and how many were significant and how many of the “wrong sign.”
So if you are a reporter interviewing an economist or an economist in a seminar or the editor of a journal, ask these questions. Of course, there is some incentive to shade the truth when answering. But some people shade badly and awkwardly and that awkwardness will give you information.
The other aspect of sharing findings is accessibility of the data. Rothwell’s analysis purporting to refute Autor et al was done with Autor et al’s data. Sharing data is honorable because data often takes work before you can use it and sharing allows others to profit without having to do the work. But all empirical claims should be accompanied by sharing of data along with all of the steps that were necessary to get the data into shape. The kitchen of econometrics can be an unsightly place. But everything that goes on in the kitchen should be publicly available in order to reassure the diners. Again, people can lie or hide things, but at least we should make clear what is the ideal.
If I am right, economists are mostly dangerous. At least economists as the world perceives them. But most of the people I am talking about are not economists. They are really applied statisticians. Economics is primarily a way of organizing one’s thinking in considering incentives and costs and the interactions between individuals that we call a market but is really emergent behavior with feedback loops. Studying economics sensitizes you to these things and others and helps you appreciate complexity and various outcomes made by economic actors — consumers, producers, entrepreneurs.
Economists understand that many things are more complicated than they seem. I recently heard someone who is quite smart say that because autonomous vehicles are likely to eliminate all taxi driving and truck driving jobs, then maybe the Trump Administration should make autonomous vehicles illegal on the grounds that low-skilled workers have a hard enough time as it is. I too worry about low-skilled workers in a world of artificial intelligence. It’s why I think we need to try some radical experimentation in education.
But an economist when considering a policy of banning autonomous vehicles can think of a lot of other impacts besides the jobs saved and the continuing deaths from human driven cars if such a ban is put in place. One of the things we would think about is how such a ban will effect the incentives to discover future innovation that might also people out of work. We would think about how putting more power in Washington would encourage lobbying for protection. We would think about the children and grandchildren of today’s workers and how restricting technology and changing incentives would affect things. These ideas are not rocket science. But they come easily to economists and not so easily to non-economists. Thinking like an economist is very useful.
What I am arguing here is that the combination of economics with statistics in a complex world promises a lot more than it delivers. We economists should be more humble and honest about the reliability and precision of statistical analysis.
Update: Here is John Cochrane’s response to this essay at The Grumpy Economist. Here is Don Boudreaux at Cafe Hayek responding to Cochrane and me. And here is Adam Ozimek at Data Points responding. Cochrane and Boudreaux are generally sympathetic to my claims but add their own very interesting takes. Ozimek thinks I very much underestimate the value of empirical economics. I will respond to his points in a separate essay.