Working/conference paper: The value of schooling in traditional sectors, with special reference to Indian fisheries

Because I’m being slow in getting this in a better state for publication, I’m posting it here now. Trying to light a fire, as it were.

This paper was presented at the symposium Understanding and Eradicating Poverty in South Asia: Lessons and Options at the University of Rajasthan (Jaipur, India, Oct. 17, 2013). This remains in a draft format; please contact me before citing.

ABSTRACT: The international community has enshrined formal education as one of the key tools necessary to alleviate poverty, on par with ending hunger and fighting disease. In addition, education is often considered a key component of the “modern” geographic, demographic and economic transition off the land, out of the village and into wage jobs in cities. But what does education mean within the rural or traditional economy? What does education mean for the legions of villagers who remain poor farmers and fishers in developing countries such as India? This paper examines the relationship between education and poverty theoretically and empirically in traditional economic sectors. First, the paper sketches an outline of neoclassical economic growth theory, with specific attention to the basic Cobb-Douglas production function. Next, the paper reviews literature on the economic returns to education or human capital, with special attention to traditional sectors when possible. Finally, the paper conducts a quantitative analysis of marine fishery census data from India, testing the empirical relationship between poverty and education within a traditional sector.

The paper ultimately finds evidence to support the idea of returns to education even within India’s coastal fishery economies; in other words, education need not simply be a ticket out of the village. In line with much development literature, female education may have an inverse relationship to poverty stronger than male education. Furthermore, the effect of education can rival that of mechanized capital, often thought to be the key to improving poverty among fishers. However, the results may be attenuated both by the structure of the economy as well as socio-political institutions. Finally, the findings have a spatial quality to them. Some relationships shift when controlling for the fixed or unobserved effects of place, and the effects of education are not uniform across geographies. Taken together, these findings suggest the need for education that is locally tailored, decentralized and relevant specifically for traditional economies.

Click here to download.

Tags: , , , , , , , , , , , , , ,

Sherlock Holmes and the nature of data(mining)

Holmes… thinking.

I spend many days sitting at my desk just thinking, reading, writing and then thinking some more about survey method, instruments, data and analysis. It’s all great fun, because while I’m comfortable with qualitative, I’m also quantitative.

But after a solid day of switching from one spreadsheet to another (fisher socioeconomics and mobility preferences of SE DC), my mind is drifting off and I’ve randomly recalled quotes from Sir Arthur Conan Doyle’s Sherlock Holmes, on the subject of data, research and hypothesizing.

Holmes rather ingeniously contradicts at least some of our ideas of scientific method and hypothesis testing. This is hardly just Holmes being fanciful; he actually does a rather good job of showing us why we need to be careful about putting too much stock in our brilliant hypotheses.

It is a capital mistake to theorize before one has data. Insensibly one begins to twist facts to suit theories, instead of theories to suit facts.
A Scandal in Bohemia

Our scientific method generally tells us that we should do the opposite. We should theorize and then design experiments or collect data to analyze/test our hypothesis. My master’s thesis, for example, examined a macroeconomic theory — of productivity conditioned upon physical and human capital — using socioeconomic census data of coastal fishers in India.

My own hypothesis, based on political economy/ecology literature, was that the basic elements of the production function theory were inadequate to explain poverty. I tested data and largely confirmed my hypothesis. Holmes suggests we do the opposite, on multiple occasions.

It is a capital mistake to theorise in advance of the facts.
The Adventure of the Second Stain

It is a capital mistake to theorize before you have all the evidence. It biases the judgment.
A Study in Scarlet

It seems that Holmes would advocate the kind of research that is castigated by some scientists as “data mining” or “data dredging.” Described negatively, data dredging involves looking at a whole range of statistics and picking obscure ones to form a thin hypothesis about any observed patterns and relationships. I suspect that some folks who dismiss statistical analysis (“Anyone can say anything with statistics”) may be thinking of data mining. More generously described, however, data dredging is simply post-hoc analysis or looking at data after the fact for trends or patterns that were unknown or inconceivable prior to the experiment/data collection.

The critical view has some merit; the more one looks at the data, the more one finds connections that may have no logic or good theoretical basis; in short, one may find trends that just don’t make sense and may only be artifacts of the data rather than descriptions of reality.

However, hypothesis-experiment designs can also be as flawed. They rely on the researcher’s own judgment to get the conditions/variables of the experiment/observation right. One might incorrectly reject the null hypothesis (Type I error) if, for example, the variables or purported causal chain of the hypothesis don’t actually relate but instead happen to proxy a real-but-untested relationship; at the same time, one might incorrectly confirm the null hypothesis (Type II error) if the proper model isn’t specified in, say, a regression.

Example from my own work: I’m interested in power, social organization and political economy (of natural resources); along those lines I read literature often originating from political economists and political ecologists. My thesis attempted to show that a supposedly apolitical macroeconomic hypothesis simply didn’t fit the facts when one dug (dredged?) a little deeper into the data. I included sociopolitical variables that began to control away the effects of the neoclassical macroeconomic predictors.

I had some theory to back me up, but at the outset, I did not hypothesize the power that geography would have on my model as well. Only when I also controlled for fixed geographic effects or removed geographic outliers did I really begin to see the macroeconomic model break apart. Another variable I found to matter highly — the presence of a post office. This really starts to seem like data mining, but by looking deeper at the statistics, I could see that post offices proxy overall levels of development in a broader economy, so the variable actually made sense. That’s a bit of post-hoc analysis, but without the social, geographic and post office variables, my research would have actually supported the overarching macroeconomic theory.

What’s more: Even my best models didn’t explain even two-thirds of the variation in my dependent variable (poverty). A first-order question: What other variables might the theory (macroeconomic or other) be missing? One of the first steps toward answering that: Looking closer at the data for unexpected interrelationships.

Says Holmes:

“Data! Data! Data! I can’t make bricks without clay.”
The Adventure of the Copper Beeches

Indeed, most of Holmes’ genius comes as Conan Doyle invents scenarios where the seemingly obvious hypothesis is wrong; only upon dredging up more data and observations does Holmes typically arrive at the correct conclusion.

And, in reality, most hypotheses are rarely designed in a purely a priori fashion. In practice, we look at some data, consider some experience, examine results of other research, design our hypothesis accordingly and go out and look at data. After a first pass analysis, we may alter our thinking on the fly, which perhaps approaches data dredging but gets us closer to describing a real relationship or explaining a real trend.

My own statistics professor, for whom I have great respect, told me that looking deeply at the data wasn’t wrong — I do tend to nerd out on my spreadsheets — as long as I had good theoretical, logical (sensible) reasons for seeing relationships.

And now, back to work and survey method/instrument design.

Tags: , , , , , , , , , ,

Semester research: The effects of agriculture on India’s forest cover

One giant poster

I spent the semester conducting a statistical analysis to explain variation across Indian states in forest cover change from 2000 to 2009. After a preliminary literature review, looking at deforestation across the world, I compiled a database of more than 200 relevant variables. From that I computed and tested more than 100 variables (averages, percentage change, raw change, etc.) before narrowing my regression to several key indicators of an individual Indian state’s economic reliance on agriculture and the presence of alternative lifestyles and livelihoods.

This culminated in a series of univariate, bivariate and multivariate analyses; I presented the research in a spring quantitative analysis symposium at American University. The poster is viewable here.

The ultimate conclusion from the research: Agricultural output value is strongly and negatively associated with forest expansion, coinciding with slow forest cover growth or even powering forest cover loss. In the alternative, a number other variables — all of which represent diversity in economic opportunity, livelihoods and lifestyles — have positive associations with forest cover growth. This all appears in several models of an OLS regression.

I’ve written a draft paper of the analysis that needs to be refined, edited and combined with an introduction, abstract and the results of my literature review — a summer project to be sure.

Anyone who wants some heavier reading can read that draft here. I’d welcome any and all feedback, even from complete strangers. (Forgive the writing. This was done in pieces and certainly is repetitive in phrasing.)

Tags: , , , , , , , , , , ,