Monday, April 16, 2012

A Unifying Thread

Note: Long time no-see here on the blog. Such is PhD student life.

I had a conversation with my advisor yesterday in which I asked him what the unifying idea that runs through his research is. Roy has studied math, writing, science, and a host of other topics in his research, and has been at times a philosopher, at times a psychologist, and at times something resembling an ethnographer. He's a guru of video data analysis, but that's far from his only analytical methodology. In short, he's as multi-faceted as they come, and among his current advisees alone there is a huge range of interests and projects.

It should come as no surprise, however, that he could condense his work into a single sentence. He studies the way that emerging symbols and representations of knowledge can change student learning and understanding (that's not exactly what he said, but it's close). This got me thinking. One of my biggest struggles as an early PhD student has been defining my research interests. In short, I have too many. Almost no interesting question is unappealing to me, in a range of areas.

There are, however, some ideas that have started to rise to the top of my focus. In particular, I'm becoming very interested in the possibilities of two branches of what gets called "computational social science." The first is natural language processing (NLP), and the second is agent-based modeling (ABM). I won't go into detail about what those entail here, as a cursory web search will give at least a decent idea. Rather, I want to talk a little about how I've started to use these techniques, and to observe something about them that I think is salient to defining what my unifying thread is and will be.

Both NLP and ABM are ways of reconceptualizing what counts as data and how to analyze it. NLP takes text - a form of data which we usually look at according to some theory of reading (like hermeneutics) - and redefines it as a "big data" set. That is, instead of looking at specific meaning in specific places in a text, NLP can help uncover aggregate trends in the use of language. For example, a recent paper I wrote for a doctoral proseminar included an analysis of how Salman Khan uses language in his first four lessons about fractions. Watching those videos certainly tells you a lot, but even something as simple as word counts uncovers surprising information, like Khan's almost non-existent use of interrogative words like "what" or "why." That alone says nothing about his quality as a teacher, but it does say something about how he teaches, and what the technology he uses affords him from a linguistic standpoint. Needless to say, this is a project I'm hoping to expand upon.

ABM, on the other hand, is less about analyzing data and more about performing complex though-experiments that you would not otherwise be able to undertake. ABM is known in the social sciences and hard sciences for overthrowing misconceived paradigms of both human and animal activity. The classic example is bird flocking behavior. For a long time, it was believed that birds followed a "leader" bird while flocking, and that they traded off being the leader. Essentially, the hypothesis was that each individual bird just did as he was told by the leader via some arcane communication system. Enter ABM. A new hypothesis suggested that, instead of a leader, each bird followed a very small set of simple rules (namely, get closer to other birds, but not too close; move away from another bird if too close; turn to try to face the same direction as nearby other birds). Creating hundreds of randomized agents in a computer program and giving them these rules to follow produces, in a short amount of time, an almost perfect analogue to the flocking behavior we see in nature.

There has been little work using ABM in education, but another student and I are adopting and adapting some of the work of Paulo Blikstein as a part of our research assistantship. We've been working on a model that represents the process of sharing a technological artifact in a classroom task. One of Paulo's papers using ABM alongside empirical data suggests that collaborative classroom tasks often lead to the assigning of roles based upon ability, which means that more advanced students end up doing most of the cognition unless specific role sharing is assigned and enforced. Similarly, our model viewed alongside video data suggests that it is vitally important that students share representations of data while doing scientific inquiry, lest the student holding the representation do all of the cognitive work.

These two projects - running Khan's teaching through some NLP paces and analyzing inquiry data through ABM - on the surface have little in common. Except that both are ways of reconceptualizing data, as I said above, and both are computational. This, then, forms an important part of the thread that I see running through both the work that I'm doing now and the work that I've done in the past. It's not data, per se, that I'm oriented towards, but rather re-imagining the bounds of what we can do with a given idea.

The thread that unifies my thinking is this: I hope to find and use new ways of thinking about and representing ideas and questions such that seemingly oppositional knowledge structures can be synthesized so that they can become more meaningful. Essentially, I'm describing a dialectical process. However, I don't believe that its an entirely phenomenological one. That is, I don't follow Hegel the whole way: there's more than interpretation here, there's also creativity.

Necessarily, the synthesis of oppositional knowledge structures requires pulling away from the depths of either. Hyper-specialization in the pejorative sense of knowing more and more about less and less has always been counterproductive, but here is is particularly problematic. Too often, increasingly detailed knowledge about a given area does not move the area forward, except by a very narrow definition of "forward" that applies only within the field. Rather, the most important innovations in the history of almost any science come from moving backward so far that previously separate knowledge structures collide, get blown apart, and reintegrate facing a new and altogether more productive direction.

That's all oversimplified, on the one hand, and hopelessly unclear on the other. Nevertheless, I think it's a good starting point for defining an academic identity for myself that incorporates both my wide range of passions and the need to have a clear self-definition.

On a somewhat different note, I also hope that I won't be away from the blog for two full months this time. I've been struggling with where this blog fits in my academic life, whether to continue it at all, whether to start a new one, or what. For now, anyway, I've decided that this one can stay. In fact, "Nicht Diese Tone," while not the most marketable slogan, ultimately does capture very well the core of my intellectual project.