Going Backwards into Coding

I have always heard that for anything related to computer programming, learning by doing is not only the best way, but the only way. Certainly, in the foundational Java and C courses I took long ago, we started coding right away, printing lines like “Hello, world!” in the baby programs’ outputs. I’ve tried to pick up programming or database-related concepts outside of classes a few times, always by doing.

Sometimes I’ve started from the beginning, because I didn’t have a project in mind and just wanted to learn the basics of, say, JavaScript or SQL. So, I worked my way through various tutorials and dutifully did the exercises, which built one one another and very gradually got more complex. I respect this way of teaching and learning, but I actually find it terribly dull for computer-related things. I like to learn human languages this way, for example, but not programming languages. One of the items on my blog list, recommended by a reader, was “learn to code, if you haven’t already.” I decided to tackle some of the lessons on the website The Programming Historian, which offers tutorials on digital tools and techniques for humanities research. I didn’t want to do the same thing I always do when I work my way through programming tutorials with no goal in mind except learning -- while I love to learn, doing it this way means I inevitably get too bored to continue.

On reflection, I realized that the times I’ve really sunk my teeth into learning a computer skill were all when I had a project I wanted to accomplish or a problem I wanted to solve. I’ve had some good experiences using tools designed for beginners that also allow you to fold in more complex skills. I’ve created games and museum interactives in the interactive story and text-based game creation platform Twine, for example. Twine doesn’t require any coding skills to get started, but I needed to learn how to call certain JavaScript commands (and to learn enough about JavaScript to figure out how to look them up) in order to achieve particular things I had in mind.

This is decidedly learning by doing, as much or more so than working through tutorials is, but it has its limitations as a method. Sometimes I don’t know enough about a tool to know what I want to learn. What about when I don’t know what projects I want to do with a particular tool? What about when I don’t have any sense of where to start, or how to tell whether a project is too complex for me to fumble through as a beginner by looking things up as I go?

In order to make The Programming Historian lessons work for me when I don’t know enough to know what I want to learn, I decided to use them like theoretical introductions rather than tutorials. The lessons are grouped into broad categories of what you are doing to data: “acquire,” “transform,” “analyze,” “present,” and “sustain.” I selected several lessons in the “analyze” category and read them carefully, reading but not doing the sections where the reader is invited to try things out, and started to explore what the tools mean.

Each lesson on The Programming Historian has a list of prior skills that you’ll need in order to implement the skill being taught. One fun thing about doing it backwards is that I can safely ignore this section -- even though generally, it’s a very useful section to have! -- and choose lessons I don’t necessarily have the prerequisites for. I do read it, though, so I get a sense of which skills I might want to spend time learning based on the more advanced techniques that appeal to me.

One of the lessons I read through was on analyzing documents with tf-idf, or Term Frequency - Inverse Document Frequency. Essentially, it’s a way to identify the words that are used disproportionately often in a document, as compared with a set of texts that document is from. One of the examples in the lesson uses tf-idf to determine what’s basically a list of keywords for a given obituary in the set of obituaries the New York Times has put online. Extrapolating a bit, I can see how computer text analysis can be a shortcut for non-technical tasks. Having keywords for one text may not be that useful, but when you are looking for patterns in documents, reading hundreds of lists of keywords is a lot simpler than reading hundreds of obituaries, and from there you can identify the research questions you want to hone in on and read the obituaries that matter to those questions.

Here’s a screenshot of part of this lesson.

Even without doing the exercises along with the lesson, this has been a kind of intense learning process. Each lesson definitely takes me a lot longer to read than an article of the same length would. Perhaps because the content is very new to me, and not just because the content is dense, a lesson the length of a New Yorker article might take me an hour. Each lesson also has plenty of useful further reading suggestions. This is important, but it can start to become a rabbithole -- if I wanted to fully explore one article, the amount of reading I’d have in front of me would be maybe five times the length of the article. Helpfully, the articles occasionally link words that might be new to the reader to Wikipedia articles, which is very useful for getting the relevant background. For example, some of the lessons I did were on various forms of text processing, and text processing as a field incorporates a number of terms and concepts from linguistics. These terms weren’t part of the programming lesson, but as a historian I hadn’t encountered them before, so I’d read at least the summary of the Wikipedia article.

There were definitely moments when I wished I was taking this as a class, or talking through it with someone knowledgeable. It came up when the actual next step or a process or the actual result of a given transformation was different than I thought it would be and it made me realize I didn’t understand something. For an extremely simple example, at one point a particular value was clearly being multiplied by another, when I expected it to be subtracted.

The lessons on The Programming Historian are contributed by a variety of authors and peer-reviewed before being published. I do wish that the website had some sort of “start here” guide, and some of the otherwise very well-written articles have a bunch of typos in them, but I have very few complaints. It’s a really great resource. I stuck to English but they also have a smaller set of lessons in Spanish, French, and Portuguese. Overall, whether you take my intentionally backwards approach, do the lessons with the exercises as they are designed, or seek out particular skills that will help you more forward in an existing project, The Programming Historian is good for anyone interested in expanding their qualitative research tool repertoire.