Computational literacy for the humanities and social sciences

By Eetu Mäkelä, associate professor in Human Sciences–Computing Interaction at the University of Helsinki.

This content is not yet complete, in the sense that some sections have not yet been converted from their original lecture slide format into self-contained texts for self-study. Each such section has a header similar to this at the top noting its draft status, as well as a 🏗 mark in the table of contents below.

Target audience

People of all levels in the humanities and interpretive social sciences (henceforth abbreviated as human sciences) interested in whether computational methods might help them in their own work.

Prerequisites: Absolutely none.

Aside: Why should you be interested in computational methods? Two reasons:

  1. they may allow you yourself to do your work more efficiently, and

  2. they may lead to completely new and powerful ways of addressing questions in your field

The probability of either of these happening very much depends on what you are interested in, but not in any way that can be shortly enumerated. Instead, that is what this course aims at enabling you to discover yourself.

Course concept and learning goals

This course is an introductory course on applying modern data processing to complex social and historical data. As such, it doesn't target the wide world of all different digital humanities and computational social sciences. Instead, it hems closely to our local focus in Helsinki, which itself aligns with the long tradition of humanities computing.

On the other hand, with regard to subfields of the humanities or social sciences, the course makes no delineations, on the contrary arguing that by taking examples from different fields, a deeper understanding of the possibilities afforded by computation can be attained. For more details, see the introduction.

As a signposting course, the course describes the landscape of computational human sciences. It provides students with the knowledge they need to choose their own focus within it, also manifesting in the ability to choose where to go for further knowledge.

After this course:

Yet most importantly, after this course and utilising all of the above, the student is able to:

  1. make informed decisions on which computational approaches will be of use to herself, and

  2. understand, follow and discuss the development of computational approaches within her field in general.


This course is meant for both independent self-study (reading up on only certain sections of the course), as well as for completing as either a contact learning or MOOC course with a group of likeminded students. For material relating to particular instances of this latter mode of study, see here.

Workload-wise, the full course is rated at 5 ECTS, which officially translates to ~135 hours of study. However, ECTS workload ratings have always diverged both from reality, as well as student expectations. In practice, I expect the load to be some 60-70 hours, or about ½ to ⅔ of the official norm. Generally, courses at this workload-level seem to be evaluated by students as "moderate to heavyish" in workload (because sometimes you can get 5 ECTS even for something like 25h or ⅕ of the official norm, for example from just sitting in lectures 14 x 1½ hours, and then doing a couple of hours of work on top of that!)

Course contents

( 🏗 marks parts of the course not yet fully converted out of lecture slide format)

  1. Introduction: three approaches to methods for digital humanists

    • Easy, ready-made tools for data cleanup, visualisation and exploration

    • Fundamentals of programming for data processing

    • Data analysis method literacy

  2. Data 🏗

General note

"At times the course felt like being hit by a bus, the way we were forced to figure out many things on our own. It did at times result in an awful lot of stress, but it actually was the best way to learn how to do these things and more importantly, how to find info on how different things work and should be done." - course feedback

There's a lot to take in during the course, and much of it may be unfamiliar and at first confusing. A major principle of the course is that you should not try to wholly understand everything in the first instance. While an effort has been made to keep the language and concepts as simple as I could make them, as well as order them sensibly with regard to each other, often there was no way I could order everything neatly into a linear learning progression.

For example, to really understand easy to use end user tools, one needs to know how they relate to the possibilities of computational analyses in general, as well as different types of data and different types of preprocessing of that data. Further, to properly contextualise them, one also needs to understand how their affordances differ from those available to users of programmatic analysis libraries. However, ready to use tools are still presented before programming, data transformations and computational analyses, because I feel having tried them in practice provides a good springboard for understanding these more abstract and complex topics.

Thus, when going through the course and doing the assignments, try not to be bothered by not understanding everything at first go. Instead, it is enough at each point to just have even a vague general notion or gist of things, and trust that it will all make sense in the end, once you've gone through all the subtopics.

Practical matters

  • The course has a Slack workspace at used for both returning some assignments as well as peer and teacher support. Please join it.

  • For linking to quotes in their original context, the course uses To be able to use this, you must join the CLIT4HSS group (as well as register in general if you don't already have an account). You also naturally need access to the sources (most commonly through accessing them from a university network / VPN. For example for Helsinki, see this guide).

  • If you use the material for self-study and it ends up being useful for you, I'd appreciate a note about this. Feel free to send that either through Slack, e-mail, Twitter or wherever you find me.


The text of this course is licensed under a Creative Commons Attribution 4.0 International License. This means that you are free to use, embed, remix and further develop any part of this course for use in your own course or other material. The only requirement is that you give appropriate credit for this material, provide a link to the license, and indicate if changes were made (see the license for more details).

If you do make use of this material, I'd naturally also appreciate a ping, as well as the possibility to merge any improvements to this version, even if neither of those is actually required by the license.

For access to the source code of this GitBook, please see this GitHub repository.