Add new Blog

28 Apr 2017

Open data in educational research

My work is in digital collaborative learning, and I am very interested in how standards for educational data are developing. I was hoping to find discussions about this at the RDA Plenary, but there were no specific sections on educational data. It is possible that the field is too young, and needs more time to come to an internal understanding, before we begin the process of formalizing standards and protocols.

In the field, there are growing calls for standardizing data access for learning data, some of which is coming from the burgeoning Learning Analytics and Knowledge (LAK) community. I worked for several years as a MOOC (Massive Open Online Courses) coordinator at the University of Toronto, and saw first-hand how researchers spent hundreds of hours cleaning up and reformatting MOOC data, with very little documentation about formats and the meaning of values, before they could begin to do the actual analysis. And because all of this work was done "ad-hoc", other researchers wishing to do similar work would have to begin "all over again". 

There have been some efforts at standardizing MOOC data, such as MOOCdb from MIT, as well as efforts to lobby MOOC vendors to better document their formats. Some groups have also began releasing tools that the community can use to process MOOC data, but we have still not made much progress. Another standard that is currently gaining a lot of steam is experience API, or xAPI, which is a very light-weight protocol for reporting activity data, based on subject-verb-object, for example { peter, watched, video2 }, however in the tradition of linked data, peter would be a student ID, watched could be a specific verb backed by a definition and specified as a URI, and video2 should be an ID or a URI for the video. 

A key idea behind xAPI is to be able to track learning across multiple platforms and media, and encourage interoperability. However, for this to happen, the community will need to agree on "recipes" for how to represent various types of data, for example a video playing activity - whether I am using Youtube, Vimeo, or a custom video player, the event "beginning to watch a video" should be represented as a single unique activity verb, and not interchangeably by "open, start, view, begin, play" etc. There are currently community efforts ongoing to address this.

One exciting aspect of xAPI is related to privacy and student ownership of data. Event statements are sent to learning record stores, which can federate data, data can be sent to multiple stores, etc. One exciting approach I just heard about is that data for student learning is sent to the students' private learning store, and it is up to the student to choose whether to share this with the course team or researchers. However, all the statements are digitally signed, so that if the user chooses to share the event data, the instructors/course team can be assured that the data has not been tampered with. 

One challenge we face with implementing xAPI as a general standard for our own research platform, is that each event is quite "verbose" - the idea is that each statement should be able to stand on its own (to enable easier aggregation and federation across Learning Record Stores). However, some of the data we will collect might be very granular, for example every single keypress in a collaborative editing situation (to be able to analyze collaborative strategies). Storing 1 kb per keypress seems very verbose, and impossible in a MOOC situation. 

In conclusion, our field is facing a lot of challenges, and some exciting new initiatives and burgeoning collaborations. It would be interesting to hear about other fields that might face similar challenges and exchange ideas about solutions, and I will certainly keep the RDA community in mind while participating in this process, to see if we can eventually produce standards that could be codified through a working group process.

About the author

Related blogs


There are 1 comments on "Open data in educational research".

submit a comment