Bibliographies as knowledge assets

The student-generated bibliography as a digital knowledge asset

In this post I continue my exploration of knowledge technologies as tools for knowledge building (and occasionally as tools useful for teachers).

Literature reviews are ubiquitous in teaching and research. They come in many shapes: From 500 words as part of an essay assignment, to substantive sections in Master and PhD thesis, to the systematic reviews written by career academics. While much has been written on the literature review and literature search, I want to focus here on how to turn the references themselves into knowledge and on how students can learn about knowledge building in this way.

I mentioned before that knowledge building is not identical with individual learning but that it builds on the notion of idea advancement and involves the creation of shared—ideally public—knowledge artefacts. The key artefact in our case is a kind of database: Have students build and contribute to databases that capture meaning, are distributed over the Internet, and can be queried by humans and machines. I prefer the term “knowledge base” for this kind of artefact.

In a nutshell, I am suggesting that one form of implementing knowledge building pedagogy takes the form of having students build a knowledge base from (a) the bibliographic references and (b) the thinking used for writing a literature review.

Web 3.0 - the semantic web

What more can be done with bibliographies? The next step is to make the relations between the references and between the references and students’ reasoning explicit and machine-readable. By describing semantic relations in a format computer can process, computers can do more than searching for references: they can now also search for how students think about the relation between references. They can, for instance, answer requests such as “Show me all references that support the claim made in study s” or “What are the categories used for classifying reference r?”

From a teacher perspective, the same queries can be asked on a per student basis:

“Show me all references student s identified as supporting the claim made in study x”
“What are the categories under which student s subsumed reference r?”
“Show me all students that created a link of type l between reference r and concept c”.
etc.

But most importantly from the knowledge building perspective, bibliographic materials can now be linked to other types of data available on the internet, such as to people, to arguments, to ideas. Next I describe how to get there.

Students create a bibliographic knowledge base

A student creates a semantic bibliography using standardized notations for describing references, concepts and relations. Even though at this stage the student may work individually (it makes as much if not more sense as a group task) the requirement that the reference database needs to be useful for others - and for machines - means that the references need to be described in a controlled vocabulary and related to each other in an agreed manner.

Information scientists have developed standard forms for describing content. The most frequently used one is Dublin Core, a set of fifteen generic, widely used element–such as Creator, Contributor, Publisher, Title, Subject, Descriptions. These were first drafted in 1995 at a meeting in Dublin, Ohio, to facilitate information discovery on an explosively growing Internet. The most recent version extends this core vocabulary considerably.

For a student to describe a bibliographic resource using the DC standard means writing statements in a subject-predicate-object format, with Dublin Core providing the predicates, like so: <myref> dc:creator <author>. Given that many bibliographic materials have a DOI, it is practical to link to the respective webpage: <myref> dc:identifier <http://doi...>.

The subject-predicate-object statements, simple as they are, constitute the backbone of the semantic web. They are part of the Resource Description Framework (RDF) specification, a web standard. RDF statements can be about anything, not just web resources. For instance, to express that I see a particular reference as belonging to a specific content category, I could state: <myRef> dc:relation <myCategory>.

I may find the relation predicate that is part of Dublin Core too generic for my purposes and therefor create my own predicates. For instance: <myRef> :supportsClaim <myClaim>. The supportsClaim predicate is not part of Dublin Core; it is a predicate that I invented. And myClaim is not necessarily something that exists on the web; it can just be something I make statements about, like so: <myClaim> dc:description <some text>. A good heuristic though is to avoid creating one’s own languages as much as possbible and re-use existing ones; it makes it much easier to find related content.

Here an example for describing a journal paper Baker et al. (2013) in Dublin Core terms, mainly:

The point is not to create the complete meta-data for a reference; rather, the task for the student should be to add classifications and relations. With such a database in place, concept maps can be automatically created. For the Baker et al. (2013) reference, it might look like this:

I used GraphDB™ in the free version for this example. This is a database engine that stores data in form of subject-predicate-object triples. These triples can be searched, ordered and filtered with the database query language SPARQL (mentioned before, and they can also be visualised in a “clickable” graphical form.

The visualisation shows on the right-hand side what I mean by connecting references to concepts and ideas. The reference Baker (2013) describes how art elicitation is used as assessment method for knowledge outcomes. Shown on the right-hand side is a small kind of thesaurus for Learning outcomes, with Knowledge outcomes as a narrower category, and Climate science knowledge as an even narrower category. In this manner, the Baker (2013) reference is placed into a context of educational concepts. Not only that, all other references that might be linked to Knowledge outcomes would now also be linked to Baker (2013). The query engine is able to find and follow such relations, which means in effect that it can retrieve semantically related references (and other resources). And those can sit anywhere on the Web, they don’t have to sit on the student’s computer.

(If you look closely, the little thesaurus on the right is nothing else than subject-predicate-object statements, using a standard language for describing thesauri called SKOS, for Simple Knowledge Organization System. You see, RDF is not only good for describing knowledge, with it we can also describe knowledge about knowledge. This is the link to Automated Reasoning and Artificial Intelligence; more on this in another post.)

How can teachers make this happen?

Firstly, they have to specify the task of creating an annotated biobliography, perhaps in combination with a literature review assignment. Ideally, this task involves making these resources publicly available (in text form and as a reference file in a widely used format, such as RIS, not just an upload to a Canvas assignment. Of course, students need to agree to this.

Secondly, teachers need to introduce students to the idea of creating and being the curator of a bibliographic knowledgebase. A convenient way for creating the knowleddge is with a spreadsheet of three columns: subject, predicate, object (see the example above). The entries need to obey the (minimal) RDF syntax and students need to learn a bit about the Dublin Core predicates. Dublin Core is quite generic and does not really express relationships well. The good news is that for any academic discipline and field more nuanced RDF thesauri/ontologies exist, and there is nothing wrong with familiarising students with at least one of these.For instance, this site lets one search for existing ontologies in the life sciences.

With a well-formed spreadsheet in place, the triple statements can be imported into pretty much any triple store server on the market, such as the already mentioned GraphDB™ (commercial a product but free and fully functional in a desktop version) or the open source Jena Fuseki server.

Bibliographies as knowledge assets

Peter Reimann

07/06/2021

The student-generated bibliography as a digital knowledge asset

Web 3.0 - the semantic web

Students create a bibliographic knowledge base

How can teachers make this happen?

Bibliographies as knowledge assets

Peter Reimann

07/06/2021

The student-generated bibliography as a digital knowledge asset

Web 2.0 - the social web

Web 3.0 - the semantic web

Students create a bibliographic knowledge base

How can teachers make this happen?