All posts by Thijs van Beek

VU Humanities Graduate Seminar January 30th, 2014

Today we attended and presented at the faculty of humanities at the VU during the afternoon program of the graduate seminar. The theme of the afternoon, Revolutions in the Humanities, consisted of several project presentations and ended with a panel discussion between professors from the university who in their work contribute to the faculty of humanities.

The discussion was triggered by a themed edition of De Groene Amsterdam, a independent Dutch weekly magazine, named Humanities, alive and kicking. In this edition, there partly was an emphasis on the embedding knowledge and resources from beta sciences and this was proposed as one reason for the current successes within the field. The search for scientific relevance in the humanities has been an important topic of discussion since the quest for valorization that is a result of the economic crisis and the subsequent budget cuts and reorganizations in scientific institutions.

The leading idea is that technology and the widespread accessibility of information should play in integral part in social sciences. Teaming up with beta sciences and information scientists in particular is a start for this scientific revolution. This idea is not new and a relevant question is why the humanities are late to the game. The panel discussion exposed either a lack of interest, understanding or both.

While there is much ground to cover, we believe that our project, INVENiT, is a good example of how close interaction between the alpha and beta fields, and collaborative research goals leads to a fascinating new research approach and holds great potential in acquiring new knowledge and insights. The reactions we received indicate that our brief presentation already inspired people in different fields and from backgrounds to rethink how technology and information can influence their research efforts. Take a look at the presentation below and if you have any questions, please get in touch.

Links

Delpher

The Koninklijke Bibliotheek launched Delpher, a search engine giving access to millions of historical text resources, varying from magazines, newspapers and books. The data is available via a public API and can also be downloaded as a complete dataset.

To us, Delpher is a great example of the opportunities that lie in making data publicly available and serves a wide variety of users, from researches in historical fields (art history, general history, anthropology, etc.) but also a more general public who would like to find something in the past from their own lives.

It is a tremendous effort to digitize, store and make available these enormous amounts of information. To make it accessible without annotation, Delpher uses Optical Character Recognition to build full-text indexes.

While the technological feats are great, it is important to be critical as well. First, Delpher states it chooses quantity over quality and says the technology they use is not yet capable of precise OCR, let alone, recognizing context or meaning. How these challenges are currently addressed is unclear.

Playing around with Delpher quickly shows it is slow. This can be fixed by both upscaling resources and using different/better search algorithms. While Delpher is not open about the both of them, it is unclear which technique is more profitable, in search times and investment cost.

A quick test also shows that Delpher can return many results. Querying Philips returns almost a million results. It is doubtful that these are all relevant and in such an example, without filtering, prioritizing and ranking results the search engine becomes difficult to use and understand.

We also argue that the interaction and presentation of the front end is not very modern and lacks a general UX quality. The aim of the project is to make historical texts available for the public and with that comes the responsibility to make the date consumable.

At this time, the website still caries the BETA label and we are curious to see improvements over time. Delpher is a great tool for anyone interested in historical text and for data scientists. Head over and take a look at http://www.delpher.nl.

PiLOD 2.0

PiLOD 2.0 is the second iteration of the Linked Open Data think tank, a joined effort to make government data open and linkable by addressing legislative en technical problems. These issues are addressed in the form of seven cases, each of them focussing on a different aspect of the problem space and solved by experts form public and private institutions. The cases look at many aspects of LOD, for instance provenance of laws and court rulings, valorization of publicly funded scientific efforts within the LOD community, awareness of ministers and legislators, etc.

The case Frontend, initiated by Waag Society, is particularly interesting as it has similar objectives to what we are doing for Invenit. Waag has recently published a beautiful interactive map with information about all buildings in the Netherlands colored by their age. The map data comes from the Kadaster and is made available via the pubic CitySDK. We urge you to take a look if you haven’t already.

The example objectifies our mission; making linked date useful. Of course we can think about what usefulness means and that it depends on many aspects of an application, such as the nature of de data, the audience en the goal of a project, but the example shows us that good application design where LOD is an integral part of the design and development process, can lead to stunning results.

We are very eager to see if there is enough common ground to team up with Waag and others in their case and see if we can use the Rijksmuseum collection to prove to the world the benefits and potential of LOD. The next PiLOD meeting will be held at the VU on January 29th (subject to change) and might be the perfect opportunity to get better acquainted with the other parties and the project.

Trough this way we also like to thank the NWO for hosting the event and to the other case leaders and speakers for their inspiring talks and demonstrations.

The PiLOD project has a website that is publicly accessible for anyone interested and has a newsletter that is send out by Geonovum, one of the more visible participants.