Book Review: Learning SPARQL
I've been experimenting with SPARQL for some time and was lucky enough to have had some training at work on it, but on several occasions when reading Bob DuCharme's Learning SPARQL I found out something that this very powerful language could do that was new to me. The book provides quite a detailed overview of the capabilities of the language and takes the reader right from their first steps in constructing a query through to using it as a data source for programs. The capabilities of both SPARQL 1.0 and 1.1 are covered, with warnings when commands only work in 1.1. If you are looking to take your first steps in learning SPARQL, or maybe you are someone who can already write queries and would like to enhance your skillset, perhaps exploring topics such as creating, updating and validating data then you may well find this book very useful.
The book opens by getting the reader to dive straight in to writing some SPARQL queries. I thought this was rather a good approach, learning about SPARQL and Linked Data will mean learning a lot of new terms and concepts but having to wade through these at the start could put people off. So instead the first chapter explains the very basics, gets the reader set up with the software (which is Java based so works on Linux, Mac and Windows) that they will need to use the examples in the book and then goes through some sample queries to give a taste of what is possible. At the end of the first chapter the book moves on to answering a real question with real data from DBPedia – which albums have been produced by Timbaland? A fun query that, with the rest of the chapter, leaves the reader with some sense of what is possible.
If the reader was wavering as to whether to commit themselves to learning SPARQL or not, the preface, which outlines the reasons for doing so, and the first chapter, which gives a sense of what is possible may well be persuasive and ease the reader into the more detailed coverage that follows in the rest of the book. The second chapter tackles the terminology but somehow avoids making this a dry topic. The author also mentions that semantic web technology is popular in medical research and the intelligence communities which I was unaware of and would cement the idea that SPARQL is a potentially very useful skill to have!
Chapter three takes a much more detailed look at SPARQL queries. A wide range of topics are covered with a lot of very useful information. At a few points in the book, and particularly in this chapter I wondered if it might have been better to split the chapters into smaller chunks somehow. Each chapter gives a lot of information and covers quite a wide topic area. If you were working through the book and following the examples each chapter could take quite some time to work through, maybe giving an incorrect impression of lack of progress.
The next few chapters move on from the read only world and cover topics such as creating, copying and updating data. Usefully it also covers how to validate data, not just for errors but also for compliance with business rules (the book gave an example of an expenses report dataset and showed how to find incidents where the wrong grade of employee had approved an expenses report). Creating and updating data are mainly SPARQL 1.1 features of course and quite new and might not be always available. However thanks to the sample data provided for the book and the software suggested the reader can gain experience in using these features of the language.
The book finished with a quick introduction to using SPARQL in programs. The last chapter provides quite a lot of useful information, including some discussion around production performance and integration considerations. It was nice to see some sample scripts in Perl and Python that could be build upon and while this part of the book was not exhaustive (it would be possible to write another book on using SPARQL in programs) it gave some very useful information and a solid starting point.
Novice computer users do not seem to be the target audience. A certain level of knowledge about topics such as XML and SQL are assumed but you do not need to know much about the semantic web. This might make it suitable for people such as developers, librarians and analysts who may have used XML and SQL databases but want to make use of data sources that offer SPARQL endpoints. The author avoids overly academic language with an approachable writing style that keeps practical considerations in focus. I learnt a lot while reading this book and can see myself going back to it as a reference and also to explore further some of the topics it covered.
The review copy of the book was obtained through the O'Reilly Blogger Review program.