Ordnance Survey UK Improve Their Open Data

When the UK’s data sharing website data.gov.uk launched I was pretty unimpressed. I mentioned a few things that annoyed me: Where were the examples? What were the ontologies used? Without this information the provision of a sparql endpoint is fairly meaningless.

Well it turns out that one section of the government is getting stuck in. Maybe I should have remembered that the marketeers love a launch without a product, and that the people doing the real work are up late, slaving away cursing their managers, trying to get the stuff out the door. Just saying; it’s not like I’ve ever seen anything like that in my job :)

Anyway…, I already liked the efforts the UK’s ordinance survey were making and, defying the normal stereotype of public sector computing, they have not been content with their first or even their second stab at presenting a linked data interface to their info-sets.

http://data.ordnancesurvey.co.uk/ presents examples, a sparql endpoint, and the ontologies used, including the use of standard ontologies like foaf.  Nice!

Now what can you do with any of this?

Well last week I was in the UK, in Kingham. If I create a sparql query like this:

Construct {
?Place a <http://data.ordnancesurvey.co.uk/ontology/50kGazetteer/NamedPlace> .
?Place a ?Type .

?Place <http://http://www.w3.org/2004/02/skos/core#broader> ?BiggerPlace .
?BiggerPlace a <http://data.ordnancesurvey.co.uk/ontology/50kGazetteer/NamedPlace> .
?BiggerPlace <http://www.w3.org/2000/01/rdf-schema#label> ?BiggerPlaceName .
?BiggerPlace <http://data.ordnancesurvey.co.uk/ontology/spatialrelations/contains> ?Place .
}
WHERE
{
?Place <http://www.w3.org/2000/01/rdf-schema#label> ‘Kingham‘ .
?Place <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> ?Type .
?Place <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> ?Type .
?BiggerPlace <http://data.ordnancesurvey.co.uk/ontology/spatialrelations/contains> ?Place .
?BiggerPlace <http://www.w3.org/2000/01/rdf-schema#label> ?BiggerPlaceName
}

and enter it on the endpoint page  (http://api.talis.com/stores/ordnance-survey/services/sparql) then I get back a graph of information about the places that say they contain Kingham. I also get an URL for Kingham (http://data.ordnancesurvey.co.uk/id/7000000000008699*) which I can use from now on as a unique identifier in my code for the Civil Parish that is Kingham.

This to me is exactly what government should be lending to the data world. The administrative levels in the ordinance survey data can be linked through to election results, the provision of services, etc. A commitment on the part of an authority to maintain a high level of integrity for such data can provide a genuinely valuable resource.

Technology and governments do not usually go well together. The thing about data though is that it really isn’t about technology. The only criteria for success is availability.

It’s the business of governments to supply services to all their citizens and with a fair degree of equality (hopefully). To assess the success of governance requires a lot of categorization and correlation: the number of doctors per 1,000 people; the average wealth in a given district; employment levels, etc. So the work is already being done. Making it open means we get more value for our taxes, accountability increases and we get a data set that allows us to talk authoritatively of entities within a state.

http://data.ordnancesurvey.co.uk/id/7000000000008699 refers to the Civil Parish of Kingham, not the village, or some other nebulous form. The Kingham link describes the nature of this relationship  by describing its type as a Civil Parish. Another graph might describe the village and also, form a relationship such as:

<http://example.com/ukplaces/villages/Kingham> <http://http://www.w3.org/2004/02/skos/core#broader <http://data.ordnancesurvey.co.uk/id/7000000000008699>

letting us know that the village and the civil parish have a strong relationship.

Sure there are things wrong with the OIS data but bucking my usual nature I’m not going to complain about them. Why? Because, I trust them to make their data even better in future. That’s a rare enough thing for me to expect in a commercial product and almost unheard of in the public sector.

Tim (not the other Tim)

*If you don’t have an rdf plugin look at these by prefixing the rdf URLs with http://demo.openlinksw.com/ode/?uri=

Tagged with:  

How To Improve Data.gov.uk

OK, so the Data.gov.uk stuff hit a raw nerve with me yesterday. In itself, it was pretty disappointing and the reporting also touched on one of my pet hates: Why do a lot of journalists not ask questions any more? They just seem to repost what they’re told. But, I’m past all that now, so here are a few thoughts on how a government data website could be better implemented.

The W3 put a bit of effort into a data browser extension for Firefox called Tabulator. It’s nice and deserves to be better. If I look at a page on Wikipedia with Firefox, say for example

http://en.wikipedia.org/wiki/Ystrad_Meurig

Then if there is data I am interested in as a starting point, I can go to Dbpedia and at http://dbpedia.org/resource/Ystrad_Meurig my data browser extension will find a graph of data that it can understand.

This is really nice.

The Data.gov.uk site has a query page for looking at its semantic data, but no clues as to what I can ask for. Dbpedia has a query page and from my Ystrad Meurig example I already know a lot about how Dbpedia might be storing its data. I know that I can ask for latitudes and longitudes using the W3 positioning predicates (using Tabulator I can browse from the data page to those predicates and find out more about them – it’s all linked). I know Dbpedia has a thing called ‘distance to Cardiff‘ so I could query Dbpedia for all things with a distance to Cardiff that have latitudes and longitudes and then I could plot them on a map.

This is properly linked data. This is what a government data site should be like.

I mentioned the Ordinance Survey ontology in yesterday’s rant. I like it, but it could be better. It has a solid structure for an administrative geography of the UK (not including Northern Ireland). However, the current version 2, is already out of date. A number of the unitary authorities were merged last year. This information is already reflected on the corresponding pages in Dbpedia, along with nice tagging to link the new authorities to the data on the authorities they have replaced.

The OS version 2 ontology replaced version 1 in a fairly unhelpful way, but that was OK because they said they were still playing around with how to work with ontologies. Will the next version play well with the current one?

The Dbpedia way of doing things means that not only do we end up with an up to date administrative structure but we also maintain a history of that structure. That history can be useful if we have to consider people and not administrations. A person might get around to acknowledging a change in the administrative make up of his area – eventually – but it wont happen immediately. The online structures need to be able to link them from old knowledge to new concepts. Here is the advantage in all these new ideas: Nothing leads to a dead end, everything is given more meaning by its connections.

The other Tim

Tagged with:  

It’s The Data, Stupid.

Data.gov.uk

Data.gov.uk

Oh, the excitement of it! Tim Berners-Lee is getting governments to listen to his cry, to set data free. Oh, the disappointment of our first look at the UK’s efforts! Where is the semantic data? Where are the ontologies to link concepts across datasets?

[For those of you not interested in the technical side of things skip over the next paragraph if you like - it's just technical ranting...]

This being a first pass the semantic data and the ontologies may be in there, but if they are they’re well hidden. There is a sparql page but no indication of which values are searchable. All the data sets I looked for were available in CSV and XSL; hardly linked. Turning one of the CSV data sets into RDF using well known namespaces took me about 30 minutes, so it shouldn’t be too hard for the site to get better, and quickly. Will it?

OK, that bit aside, the point is that the launch of this site seems to have been a deadline achieving exercise rather than an announcement of anything actually being ready. That being the case, somebody needs to put up their hand and say “That’s rubbish”.

It’s especially hard for me because I’m as excited about the possibilities of the semantic web as my more illustrious name-sake, but this ain’t those dreams, not even close. I was hoping to be able to complain loudly in the pub about my own useless government here in Ireland and how they weren’t doing anything to make their data available. “Look at how good the UK is”, I could have said. Oh well, another day.

So why am I so disappointed? What is this stuff anyway?
Well the semantic web can be a way to connect… well everything.

When I talk about a thing, say Sutton, I can link it to a description and eliminate any possibility that I am talking about a different Sutton. “Which Sutton?” you ask. “Sutton, Peterborough; Sutton, Craven etc…” and I answer “Exactly my point”. I can link it though:

http://www.ordnancesurvey.co.uk/ontology/AdministrativeGeography/v2.0/AdministrativeGeography.rdf#osr7000000000001643

[N.B. Don't go to the above URL unless you want to download a massive file!]

which is a link to the Ordinance Survey UK’s ontology for Administrative places in the UK. If this excellent data (if a little limited, and a little out of date) is on the new data.gov site, then it’s hard to find. It would be a good way to tie geographic data sets together. An arbitrarily named field in an xls file is, on the other hand, not a good way to link data together.

Nice idea Tim, now you need them to actually do the work.

The other Tim

Tagged with:  
© 2010 WhatClinic.com Blog