Have you even visited your own "Open Data" portals ?
In Satyajit Ray's classic film "Jana Aranya", the character played by Utpal Dutt asks a banana seller if he had ever himself tasted the bananas that he sells.
Better hear the voice of the great Utpal Dutt yourself, and keep the tone in your mind -
We can then ask the officials and consultants of India's various urban missions in the same tone, if they have ever visited their own "open data" portals.
A few months ago, at a conference on the smart cities mission, a senior official of the mission said that he was very glad that so much data was now available freely to the public through open data platforms such as the Open Government Data (OGD) platform , the Indian Urban Data Exchange (IUDX) platform etc.
One wonders how he could say something like that at a public forum and how none of the die-hard supporters and critics of the mission in the audience had no questions regarding such a statement.
Open data and my neighbour's laundry list
I have made multiple visits to both the above platforms and downloaded various data files. Never have I ever found anything with any more usefulness or relevance to my work than ... let's say....my neighbour's laundry list. It seems as if the professionals tasked with uploading data to these portals found every scrap of excel spreadsheet lying around in their respective offices and dumped them in these digital bins. May be they are rewarded for the sheer number of files that they upload rather than what those files contain.
It is also amusing to discover that individuals who talk
passionately about these platforms have never visited them or have
downloaded any data from them. A large part of the problem is also
the unfamiliarity with the basic standards of data storing and a lack of
clarity regarding the tasks that the data should be used for.
The fact that this useless data is available in a range of file types such as csv, json, ods etc further compounds the irony of the situation.
And of course, let's not forget that entering the website and accessing the data are not always the same thing.
Often you will encounter this at some point -
Or this -
Strangely, when I had checked the portal sometime back, many of the "private" buttons were "open" and coloured a welcoming green. Of course in the name of smart traffic signal data they often contained something as amazing as column containing names of certain squares (all the rest is left to the Sherlock Holmsian powers of imagination and deduction on the part of the website visitor).
Consider the following csv (comma separated value) file available on the OGD portal -
This is all the information that this downloaded csv file contains. The file contains no metadata (which means there is no data on the data itself) such as - when was it uploaded, who uploaded it, which period is represented in the data, what do the fields mean (does "Nos. of IHHL" mean number of individual household toilets under construction or already constructed or targeted ?), does the data correspond to the Swachh Bharat Mission (SBM) or some other project...and, how on earth does a person not dealing daily with Indian development lingo know what "IHHL" stands for in the first place ??...etc. etc. etc.
The incompleteness of the data further renders it useless. Even if we were to assume that the data shows how many toilets have been constructed, what is the use of that if not compared against the total toilets that were supposed to be constructed ? Even if that data were available in another file on the portal, they would not be comparable due to the lack of metadata.
The hard fact regarding any data management process is that any data that does not contain meta-data is garbage data. And considering the inevitable thing that happens when garbage data is fed into any analytical or decision-making system (Garbage In - Garbage Out....aka GIGO)....none of this data should be used by anyone actually trying to do something useful.
Genuine open data portals
It is normal for the "defenders" of these portals to wax apologetic when confronted with these issues with predictable statements such as - "Yes...but it also contains things which are useful...with time it will improve...it takes time to build something like this" etc etc.
There is no need for all that. Just a quick visit to any of the following would give one a clear idea on what serious open data portals should be like.
Bhuvan portal of the Indian Space Research Organisation (ISRO)+National Remote Sensing Centre (NRSC)
Automatic Weather Station portal of the Indian Meteorological Department.
USGS EarthExplorer of the United States Geological Survey.
AND not to forget the wonderful....
At the end it all boils down to this - if you have something serious to do and you know what you are talking about, your data won't have to be a pile of BS...open or not.
No comments:
Post a Comment