Tuesday, July 11, 2023

Open Data...or Open Disdain ? Cognitive dissonance and Urban India's Open (GIGO) Portals

The challenge of cognitive dissonance

It seems that the biggest challenge facing the urban development sector today is a severe case of cognitive dissonance. While at one time people lamented the mis-match between "planning" and "implementation", now it seems to be a far deeper malaise of people simply not registering the difference between what they are saying and what they are doing.

To be honest, I always found the "planning-is-ok-but-implementation-is-poor" thingy to be utterly ludicrous and misleading. Even a little child would know that making a time-table and following it are two different things. There was no need for professional adults to have gone on parroting this truism and diverting attention from the real technical challenges of effective plan making and execution.

But this recent cognitive dissonance is dead-on devastating in its ability to nurture a collective dumbness among urban sector officials and professionals and then rewarding those who manage to dumb down at exponential rates.

A prime example of this phenomenon is the recent trumpeting of the online Open Data portals by the urban sector.

Have you even visited your own "Open Data" portals ?

In Satyajit Ray's classic film "Jana Aranya", the character played by Utpal Dutt asks a banana seller if he had ever himself tasted the bananas that he sells. 

Better hear the voice of the great Utpal Dutt yourself, and keep the tone in your mind -


We can then ask the officials and consultants of India's various urban missions in the same tone, if they have ever visited their own "open data" portals. 

A few months ago, at a conference on the smart cities mission, a senior official of the mission said that he was very glad that so much data was now available freely to the public through open data platforms such as the Open Government Data (OGD) platform , the Indian Urban Data Exchange (IUDX) platform etc. 

One wonders how he could say something like that at a public forum and how none of the die-hard supporters and critics of the mission in the audience had no questions regarding such a statement.

Open data and my neighbour's laundry list

I have made multiple visits to both the above platforms and downloaded various data files. Never have I ever found anything with any more usefulness or relevance to my work than ... let's say....my neighbour's laundry list. It seems as if the professionals tasked with uploading data to these portals found every scrap of excel spreadsheet lying around in their respective offices and dumped them in these digital bins. May be they are rewarded for the sheer number of files that they upload rather than what those files contain.

It is also amusing to discover that individuals who talk passionately about these platforms have never visited them or have downloaded any data from them. A large part of the problem is also the unfamiliarity with the basic standards of data storing and a lack of clarity regarding the tasks that the data should be used for.

The fact that this useless data is available in a range of file types such as csv, json, ods etc further compounds the irony of the situation.

And of course, let's not forget that entering the website and accessing the data are not always the same thing. 

Often you will encounter this at some point -


Or this -


Strangely, when I had checked the portal sometime back, many of the "private" buttons were "open" and coloured a welcoming green. Of course in the name of smart traffic signal data they often contained something as amazing as column containing names of certain squares (all the rest is left to the Sherlock Holmsian powers of imagination and deduction on the part of the website visitor).

Consider the following csv (comma separated value) file available on the OGD portal -


This is all the information that this downloaded csv file contains. The file contains no metadata (which means there is no data on the data itself) such as - when was it uploaded, who uploaded it, which period is represented in the data, what do the fields mean (does "Nos. of IHHL" mean number of individual household toilets under construction or already constructed or targeted ?), does the data correspond to the Swachh Bharat Mission (SBM) or some other project...and, how on earth does a person not dealing daily with Indian development lingo know what "IHHL" stands for in the first place ??...etc. etc. etc.

The incompleteness of the data further renders it useless. Even if we were to assume that the data shows how many toilets have been constructed, what is the use of that if not compared against the total toilets that were supposed to be constructed ? Even if that data were available in another file on the portal, they would not be comparable due to the lack of metadata.

The hard fact regarding any data management process is that any data that does not contain meta-data is garbage data. And considering the inevitable thing that happens when garbage data is fed into any analytical or decision-making system (Garbage In - Garbage Out....aka GIGO)....none of this data should be used by anyone actually trying to do something useful.

Genuine open data portals

It is normal for the "defenders" of these portals to wax apologetic when confronted with these issues with predictable statements such as - "Yes...but it also contains things which are useful...with time it will improve...it takes time to build something like this" etc etc. 

There is no need for all that. Just a quick visit to any of the following would give one a clear idea on what serious open data portals should be like.

Bhuvan portal of the Indian Space Research Organisation (ISRO)+National Remote Sensing Centre (NRSC)

Automatic Weather Station portal of the Indian Meteorological Department. 

USGS EarthExplorer of the United States Geological Survey.

AND not to forget the wonderful....

Census of India.

At the end it all boils down to this - if you have something serious to do and you know what you are talking about, your data won't have to be a pile of BS...open or not.




No comments:

Post a Comment

To Go or Not to Go --> (Urban Planning and the Distance Decay Function)

The fine art of problem articulation  The important thing about mathematical urban models is not the mathematics itself but its application ...