Enigma Labs
Enigma Labs

What is Public Data?

A person works at a computer solving a problem.

The Federal Communication Commission's net neutrality repeal has resulted in heated debate on the future of a free and open Internet. While a pertinent conversation, it often fails to address broader questions on the exact nature of open information:

What does it mean for information to be open?

What is open data?

How does open data differ from public data?

The answers become more nuanced when we consider factors such as access, redistribution, maintenance and structure.

According to the Open Knowledge Foundation definition, “Open data and content can be freely used, modified, and shared by anyone and for any purpose.” While this provides some helpful insight, it does little to hold open data to a technical standard. For that, we turn to the inventor of the World Wide Web, Tim Berners-Lee, who developed a 5-star scale for the quality of open data. His scale is as follows:

  1. Make data available online and under an open license
  2. Make it available in a structured format (i.e excel)
  3. Make it available in an open structured format (i.e csv)
  4. Use URIs for denotation
  5. Link data to other data to offer context

The Open Data Institute adds further color by providing an open data certificate to verify a data publisher uses best practices to uphold data dependability. These practices include timely data updates, the presence of a data maintainer who provides metadata on changes, and the availability of historical data.

Today, there is an implied standard to open data: often structured, machine readable, open licensed and well maintained. Additionally, open data is free. The same does not necessarily hold true for public data.

Public data can be defined as all information in the public domain, encompassing anything from a monthly updating dataset on a government data portal to PDF files that are only accessible via Freedom of Information requests (and everything in between).

A larger circle representing public data contains a smaller circle within it representing open data.  According to the Open Data Barometer's Global Report 2017, only 7% of key datasets across 115 countries were considered open.  The open data circle size is 7% of data otherwise considered public.

Open data is, by definition, easy to access. Public data on the other hand can be trickier, sometimes requiring a Freedom of Information Act (FOIA) request. For those unfamiliar, submitting a FOIA request to a government agency can be a real a test of patience, taking months to receive a response and sometimes costing a fair amount of money.

Datasets that otherwise do not require a FOIA but are purchased from government agencies may also be public data. However, they are certainly not open, as they are not free. In one case, an open data activist in Virginia purchased the state’s corporate registration data for two years before turning around and publishing it for free. His pressure to make this information more widely available resulted in the state eventually publishing the data for free and for everyone.

Enigma Public makes a continuous effort to FOIA for politically relevant or otherwise interesting datasets. We offer all our datasets in machine readable format (downloadable as a CSV or accessible via our API), even when data at the source is anything but.

Conversation on what data transparency means and its pertinence to public knowledge goes beyond the Enigma offices. As the Inter-Parliamentary Union prepares for its 2018 World e-Parliament Report, we look forward to changing legislation as governments strive to increase the standard of their public data.

Related Resources