Big Data – open data records of New York City

We have asked ourselves: what practical example might we best use to illustrate the usage, processing and collection of Big Data?

Of course Facebook, IBM and Google provide widely quoted and illuminated Use Cases. Nevertheless we have chosen another global player, which, according to our opinion, offers an even more graphical example, because it represents "real life" more intensively than the mere online-business:

New York City

What Big Data can mean for companies and businesses will presumably only become apparent in the next months and years. Nevertheless we do think that the City of New York does offer some perspective on how:
  • Insights may be gained and used from Big Data
  • Value-added products result from Big Data
  • What sense the release of data records can make.

New York City has already been releasing data collections from authorities and so called urban institutions for several years. In March 2012 Mayor Bloomberg has approved of the introduction of law no. 29-A, which will initially standardize the collection of urban data up to 2018 and also regulates the further handling of them within the framework of the US-wide Data.Gov Programs, which in the future presents a worldwide platform for open data records of all sorts. The objective of this initiative is to make the work of governments of all sorts and sizes more transparent, to support the participation of citizens, and to provide raw data to developers, on which’s basis then can arise new applications and insights.

What may arise out of it?

So far it all sounds very abstract. Therefore we will name some examples beforehand, which originated from the released data within the framework of the programming competition "BIG APPS", in which’s framework developers have combined various data records and have then processed those to real end-user applications with public benefits:

  • Embark NYC http://2011.nycbigapps.com/submissions/5738-embark-nyc Have you ever taken the subway in New York? Not only tourists are sometimes over-challenged by this. Embark NYC uses current data of the subway in order to send delays as messages, points out the right direction through subway stations and timetables by means of configurable routes without radio signal. Within a very short time, over 100.000 New Yorkers are already using this app.
  • Clean Streets of New York - http://2011.nycbigapps.com/submissions/5886-the-clean-streets-of-new-york - This app. is supposed to address especially environmentally conscious citizens. It picks off the geo-position of the user and shows all green spaces, public gardens and parks in the vicinity. In order to keep these green spaces clean, this application also offers the option to show recycling points as well as garbage collection schedules.
  • My NYC Running Tracks - http://2011.nycbigapps.com/submissions/5871-my-nyc-running-tracks This is an app. for sporty inhabitants of the metropolis: users can set up particularly suitable running routes in their neighborhood.
  • Sage: Pre-K and Elementary Schools Search - http://2011.nycbigapps.com/submissions/5837-sage-pre-k-and-elementary-schools-search – The selection of the right school is a life-shaping decision. In addition to hearsay and recommendations we would also like to base ourselves on data and facts. http://nysage.com/ offers a map presentation of the schools around the location of the user and an evaluation of the average certifications and grade-developments.
  • NYV Smoke Out - http://2011.nycbigapps.com/submissions/5587-nycsmokeout - New York is a non-smoking city with numerous non-smoking areas. Disregarding these can cost up to 50$. The app. displays these zones by means of the geo-data on a map and shows the user in which areas the blue smoke might become expensive.

What is required?

First of all one obviously needs raw data. These are provided, as mentioned, by the urban institutions (such as for example the Office of Consumer Protection) with over 850 data tablets. But can these data now be accessed in a sensible way? They must be described, combined and visualized in a sensible way in order to recognize connections, which might be used for further processing. Here comes the winner of the Big App competition 2012 into play:

NYC Facets, which has developed an online-tool that also allows non-specialized citizens to relate the various amounts of data to each other, to have them displayed as diagrams and to have them filtered. Thus confusing tables acquire meaning and can be described by meta-data or can be associated by APIs with internal company data.

For example, the OpenDataSet contains lists of all electronic stores http://nyc.pediacities.com/facets/index.php/Electronic_store_competitors_%28r9jk-nd53%29#tab=Columns and cafes including the size of the floor space http://nyc.pediacities.com/facets/index.php/Map_View_%28b8ak-guj9%29. In combination with the sales data of owner-occupied flats and residential houses in the corresponding area (http://nyc.pediacities.com/facets/index.php/DOF_Summary_of_Neighborhood_Sales_for_Manhattan_for_Class_1-_2-_and_3-Family_homes_-_2009_%285yay-3jd5%29) a company can carry out evaluations of locations and then make decisions for new openings or the range of services.

What potential do these data have for a company?

The example of New York City shows what exciting solutions can arise from the mere release of Big Data that had previously not been thought of by a long shot. A great part of the resulting apps do have a rather social use. But it is nevertheless already evident what tremendous momentum is created when open data is released from various sources for combined use, from which may then arise new, value-added products and individual solutions.

A utilization and participation in similar programs may mean the following for companies:
  • To be able to make better decisions due to extended insights
  • To recognize new markets
  • To develop innovative products
  • To improve services
  • To sharpen target-groups by means of recognized user behavior

If publically accessible data is now combined with internal company data, downright new insights and decision-making grounds for new business areas or customer groups can result from it.

If taking another step and thinking about that companies release their own data (of course anonymized and safe for competition), even greater potential results, from which be can only suspect today what information treasures are slumbering in secret.