OpenGov Hack Night: Upcoming Events and Sustainable Data

There was no presentation at this week’s OpenGov Hack Night, but that doesn’t mean there isn’t anything going on.

Here are a few events that are coming up!

  • National Day of Civic Hacking: Save the date! On June 1st and 2nd, Chicago will be joining civic hackers across the country to Hack for Change! We’ll have three events.
    • Immigration Hackathon at Cibola
    • Youth Hackathon at Adler Planetarium
    • Hack for Chicago at 1871

    More information about these events will be released soon.

  • OpenStreetMap Hack Weekend: If you know your way around a compiler, feel comfortable with JSON and XML, or know the difference between an ellipsoid and a geoid, then the Hack Weekend is for you. We’re looking for those with technical know-how to help make a difference in OpenStreetMap’s core software by writing patches and new software to help make mapping faster and easier.
  • Safe Communities Hackathon at Google: The City of Chicago is partnering with Google to host a hackathon centered around community safety on May 11th.
  • Work for Chicago’s Department of Innovation and Technology – Make the awesome happen: The City of Chicago is hiring a new Managing Deputy Chief Information Officer to help run the city’s enterprise applications. The City of Chicago’s efforts in releasing data and leveraging technology have been the keystone to the entire civic innovation effort in Chicago. If you’ve got the chops, drop what you’re doing and apply now.

Datasets of the week: Energy Usage and alternative fuel locations

In honor of Earth Day, the City of Chicago released two new data sets.

The first is a new API that lets users see what the energy usage is throughout the city. This data set uses data aggregated by ComEd and People’s Gas to display energy uses by census track pairs. (For privacy concerns, the City doesn’t want to release data that can point out energy use for just one building. By having the data by census tract pairs, it protects privacy while still giving great information on the city’s energy usage.)

This data set also comes with an API. As with all new API’s released by the City, this API is well documented telling developers what all the fields are, what the error messages mean, and giving samples of code that use the API.

The other data set that was released is alternative fuel locations. This data set will be particularly important to companies that want to make electric cars more viable in the city.

Come join us next week at OpenGov Hack Night! Every Tuesday at 6:00pm inside 1871.

Behind the Scenes: Foodborne Chicago

Earlier this month the Smart Chicago Collaborative, in partnership with local developers Cory Nissen, Joe Olson, and Scott Robbin, and the Chicago Department of Public Health (CDPH), launched Foodborne Chicago, an innovative application that trawls Twitter for mentions of food poisoning in Chicago, enabling a team of administrators to connect with affected people and encourage them to report details of their food poisoning to the CDPH.

The Foodborne Chicago application is a collection of different services that make up a complex workflow. This post explains the overall architecture of the application and the direction that development is headed.

Backend analysis

Foodborne searches Twitter for all tweets near Chicago containing the string “food poisoning”. The ingestion service consumes thousands of tweets, storing them in a large MongoDB instance. A collection of classification servers, running R, churn through the collected tweets, applying a series of filters. The tweets are classified using a model that was trained via supervised learning, which determines if the tweets are related to a food poisoning illness or not. The Twitter crawler, classification machines, and MongoDB instance are all virtual EC2 instances running on the Smart Chicago Collaborative Amazon Web Services account.

Here is a sample of actual tweets and the determination of the classifier:

food poisoning tweets:

  • Knocked down by food poisoning for the second day. Not a good way to start the week 🙁
  • Stomach flu/food poisoning is like eating gas station sushi without the joys of eating gas station sushi
  • I think I ate my food too quick, either that or I sense food poisoning
  • Food poisoning at the first chapter meeting. Awesome..
  • My stomach keeps making the weirdest noise. Possibly food poisoning from Golden Nugget!

not food poisoning tweets:

  • I read that over six million people will get food poisoning this year with 100,000 requiring hospitalization. This is entirely preventable.
  • It’s really hard to snack while watching Honey Boo Boo. It’s the second best diet to food poisoning.

The Foodborne web application, a standard Ruby on Rails application, runs on Heroku, and has a scheduled job that loads classified tweets from the MongoDB instance every few minutes. This administrative interface shows the admin team, a partnership between Smart Chicago and the CDPH, a list of previously classified food poisoning tweets. For each tweet, the application shows if the tweet has been replied to, and if not, a simple mechanism for sending an @-reply to the tweet. The reply can use one of a standard set of replies, or a custom message, depending on context.

Public interaction

When users respond to the Twitter @-reply, they fill out a simple food poisoning report form on Foodborne. This form is submitted to the City of Chicago via its Open311 interface. This submission is equivalent to the person calling Chicago 311 to report their food poisoning. The 311 software routes the submission to the Chicago Department of Public Health, where investigators review the submission and take action, including conducting inspections, based on the report.

Development roadmap

Foodborne has a number of exciting development goals ahead. The backend infrastructure, while adequate, can be optimized and made far more efficient. Joe and Cory are exploring how to use EC2 spot instances and queuing tools to perform the classification work when computing resources are less expensive. The administrative interface will be extended to show more information about suspected food poisoning tweets, including if a person has submitted a request to 311. Scott and Cory are also working on building a feedback loop to the classifier; eventually administrators will be able to flag tweets that are incorrectly classified as relating to food poisoning illness and the classifier model will then learn to ignore similar tweets in the future.

Foodborne is an exciting addition to the collection of applications hosted by the Smart Chicago Collaborative. We’re proud of the work the entire Foodborne team has done, and look forward to supporting future development. If you’re a developer working with open data in Chicago, you may qualify for free hosting, too!

Transit Night at the OpenGov Hack Night



At this week’s OpenGov Hack Night, we had presentations from Ed Zotti and Joe Iacobucci about transit and data in Chicago.

Joe Iacobucci is a Chicago transit enthusiast and gave a presentation on the link between public transit and economic development.

As an example, Iacobucci used a site called Mapnificent to show a locations access to transit by calculating how far a traveller could go by using public transit. Mapnificent uses transit data to calculate all the areas that can be reached by public transit in a given period of time.

For example, a rider that starts in the loop could go as far as Wilmette, Maywood, Homewood, or Calumet City in just 30 minutes. (Even without the Metra, a rider could still reach Evanston, Jefferson Park, or Englewood.) In comparison, a rider who starts in Cicero would only be able to reach as far as Greektown or Melrose Park in 30 minutes.

When you compare this information to real estate prices on sites like Zillow, you can see that the areas with greater transit options tend to have higher home values.

We also had a presentation from Ed Zotti, editor of The Straight Dope and assistant to the legendary Cecil Adams. Zotti ran through a recent history of Chicago Transit in terms of ridership capacity. Zotti has written extensively about transit in Chicago including a Chicago Reader feature on How to fix the El.

Resources for developers interested in transit issues.

Chicago has 43 sets transit data available to developers on the city’s data portal. This data includes everything from transit routes, bus ridership, to the annual boarding totals all the way from 1988. The CTA also has three APIs for developers to use in their own apps. These include bus tracker, train tracker, and a feed of all customer alerts.


The CTA even has instructions on how to build your own CTA Transit Info Display.

The next OpenGov Chicago Meetup:

The next OpenGov Chicago meetup will be this Thursday at the Chicago Community Trust. We’ve invited three speakers to give thoughtful critiques of the open government movement in Chicago. These speakers include:

  • Ramsin Canon, political editor at GapersBlock.com will provide perspective on the extent to which the movement benefits local communities.

  • Terry Pastika, Executive Director of the Citizen Advocacy Center will give a view of the current state of democracy in Illinois, with a focus on the Western and Far Western suburbs of Chicago.

  • Mike Stringer, Managing Partner at Datascope Analytics and organizer of the Data Science Chicago meetup group, will talk about whether we’re asking the questions most worth answering.

If you aren’t able to attend in person, the meeting will be live streamed on the Smart Chicago Collaborative blog.

Schoolcuts.org: Open Data and Civil Discourse


Last week, Chicago Public Schools announced that it was closing 61 schools due to budget constraints. Even before the list was announced, the plan to shut down schools was and still is generating lots of heated debate.

CPS has released data on each school, but it isn’t always organized in a way that makes it easy for parents to see what is going on at the school. To find out information on the school utilization, you would first visit a separate 19 page PDF file to see how CPS determines utilization. You then have to download an excel file and search through it to find the school you are interested in. This is a particularly thorny problem for parents and community members who care deeply about their schools as community anchors.

Schoolcuts.org Screenshot

What schoolcuts.org does is pull out all the available data on every school that is either being closed or receiving and put it in one place that’s easy for parents and community members to see.

https://soundcloud.com/morningshiftwbez/130322-morning-shift-seg-c

Listen to Schoolcuts.org’s Jeanee Olson talk about the site on WBEZ Morning Shift

Getting the data out there to the community in a format that’s easy to understand is extremely valuable. Not only is it important for parents to know what kind of schools that their children are being sent to, but having the data readily available makes for a better debate about school closings for all those involved.

One of the points of contention is that Chicago Public Schools has stated that children would only be moved to higher performing schools. CPS places schools into three tiers with regards to academic performance with he first tier being the best performing. However, there are several receiving schools that are Tier 3 – meaning they are the worst academically performing schools in the district. Because this data is open, and is being presented in plain language, community members can use this data to advocate for their schools.

Open data can and does aid in civil discourse.

Another point of contention is the role of charter schools and how they affect the neighborhood schools. One sides states that charter schools do a better job of teaching our children, while the other side states that opening additional charter schools robs resources from struggling neighborhood schools.

The Chicago Tribune wrote an editorial supporting charter schools stating that there were 19,000 students on waiting lists for charters schools in Chicago. This number was then disputed by WBEZ.

https://twitter.com/WBEZeducation/status/316361105410764800

WBEZ’s point was that the list of students on waiting lists for charters was generated by combining the waiting list of each school, some of which had students that had applied to multiple charter schools. 

Instead of just rhetoric, we’re now seeing debates in the public domain about the data. And that’s a good thing. This isn’t the only example of this being done. WBEZ’s Day by Datum blog recently provided a detailed explanation of the recent data spat between the Chicago Sun-Times and the CTA over crime data.

Sometimes the best civic apps are not the ones that give us the answers, but the ones that bring up the hard questions – David Eads

As we talk about open data and the ability of civic apps to solve problems and help us answer questions about civic lift, it’s important to realize the potential that open data has to improve civic discourse. Schoolcuts.org has helped to steer the course of the debate back to the data and that’s a powerful thing.

Data Potluck: 7 Million Rows of Data

There were a lot of people at this week’s data potluck

Data Potluck is a monthly event occurring the last Tuesday of every month at 6:00pm inside 1871.  Like the OpenGov Hack nights, these events focus on how open data and civic apps can help improve the citizen experience. However, these events have a more non-profit focus to them. Data Potluck was inspired by last year’s DataKind Data Drive which helped gather data for the Chicago area Red Cross. In order to keep the effort moving forward, Young-Jin Kim, Matt Gee and Nicholas Mader started the DataPotluck Meetup group.

DataPotluck’s other advantage? People bring food.

Rayid Ghani, Chief Scientist for Obama for America

At this month’s Data Potluck we had two presentations. The first was from  Rayid Ghani, former Chief Scientist for the Obama for America 2012 Campaign. Rayid explained how the Obama for America campaign used the power of predictive analysis and social media to help win the election. 

Rayid announced that the same model that made the Obama team so effective at their outreach efforts would be made available to non-profits.

Historical Traffic Congestion Data

The second presentation was by the City of Chicago’s Chief Analytics officer to announce the release of a seven million row dataset. Chicago has just released data on traffic congestion by segment.


To get an idea of just how big this data set is, a traffic segment is about a half mile. The city has 300 miles of road that the city keeps real-time traffic data for. The city refreshes the database that lives on the portal every ten minutes.

The city first released the real-time data in December, but civic developers wanted to take a look at historical data.

So, the city worked with Socrata to enable the city’s data portal to be able to handle such a massive volume of data. Now, civic developers can dig into all of the Chicago’s traffic data.

To help developers dig into the data, they’ve created a very well documented API.

This documentation includes code samples in multiple languages on how to access the data as well as definitions on all the different fields in the data set and the possible errors you could get.

Now that this data has been released, we’re excited to see what cool, useful, and interesting things that people will do with this data.

If you want to work with civic data:

For people who are interested in working with civic data, there are two opportunities that they should look into.

The first is the Chicago Data Science Fellowship. The University of Chicago and Argonne National Laboratory are recruiting people with statistics, programming, and data skills to work with real world data to make an impact on social issues.

The second is that the City of Chicago is hiring a data scientist to help ensure that Chicago becomes the very best civic data team in the world. The City of Chicago is looking to hire a new data scientist to join their team. If you are interested, you should apply on the city’s website.

Improving Adopt-a-sidewalk

TL;DR: Adopt-a-sidewalk is a flawed, under-utilized application with enormous potential. By refocusing the user experience on addressing actual needs of people in Chicago and showing meaningful activity, it could be a powerful tool for engaging citizens in supporting and  improving the civic infrastructure in their community.

Winter is officially in Chicago’s rearview mirror, although you would not notice from the chilly temperatures outside. This post is a reflection on one of Chicago’s winter-weather civic applications, Adopt-a-sidewalk, an application I helped bring online over a year ago, and how it can evolve to improve the lives of Chicago residents year-round.

Going Nowhere Fast @ Wal*Mart

Adopt-a-sidewalk is a Chicago-based version of the Adopt-a-hydrant web application built by Code for America in Boston back in 2011. Developed by Code for America fellow Erik Michaels-Ober, Adopt-a-hydrant lets residents of Boston volunteer to clear fire hydrants when there is a snow storm.

In the fall of 2011, City of Chicago officials, acutely aware of the severity and importance of swift snow removal, saw an opportunity to repurpose the code, and invited a group of civic developers to customize the application for use in Chicago. The key functional difference between the applications is that in Chicago, residents can request help clearing their sidewalk. Adopt-a-sidewalk first went live as part of ChicagoShovels.org in January 2012, and generated a bit of fanfare in local and national media:

  • New York Times: Snow Site Lets Chicago See if Plows Are Really in a Rut
  • ABC7 News: Mayor’s office launches ChicagoShovels.org
  • Chicagoist: City’s Adopt-A-Sidewalk Website Launches

Adopt-a-sidewalk saw moderate adoption, but quickly fell out of use due to a very mild winter, and the fast arrival of spring a few months later. In the fall of 2012, the City of Chicago asked the Smart Chicago Collaborative to assume the responsibility of hosting the application, and development responsibilities were handed over to the Code for America Chicago brigade.

To date, Adopt-a-sidewalk has seen very little adoption in Chicago. There are 557,793 individual sidewalk segments available for adoption, but only 75 registered users. 153 sidewalks have been claimed, either by volunteer shovelers, or people asking for help. That means that only 0.027% of all sidewalk segments in Chicago have been adopted. At its busiest, only 200 people visited the site in a given day.

There are three major issues that impact the usability and adoption of Adopt-a-sidewalk.

First, plainly speaking, the application is boring. In the case of a snow storm, there is a sense of urgency to responding and cleaning up the mess. The City deploys a fleet of snowplows to clear the streets, and neighborhoods are abuzz with residents scraping cars, shoveling steps, and snow-blowing their sidewalks and alleys. On Adopt-a-sidewalk, there is absolutely no perception of activity, urgency, or community. There is no mechanism to show users where activity is happening, or if there is a need for activity. On their first visit to the site, users are presented with a featureless, generic Google map of the city of Chicago, and no clear call to action. If the user does decide to register and adopt a sidewalk, there is little incentive to return or to refer friends to the site.

Second, the path to participating is laden with friction. Users must search using a real Chicago street address and register for an account before they may participate. Registering an account involves giving a name, email address, a password, and completing a captcha. There’s no mechanism to invite your neighbors to join you in shoveling, nor is there a mechanism to share your activity with your social network.

Third, the application is useless when there is no snow on the ground. Adopt-a-sidewalk is irrelevant in the summertime, and, for most of the winter spent between snow storms. There is no incentive to return to the site, and there is no meaningful action to take in between snow storms.

On a conceptual level, the premise of Adopt-a-sidewalk is flawed. Chicago residents are already expected to and, by ordinance, required to, shovel their sidewalks. Adopt-a-sidewalk provides no benefit to users who adopt the sidewalk in front of their house and dutifully shovel it each time snow falls. The steps to register and adopt their sidewalk is busy work.

The real work

Instead of asking users to do monotonous work, Adopt-a-sidewalk should focus on providing a real service: matching people in need of help with people willing to help. In that scenario, there are two key classes of users: people who cannot clear their sidewalks and people who are willing to help shovel sidewalks near them.

By shifting the interaction model from navigating a half million rectangles on a map to a focused, needs-based one, many of the core usability issues can be alleviated. It’s far easier to show activity, in the form of the most recent or most urgent requests for help, and the reward for participating is much more immediate and meaningful. Instead of highlighting what’s expected of people, the focus can be on enabling and rewarding people who want to help their neighbors.

The natural extension of this concept is to move beyond simple sidewalks and instead enable neighborhood adoption of any civic infrastructure. Adopting sidewalks could easily gave way in the spring and summer time to adopting parks and community gardens. In the fall, communities could band together to adopt a local school and fix it up before students return. A baseball team can adopt its ball field and organize events to maintain and improve it.

Fostering community around shared civic infrastructure is not a new concept. However, using technology, it is possible to integrate the real world thing with an online community, and the vast network of people and data that exists there. With the rise of open government data, not only is the civic infrastructure as physical object or place, it’s a continuous stream of data and interactions. The baseball diamond around the corner is not just a sandlot for shagging fly balls, it is a collection of data points: tweets, photos, and events created by community members, and crime reports, 311 requests, park facilities data from the local government.

I look forward to seeing where Adopt-a-sidewalk goes from here, especially if Code for America or one of the brigades takes some of the concepts from Adopt-a-sidewalk and pulls them back into the mainline repository. Adopt-a-sidewalk is, despite its flaws and low adoption, one very small step on a long path to building, enabling, and merging real life and online communities.