I’m working on a secret project to get something installed on the public way. The process to find out how to do it is as arduous as getting it done because you never finish learning the process. Every time you think you’ve figured something out, there’s something else.
To get the secret project installed I need a licensed contractor. Not only do a need a licensed contractor, but they must have the license to do work in the public way (versus doing work at your private property).
The Chicago Department of Buildings publishes a continually updated list of licensed contractors on its website but it’s annoying to use. There’s no search, no permanent links, and if you leave the window open long enough this weird session manager kicks in and stops you from browsing to the next page of results.
I asked my followers on Twitter the best way to scrape the data. The ever-amusing Dan O’Neill, who leads the Smart Chicago Collaborative (which hosts the Chicago Crash Browser), recommended just copying and pasting all 10 pages. That would work fine for the first time, but I might need to do it a second time when the data updates. Nick Bennett jumped in and used Selenium, a tool that automates web browsers. He said, “it’s inefficient but for a small job like that I figured why bother with something faster”.
I imported the data into a MySQL table and ran through some of my “standard” data cleaning methods (like trimming leading and trailing spaces, removing odd characters, and extracting good information into other columns, like phone numbers and ZIP codes).
With PHP – my favorite web language – I created a single page website that loads all 3,930 licensed general contractors extremely fast, loads the DataTables JavaScript library to enhance the table with search and sort. I used Bootstrap to make a responsive design meaning it adjusts to fit multiple screen sizes including smartphones and tablets.
I call it LicensedChicagoContractors.com.
The new website still doesn’t solve my problem of finding a company that can do work in the public way – I’m still working on this. The last online dataset I could find is on the city’s old http://egov.cityofchicago.org domain, and was cached by the Internet Archive’s Wayback Machine on January 25, 2010. Ideally this information – plumbers, public way, and general contractors – should be posted on the City’s data portal.