Category: Data

How I used ST_ClusterDBSCAN to locate clusters of multiple, similar parcels

Alternative headline: A practical example of how to use ST_ClusterDBSCAN to find similar real estate properties.

Oftentimes a developer wants to acquire several adjacent lots for a single redevelopment. Each standard sized lot in Chicago is about 3,125 square feet (25 feet wide and 125 feet deep). Because of downzoning in 2004, and since, the zoning rules for many lots allow only about 3-4 dwelling units each. Multiple lots are required to develop buildings with 6-9 dwelling units, which is a sweet spot in Chicago for design and avoiding having to get an upzone.

Chicago Cityscape has long had Property Finder, a tool to locate parcels that meet exacting specifications given existing lot size, current zoning district, distance to transit, and other criteria.

Now, Chicago Cityscape can locate parcels that are adjacent or near each other that all meet the user’s specified criteria (what the website calls “filters”). This is possible because of the PostGIS function ST_ClusterDBSCAN.

ST_ClusterDBSCAN considers all geospatial features in your result set (whatever matches the WHERE clause) and assigns them to a cluster ID according to two inputs: minimum cluster size, and maximum distance each feature can be from any other feature in order to be considered in the same cluster as that other feature.

The function can also assign a feature with a cluster ID of NULL, indicating that the feature did not meet the clustering criteria and is alone.

Show me what that looks like

Chicago Cityscape gives the user three options to cluster: Small, compact clusters with at least 3 properties each; small, compact clusters with at least 5 properties each; large, loose clusters with at least 10 properties each.

Additionally, Chicago Cityscape lets the user choose between showing parcels that weren’t found in a cluster, or hiding parcels that weren’t found in a cluster. The reason to show parcels that weren’t found in a cluster is to visualize where there are and aren’t clusters of parcels in the same map.

A map of Chicago’s Near West Side community area is shown with clusters of vacant lots. The “show all properties” mode is used, which shows clusters with a thick, black outline. Properties that were not in a cluster are still shown but without the thick black outline (enlarge the photo to see the difference).

Sample query

This query looks at all of the vacant lots within 1 mile of the intersection of Washington Boulevard and Karlov Avenue in the West Garfield Park community area of Chicago. The query looks for clusters of at least 3 features (“minpoints”) that are no more than 25 feet apart (“eps”). (The data are projected in Illinois StatePlane East Feet, rather than a projection that’s in meters because it’s easier for me to work with feet.)

I posted another sample query below that’s used to exclude all of the features that were not assigned to a cluster.

SELECT pin14, ST_ClusterDBSCAN(geom, eps := 25, minpoints := 3) over () AS cid, geom
FROM parcels
WHERE property_class = '1-00'
	AND ST_DWithin(geom,
        ST_Transform(
            ST_GeomFromText('POINT(-87.7278 41.8819)', 4326), 3435),
           5280)

The screenshot below shows clusters of vacant lots that resulted from the query above. The parcels symbolized in a gray gradient were not assigned to a cluster. Notice how clusters will form across the alleys but not across streets; this is because the streets are wider than 25 feet but most alleys are only 16 feet wide.

The map shows various groups (clusters) of vacant properties in West Garfield Park. Each cluster is symbolized in QGIS using a different color. Properties that are not in a cluster are symbolized by a gray gradient.

Exclusion sample query

This query is the same as above except that a Common Table Expression (CTE) is used (CTEs have the “WITH” keyword at the beginning) to create a subquery. The “WITH” subquery is the one that clusters the parcels and the following query (“SELECT *”) throws out any features returned by the subquery that don’t have a cluster ID (the “cid” field).

with parcels as (
SELECT pin14, ST_ClusterDBSCAN(geom, eps := 25, minpoints := 3) over () AS cid, geom
FROM parcels
WHERE property_class = '1-00'
	AND ST_DWithin(geom,
        ST_Transform(
            ST_GeomFromText('POINT(-87.7278 41.8819)', 4326), 3435),
           5280)
) select * 
from parcels where cid is not null;

I would also recommend Dan Baston’s blog post from six years ago which has more commentary and explanation, and additional examples of how to use the function.

Chicago Crash Browser updates with stats, filters, and news article links

Today I’m adding a bunch of new features to the Chicago Crash Browser, which lives on Chicago Cityscape.

But first…special access is no longer required. Anyone can create a free Cityscape account and access the map. However, only those with special access or a Cityscape Real Estate Pro account will be able to download the data.


Five new features include:

  • Statistics that update weekly to summarize what happened in the past week: the number of crashes, the number of people killed in crashes, and the number of people with the two worst tiers of injuries. The statistics are viewable to everyone, including those without access to the crash browser.
screenshot of the new weekly statistics
The statistics will update every Sunday. The numbers may change throughout the week as Chicago police officers upload crash reports.
  • For data users, the crash record ID is viewable. The crash record ID links details about the same crash across the Chicago data portal’s three tables: Crashes, Vehicles, and People. My Chicago Crash Browser is currently only using the Crashes table. Click on the “More details” arrow in the first table column.
screenshot of the data table showing the crash record ID revealed.
The crash record ID is hidden by default but can be exposed. Use this ID to locate details in the data portal’s Vehicles and People tables.
  • Filter crashes by location. There are currently two location filters: (1) on a “Pedestrian Street” (a zoning designation to, over time, reduce the prevalence of car-oriented land uses and improve building design to make them more appealing to walk next to); (2) within one block of a CTA or Metra station, important places where people commonly walk to. Select a filter’s radio button and then click “Apply filters”.
  • Filter crashes by availability of a news article or a note. I intend to attach news articles to every crash where a pedestrian or bicyclist was killed (the majority of these will be to Streetsblog Chicago articles, where I am still “editor at large”. Notes will include explanations about data changes [1] (the “map editor” mentioned in some of the notes is me) and victims’ names.
screenshot of the two types of filters

After choosing a filter’s radio button click “Apply filters” and the map and data table will update.
  • Filter by hit and run status. If the officer filling out the crash report marked it as a hit and run crash, you can filter by choosing “Yes” in the options list. “No” is another option, as is “not recorded”, which means the officer didn’t select yes or no.
  • Search by address. Use the search bar inside the map view to center the map and show crashes that occurred within one block (660 feet) of that point. The default is one block and users can increase that amount using the dropdown menu in the filter.
screenshot of the map after the search by address function has been used
Use the search bar within the map view to show crashes near a specific address in Chicago.

Footnotes

[1] The most common data change as of this writing is when a crash’s “most severe injury” is upgraded from non-fatal to fatal, but the crash report in the city’s data portal does not receive that update. This data pipeline/publishing issue is described in the browser’s “Crash data notes” section.

The “map editor” (me) will change a crash’s “most severe injury” to fatal to ensure it appears when someone filters for fatal crashes. This change to the data will be noted.

Chicagoland’s massive parking footprint – as measured on September 16, 2018

Using the footprints of parking lots and garages drawn into OpenStreetMap as a data source, the area of land in Chicagoland occupied by parking lots and garages is 247,539,968 square feet. (The data was exported using HOT Export Tool; you can replicate my export.)

That converts to:

  • 5,682.71 acres
  • 8.88 mi^2 (square miles)
  • 22.99 km^2 (square kilometers)
  • ≈ 0.26 × area of Manhattan (≈ 87 km^2 )
  • 3.9% area of Chicago is parking (Chicago is ~589.56 km^2 )

(I forgot to measure the portion of this within Chicago, and now the data snapshot is gone. I fixed this in the 2019 report.)

Why Jefferson Park residents should allow more housing

Short answer: To provide more shoppers for the local businesses. Read on for the longer answer. 

Over on Chicago Cityscape I added a new feature called “market analysis” which measures the number of people who live within specific walking areas (measured by time) and driving areas (measured by distance). 

I am in favor of removing apartment & condo bans in Chicago, especially in areas where they were previously allowed and near train stations.

Jefferson Park is centered around two co-located train stations, serviced by CTA and Metra respectively. There have been multiple proposals for multi-family housing near the stations (collectively called the Jefferson Park Transit Center) and some have been approved. 

Always, however, there are residents who resist these proposals and the number of originally proposed apartments or condos gets reduced in the final version (classic NIMBYism). 

There’re four reasons – at least – why more housing should be allowed near the Jefferson Park Transit Center:

  • Locally owned businesses require a significant amount of shoppers who live nearby and walk up traffic
  • More people should have the opportunity to live near low-cost transportation
  • It will include more affordable housing, through Chicago’s inclusionary zoning rules (the Affordable Requirements Ordinance, ARO)
  • There will be less driving, and therefore lower household transportation costs and less neighborhood pollution

To support the first reason, I used the “market analysis” tool to see just how many people live in a walkable area centered around Veterans Square, a mixed-use office and retail development adjacent to the train stations. 

Only 9,368 people live within a 10 minute walk to Veterans Square (get the Address Snapshot). 

Comparatively, 19,707 people live within a 10 minute walk to The Crotch, or the center of Wicker Park, at the intersection of Milwaukee/North/Damen (get the Address Snapshot). The Blue Line station is about 75 feet south of the center point.

I would grant the low Veterans Square number a small discount based on the proximity to the Kennedy Expressway, which severely truncates walking areas up and down the northwest side. Still, even with that discount, ending up with less than half the amount as the one in Wicker Park, is disturbing. Wicker Park is hardly characterized by high-density housing. In fact, all of the new high-rises are just outside the 10 minute walk shed!

Don’t ban apartments on this vacant lot if you want more affordable housing – a case study

A vacant lot is for sale near the 606’s Bloomingdale Trail, a popular amenity that’s now known to have an effect in increasing home values. It’s zoned RS-3, which means it bans apartments. If the zoning stays the same, then the vacant lot will only allow a rich family to move in here. If the lot’s zoning is changed to allow apartments or condos, then the vacant lot could welcome families that earn median incomes.

You can build multi-family housing on the lot if you can get a zoning change, but you’ll have to pay the city a fee, convince your future neighbors that they shouldn’t oppose it, convince the alder that he should support it, and you’ll have to hire a lawyer.

Let’s say that zoning changes in Chicago were free and frictionless*. What should be built on this lot?

If the lot would allow multi-family housing, we can build several units for less money per unit than if we built a single-family house. That means that three families (let’s stick with three, which requires a zoning change to RM-4.5) could be housed for less money per family than the cost of one family.

How’s that? The sticker price for this lot is $425,000 right now, and if one family is paying for that plus the cost of building a house, then your minimum investment is pretty massive. (I suspect the lot will sell for something closer to $400,000.)

I looked at new construction costs on Chicago Cityscape, as indicated on building permits issued within 1 mile of the vacant lot, took the average, and added it to the cost of land per unit.

Construction costs

The average new construction single-family house, from the 10 most recent permits, is $304,052.78.

The average new construction multi-family housing, from the 10 most recent permits, is $230,192.13 per unit.

Total cost per unit (land + construction)

Add in the land cost per unit ($425,000 for the single-family house and $141,666.67 per unit for the 3-flat) and you end up with the total costs of:

  • $729,052.78 for the single-family house
  • $371,858.80 per unit in the 3-flat

Add in the profit or “cap rate” that a builder wants to make and the price is even higher, but the people who would buy in the multi-family house would be paying much less for their homes.

Takeaways

The city can generate more affordable housing if it “upzones” vacant land and stops banning multi-family housing. (Much of the city’s parcels have been “downzoned” to ban multi-family housing in a process that creates “exclusionary zoning” and allows only – expensive – single-family housing.)

The city and the Chicago Transit Authority will earn more real estate transfer taxes (RPTT) from the sales of the units as condos than from a single-family house.

Three families instead of one would enjoy living to the wonderful amenity that the Bloomingdale Trail and the parks that the 606 offers.

Want this kind of analysis for a property in Chicago? You can order a zoning report from me.

* The City of Chicago charges a zoning change fee of $1,025, and you will most likely have to hire a lawyer, and it will take about 3-6 months, depending on the complexity of the proposal that requires the zoning change. You can use Chicago Cityscape to see actual approval times (excluding the time meeting the alder for the ward of the proposed project).