The project began with a mix of frustration and curiosity. Almost every single store, venue, grungy dive bar, and cheap restaurant I visited as a starry-eyed punk rock teenager and questing twentysomething was gone. Some were replaced with chain retail and luxury brands (and banks, so many banks) but an increasing number were sitting empty. My neighborhood began to look decrepit. As much as I bemoan national chains and the homogenization of the city, you can argue to me that the nationals are beneficial: they can weather economic downturns, they create jobs, they provide cheap goods and services to people who can't afford much. I'll probably still think you're wrong, but I'll listen to the arguments. Vacant storefronts provide value to nobody.
I saw articles linked from the incredible Vanishing New York blog, Tim Wu's New Yorker piece on high-rent blight, and several others, but most were focused on a specific neighborhood. I wanted to know what the whole picture looked like.
What's included and what's not:
This is specifically about vacant, for-lease storefronts. I've done my best to exclude empty but not obviously for-rent businesses, storefronts with plywood/kraft paper indicating potential construction, and spaces with existing businesses up for lease. All data came from either brokers' official websites or old-fashioned pavement-pounding. Aggregator and MLS sites like Loopnet made it difficult to tell if listings were outdated or currently existing businesses, so they've been excluded.
Unfortunately, there's a very long tail. Large brokers control the majority of retail in ultra-expensive midtown, but areas like the LES, the East Village, and Harlem are either for rent by owner, by individual brokers, or by smaller brokerage firms who don't list their available properties online. I combed the especially hard-hit East Village and LES on foot, so those are a fairly accurate picture. I found more than 100 properties I had missed from the online data collection exercise, meaning the rest of the city is almost certainly worse than it looks on the map.
One of the hardest parts of this project was tying all the pieces together. The open source ecosystem for GIS is excellent, but frequently I found myself unsure what tool to use, how they tie together, and where to go for the next step of the project. While not comprehensive (and probably not the most efficient), here's a process from A-Z for anyone wanting to do similar work.
New York has put out a wide array of datasets for public consumption. DCP has a GIS shapefile with the 1M+ buildings in NYC, which was the basis of the map. I specifically chose this set because unlike the PLUTO data, it has information down to the individual building, identified by the Building Identification Number (BIN), not just the lot level (BBL).
Getting the BIN from an address requires a two-step process. The Street Name Dictionary (SND) maps streets to a street code - necessary in a city where '6 Avenue', 'Sixth Avenue', and 'Avenue of the Americas' all refer to the same street. Armed with a street code and an address number, the Property Address Dictionary (PAD) yields the BIN. There are plenty of address quirks, combined buildings, defunct addresses, and other wrinkles to keep it from being 100% streamlined, but I wrote a tool to parse the PAD and SND and map the address to the BIN, available on my GitHub. There is an official geocoding API, but I needed to feed hundreds of addresses through it, hammering on someone else's API feels rude.
Mapping was all through free tools, mostly open-source. OpenGeo Suite has a great set of admin tools for the numerous PostGIS databases I used to manage the data. I rolled everything together into one shapefile via QGIS, then imported it into Mapbox Studio Classic to generate the vector tiles. Street data came from OpenStreetMap.
This project has been difficult, educational, depressing, and fascinating. I couldn't have done it alone. Thanks to:
- Jeremiah's Vanishing New York, for getting me interested and upset about the changes happening in the city I call home.
- David Elner's blog post that finally made sense of the city's property data.
- Boundless Geo and Mapbox, who have good tools and some of the best damn documentation I've seen in the OSS community.
- All my friends and loved ones who have listened (with varying degrees of enthusiasm) to me complain about this in bars and pushed me to Do Something.
Questions? Comments? Email me. email@example.com