The team wanted to build a tool to made it easier for the public to find relevant trees for Friends of the Urban Forest (FuF) to plant in their area. San Francisco has many microclimates and not all trees thrive in each one. FuF has an online species directory, but the tool provides many choices, which can be overwhelming. Our idea was to create a recommendation system for the FuF website that would provide people with several candidate trees to plant tailored to the conditions of their specific location.
To kick off our work, FuF provided us with an inventory of tree health data. This inventory was conducted by the city government over the past year and contained metrics such as tree species, location and a subjective health score. Below is an example of the trees with various health scores mapped out in San Francisco.
After we had explored the data, we created some website mock ups and several candidate models to test the tree recommender concept. We played around with collaborative filtering and regression models, but ended up developing a simple k-nearest neighbors model since it made the most sense based on the problem set up. This model recommends the best-performing trees (based on the condition score) within close proximity to a specified latitude and longitude. Additionally, we’ve filtered out any trees that FuF has not approved to plant in San Francisco. Our final mock up concept (the third image in the set of images below) was the basis for the website we went on to develop.
Once we’d developed a model and some mock-ups we decided to put the pieces together in a proof of concept. We created a very basic website (code on github) which allows you to click on a map location to view a list of recommended trees. Since we only have data for San Francisco, we only recommend trees for locations within the city boundaries. A screenshot of the site is below.
We feel good about our recommender website. The basic idea is possible and likely rather useful. There are however a few remaining challenges. The first hurdle is getting the recommender onto the FuF website which involves some wordpress finengeling. Also, additional work could be done to ensure that the data is clean enough to provide accurate enough recommendations for FuF to use on their site i.e. the condition score text requires some additional cleaning and some more complex matching between the tree inventory and the approved trees list may be needed to ensure that all species are being mapped correctly (we’ve already conveniently extracted the approved trees list to a spreadsheet and added it to the recommender site’s assets). If anyone is interested in taking on these challenges, come get involved at Code for San Francisco!