In this notebook we're going to develop a geo-spatial visualisation using Bokeh. Let's load the modules we'll be initially using:

import pandas as pd
import matplotlib.pyplot as plt
import os
import numpy as np


Let's load the dataframe we developed in part 2:

actives_df = pd.read_pickle('actives_df_part2.pkl')


We are going to be looking at the data geo-spatially in reference to post-codes. This is the most convenient way as all property addresses have a post-code and it's also intuitive for most poeple. The following image is the post-code map of the "EH" (Edinburgh) area. We are going to focus our analysis on the city and immediate surrounds - post-codes EH1 through EH32, inclusive.

EH_postcode_area_map.svg.png

Attribution: Contains Ordnance Survey and Royal Mail data © Crown copyright and database right (2023)


For the geo-location visualisation we are going to use the Bokeh package. Bokeh has built-in support for the Google maps API or for rendering open-source map tiles (which we will use here). Let's load all the module elements that we'll need (the "output_notebook()" call renders the map inside the notebook instead of opening another browser window).

from bokeh.io import show
from bokeh.plotting import gmap
from bokeh.models import GMapOptions
from bokeh.models import ColumnDataSource
from bokeh.io import output_notebook
from bokeh.models import Scatter
from bokeh.plotting import figure
output_notebook()
Loading BokehJS ...


As we'll be using Open Street Map (which uses the Web Mercator coordinate system), we'll need to convert the lon/lat coordinates in the database to eastings / northings using the Web Mercator coordinate system. We'll add these as two addtional features.

actives_df['longitude'] = actives_df['longitude'].astype('float')
actives_df['latitude'] = actives_df['latitude'].astype('float')
origin_shift = np.pi * 6378137
actives_df['eastings'] = actives_df['longitude'] * origin_shift / 180.0
actives_df['northings'] = np.log(np.tan((90 + actives_df['latitude']) * np.pi / 360.0)) * origin_shift / np.pi


One of the great things about Bokeh is how easy it is to get things going. It's wrapped a tremendous amount of functionality into a few simple calls.

# Render a map at an Eastings / Northings centered on Edinburgh
EN = lnglat_to_meters(lon, lat)
EN = (-340000, 7540000)
dE, dN = 35000, 35000

x_range = (EN[0]-dE, EN[0]+dE) # (m) Easting  x_lo, x_hi
y_range = (EN[1]-dN, EN[1]+dN) # (m) Northing y_lo, y_hi

p = figure(x_range=x_range, y_range=y_range,
           x_axis_type="mercator", y_axis_type="mercator",
           tools=['pan','box_zoom','wheel_zoom', 'save', 'reset'], active_drag = 'pan',
           width=900, height=400)

# Render the osm map
p.add_tile("OpenStreetMap Mapnik")

# Define a data source to plot all the For-Sale properties
source = ColumnDataSource(actives_df[actives_df['current_status'] == 'Active'])

glyph = Scatter(x='eastings', y='northings', size=6, fill_color='blue', line_color='blue', marker='circle', 
                line_alpha=0.2, fill_alpha=0.2)

p.add_glyph(source, glyph)

show(p)


That's a pretty nice overview of all the properties for sale in the EH1-EH32 area. We've turned the Alpha down quite a bit so where there's a high density of properties they cluster somewhat and show a darker region. We've also enabled the pan and mouse-wheel zoom tools (on the upper right of the plot) so you can move the map around and zoom into specific areas of interest.

With all my visualisations I try to think through what would be an effective way to additionally communicate some of the underlying data. It's interesting to see the overview but we can convey a lot more in a single plot by utilising other attributes like marker shape, colour, size and so on. Let's start with thinking through what would be interesting to convey about the underlying data.

Asking price is a pretty fundamental attribute. If we coded the markers in some way with the price information we would be able to see where, in general, the more expensive vs cheaper properties were. We'd also be able to zoom in on interesting individual properties that stand out from the neighbouring ones. We could encode this information with:

  • Colour - use a darker (colder) colour for cheaper, lighter (hotter) colour for more expensive.
  • Size - use a smaller size for cheaper and larger for more expensive.
  • Shape - use different shapes for different price bands and have a legend that relates them.

I don't like the shape idea - I think that for it to carry information it would need to have a lot of price-band categories and I think it would be very busy. Shapes aren't ordinal!

Size would be OK but it kind of assumes that you want to see, say, more expensive properties clearer than cheaper ones and that may not necessarily be the case. Even with the Alpha turned down we will have problems with getting the dynamic range correct so that we can clearly see all the properties (cheap or expensive) and the marker size is meaningful. Recall from part 2 that we have almost 2-orders of magnitude difference between the cheapest and most expensive properties. This is hard to convey meaningfully with size without the larger markers completely dominating the visualisation.

Colour it seems would be the most approiate way to convey the price range. By using an appropiate heat-map palette, we can have a very fine resolution on the scale.

Let's try it:

from bokeh.transform import linear_cmap
from bokeh.palettes import Plasma256 as palette
from bokeh.models import ColorBar

p = figure(x_range=x_range, y_range=y_range,
           x_axis_type="mercator", y_axis_type="mercator",
           tools=['pan','box_zoom','wheel_zoom', 'save', 'reset'], active_drag = 'pan',
           width=900, height=400)

# Render the osm map
p.add_tile("OpenStreetMap Mapnik")

# Define a colour mapper based on asking price and a data source to plot all the For Sale properties
mapper = linear_cmap('current_price', palette, 0, 900000)
source = ColumnDataSource(actives_df[actives_df['current_status'] == 'Active'])

glyph = Scatter(x='eastings', y='northings', size=6, fill_color=mapper, line_color=mapper, marker='circle', 
                line_alpha=0.6, fill_alpha=0.6)
p.add_glyph(source, glyph)

# Put the color bar "legend" on the right
color_bar = ColorBar(color_mapper=mapper['transform'], location=(0,0))
p.add_layout(color_bar, 'right')

show(p)


I think that looks pretty good! Next, we'd like to find an intuitive way to communicate to the viewer which are the new properties that have just come to market. This is very useful as it guides the viewer to the new information available in the visualisation vs what they may already have seen and studied before.

We can use shape or marker size for this and to not over-complicate the visualisation we will just have two categories - properties that have come to market within the last 7 days and everything else.

Let's try shape first. We will need to add a categorical feature to our dataframe that indicates whether the property is new or not. To avoid mapping a second time to shape, we can just have this column have the shape string and then use that in the column data source when rendering the glyph.

# Make the shape a star for new properties, a circle otherwise
actives_df['marker_shape'] = np.where(actives_df['dom'] <= 7, "star", "circle")
p = figure(x_range=x_range, y_range=y_range,
           x_axis_type="mercator", y_axis_type="mercator",
           tools=['pan','box_zoom','wheel_zoom', 'save', 'reset'], active_drag = 'pan',
           width=900, height=400)

# Render the osm map
p.add_tile("OpenStreetMap Mapnik")

# Define a colour mapper based on asking price and a data source to plot all the For Sale properties
mapper = linear_cmap('current_price', palette, 0, 900000)
source = ColumnDataSource(actives_df[actives_df['current_status'] == 'Active'])

glyph = Scatter(x='eastings', y='northings', size=6, fill_color=mapper, line_color=mapper, marker='marker_shape', 
                line_alpha=0.6, fill_alpha=0.6)
p.add_glyph(source, glyph)

# Put the color bar "legend" on the right
color_bar = ColorBar(color_mapper=mapper['transform'], location=(0,0))
p.add_layout(color_bar, 'right')

show(p)


Hmmm, it's ok-ish. I can see some of the stars zoomed-out and they're definitely distinct if you zoom-in. Let's try the marker size though and see if that's a little clearer.

# Make the size 10 for new properties, 6 otherwise
actives_df['marker_size'] = np.where(actives_df['dom'] <= 7, 10, 6)
p = figure(x_range=x_range, y_range=y_range,
           x_axis_type="mercator", y_axis_type="mercator",
           tools=['pan','box_zoom','wheel_zoom', 'save', 'reset'], active_drag = 'pan',
           width=900, height=400)

# Render the osm map
p.add_tile("OpenStreetMap Mapnik")

# Define a colour mapper based on asking price and a data source to plot all the For Sale properties
mapper = linear_cmap('current_price', palette, 0, 900000)
source = ColumnDataSource(actives_df[actives_df['current_status'] == 'Active'])

glyph = Scatter(x='eastings', y='northings', size='marker_size', fill_color=mapper, line_color=mapper, marker='circle', 
                line_alpha=0.4, fill_alpha=0.4)
p.add_glyph(source, glyph)

# Put the color bar "legend" on the right
color_bar = ColorBar(color_mapper=mapper['transform'], location=(0,0))
p.add_layout(color_bar, 'right')

show(p)


It looks a little busy to me. It looks to me like we're going to want to use both marker shape and size to communicate the new listings - the star stands out vs a circle but it's a little small vs the same size circle. If we just use circles then the new listings tend to swamp the chart a bit (as we argued ealier). So, here's the combined version that I think is the best of both worlds:

p = figure(x_range=x_range, y_range=y_range,
           x_axis_type="mercator", y_axis_type="mercator",
           tools=['pan','box_zoom','wheel_zoom', 'save', 'reset'], active_drag = 'pan',
           width=900, height=400)

# Render the osm map
p.add_tile("OpenStreetMap Mapnik")

# Define a colour mapper based on asking price and a data source to plot all the For Sale properties
mapper = linear_cmap('current_price', palette, 0, 900000)
source = ColumnDataSource(actives_df[actives_df['current_status'] == 'Active'])

glyph = Scatter(x='eastings', y='northings', size='marker_size', fill_color=mapper, line_color=mapper, marker='marker_shape', 
                line_alpha=0.6, fill_alpha=0.6)
p.add_glyph(source, glyph)

# Put the color bar "legend" on the right
color_bar = ColorBar(color_mapper=mapper['transform'], location=(0,0))
p.add_layout(color_bar, 'right')

show(p)


That's pretty good zoomed-out or zoomed-in - we can clearly see the new listings and the colour clearly indicates the prices. Wouldn't it now be cool to be able to hover over any particular property and find out its listing details? In Bokeh, we can do that quite easily with the hover tool as follows:

from bokeh.models import HoverTool

# Define the fields of the hover tool
hover = HoverTool(tooltips = [
                              ('Address', '@address'),
                              ('Description', '@dwelling_type'),
                              ('Price', '£@current_price'),
                              ('Beds', '@beds'),
                              ('Baths', '@baths'),
                              ('Days on Market', '@dom'),
                              ])

p = figure(x_range=x_range, y_range=y_range,
           x_axis_type="mercator", y_axis_type="mercator",
           tools=[hover, 'pan','box_zoom','wheel_zoom', 'save', 'reset'], active_drag = 'pan',
           width=900, height=400)

# Render the osm map
p.add_tile("OpenStreetMap Mapnik")

# Define a colour mapper based on asking price and a data source to plot all the For Sale properties
mapper = linear_cmap('current_price', palette, 0, 1400000)
source = ColumnDataSource(actives_df[actives_df['current_status'] == 'Active'])

glyph = Scatter(x='eastings', y='northings', size='marker_size', fill_color=mapper, line_color=mapper, marker='marker_shape', 
                line_alpha=0.6, fill_alpha=0.6)
p.add_glyph(source, glyph)

# Put the color bar "legend" on the right
color_bar = ColorBar(color_mapper=mapper['transform'], location=(0,0))
p.add_layout(color_bar, 'right')

show(p)

Note: I have deliberately commented out the address for privacy reasons.


Try hovering the mouse over one of the glyphs and you will see some of the property details come up.

actives_df.to_pickle('actives_df_part3.pkl')