Skip to content

EtzionR/Cumulative-Heatmap-Calculation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

63 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Cumulative-Heatmap-Calculation

Fast calculation of heatmap from given 2D points.

Overview

This code is a follow-up project for calculating a heat map using a recursive algorithm: recursive-HeatMap-calculation. Calculating a heat map is a complex task, because since the user selects a more detailed resolution, the runtime of the calculation increases accordingly. The main difficulty in the calculation is in the sum of all the coordinates for the boundaries of each cell in the heatmap.

The recursive method has managed to result a significant improvement in the runtime of the code (see here). But, there are still modifications that can be made even better results. To calculate the heatmap, the code chm.py applied different approach from the previous project, and use Flooring Division to improve the runtime of the code:

  • Step One: Define the division parameter for the points 2D space, so that we know the length of a side of each square we want to calculate.

  • Step Two: We will use a loop through all the points, and calculate for each point its the row and column, using flooring division, when side is used as a denominator:

    and:

  • Step Three: We will use the row and column we have calculated as KEY in the dictionary, so that we can accumulate for each KEY the amount of points associated with it.

  • Step Four: now we get heatmap dictionary, that maintain for each row and column the number of intersected point to its area.

An illustration of the HeatMap calculation process can be seen in the following plot. As you can see, each dot is added to a specific square on the map, so in the end of the process we get the cumulative result:

cumulative_gif

This method allows a runtime of O(n) and produces only squares that overlap to the given points. This implementation allows extremely fast runtime, even for a large amount of given points. As can be seen, the runtime of the calculation using the cumulative algorithm is significantly faster, relative even to the recursive algorithm:

runtime

The resolution of the heatmap can be adjusted using the "division" variable. This variable determines to how many parts should the points 2D space should divide. The larger the "division" variable, the higher the resolution we get. You can see a diagram describing the split into squares, by each division:

square size

As mentioned, it is important to make sure that a suitable resolution is chosen for the calculation, since different values for the "division" variable will lead to different results. A simple example based on the file Athens.kml, illustrates how different values resulted different outputs:

resolution

The heat map calculation performed using the HeatMap object. This object receives a Python list consisting of tuples of X and Y coordinates. Also, the desired division also must be set for the object. Using the data about the coordinates, the HeatMap calculates the intersections. The object procces the coordinates to heatmap result, such as this example, based on the file data.npy, that contain 1,000,000 points!:

input output

The final squares can be accessed as one of the data of the object: HeatMap(xy_tpl_lst, division = 100).get_map(). The function return list of dictionaries(one for each square), each one contain two value:

  • "xy": conatian the X and Y coordinate values of the square
  • "count": conatian the number of the points intersect with the square

It is also possible to create a plot based on the results of the heat map using the function built into the object, as in the following example:

from chm import HeatMap, loadkml

xy = loadkml(r'examples\Athens.kml')
hm = HeatMap(xy, division=80)
hm.plot()

hm.plot() example

In addition, the calculation results can be exported and saved as files in various formats using the save_map function of the object, as can be seen in the implementation.py file. The function allows to save the results in SHP, KML and CSV format. It should be noted that saving the results as layers is automatically set to geographic coordinate system of WGS_1984_DD.

Also, the data that saved to a KML file gets a color corresponding to the count of its intersection. Example for such output you can see here as interactive MYMAPS page. This output based on the Athens.kml example file and the output also available here: Athens_output.kml. Also, for convenience, the code contains three XY-tuple-list load functions from SHP, KML and CSV files:

  • loadshp
  • loadkml
  • loadcsv (this function required also the X & Y fields names)

The Cumulative-Heatmap-Calculation code was originally written as a standard Python script, but now it rebuild by using Numpy implementation. This new vrsion increase the speed, over the previous version significantly (see here for runtime comparsion). The old version of the code is also available, in the following path: old version

Libraries

The code uses the following libraries in Python:

matplotlib

shapefile

simplekml

pandas

numpy

Application

An application of the code is attached to this page under the name:

implementation.py

the examples outputs are also attached here.

Example for using the code

To use this code, you just need to import it as follows:

# import
from chm import HeatMap, loadshp

# load data
xy = loadshp(r'examples\feature_class.shp')

# define division-resolution variable
divison = 5

# application
hm = HeatMap(xy, divison)

# plot
hm.plot()

# save data as csv
hm.save_map('filename','csv')

When the variables displayed are:

xy: the given xy coordiantes list.

division the required divisions for the 2D space, define the output resolution.

License

MIT © Etzion Harari

Releases

No releases published

Packages

No packages published

Languages