Leuven.MapMatching’s documentation

Align a trace of coordinates (e.g. GPS measurements) to a map of road segments.

The matching is based on a Hidden Markov Model (HMM) with non-emitting states. The model can deal with missing data and you can plug in custom transition and emission probability distributions.

Reference:

Meert Wannes, Mathias Verbeke, “HMM with Non-Emitting States for Map Matching”, European Conference on Data Analysis (ECDA), Paderborn, Germany, 2018.
example

Installation

Dependencies

Required:

Optional (only loaded when methods are called that rely on these packages):

Using pip

If you want to install the latest released version using pip:

$ pip install leuvenmapmatching

If you want to install the latest non-released version (add @develop) for the latest development version:

$ pip install git+https://github.com/wannesm/leuvenmapmatching

From source

The library can also be compiled and/or installed directly from source.

Examples

Example 1: Simple

A first, simple example. Some parameters are given to tune the algorithm. The max_dist and obs_noise are distances that indicate the maximal distance between observation and road segment and the expected noise in the measurements, respectively. The min_prob_norm prunes the lattice in that it drops paths that drop below 0.5 normalized probability. The probability is normalized to allow for easier reasoning about the probability of a path. It is computed as the exponential smoothed log probability components instead of the sum as would be the case for log likelihood. Because the number of possible paths quickly grows, it is recommended to set the max_lattice_width argument to speed up the algorithm (available from version 1.0 onwards). It will only continue the search with this number of possible paths at every step. If no solution is found, this value can be incremented using the increase_max_lattice_width method.

from leuvenmapmatching.matcher.distance import DistanceMatcher
from leuvenmapmatching.map.inmem import InMemMap

map_con = InMemMap("mymap", graph={
    "A": ((1, 1), ["B", "C", "X"]),
    "B": ((1, 3), ["A", "C", "D", "K"]),
    "C": ((2, 2), ["A", "B", "D", "E", "X", "Y"]),
    "D": ((2, 4), ["B", "C", "F", "E", "K", "L"]),
    "E": ((3, 3), ["C", "D", "F", "Y"]),
    "F": ((3, 5), ["D", "E", "L"]),
    "X": ((2, 0), ["A", "C", "Y"]),
    "Y": ((3, 1), ["X", "C", "E"]),
    "K": ((1, 5), ["B", "D", "L"]),
    "L": ((2, 6), ["K", "D", "F"])
}, use_latlon=False)

path = [(0.8, 0.7), (0.9, 0.7), (1.1, 1.0), (1.2, 1.5), (1.2, 1.6), (1.1, 2.0),
        (1.1, 2.3), (1.3, 2.9), (1.2, 3.1), (1.5, 3.2), (1.8, 3.5), (2.0, 3.7),
        (2.3, 3.5), (2.4, 3.2), (2.6, 3.1), (2.9, 3.1), (3.0, 3.2),
        (3.1, 3.8), (3.0, 4.0), (3.1, 4.3), (3.1, 4.6), (3.0, 4.9)]

matcher = DistanceMatcher(map_con, max_dist=2, obs_noise=1, min_prob_norm=0.5, max_lattice_width=5)
states, _ = matcher.match(path)
nodes = matcher.path_pred_onlynodes

print("States\n------")
print(states)
print("Nodes\n------")
print(nodes)
print("")
matcher.print_lattice_stats()

Example 2: Non-emitting states

In case there are less observations that states (an assumption of HMMs), non-emittings states allow you to deal with this. States will be inserted that are not associated with any of the given observations if this improves the probability of the path.

It is possible to also associate a distribtion over the distance between observations and the non-emitting states (obs_noise_ne). This allows the algorithm to prefer nearby road segments. This value should be larger than obs_noise as it is mapped to the line between the previous and next observation, which does not necessarily run over the relevant segment. Setting this to infinity is the same as using pure non-emitting states that ignore observations completely.

from leuvenmapmatching.matcher.distance import DistanceMatcher
from leuvenmapmatching.map.inmem import InMemMap
from leuvenmapmatching import visualization as mmviz

path = [(1, 0), (7.5, 0.65), (10.1, 1.9)]
mapdb = InMemMap("mymap", graph={
    "A": ((1, 0.00), ["B"]),
    "B": ((3, 0.00), ["A", "C"]),
    "C": ((4, 0.70), ["B", "D"]),
    "D": ((5, 1.00), ["C", "E"]),
    "E": ((6, 1.00), ["D", "F"]),
    "F": ((7, 0.70), ["E", "G"]),
    "G": ((8, 0.00), ["F", "H"]),
    "H": ((10, 0.0), ["G", "I"]),
    "I": ((10, 2.0), ["H"])
}, use_latlon=False)
matcher = DistanceMatcher(mapdb, max_dist_init=0.2, obs_noise=1, obs_noise_ne=10,
                          non_emitting_states=True, only_edges=True , max_lattice_width=5)
states, _ = matcher.match(path)
nodes = matcher.path_pred_onlynodes

print("States\n------")
print(states)
print("Nodes\n------")
print(nodes)
print("")
matcher.print_lattice_stats()

mmviz.plot_map(mapdb, matcher=matcher,
              show_labels=True, show_matching=True
              filename="output.png"))

Map from OpenStreetMap

You can download a graph for map-matching from the OpenStreetMap.org service. Multiple methods exists, we illustrate two.

Using requests, osmread and gpx

You can perform map matching on a OpenStreetMap database by combing leuvenmapmatching with the packages requests, osmread and gpx.

Download a map as XML

You can use the overpass-api.de service:

from pathlib import Path
import requests
xml_file = Path(".") / "osm.xml"
url = 'http://overpass-api.de/api/map?bbox=4.694933,50.870047,4.709256000000001,50.879628'
r = requests.get(url, stream=True)
with xml_file.open('wb') as ofile:
    for chunk in r.iter_content(chunk_size=1024):
        if chunk:
            ofile.write(chunk)

Create graph using osmread

Once we have a file containing the region we are interested in, we can select the roads we want to use to create a graph from. In this case we focus on ‘ways’ with a ‘highway’ tag. Those represent a variety of roads. For a more detailed filtering look at the possible values of the highway tag.

from leuvenmapmatching.map.inmem import InMemMap
import osmread

map_con = InMemMap("myosm", use_latlon=True, use_rtree=True, index_edges=True)
for entity in osmread.parse_file(str(xml_file)):
    if isinstance(entity, osmread.Way) and 'highway' in entity.tags:
        for node_a, node_b in zip(entity.nodes, entity.nodes[1:]):
            map_con.add_edge(node_a, node_b)
            # Some roads are one-way. We'll add both directions.
            map_con.add_edge(node_b, node_a)
    if isinstance(entity, osmread.Node):
        map_con.add_node(entity.id, (entity.lat, entity.lon))
map_con.purge()

Note that InMemMap is a simple container for a map. It is recommended to use your own optimized connecter to your map dataset.

If you want to allow transitions that are not following the exact road segments you can inherit from the Map class and define a new class with your own transitions. The transitions are defined using the nodes_nbrto and edges_nbrt methods.

Perform map matching on an OpenStreetMap database

You can create a list of latitude-longitude coordinates manually. Or read a gpx file.

from leuvenmapmatching.util.gpx import gpx_to_path

track = gpx_to_path("mytrack.gpx")
matcher = DistanceMatcher(map_con,
                         max_dist=100, max_dist_init=25,  # meter
                         min_prob_norm=0.001,
                         non_emitting_length_factor=0.75,
                         obs_noise=50, obs_noise_ne=75,  # meter
                         dist_noise=50,  # meter
                         non_emitting_states=True,
                         max_lattice_width=5)
states, lastidx = matcher.match(track)

Using osmnx and geopandas

Another great library to interact with OpenStreetMap data is the osmnx package. The osmnx package can retrieve relevant data automatically, for example when given a name of a region. This package is build on top of the geopandas package.

import osmnx
graph = ox.graph_from_place('Leuven, Belgium', network_type='drive', simplify=False)
graph_proj = ox.project_graph(graph)

# Create GeoDataFrames (gdfs)
# Approach 1
nodes_proj, edges_proj = ox.graph_to_gdfs(graph_proj, nodes=True, edges=True)
for nid, row in nodes_proj[['x', 'y']].iterrows():
    map_con.add_node(nid, (row['x'], row['y']))
for eid, _ in edges_proj.iterrows():
    map_con.add_edge(eid[0], eid[1])

# Approach 2
nodes, edges = ox.graph_to_gdfs(graph_proj, nodes=True, edges=True)
nodes_proj = nodes.to_crs("EPSG:3395")
edges_proj = edges.to_crs("EPSG:3395")
for nid, row in nodes_proj.iterrows():
    map_con.add_node(nid, (row['lat'], row['lon']))
# We can also extract edges also directly from networkx graph
for nid1, nid2, _ in graph.edges:
    map_con.add_edge(nid1, nid2)

Visualisation

To inspect the results, a plotting function is included.

Simple plotting

To plot the graph in a matplotlib figure use:

from leuvenmapmatching import visualization as mmviz
mmviz.plot_map(map_con, matcher=matcher,
               show_labels=True, show_matching=True, show_graph=True,
               filename="my_plot.png")

This will result in the following figure:

Plot1

You can also define your own figure by passing a matplotlib axis object:

fig, ax = plt.subplots(1, 1)
mmviz.plot_map(map_con, matcher=matcher,
               ax=ax,
               show_labels=True, show_matching=True, show_graph=True,
               filename="my_plot.png")

Plotting with an OpenStreetMap background

The plotting function also supports a link with the smopy package. Set the use_osm argument to true and pass a map that is defined with latitude-longitude (thus use_latlon=True).

You can set zoom_path to true to only see the relevant part and not the entire map that is available in the map. Alternatively you can also set the bounding box manually using the bb argument.

mm_viz.plot_map(map_con, matcher=matcher,
                use_osm=True, zoom_path=True,
                show_labels=False, show_matching=True, show_graph=False,
                filename="my_osm_plot.png")

This will result in the following figure:

Plot2

Or when some GPS points are missing in the track, the matching is more visible as the matched route deviates from the straight line between two GPS points:

Plot3

Dealing with Latitude-Longitude

The toolbox can deal with latitude-longitude coordinates directly. Map matching, however, requires a lot of repeated computations between points and latitude-longitude computations will be more expensive than Euclidean distances.

There are three different options how you can handle latitude-longitude coordinates:

Option 1: Use Latitude-Longitude directly

Set the use_latlon flag in the Map to true.

For example to read in an OpenStreetMap file directly to a InMemMap object:

from leuvenmapmatching.map.inmem import InMemMap

map_con = InMemMap("myosm", use_latlon=True)

for entity in osmread.parse_file(osm_fn):
    if isinstance(entity, osmread.Way) and 'highway' in entity.tags:
        for node_a, node_b in zip(entity.nodes, entity.nodes[1:]):
            map_con.add_edge(node_a, node_b)
            map_con.add_edge(node_b, node_a)
    if isinstance(entity, osmread.Node):
        map_con.add_node(entity.id, (entity.lat, entity.lon))
map_con.purge()

Option 2: Project Latitude-Longitude to X-Y

Latitude-Longitude coordinates can be transformed two a frame with two orthogonal axis.

from leuvenmapmatching.map.inmem import InMemMap

map_con_latlon = InMemMap("myosm", use_latlon=True)
# Add edges/nodes
map_con_xy = map_con_latlon.to_xy()

route_latlon = []
# Add GPS locations
route_xy = [map_con_xy.latlon2yx(latlon) for latlon in route_latlon]

This can also be done directly using the pyproj toolbox. For example, using the Lambert Conformal projection to project the route GPS coordinates:

import pyproj

route = [(4.67878,50.864),(4.68054,50.86381),(4.68098,50.86332),(4.68129,50.86303),(4.6817,50.86284),
         (4.68277,50.86371),(4.68894,50.86895),(4.69344,50.86987),(4.69354,50.86992),(4.69427,50.87157),
         (4.69643,50.87315),(4.69768,50.87552),(4.6997,50.87828)]
lon_0, lat_0 = route[0]
proj = pyproj.Proj(f"+proj=merc +ellps=GRS80 +units=m +lon_0={lon_0} +lat_0={lat_0} +lat_ts={lat_0} +no_defs")
xs, ys = [], []
for lon, lat in route:
    x, y = proj(lon, lat)
    xs.append(x)
    ys.append(y)

Notice that the pyproj package uses the convention to express coordinates as x-y which is longitude-latitude because it is defined this way in the CRS definitions while the Leuven.MapMatching toolbox follows the ISO 6709 standard and expresses coordinates as latitude-longitude. If you want pyproj to use latitude-longitude you can use set the axisswap option.

If you want to define both the from and to projections:

import pyproj

route = [(4.67878,50.864),(4.68054,50.86381),(4.68098,50.86332),(4.68129,50.86303),(4.6817,50.86284),
         (4.68277,50.86371),(4.68894,50.86895),(4.69344,50.86987),(4.69354,50.86992),(4.69427,50.87157),
         (4.69643,50.87315),(4.69768,50.87552),(4.6997,50.87828)]
p1 = pyproj.Proj(proj='latlon', datum='WGS84')
p2 = pyproj.Proj(proj='utm', datum='WGS84')
xs, ys = [], []
for lon, lat in route:
    x, y = pyproj.transform(lon, lat)
    xs.append(x)
    ys.append(y)

Option 3: Use Latitude-Longitude as if they are X-Y points

A naive solution would be to use latitude-longitude coordinate pairs as if they are X-Y coordinates. For small distances, far away from the poles and not crossing the dateline, this option might work. But it is not adviced.

For example, for long distances the error is quite large. In the image beneath, the blue line is the computation of the intersection using latitude-longitude while the red line is the intersection using Eucludean distances.

Latitude-Longitude mismatch
Latitude-Longitude mismatch detail

Custom probability distributions

You can use your own custom probability distributions for the transition and emission probabilities. This is achieved by inheriting from the BaseMatcher class.

Examples are available in the SimpleMatching class and DistanceMatching class. The latter implements a variation based on Newson and Krumm (2009).

Transition probability distribution

Overwrite the logprob_trans() method.

For example, if you want to use a uniform distribution over the possible road segments:

def logprob_trans(self, prev_m, edge_m, edge_o, is_prev_ne, is_next_ne):
    return -math.log(len(self.matcher.map.nodes_nbrto(self.edge_m.last_point())))

Note that prev_m.edge_m and edge_m are not necessarily connected. For example if the Map object returns a neighbor state that is not connected in the roadmap. This functionality is used to allow switching lanes.

Emission probability distribution

Overwrite the logprob_obs() method for non-emitting nodes. These methods are given the closest distance as dist, the previous Matching object in the lattice, the state as edge_m, and the observation as edge_o. The latter two are Segment objects that can represent either a segment or a point. Each segment also has a project point which is the point on the segment that is the closest point.

For example, a simple step function with more tolerance for non-emitting nodes:

def logprob_obs(self, dist, prev_m, new_edge_m, new_edge_o, is_ne):
    if is_ne:
        if dist < 50:
            return -math.log(50)
    else:
        if dist < 10:
            return -math.log(10)
    return -np.inf

Note that an emission probability can be given for a non-emitting node. This allows you to rank non-emitting nodes even when no observations are available. It will then insert pseudo-observations on the line between the previous and next observations. To have a pure non-emitting node, the logprob_obs method should always return 0 if the is_ne argument is true.

Custom lattice objects

If you need to store additional information in the lattice, inherit from the Matching class and pass your custom object to the Matcher object.

from leuvenmapmatching.map.base import BaseMatching

class MyMatching(BaseMatching):
    ...

matcher = MyMatcher(mapdb, matching=MyMatching)

Incremental matching

Example: Incremental matching

If the observations are collected in a streaming setting. The matching can also be invoked incrementally. The lattice will be built further every time a new subsequence of the path is given.

from leuvenmapmatching.matcher.distance import DistanceMatcher
from leuvenmapmatching.map.inmemmap import InMemMap

map_con = InMemMap("mymap", graph={
    "A": ((1, 1), ["B", "C", "X"]),
    "B": ((1, 3), ["A", "C", "D", "K"]),
    "C": ((2, 2), ["A", "B", "D", "E", "X", "Y"]),
    "D": ((2, 4), ["B", "C", "D", "E", "K", "L"]),
    "E": ((3, 3), ["C", "D", "F", "Y"]),
    "F": ((3, 5), ["D", "E", "L"]),
    "X": ((2, 0), ["A", "C", "Y"]),
    "Y": ((3, 1), ["X", "C", "E"]),
    "K": ((1, 5), ["B", "D", "L"]),
    "L": ((2, 6), ["K", "D", "F"])
}, use_latlon=False)

path = [(0.8, 0.7), (0.9, 0.7), (1.1, 1.0), (1.2, 1.5), (1.2, 1.6), (1.1, 2.0),
        (1.1, 2.3), (1.3, 2.9), (1.2, 3.1), (1.5, 3.2), (1.8, 3.5), (2.0, 3.7),
        (2.3, 3.5), (2.4, 3.2), (2.6, 3.1), (2.9, 3.1), (3.0, 3.2),
        (3.1, 3.8), (3.0, 4.0), (3.1, 4.3), (3.1, 4.6), (3.0, 4.9)]

matcher = DistanceMatcher(map_con, max_dist=2, obs_noise=1, min_prob_norm=0.5)
states, _ = matcher.match(path[:5])
states, _ = matcher.match(path, expand=True)
nodes = matcher.path_pred_onlynodes

print("States\n------")
print(states)
print("Nodes\n------")
print(nodes)
print("")
matcher.print_lattice_stats()

Debug

Increasing the verbosity level

To inspect the intermediate steps that the algorithm take, you can increase the verbosity level of the package. For example:

import sys
import logging
import leuvenmapmatching
logger = leuvenmapmatching.logger

logger.setLevel(logging.DEBUG)
logger.addHandler(logging.StreamHandler(sys.stdout))

Inspect the best matching

The best match is available in matcher.lattice_best. This is a list of Matching objects. For example after running the first example in the introduction:

>>> matcher.lattice_best
[Matching<A-B-0-0>,
 Matching<A-B-1-0>,
 Matching<A-B-2-0>,
...

A matching object summarizes its information as a tuple with three values if the best match is with a vertex: <label-observation-nonemitting>. And a tuple with four values if the best match is with an edge: <labelstart-labelend-observation-nonemitting>.

In the example above, the first observation (with index 0) is matched to a point on the edge A-B. If you want to inspect the exact locations, you can query the Segment objects that express the observation and map: matching.edge_o and matching.edge_m.

>>> match = matcher.lattice_best[0]
>>> match.edge_m.l1, match.edge_m.l2  # Edge start/end labels
('A', 'B')
>>> match.edge_m.pi  # Best point on A-B edge
(1.0, 1.0)
>>> match.edge_m.p1, match.edge_m.p2  # Locations of A and B
((1, 1), (1, 3))
>>> match.edge_o.l1, match.edge_o.l2  # Observation
('O0', None)
>>> match.edge_o.pi  # Location of observation O0, because no second location
(0.8, 0.7)
>>> match.edge_o.p1  # Same as pi because no interpolation
(0.8, 0.7)

Inspect the matching lattice

All paths through the lattice are available in matcher.lattice. The lattice is a dictionary with a LatticeColumn object for each observation (in case the full path of observations is matched).

For each observation, you can inspect the Matching objects with:

>>> matcher.lattice
{0: <leuvenmapmatching.matcher.base.LatticeColumn at 0x12369bf40>,
 1: <leuvenmapmatching.matcher.base.LatticeColumn at 0x123639dc0>,
 2: <leuvenmapmatching.matcher.base.LatticeColumn at 0x123603f40>,
 ...
>>> matcher.lattice[0].values_all()
{Matching<A-B-0-0>,
 Matching<A-B-0-1>,
 Matching<A-C-0-0>,
 ...

To start backtracking you can, for example, see which matching object for the last element has the highest probability (thus the best match):

>>> m = max(matcher.lattice[len(path)-1].values_all(), key=lambda m: m.logprob)
>>> m.logprob
-0.6835815469734807

The previous matching objects can be queried with. These are only those matches that are connected to this matchin the lattice (in this case nodes in the street graph with an edge to the current node):

>>> m.prev  # Best previous match with a connection (multiple if equal probability)
{Matching<E-F-20-0>}
>>> m.prev_other  # All previous matches in the lattice with a connection
{Matching<C-E-20-0>,
 Matching<D-E-20-0>,
 Matching<F-E-20-0>,
 Matching<Y-E-20-0>}

matcher

BaseMatcher

This a generic base class to be used by matchers. This class itself does not implement a working matcher. Use a matcher such as SimpleMatcher, DistanceMatcher, …

class leuvenmapmatching.matcher.base.BaseMatcher(map_con, obs_noise=1, max_dist_init=None, max_dist=None, min_prob_norm=None, non_emitting_states=True, max_lattice_width=None, only_edges=True, obs_noise_ne=None, matching=<class 'leuvenmapmatching.matcher.base.BaseMatching'>, non_emitting_length_factor=0.75, **kwargs)[source]

Initialize a matcher for map matching.

This a generic base class to be used by matchers. This class itself does not implement a working matcher.

Distances are in meters when using latitude-longitude.

Parameters:
  • map_con – Map object to connect to map database
  • obs_noise – Standard deviation of noise
  • obs_noise_ne – Standard deviation of noise for non-emitting states (is set to obs_noise if not give)
  • max_dist_init – Maximum distance from start location (if not given, uses max_dist)
  • max_dist – Maximum distance from path (this is a hard cut, min_prob_norm should be better)
  • min_prob_norm – Minimum normalized probability of observations (ema)
  • non_emitting_states – Allow non-emitting states. A non-emitting state is a state that is not associated with an observation. Here we assume it can be associated with a location in between two observations to allow for pruning. It is advised to set min_prob_norm and/or max_dist to avoid visiting all possible nodes in the graph.
  • max_lattice_width – Only continue from a limited number of states (thus locations) for a given observation. This possibly speeds up the matching by a lot. If there are more possible next states, the states with the best likelihood so far are selected. The other states are ‘delayed’. If the matching is continued later with a larger value using increase_max_lattice_width, the algorithms continuous from these delayed states.
  • only_edges – Do not include nodes as states, only edges. This is the typical setting for HMM methods.
  • matching – Matching type
  • non_emitting_length_factor – Reduce the probability of a sequence of non-emitting states the longer it is. This can be used to prefer shorter paths. This is separate from the transition probabilities because transition probabilities are averaged for non-emitting states and thus the length is also averaged out.

To define a custom transition and/or emission probability distribtion, overwrite the following functions:

copy_lastinterface(nb_interfaces=1)[source]

Copy the current matcher and keep the last interface as the start point.

This method allows you to perform incremental matching without keeping the entire lattice in memory.

You need to run match_incremental() on this object to continue from the existing (partial) lattice. Otherwise, if you use match(), it will be overwritten.

Open question, if there is no need to keep track of older lattices, it will probably be more efficient to clear the older parts of the interface instead of copying the newer parts.

Parameters:nb_interfaces – Nb of interfaces (columns in lattice) to keep. Default is 1, the last one.
Returns:new Matcher object
get_matching_path(start_m)[source]

List of Matching objects that end in the given Matching object.

get_node_path(start_m, only_nodes=False)[source]

List of node/edge names that end in the given Matching object.

inspect_early_stopping()[source]

Analyze the lattice and try to find most plausible reason why the matching stopped early and print to stdout.

lattice_dot(file=None, precision=None, render=False)[source]

Write the lattice as a Graphviz DOT file.

Parameters:
  • file – File object to print to. Prints to stdout if None.
  • precision – Precision of (log) probabilities.
  • render – Try to render the generated Graphviz file.
logprob_obs(dist, prev_m, new_edge_m, new_edge_o, is_ne=False)[source]

Emission probability.

Note: In contrast with a regular HMM, this cannot be a probability density function, it needs
to be a proper probability (thus values between 0.0 and 1.0).
Returns:probability, properties that are passed to the matching object
logprob_trans(prev_m, edge_m, edge_o, is_prev_ne=False, is_next_ne=False)[source]

Transition probability.

Note: In contrast with a regular HMM, this cannot be a probability density function, it needs
to be a proper probability (thus values between 0.0 and 1.0).
Returns:probability, properties that are passed to the matching object
match(path, unique=False, tqdm=None, expand=False)[source]

Dynamic Programming based (HMM-like) map matcher.

If the matcher fails to match the entire path, the last matched index is returned. This index can be used to run the matcher again from that observation onwards.

Parameters:
  • path – list[Union[tuple[lat, lon], tuple[lat, lon, time]]
  • unique – Only retain unique nodes in the sequence (avoid repetitions)
  • tqdm – Use a tqdm progress reporter (default is None)
  • expand – Expand the current lattice (delayed matches)
Returns:

Tuple of (List of BaseMatching, index of last observation that was matched)

match_gpx(gpx_file, unique=True)[source]

Map matching from a gpx file

node_path_to_only_nodes(path)[source]

Path of nodes and edges to only nodes.

Parameters:path – List of node names or edges as (node name, node name)
Returns:List of node names
path_all_distances()[source]

Return a list of all distances between observed trace and map.

One entry for each point in the map and point in the trace that are mapped to each other. In case non-emitting nodes are used, extra entries can be present where a point in the trace or a point in the map is mapped to a segment.

path_bb()[source]

Get boundig box of matched path (if it exists, otherwise return None).

path_distance()[source]

Total distance of the observations.

path_pred

The matched path, both nodes and/or edges (depending on your settings).

path_pred_distance()[source]

Total distance of the matched path.

path_pred_onlynodes

A list with all the nodes (no edges) the matched path passes through.

SimpleMatcher

class leuvenmapmatching.matcher.simple.SimpleMatcher(*args, **kwargs)[source]

A simple matcher that prefers paths where each matched location is as close as possible to the observed position.

Parameters:
  • avoid_goingback – Change the transition probability to be lower for the direction the path is coming from.
  • kwargs – Arguments passed to BaseMatcher.
logprob_obs(dist, prev_m, new_edge_m, new_edge_o, is_ne=False)[source]

Emission probability.

Note: In contrast with a regular HMM, this is not a probability density function, it needs
to be a proper probability (thus values between 0.0 and 1.0).
logprob_trans(prev_m: leuvenmapmatching.matcher.base.BaseMatching, edge_m: leuvenmapmatching.util.segment.Segment, edge_o: leuvenmapmatching.util.segment.Segment, is_prev_ne=False, is_next_ne=False)[source]

Transition probability.

Note: In contrast with a regular HMM, this is not a probability density function, it needs
to be a proper probability (thus values between 0.0 and 1.0).

DistanceMatcher

class leuvenmapmatching.matcher.distance.DistanceMatcher(*args, **kwargs)[source]

Map Matching that takes into account the distance between matched locations on the map compared to the distance between the observations (that are matched to these locations). It thus prefers matched paths that have a similar distance than the observations.

Inspired on the method presented in:

P. Newson and J. Krumm. Hidden markov map matching through noise and sparseness. In Proceedings of the 17th ACM SIGSPATIAL international conference on advances in geographic information systems, pages 336–343. ACM, 2009.

The options available in :class:BaseMatcher are inherited. Additionally, this class offers:

  • Transition probability is lower if the distance between observations and states is different
  • Transition probability is lower if the the next match is going back on an edge or to a previous edge
  • Transition probability is lower if two neighboring states represent not connected edges
  • Skip non-emitting states if distance between states and observations is close to each other

Create a new object.

Parameters:
  • map_con – Map object to connect to map database
  • obs_noise – Standard deviation of noise
  • obs_noise_ne – Standard deviation of noise for non-emitting states (is set to obs_noise if not given)
  • max_dist_init – Maximum distance from start location (if not given, uses max_dist)
  • max_dist – Maximum distance from path (this is a hard cut, min_prob_norm should be better)
  • min_prob_norm – Minimum normalized probability of observations (ema)
  • non_emitting_states – Allow non-emitting states. A non-emitting state is a state that is not associated with an observation. Here we assume it can be associated with a location in between two observations to allow for pruning. It is advised to set min_prob_norm and/or max_dist to avoid visiting all possible nodes in the graph.
  • non_emitting_length_factor – Reduce the probability of a sequence of non-emitting states the longer it is. This can be used to prefer shorter paths. This is separate from the transition probabilities because transition probabilities are averaged for non-emitting states and thus the length is also averaged out.
  • max_lattice_width – Restrict the lattice (or possible candidate states per observation) to this value. If there are more possible next states, the states with the best likelihood so far are selected.
  • dist_noise – Standard deviation of difference between distance between states and distance between observatoins. If not given, set to obs_noise
  • dist_noise_ne – If not given, set to dist_noise
  • restrained_ne – Avoid non-emitting states if the distance between states and between observations is close to each other.
  • avoid_goingback – If true, the probability is lowered for a transition that returns back to a previous edges or returns to a position on an edge.
  • args – Arguments for BaseMatcher
  • kwargs – Arguments for BaseMatcher
logprob_obs(dist, prev_m=None, new_edge_m=None, new_edge_o=None, is_ne=False)[source]

Emission probability for emitting states.

Exponential family: \(P(dt) = exp(-d_o^2 / (2 * obs_{noise}^2))\)

with \(d_o = |loc_{state} - loc_{obs}|\)

logprob_trans(prev_m, edge_m, edge_o, is_prev_ne=False, is_next_ne=False)[source]

Transition probability.

The probability is defined with a formula from the exponential family. \(P(dt) = exp(-d_t^2 / (2 * dist_{noise}^2))\)

with \(d_t = |d_s - d_o|, d_s = |loc_{prev\_state} - loc_{cur\_state}|, d_o = |loc_{prev\_obs} - loc_{cur\_obs}|\)

This function is more tolerant for low values. The intuition is that values under a certain distance should all be close to probability 1.0.

Note: We should also smooth the distance between observations to handle outliers better.

Parameters:
  • prev_m – Previous matching / state
  • edge_m – Edge between matchings / states
  • edge_o – Edge between observations
  • is_prev_ne – Is previous state non-emitting
  • is_next_ne – Is the next state non-emitting
  • dist_o – First output of distance_progress
  • dist_m – Second output of distance_progress
Returns:

BaseMatching

class leuvenmapmatching.matcher.base.BaseMatching(matcher: leuvenmapmatching.matcher.base.BaseMatcher, edge_m: leuvenmapmatching.util.segment.Segment, edge_o: leuvenmapmatching.util.segment.Segment, logprob=-inf, logprobema=-inf, logprobe=-inf, logprobne=-inf, dist_obs: float = 0.0, obs: int = 0, obs_ne: int = 0, prev: Optional[Set[BaseMatching]] = None, stop: bool = False, length: int = 1, delayed: int = 0, **_kwargs)[source]

Matching object that represents a node in the Viterbi lattice.

Parameters:
  • matcher – Reference to the Matcher used to generate this matching object.
  • edge_m – Segment in the given graph (thus line between two nodes in the graph).
  • edge_o – Segment in the given observations (thus line in between two observations).
  • logprob – Log probability of this matching.
  • logprobema – Exponential Mean Average of Log probability.
  • logprobe – Emitting
  • logprobne – Non-emitting
  • dist_obs – Distance between map point and observation
  • obs – Reference to path entry index (observation)
  • obs_ne – Number of non-emitting states for this observation
  • prev – Previous best matching objects
  • stop – Stop after this matching (e.g. because probability is too low)
  • length – Lenght of current matching sequence through lattice.
  • delayed – This matching is temporarily stopped if >0 (e.g. to first explore better options).
  • dist_m – Distance over graph
  • dist_o – Distance over observations
  • _kwargs
classmethod first(logprob_init, edge_m, edge_o, matcher, dist_obs)[source]

Create an initial lattice Matching object.

key

Key that indicates the node or edge, observation and non-emitting step. This is the unique key that is used in the lattice.

next(edge_m: leuvenmapmatching.util.segment.Segment, edge_o: leuvenmapmatching.util.segment.Segment, obs: int = 0, obs_ne: int = 0)[source]

Create a next lattice Matching object with this Matching object as the previous one in the lattice.

prune_value

Pruning the lattice (e.g. to delay) is based on this key.

shortkey

Key that indicates the node or edge. Irrespective of the current observation.

update(m_next)[source]

Update the current entry if the new matching object for this state is better.

Parameters:m_next – The new matching object representing the same node in the lattice.
Returns:True if the current object is replaced, False otherwise

map

BaseMap

class leuvenmapmatching.map.base.BaseMap(name, use_latlon=True)[source]

Abstract class for a Map.

Simple database wrapper/stub.

all_edges(bb=None)[source]

All edges.

Returns:[(key_a, loc_a, key_b, loc_b)]
all_nodes(bb=None)[source]

All node keys and coordinates.

Returns:[(key, (lat, lon))]
bb()[source]

Bounding box.

Returns:(lat_min, lon_min, lat_max, lon_max)
edges_closeto(loc, max_dist=None, max_elmt=None)[source]

Find edges close to a certain location.

Parameters:
  • loc – Latitude, Longitude
  • max_dist – Maximal distance that returned nodes can be from lat-lon
  • max_elmt – Maximal number of elements returned after sorting according to distance.
Returns:

list[tuple[dist, label, loc]]

edges_nbrto(edge)[source]

Return all edges that are linked to edge.

Defaults to nodes_nbrto.

Parameters:edge – Edge identifier
Returns:list[tuple[label1, label2, loc1, loc2]]
labels()[source]

Labels of all nodes.

node_coordinates(node_key)[source]

Coordinates for given node key.

nodes_closeto(loc, max_dist=None, max_elmt=None)[source]

Find nodes close to a certain location.

Parameters:
  • loc – Latitude, Longitude
  • max_dist – Maximal distance that returned nodes can be from lat-lon
  • max_elmt – Maximal number of elements returned after sorting according to distance.
Returns:

list[tuple[dist, label, loc]]

nodes_nbrto(node)[source]

Return all nodes that are linked to node.

Parameters:node – Node identifier
Returns:list[tuple[label, loc]]
size()[source]

Number of nodes.

InMemMap

class leuvenmapmatching.map.inmem.InMemMap(name, use_latlon=True, use_rtree=False, index_edges=False, crs_lonlat=None, crs_xy=None, graph=None, linked_edges=None, dir=None, deserializing=False)[source]

In-memory representation of a map.

This is a simple database-like object to perform experiments with map matching. For production purposes it is recommended to use your own derived class (e.g. to connect to your database instance).

This class supports:

  • Indexing using rtrees to allow for fast searching of points on the map. When using the rtree index, only integer numbers are allowed as node labels.
  • Serializing to write and read from files.
  • Projecting points to a different frame (e.g. GPS to Lambert)
Parameters:
  • name – Map name (mandatory)
  • use_latlon – The locations represent latitude-longitude pairs, otherwise y-x coordinates are assumed.
  • use_rtree – Build an rtree index to quickly search for locations.
  • index_edges – Build an index for the edges in the map instead of the vertices.
  • crs_lonlat – Coordinate reference system for the latitude-longitude coordinates.
  • crs_xy – Coordiante reference system for the y-x coordinates.
  • graph – Initial graph of form Dict[label, Tuple[Tuple[y,x], List[neighbor]]]]
  • dir – Directory where to serialize to. If given, the rtree index structure will be written to a file immediately.
  • deserializing – Internal variable to indicate that the object is being build from a file.
add_edge(node_a, node_b)[source]

Add new edge to the map.

Parameters:
  • node_a – Label for the node that is the start of the edge
  • node_b – Label for the node that is the end of the edge
add_node(node, loc)[source]

Add new node to the map.

Parameters:
  • node – label
  • loc – (lat, lon) or (y, x)
all_edges(bb=None)[source]

Return all edges.

Parameters:bb – Bounding box
Returns:(key_a, loc_a, nbr, loc_b)
all_nodes(bb=None)[source]

Return all nodes.

Parameters:bb – Bounding box
Returns:
bb()[source]

Bounding box.

Returns:(lat_min, lon_min, lat_max, lon_max) or (y_min, x_min, y_max, x_max)
classmethod deserialize(data)[source]

Create a new instance from a dictionary.

dump()[source]

Serialize map using pickle.

All files will be saved to the dir directory using the name as filename.

edges_closeto(loc, max_dist=None, max_elmt=None)[source]

Return all nodes that are on an edge that is close to the given location.

Parameters:
  • loc – Location
  • max_dist – Maximal distance from the location
  • max_elmt – Return only the most nearby nodes
edges_nbrto(edge)[source]

Return all edges that are linked to edge.

Defaults to nodes_nbrto.

Parameters:edge – Edge identifier
Returns:list[tuple[label1, label2, loc1, loc2]]
find_duplicates(func=None)[source]

Find entries with identical locations.

classmethod from_pickle(filename)[source]

Deserialize map using pickle to the given filename.

labels()[source]

All labels.

node_coordinates(node_key)[source]

Get the coordinates of the given node.

Parameters:node_key – Node label/key
Returns:(lat, lon)
nodes_closeto(loc, max_dist=None, max_elmt=None)[source]

Return all nodes close to the given location.

Parameters:
  • loc – Location
  • max_dist – Maximal distance from the location
  • max_elmt – Return only the most nearby nodes
nodes_nbrto(node)[source]

Return all nodes that are linked to node.

Parameters:node – Node identifier
Returns:list[tuple[label, loc]]
serialize()[source]

Create a serializable data structure.

size()[source]

Number of nodes.

to_xy(name=None, use_rtree=None)[source]

Create a map that uses a projected XY representation on which Euclidean distances can be used.

SqliteMap

class leuvenmapmatching.map.sqlite.SqliteMap(name, use_latlon=True, crs_lonlat=None, crs_xy=None, dir=None, deserializing=False)[source]

Store a map as a SQLite instance.

This class supports:

  • Indexing using rtrees to allow for fast searching of points on the map. When using the rtree index, only integer numbers are allowed as node labels.
  • Serializing to write and read from files.
  • Projecting points to a different frame (e.g. GPS to Lambert)
Parameters:
  • name – Name of database file
  • use_latlon – The locations represent latitude-longitude pairs, otherwise y-x coordinates are assumed.
  • crs_lonlat – Coordinate reference system for the latitude-longitude coordinates.
  • crs_xy – Coordiante reference system for the y-x coordinates.
  • dir – Directory where to serialize to. If not given, a temporary location will be used.
  • deserializing – Internal variable to indicate that the object is being build from a file.
add_edge(node_a, node_b, loc_a=None, loc_b=None, speed=None, edge_type=None, path=None, pathnum=None, no_index=False, no_commit=False)[source]

Add new edge to the map.

Parameters:
  • node_a – Label for the node that is the start of the edge
  • node_b – Label for the node that is the end of the edge
  • no_commit – Do not commit to database (remember to commit later)
add_edges(edges, no_index=False)[source]

Add list of nodes to database.

Parameters:edges – List[Tuple[node_key, node_key]] or List[Tuple[node_key, node_key, path_key, int]]
add_node(node, loc, ignore_doubles=False, no_index=False, no_commit=False)[source]

Add new node to the map.

Parameters:
  • node – label
  • loc – (lat, lon) or (y, x)
  • ignore_doubles – When trying to add the same node, ignore it
  • no_commit – Do not commit to database (remember to commit later)
add_nodes(nodes)[source]

Add list of nodes to database.

Parameters:nodes – List[Tuple[node_key, Tuple[lat, lon]]]
all_edges(bb=None)[source]

Return all edges.

Parameters:bb – Bounding box
Returns:(key_a, loc_a, nbr, loc_b)
all_nodes(bb=None)[source]

Return all nodes.

Parameters:bb – Bounding box (minY, minX, maxY, maxX)
Returns:
bb()[source]

Bounding box.

Returns:(lat_min, lon_min, lat_max, lon_max) or (y_min, x_min, y_max, x_max)
edges_closeto(loc, max_dist=None, max_elmt=None)[source]

Return all nodes that are on an edge that is close to the given location.

Parameters:
  • loc – Location
  • max_dist – Maximal distance from the location
  • max_elmt – Return only the most nearby nodes
edges_nbrto(edge)[source]

Return all edges that are linked to edge.

Defaults to nodes_nbrto.

Parameters:edge – Edge identifier
Returns:list[tuple[label1, label2, loc1, loc2]]
find_duplicates(func=None)[source]

Find entries with identical locations.

classmethod from_file(filename)[source]

Read from an existing file.

labels()[source]

All labels.

node_coordinates(node_key)[source]

Get the coordinates of the given node.

Parameters:node_key – Node label/key
Returns:(lat, lon)
nodes_closeto(loc, max_dist=None, max_elmt=None)[source]

Return all nodes close to the given location.

Parameters:
  • loc – Location
  • max_dist – Maximal distance from the location
  • max_elmt – Return only the most nearby nodes
nodes_nbrto(node)[source]

Return all nodes that are linked to node.

Parameters:node – Node identifier
Returns:list[tuple[label, loc]]
size()[source]

Number of nodes.

to_xy(name=None)[source]

Create a map that uses a projected XY representation on which Euclidean distances can be used.

util

Segment

class leuvenmapmatching.util.segment.Segment(l1, p1, l2=None, p2=None, pi=None, ti=None)[source]

Segment in the graph and its interpolated point.

Create a new segment.

Parameters:
  • l1 – Label of the node that is the start of the segment.
  • p1 – Point (coordinate) of the start node.
  • l2 – Label of the node that is the end of the segment.
  • p2 – Point (coordinate) of the end node.
  • pi – Interpolated point. The point that is the best match and can be in between p1 and p2.
  • ti – Position of pi on the segment [p1,p2], thus pi = p1+t1*(p2-p1).

Indices and tables