Documente Academic
Documente Profesional
Documente Cultură
Hannah Bast
Patrick Brosi
Sabine Storandt
University of Freiburg
79110 Freiburg, Germany
geOps
Kaiser-Joseph-Str. 263
79098 Freiburg, Germany
University of Freiburg
79110 Freiburg, Germany
bast@informatik.unifreiburg.de
patrick.brosi@geops.de
storandt@informatik.unifreiburg.de
ABSTRACT
1.
INTRODUCTION
General Terms
Algorithms, Data Structures, Visualization
Keywords
Public Transit Network, Spatio-Temporal Grid, Real-Time
Visualization
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are not
made or distributed for profit or commercial advantage and that copies bear
this notice and the full citation on the first page. Copyrights for components
of this work owned by others than ACM must be honored. Abstracting with
credit is permitted. To copy otherwise, or republish, to post on servers or to
redistribute to lists, requires prior specific permission and/or a fee. Request
permissions from Permissions@acm.org.
SIGSPATIAL 14, November 04 - 07 2014, Dallas/Fort Worth, TX, USA
Copyright 2014 ACM 978-1-4503-3131-9/14/11...$15.00
http://dx.doi.org/10.1145/2666310.2666404
http://transportation.stanford.edu/marguerite/
http://www.uofubus.com/
3
swisstrains.ch
4
http://s-bahn-muenchen.hafas.de
5
http://bahn.de/zugradar
6
http://zugradar.oebb.at
7
http://traintimes.org.uk/map
2
1.1
Related Work
2.
8
9
http://www.flightradar24.com/
http://www.marinetraffic.com/de/
2.1
2.2
Interpolated Schedules
Figure
1:
Map
zoomed in on New
York early in the
morning.
The high
density of public transit vehicles (buses,
light rail, subways,
ferries) displayed
by our system as
coloured
circles
2.3
We would like to have the best of both worlds: coverage, fall-back, complete schedule information, compact data
and real-time information. We achieve this by using a combined approach. The static timetable data provides all of
the basic information, like station locations, envisioned arrival and departure times of vehicles at stations, sequences
of stations and so on. By interpolating this data we can
continuously compute the positions for all relevant vehicles.
If live delay information is available, we will incorporate it
by updating the respective vehicle positions in a suitable
way. This approach adds minimal additional traffic (only a
delay value for a time point) and falls back seamlessly to
the static schedule if no real-time information is available.
There exists an extension to GTFS, called GTFS-realtime,
which was developed to specify delay informations in a neat
manner. Unfortunately, the coverage of GTFS-realtime is
still far behind the general GTFS coverage, but for some
countries, like the Netherlands, such information is already
available nationwide.
Even with this combined approach the amount of data that
has to be managed by the server is huge. Therefore an ef-
Server
Client
draws vehicles
fires requests
provides user interaction
bounding box requests
3.
CLIENT/SERVER INFRASTRUCTURE
We will focus now on a client/server architecture similar to the one sketched in Figure 2. The server manages
transit data, receives requests and outputs information that
enables the client to display vehicles on the screen. A client
can either be a web application, a desktop application, a
smartphone app, or something completely different. In this
section, we present the two most common interface designs
currently used in transit visualization maps.
3.1
Periodic Updates
3.2
4.
structure, which allows to extract all relevant trajectories inside a spatio-temporal bounding box efficiently. Finally, we
point out how live delay information can be incorporated
into our model.
4.1
Vehicle Trajectories
tcur t
t0 t
4.2
20
20
20
20
:
:
:
:
bus
streetcar, cable car, funicular
subway, ferry
rail
We could use multiple R-trees for the different zoom levels. But then we would have to traverse multiple R-tree in
order to display vehicles from multiple levels. We therefore
propose another data structure, which allows for spatial and
temporal indexing, and an easy hierarchical representation.
Multi-Layer Spatio-Temporal Grids. A classical grid
divides the spatial plane in grid cells of equal size. It can
simply be stored as a two-dimensional array. A trajectory
is then assigned to all grid cells it traverses. To model the
hierarchy of transportation modes, we use a multi-layer grid.
Like an R-tree, it groups multiple bounding rectangles into
a bigger rectangle on the next level. Figure 3 gives an example of such a multi-layer grid. Each layer is visible on a
certain zoom level. The side lengths of a cell on level i are
two times the lengths of a cell on level i + 1. This resembles the way tiles are generated for web map services like
OpenStreetMap or Google Maps, which will turn out to be
beneficial for the visualization by the client. More complicated grid-based data structures as described in [1] produce
different sized cells on the same hierarchical level and are
therefore not as easy applicable here. In Figure 3, a rectangle request for zoom level 15 is highlighted along with the
grid cells that have to be checked for trajectories traversing
the rectangle. The obvious disadvantage of this approach is
that some trajectories have to be indexed for multiple cells.
zreq
Bmin (T1 )
T2
T1
4.3
4.4
5.
5.1
0
0s
0s
13:10
13:11
1
0s
0s
13:17
13:19
2
60s
120s
13:24
13:25
3
180s
180s
13:40
13:45
4
240s
120s
13:51
13:52
5
60s
60s
13:58
13:59
FURTHER APPLICATIONS
Combination with Route Planning
5.2
5.3
box area
66.84
1,952
98,965
2.5107
108
105
lr
102
1.5
101
number of Tpot
#arr/dep
122,184
2,660,027
11,665,443
12,221,953
76,287,281
#stops
338
5,357
34,948
73,293
298,535
#|T |
6,041
147,556
300,417
548,007
2,507,566
GTFS feed
Vitoria-Gasteiz
Budapest
New York Area
Netherlands
Combined feed
(22 GTFS feeds)
0.5
6.
EXPERIMENTAL EVALUATION
6.1
Grid-Layer Construction
6.2
Server Performance
We tested the server performance by running several spatiotemporal queries against the datasets described above. Each
100
0
0.5
1
1.5
grid size l
2.5
106
Figure 6: Effects of spatial grid size l. We measured the number of trajectory index items and the
number of output potential trajectories Tpot inside
a spatio-temporal bounding box with side length
500.000 (ca. 45 km). Data is taken from the complete Netherlands GTFS.
query requested the partial trajectories for the next 60 minutes and was run twice: once in the morning rush hour with
tb = 8:00:00 and once in the late evening with tb = 23:00:00.
In a first step, we did a total area scan that requested the
partial trajectories of the entire dataset (z = 20). We then
proceeded with a spatio-temporal bounding box request that
resembles the requests fired by real clients. The request box
with area A 10 km2 was hand-picked from the center of
the dataset. The request zoom level was z = 20. A third
request covered an area of about 60,000 km2 at zoom level
z = 9.
All of these queries were answered using two different approaches. First, we ran the test using a naive approach
where trajectories were stored as an unsorted list. We then
did the same request on a layered grid with base-cell side
length l = 500,000 (approximately 40 km in central Europe.)
Note that even in the naive approach, we implemented various simple optimizations. Consider, for example, a query
that requests all partial trajectories between 8:00 and 9:00.
If the naive approach finds a trajectory that leaves the first
station at 9:20 and arrives at the last station at 10:50, the
trajectory is skipped.
We measured the total query time, the number of output
partial trajectories and the number of affected trajectories.
We say a trajectory T is affected during a query if a nontrivial computation has to be executed on T . Non-trivial
computations include, for example, clipping and a call to
the trajectorys activity function. Calls to getter functions
are considered trivial. A good measurement for the scalability of an approach is the ratio of affected trajectories and
partial trajectories in the output. If, for example, the server
outputs 500 partial trajectories and only had to look at 700
trajectories at all, we consider this a good result.
For small networks like Vitoria-Gasteiz, the naive approach
yields reasonable good results. However, even for this small
dataset, request times are 5 to 20 times higher than for the
grid layer approach. Despite the fact that Vitoria-Gasteiz
does not have any vehicles that are displayed at zoom level
9, the naive approach still has to check 6,000 trajectories for
the z = 9 box request. For bigger datasets like Budapest,
box requests at ground level are still 20 times faster with
Naive
8am
11pm
Grid
8am
11pm
Naive
8am
11pm
500
3.3 k
147.6 k
1.9 k
480
1.6 k
147.6 k
1.9 k
100
3.3 k
3.8 k
1.9 k
50
1.6 k
1.8 k
1.9 k
1.3
11.5
300
98
Box request
time in ms
#partial trajectories
#affected trajectories
request area in km2
457
1k
147.6 k
9.9
460
434
147.6 k
9.9
27
1k
1.5 k
9.9
12
434
628
9.9
Box request
time in ms
#partial trajectories
#affected trajectories
request area in km2
328
73
147.6 k
60.2 k
326
39
147.6 k
60.2 k
1.8
73
73
60.2 k
<1
39
39
60.2 k
the grid layer than with the naive approach, see Table 3.
In Table 4, the z = 9 box request for the New York City
area at 11pm yields 161 partial trajectories. To output this
(small) number of T par , the naive approach has to look at
300,000 trajectories, while the grid only has to look at 227.
Similar results can be seen in Table 5 for the Netherlands
feed.
To the best of our knowledge, the Netherlands GTFS is
(by far) the biggest transit feed available. It can be considered as complete, meaning that every public transit vehicle in the country is included with its full polyline. Still,
despite the fact that the total number of trajectory vertices
is 16 times as high as in the Budapest feed (the number of
arrival/departure events is nearly 5 times as high), the time
to output about 1,000 partial trajectories at ground level in
the city center of Amsterdam is only slightly bigger than
the time to output about 1,000 T par in the city center of
Budapest. The higher time for the box request in Amsterdam can be explained by the higher network density in the
city and mainly because of the higher vertex density in the
Netherlands feed. Note that the z = 9 box requests in Table 5, which nearly covers the whole country of the Netherlands, only takes 23 ms during the morning rush hour. At
this zoom level, only trains are returned by the server.
To show that our back-end has the potential to handle the
public transit network of the whole world, we ran the queries
against a dataset consisting of 22 feeds from around the
world. These feeds include three entire countries (Switzerland, Sweden and the Netherlands). The other feeds are:
public transit in the cities of Albuquerque, Boston, Los Angeles, Miami, San Francisco, Portland, Chicago (all USA),
Quebec, Montreal (both Canada), Manchester (UK), Budapest (Hungary), Rennes (France), Turin (Italy), VitoriaGasteiz (Spain), Auckland, Wellington (both New Zealand),
Adelaide (Australia) and the public transit in the areas of
Freiburg (Germany) and New York (USA). The parameters
of this combined feed are listed in Table 2.
Table 6 shows that even when run against the combined
dataset, the grid layer approach is still able the handle requests very fast. For the box requests, computation times
are nearly the same (1 ms) as for the Netherlands. This
k
k
k
k
1.1
3.6
300
98
Grid
8am
11pm
k
k
k
k
478
11.5 k
17.2 k
98 k
155
3.6 k
3.8 k
98 k
1.1 k
1.1 k
300 k
10.7
1k
353
300 k
10.7
87
1.1 k
3.4 k
10.7
26
353
783
10.7
676
485
300 k
60.8 k
670
161
300 k
60.8 k
13
485
559
60.8 k
5
161
227
60.8 k
Naive
8am
11pm
Grid
8am 11pm
2.1 k
11.5 k
548 k
25 M
1.9 k
5.1 k
548 k
25 M
706
11.5 k
18.1 k
25 M
310
5.1 k
8.2 k
25 M
Box request
time in ms
#partial trajectories
#affected trajectories
request area in km2
1.82 k
911
548 k
10.7
1.82 k
556
548 k
10.7
51
911
2k
10.7
33
556
1.2 k
10.7
1.28 k
707
548 k
60 k
1.28 k
451
548 k
60 k
23
707
986
60 k
14
451
533
60 k
Grid
8am
11pm
8k
40.5 k
2,5 M
9k
1.3 k
40.3 k 40.5 k
2,5 M 67.7 k
100 M
1.7 k
40.3 k
67.1 k
Box request
time in ms
#partial trajectories
#affected trajectories
request area in km2
7.5 k
911
2,5 M
8.2 k
51
556
911
2,5 M
2k
10.7
33
556
1.2 k
Box request z = 9
time in ms
#partial trajectories
#affected trajectories
request area in km2
5.9
707
2,5 M
5.9
23
451
707
2,5 M
986
60 k
15
451
533
proves that loading more feeds in our system does not affect
the runtime of spatio-temporal queries covered by a single
feed. With that observation, we can conclude that our implementation (given enough memory) can provide real-time
vehicle movements all around the whole world efficiently. In
combination with a client that visualizes this movements in
an appealing manner, we now have a framework for a fully
functional world-wide live public transit map.
We encourage the reader to visit our live demo at http:
//tracker.geops.ch . At the time of this writing, over 80
feeds have been incorporated, which is even more than in
our experiments.
7.
8.
REFERENCES