Sunteți pe pagina 1din 8

2018 IEEE 16th Int. Conf. on Dependable, Autonomic & Secure Comp., 16th Int. Conf.

on Pervasive Intelligence &


Comp., 4th Int. Conf. on Big Data Intelligence & Comp., and 3rd Cyber Sci. & Tech. Cong.

Data-Driven Travel Itinerary with Branch and Bound Algorithm

Du Jiaoman Li Lei Li Xiang


Graduate School of Engineering Faculty of Science and Engineering School of Economics and Management,
Hosei University Hosei University Beijing University of Chemical Technology,
Tokyo, Japan 184–8584 Tokyo, Japan 184–8584 Beijing, China 100029
Email: du.jiaoman.23@stu.hosei.ac.jp Email: lilei@hosei.ac.jp Email: lixiang@mail.buct.edu.cn

Abstract—In this paper, we study a novel self-driving travel the STIP there exists. Similar to the TSP, each city will be
planning problem, where the tourist aims to minimize the total visited at least and only once. By contrast, in the STIP the
cost. The idea is to use a mathematical model to planning a hotels will be chosen to settle in to have a rest. Three kinds
route-time scheme for travel spots and hotels. Specifically, this of decisions will be implemented in the STIP, that are hotel
planning determines the tour for travel spots and considers choice, the tour arrangement of travel spots and hotels as
the hotel selection under the rest break constraint, as well well as time arrangement with the ongoing travel. First, the
as schemes routes and time arrangement for the trip. Mean- tour of travel spots will be determined; the next step is to
while, based on real-time and multi-resource demand, we use determine hotel choice and travel routes under the constraint
multi-resource data to execute multiple websites’ information of rest break. The objective is to minimize the total cost.
extraction. We utilize two algorithms to solve the proposed Data in travel field, enabled by the rapid development
problem and make a comparison, one is exact branch and of travel behavior [3], [6], travel demand [4], [5] and travel
bound scheme and the other is the branch and bound based choice [7] in the past few years, is often collected from
heuristic algorithm. In the proposed heuristic algorithm, the a variety of different sources. In addition, real-time nature
travel spots in the problem are decomposed by K-means of the data can better enhance the model’s accuracy and
algorithm, then each group of travel spots is bounded by the provides the better service for tourists. Therefore, in this
greedy algorithm and Hungarian method for upper bound study, the mathematical optimization model on self-driving
and lower bound, respectively. Each branch node branches
travel planning integrated the multiple data sources of real-
time data is developed to better match the real world.
using Hungarian method and each branch can be treated as
The TSP, introduced by [1], is the problem of traveling
an assignment problem solved by Hungarian method. Finally,
where each city only is visited once by a tourist to minimize
we give numerical examples and discuss the results.
the total travel cost. Several variations of the TSP such as
Index Terms—Travel Itinerary Problem, Branch and Bound the bottleneck TSP [8], m-salesmen TSP [9], time-dependent
Algorithm, Data-Driven. TSP [10], TSP with time window [11] that originated from
various life or potential applications have been studied.
1. INTRODUCTION However, in these problems they only consider the routing
planning, which is not just so simple in many real-life
Travel plays an important role in modern transportation applications. Typically, in the STIP, the hotel choice with
systems, thanks to the improvement of life quality. Travel rest break constraint is necessary to be considered during
itinerary is a necessary method to the travel preparation in the trip, affecting the efficiency and satisfaction of a trip.
advance. Before the trip, tourists need choose travel spots The resting break constraints need to be taken into account
and hotels and route planning. Nowadays, a large number when choosing hotels. In this paper, the hotel choice with
of travel websites such as Airbnb, TripAdvisor, Agoda can rest breaking constraint is studied.
provide location, conservation cost of hotels, travel time and Travel itinerary problem is a kind of travel plan for
recommended viewing time. Integrated application of multi- travel events, such as sight spot, flight and hotel timetable
resource data in self-driving travel itinerary problem (STIP), arrange, in a chronological order of date and time. When
travel has become more free and customized. It has become a trip is in a way of public transportation, i.e., airplane,
easier to make a travel under the data age. train, bus, the tourist needs to plan the schedule according
This study introduces a mathematical model for data- to the way of public transportation. Nowadays, because of
driven travel planning and exploits a branch-and-bound the characteristics of free and convenient, self-driving travel
based heuristic for self-driving travel itinerary planning, itinerary has drawn more and more tourists’ attention. A
which based on the classic travel salesman problem (TSP) self-driving travel itinerary consists of a route with one or
[1], [2] where there is no the hotel choice and time ar- more stops that a tourist takes. These destinations are visited
rangement problem under the reality constraints, while in follow one another sequentially in time in a way of self-

978-1-5386-7518-2/18/$31.00 ©2018 IEEE 1046


DOI 10.1109/DASC/PiCom/DataCom/CyberSciTec.2018.00149

Authorized licensed use limited to: Middlesex University. Downloaded on August 31,2020 at 21:05:38 UTC from IEEE Xplore. Restrictions apply.
driving. Traditional travel decision-making problems use 2.2. Background
route choice behavior analysis [16], travel forecasting [12],
air-travel itinerary share [13], intersection movement-based In this study, planning itinerary problem for STIP in-
dynamic route choice [14], walk-ride itinerary optimization cludes the following categories of decisions: (i) identifica-
[15], among others. However, to our knowledge, no research tion of hotel choice; (ii) the determination of the tour (the
on data-driven self-driving travel itinerary problems in a way sequence of stops (travel spots, hotels) visited by tourists);
of mathematical programming with hotel choice has been (iii) the specification of the routing and time arrangement
presented in the previous studies. for each section of the itinerary. The latter two decisions
The main contributions of this study are the following. are closely dependent on the former, since for each given
We introduce the STIP with hotel choice under the rest break sequence of stops, the time and the number of visited spots
constraint. It is a schedule plan provided for tourists, which each day are different, and the total cost is different. Due
involves hotel choice, routing decisions and time arrange- to STIP is a travel way from origin to interesting places by
ment for each visited destination. We offer a valuable new self-driving, there is no the choice about transport modes.
model for data-driven travel planning research. We develop The STIP is defined on a road graph (N, A), where N is
a branch-and-bound based heuristic to solve this problem. the set of nodes (e.g. travel spots, hotels, and origin) and A
Finally, we show the results of numerical experiments to is the link set which connects any two nodes in road graph
provide insight in the performance of model and algorithm. (N, A). In the following, we assume that there are multiple
The remainder of the paper is as follows. Section 2 travel spots and they are surrounded by many hotels. the
provides the overview of the STIP. The mathematical model tourist visits each travel spot and returns to the origin. In
is shown in Section 3, followed by the heuristic algorithm this process, the tourist finds the suitable hotels to settle in
in Section 4. Computational results are presented in Section so as to ensure the rest break constraint. The origin and
5. Concluding remarks are presented in Section 6. destination are the same in this paper. Assume that cost
of travel spots and hotels and recommended viewing time
2. SELF-DRIVING TRAVEL ITINERARY are known in advance, which can be obtained from multi-
resource data websites. Likewise, there also exist transport-
PROBLEM ation cost between two nodes. Visitors, having the regular
rest time, can find a hotel near the travel spot to guarantee
2.1. Data Description in Travel Planning the rest time of visitors. And they can rest at the hotel at
night and depart from it on the next day.
Multi-resource data in the travel field comes not only
from a single source but many. The advantages of multi-
resource data involve direct and indirect aspects, where Origin Travel Spot Hotel

accurate and reasonable travel scheme is direct benefits,


indirect aspect are implemented by enhancement of travel  Arrival/Departure Time Selected Hotel

planning modeling in the model development. 


 
There are three kinds of costs regarding the self-driving
trip planning: entrance ticket for travel spots, reservation
 

cost for hotels, driving cost. The costs are different for 3    2
different travel spots, hotels and paths. Even though for the  
same travel spot, there may be variable prices according to 1
seasonality, age, occupation. Price variety of the hotels is
 
more irregular, where the impact factors include demand,   

seasonality, rest day and so on. The variety of driving costs 4 5


depends on the distance of the route. Furthermore, there   

exist significant differences for the recommended viewing


  
time in different travel spots. There is a common charac-
teristic among the above information, which is variability. Fig. 1. Modeling network for self-driving travel itinerary problem
Therefore, real-time information can provide the precise
Fig. 1 illustrates the specific modeling network, which
planning. Meanwhile, because of the single website can not
includes 1 origin, 4 travel spots, and multiple hotels around
obtain the all needed data, the data need to be collected
travel spots. The numbers written near the node denote
from multiple source websites. Data-driven travel planning
time ranges that are the arrival time and departure time
can offer a valuable real-time data from many sources to
of nodes, respectively. Each link has two attributes namely
make a satisfied travel planning. The above data regarding to
transportation cost and arrival, departure time. The visitors’
cost and location information can be obtained from multiple
expected rest time at night is before 20:00 pm. The travel
source travel websites Google Map1 , TripAdvisor2 , Tuniu3 .
itinerary in Fig. 1 is introduced as follows: A visitor departs
1. https://www.google.co.jp/maps the origin at 08:00 am, he/she takes 9 hours to drive and
2. https://www.tripadvisor.com/ arrives spot 2. It is needs take 2 hours to finish the spot
3. http://www.tuniu.com/ viewing at spot 2 and the visitor completes viewing time at

1047

Authorized licensed use limited to: Middlesex University. Downloaded on August 31,2020 at 21:05:38 UTC from IEEE Xplore. Restrictions apply.
⎧ Hi
19:00 pm which is before the rest time, but if he/she selects ⎪ D I D I D

to continue traveling, the visitor will arrive spot 3 at 21:00 ⎪



min ∑ ∑ pεi xid + ∑ ∑ ∑ cihi yihi d + ∑ ∑ ρabd zabd
pm, which break the rest time. Then the visitor will select a ⎪
⎪ d=1 i=1 d=1 i=1 hi =1 d=1 a,b∈G

⎪ s.t.
satisfying hotel near spot 2 to settle in and departs it the next ⎪


⎪ D
day. The above description only is a small part of the whole ⎪

trip. The modeling network aims to illustrate the processes



(1) ∑ xid = 1, i = 1, 2, · · · , I

⎪ d=1
of hotel choice, routing strategy and time arrangement. ⎪
⎪ D Hi






(2) ∑ ∑ yihi d = 1, i∈N

⎪ d=1 hi =1

⎪ D Hi
3. MODEL ⎪





(3) ∑ ∑ yihi d = 0, i∈
/N

⎪ d=1 hi =1

⎪ D D


We formulate the self-driving travel itinerary problem in-
volving travel tour, hotel selection, routing strategy for spots




(4) ∑ zihi d + ∑ zhi id = 1, (hi , i) ∈ (E, F)

⎪ d=1 d=1
and hotels and time arrangement under the consideration of ⎪
⎪ D

rest break constraint. (5) ∑ ∑ zabd = 1, a∈G


d=1 b∈G
b=a



⎪ D

Notation systems ⎪



(6) ∑ ∑ zabd = 1, b∈G

⎪ d=1 a∈G

⎪ a=b

⎪ D


I
i
number of travel spots
travel spot index i = 1, 2, · · · , I




(7) ∑ ∑ zabd ≤ |S| − 1, ∀S ∈ G,

⎪ d=1 a,b∈G
D travel days ⎪
⎪ a=b

⎪ 2 ≤ |S| ≤ |G| − 1
d travel day index d = 1, 2, · · · , D ⎪



pεi ticket price of spot i, binary variable ε = 1 ⎪

(8) Tds + Timed ≤ Qd , d = 1, 2, · · · , D
denotes boom season, while ε = 0 slack season ⎪
⎪ (9) zabd (T Ea + tab + Sb − T Eb ) = 0,


Hi number of available hotels around spot i, ⎪
⎪ a, b ∈ G, d = 1, 2, · · · , D


i = 1, 2, · · · , I ⎪
⎪ (10) xid ∈ {0, 1}, i = 1, 2, · · · , I, d = 1, 2, · · · , D


hi index of available hotels around spot i, ⎪
⎪ (11) yihi d ∈ {0, 1}, hi = 1, 2, · · · , Hi , i = 1, 2, · · · , I,


hi = 1, 2, · · · , Hi , i = 1, 2, · · · , I ⎪
⎪ d = 1, 2, · · · , D

xid if spot i is visited on the dth day, it takes (12) zabd ∈ {0, 1}, a, b ∈ G, d = 1, 2, · · · , D
value 1; otherwise, it takes value 0,  
which is the decision variable, i = 1, 2, · · · , I, where N =  i| md = Md , yid = 1, d = 1,
i 2, · · · , D
d = 1, 2, · · · , D (E, F) = (hi , i)|yihi d = 1, d = 1, 2, · · · , D
cihi reservation cost of hotel hi around spot i,
hi = 1, 2, · · · , Hi , i = 1, 2, · · · , I
Timed = max{ ∑ tab zabd + ∑ Sa }, d = 1, 2, · · · , D
a,b∈G a∈G
a=b a∈E
/
yihi d if hotel hi around spot i is selected on the
The objective in the above model is minimizing the total
dth day, it takes value 1; otherwise, it takes
cost on this trip, where the first term represents the entrance
value 0, which is the decision variable,
ticket for travel spots, the second term is reservation cost for
hi = 1, 2, · · · , Hi , i = 1, 2, · · · , I, d = 1, 2, · · · , D
hotels, and the third is driving cost. Constraint (1) shows
G set of travel spots and selected hotels as well
that each trip spot is chosen only once during the trip.
as origin node
Constraints (2-3) determine the hotel around travel spot i
ρabd travel cost from a to b on the dth day, a, b ∈ G,
is chosen to settle in. Constraint (4) expresses the route
d = 1, 2, · · · , D
between travel spot i and hotel yihi d is selected. Constraints
zabd if the arc from a to b is active with a, b ∈ G,
(5-7) state the general constraints of TSP. The rest break
a = b, it takes value 1; otherwise, it takes value
constraint is guaranteed in Constraint (8). Constraint (9)
0, which is the decision variable, d = 1, 2, · · · , D
ensures the time continuity each day. Constraints (10-12)
Md number of visited travel spots on the dth day
are the domain of variables.
mid the order index in which the travel spot i is
visited on the dth day
mid = 1, 2, · · · , Md , i = 1, 2, · · · , I, d = 1, 2, · · · , D 4. ALGORITHM
Timed total travel time on the dth day d = 1, 2, · · · , D
We describe the branch and bound based heuristic al-
Tds travel start time on the dth day d = 1, 2, · · · , D
gorithm for the travel itinerary problem. In our case, our al-
T Ea departure time that departs at node a a ∈ G
gorithm decomposes the problem into three parts: assign the
tab travel time from node a to b, a, b ∈ G
big-scale network into small-scale groups of travel spots by
S a , Sb the stay time at node a and b, a, b ∈ G
Qd resting break time on the dth day d = 1, 2, · · · , D

1048

Authorized licensed use limited to: Middlesex University. Downloaded on August 31,2020 at 21:05:38 UTC from IEEE Xplore. Restrictions apply.
K-means algorithm; solve travel salesmen problem for each processes of K-means algorithm to solve STSP is given as
group of travel spots’ tour by branch and bound scheme; follows.
select the hotels location by an efficient constraint inspection (1) Each spot i is initially assigned to its closest cluster.
procedure. Meanwhile, in branch and bound scheme, we (2) Each cluster center Lu is updated to be the mean of
use the greedy algorithm to obtain the upper bound and the its constituent spots’ distance.
lower bound is computed by assignment problem solved by (3) The algorithm converges when there is no further
Hungarian algorithm. Fig. 2 shows a diagram of the STIP change in assignment of spots to clusters.
architecture, which has three layers. The first layer is the
data layer including the tourist’s preferred travel spots and 4.2. Hungarian Algorithm
relative data from multiple travel websites such as location,
recommended viewing time, etc. The second layer provides The Hungarian method [19], [20] is an algorithm which
the itinerary planning process by integrated algorithms. The finds an optimal assignment for a given cost matrix. The
third layer is the itinerary recommendation system. In the Hungarian algorithm is used to solve the assignment prob-
following sections, we describe the K-means algorithm, lem. In this section, we employ the Hungarian method
Hungarian algorithm, and branch and bound scheme. to solve the lower bound of STIP. Assume the distance
matrix n × n is known in advance. The following steps will
Database layer introduce the solving procedures.
Travel websites
Step 1. Subtract the smallest element in each row from
TripAdvisor
Google Map
all the elements of its row for distance matrix.
Airbnb Step 2. Subtract the smallest element in each column
from all the elements of its column.
Tourist spots choice Tourist spots data Step 3. Draw lines by appropriate rows and columns so
that all the zero elements of the distance matrix are covered
and the minimum number of such lines is used.
Itinerary planning Step 4. Test optimality: (i) If the minimum number
K-means cluster of covering lines is n, an optimal assignment of zeros is
Itinerary recommendation layer
possible where the optimal assignment is obtained(ii) If the
Users
minimum number of covering lines is less than n, an optimal
Tourist spots 1 Tourist spots k Tourist spots n
assignment of zeros is not yet possible. In that case, proceed
Branch-and-bound algorithm
to Step 5.
Upper bound: greedy algorithm
Step 5. Determine the smallest element not covered by
Lower bound: hungarian algorithm any line. Subtract this element from each uncovered row,
Branch: breadth search strategy and then add it to each covered column. Return to Step 3.
Tourist spots k

Constraint inspection procedure


Hotel choice
4.3. Branch and Bound Algorithm
Time arrangement
Branch and bound algorithm [21], [22] comes in many
Fig. 2. System architecture of STIP shapes and forms, which integrates a large class of al-
gorithms for solving hard optimization problems optimally.
This scheme works as follows. In each step, the set of all
4.1. K-means Algorithm possible solutions is split into two or more subsets, which
are represented by branches in a decision tree. Including
Given a set of numeric objects R = {R1 , R2 , · · · , Rη } or excluding an edge is the common criterion for splitting

and an integer number k (≤ η ). The K-means algorithm into subsets. For each subset, a lower bound on the length

searches for a partition of R into k clusters that minimizes of the tour is calculated. And compare the lower bounds
the sum of squared errors within groups. This process is with some previously computed upper bound. The branch
often formulated as the following mathematical program where the corresponding lower bound exceeds the known
[17], [18]: upper bound will be discarded. According to this operation,
 the entire branch of the tree can be discarded. Eventually, a
k η
min ∑ ∑ wur d(Rr , Lu ) subset is found which contains a single tour whose length
u=1 r=1 is less than or equal to some lower bound for every tour.

k The main body of branch and bound algorithm in this
s.t. ∑ wur = 1, 1 ≤ r ≤ η study has three steps: (i) determination of the upper bound
u=1
 using the greedy algorithm (ii) determination of the lower
wur ∈ {0, 1}, 1 ≤ r ≤ η , 1 ≤ u ≤ k
bound according to the assignment problem solved by Hun-
where d(·, ·) is the squared Euclidean distance between two garian algorithm (iii) branch and bound scheme. Besides
objectives; Lu denotes the object in the cluster center u; if Rr that, a constraint inspection procedure is embedded into the
is in the cluster center u, wur = 1, otherwise, 0. The detailed solving algorithm to ensure the integrity of the generated

1049

Authorized licensed use limited to: Middlesex University. Downloaded on August 31,2020 at 21:05:38 UTC from IEEE Xplore. Restrictions apply.
travel scheme. In what follows, we present a formal state- Step 4. Stop when only one branch survives.
ment of the above solution methods. The whole process of the proposed algorithm is sum-
marized as follows:
4.3.1. Upper Bound. In this algorithm, we use a greedy 1) Cluster travel spots by K-means clustering algorithm.
algorithm to find the upper bound. The main idea behind 2) Solving TSP for each group of travel spots by branch
a greedy algorithm is local optimization. That is, the al- and bound algorithm.
gorithm picks what seems to be the best thing to do at the Step I. Get upper bound by the greedy algorithm.
particular time, instead of considering the global situation. Step II. Solve the problem as an assignment problem
Hence it is called ”greedy". The process of greedy algorithm using Hungarian method to get the lower bound.
determining the upper bound is follows: Assume there are Step III. Choice the branch element and construct two
n travel spots that tourists want to visit. Pick an arbitrary subproblems where one includes the branch element, the
travel spot and call it travel spot 1. Find a spot with the other one doesn’t include the branch element.
smallest distance from spot 1, and call it spot 2. Find a spot Step IV. Use Hungarian method to get the distance of
in the rest of the n-2 spots with the smallest distance from each branch, If the distance is worse than the upper bound,
spot 2. Repeat the above steps until a complete tour will be then terminate the branch; If the distance is less than the
found. upper bound, then this branch can continue to branch.
Step V. If there exist other branches that have no branch
4.3.2. Lower Bound. Branch and bound scheme is an exact in the same level, implement the Step III for other branches.
method which solves the travel salesmen problem as a state- Otherwise, create branches for the next level branches.
space search. This lower bound represents the smallest cost Step VI. Stop when only one branch survives.
sum in the assignment problem. However, it doesn’t have 3) Connect travel routes of all clusters of travel spots by the
to be a complete tour in travel salesmen problem. If this giant-tour representation.
lower bound is higher than the upper bound, the node may 4) Calculate time arrangement in the tour by time continuity
be pruned. according to formula (9).
In this study, we use the assignment problem as a lower- 5) Check rest break constraint and determine the rest loca-
bound cost function. In the assignment problem, each travel tion namely hotel by formula (8).
spot i is assigned to another travel spot j, with ci j as the cost 6) Stop the algorithm.
of the assignment, finally, the total assignment cost is min-
imized. Because of the tour obtained by assignment problem
need not form a complete tour, the assignment problem is a 5. NUMERICAL EXPERIMENTS
relaxation of the travel salesmen problem. In our study, an
initial lower bound is an assignment problem (AP), which In this section, we provide one case study and three
is solved by Hungarian algorithm. The specified solution examples with different scales of travel networks. First, one
method is presented in Section Hungarian Algorithm. illustrated case includes 10 travel spots is from Yunnan,
China shown in Fig. 3, in which A is the origin and B-K are
4.3.3. Branching. The splitting of the set of all tours into travel spots from Yunnan. The above location information,
disjoint subsets will be represented by the branching of a recommended viewing time, ticket expenses, reservation
tree. In this study, we use the breadth-first search strategy to cost are obtained from multiple data websites: Google Map,
find the branch node. The branch containing (i, j) represents TripAdvisor, and Tuniu. To validate our proposed algorithm,
all tours which include travel spot pair (i, j). The branch we used another 3 examples with different scale networks
excluding (i, j) represents all tours which do not include and different departure times to test algorithms and give the
travel spot pair (i, j). In all the procedures designed both management insights. Example 1 is constructed by a small-
the branching of nodes and the calculation sub-problem for scale network and solved by the original branch and bound
every branched node, follow the same outline. Regarding scheme, which includes 1 origin and 29 travel spots. For
the branching there are two different situations to manage Example 2, 50 travel spots and 250 hotels near them are
throughout the process: generated randomly. 100 travel spots and 500 hotels near
Step 1. Find that 0-position element (i, j) in the reduced them are generated randomly in Example 3. Example 2 and
matrix, construct two subproblems: one is the subproblem Example 3 are solved by the proposed branch and bound
where element (i, j) is forced into the solution (remove row based heuristic algorithm. The location of travel spots and
i and column j from the matrix and set element ( j, i) is equal hotels of the above 3 examples are generated randomly. The
to ∞), and the other one is the subproblem in which element recommended sightseeing time and entrance ticket in three
(i, j) is excluded from the solution (set element (i, j) is equal examples are randomly generated [2, 5]h and [0, 4000]¥in
to ∞). spots. The reservation cost is generated from the range
Step 2. Solve the above two subproblems using Hun- [8000, 20000]¥. Taking the clusters in K-means algorithm
garian method and obtain the solutions, respectively. are 3 and 6 in Example 2 and Example 3, respectively.
Step 3. If the solutions less than the upper bound, then The experiment environment and parameters are set
continue to explore the subproblems namely the next branch as follows: The driving cost and driving time between
node and proceed step 1. Otherwise, terminate the branch. two nodes use the following computational formulation,

1050

Authorized licensed use limited to: Middlesex University. Downloaded on August 31,2020 at 21:05:38 UTC from IEEE Xplore. Restrictions apply.
cost=distance*fuel, time=distance/speed, in which Euc- specific travel itinerary, where the first row for each itinerary
lidean distance is used to represent the distance between represents the tour for travel spots and hotels and the bold
two nodes, fuel denotes the fuel cost each kilometer font numbers denote hotels. For example, in the Case study,
(fuel=9.5¥/km), speed is driving speed of vehicle (43mile/h). B(5) represents the fifth hotel near travel spot B, and E(5)
We assume that the rest break time is 22:00 pm. The represents the fifth hotel near travel spot E. The last four
departure times of travel from the origin are assumed as columns are the total cost, travel time in terms of hours and
06:00 am and 09:00 am. days, as well as computation time in terms of second.
Some of the observations that can be made from the
results. We can observe that different departure times will
lead to the different itineraries, different cost and different
J:Tiger Leaping Gorge travel time. With regard to itinerary, we can see that the tours
K:Meili Snow Mountain
I:Pudacuo National Park of travel spots almost are coherent, however, the selected
hotels are different due to different departure times. For
B:Old Town of Lijiang A:Lugu Lake example, in Case study, the tour of travel spots A-K are the
C:Lanyuegu
same, but the selected hotels are different. B(5) and E(5)
F:Shuanglang
are selected to be settled in when DT=06:00 am at origin A
G:Dali Old City and hotels C(1), H(4) and D(4) are selected to be settled in
when DT=09:00 am at origin A. The total costs also exist
H:Erhai significant differences. Taking Example 2 as an example, the
E:Minzucun D:Stone Forest Scenic cost difference can reach 31%((191670-146580)/(146580)).
In addition, we can see that the travel time in Case study
can save 17.83h (71.63-53.80). Therefore, selecting suitable
departure time from origin can reduce effectively Cost and
Time. This model can provide alternative itinerary schemes
which include hotels decision, travel tour of nodes as well as
Fig. 3. Map of Yunnan’s travel spots time arrangement and the corresponding total Cost, Tome(h)
and Time(D) to tourists to plan schemes as their preference.
Furthermore, we solve the small scale Case study by classic
branch and bound algorithm. With the expansion of network
scale, the branch and bound based heuristic algorithm is
J:Tiger Leaping Gorge used to solve the Example 1-3. According to the last column,
K:Meili Snow Mountain
I:Pudacuo National Park we can see that the computation time increases as the
network’s scale, and the computation time is acceptable. The
B:Old Town of Lijiang A:Lugu Lake proposed branch and bound based heuristic algorithm can
C:Lanyuegu
effectively deal with the large-scale network.
F:Shuanglang
G:Dali Old City 6. CONCLUSIONS and FURTHER STUDY
H:Erhai To our knowledge, there has been limited research on
E:Minzucun D:Stone Forest Scenic modeling the self-driving travel itinerary problem. Our re-
search extracts useful information from multiple data web-
sites and provides the more accurate result of the model for
advanced travel planning and travel management applica-
tions.
This paper uses a branch and bound based heuristic
Fig. 4. Optimal result for Yunnan’s travel spots to solve the data-driven travel itinerary model for travel
planning from a tourist’s perspective. The key features of the
Fig. 4 and Table 1 show the results of Case study and model are the comprehensiveness for making a specific tour
Example 1-3. Fig. 4 provides the informations of Case study plan, which gives the scheme of tour, hotel choice and time
about selected hotels and routing strategy for travel spots arrangement under the consideration of rest break constraint.
and hotels, where the black squares denote selected hotels. Moreover, the real-time multi-resource data is utilized to
Table 1 gives specific itinerary schemes for different scale support the consolidation of the model in the real world.
networks and different departure times. The first column Overall, this study offers a valuable new model for data-
denotes different scale networks include Case study, Ex- driven travel planning.
ample 1, Example 2 and Example 3. For each network, We employ branch and bound based heuristic to explore
two departure times from origin are considered: 06:00 am the resulting of the problem, in which the upper bound is
and 09:00 am. AT and DT are the arrival and departure solved by the greedy algorithm, and the lower bound is
time for the corresponding nodes. Columns 4-24 show the obtained by the assignment problem solved by Hungarian

1051

Authorized licensed use limited to: Middlesex University. Downloaded on August 31,2020 at 21:05:38 UTC from IEEE Xplore. Restrictions apply.
Table 1. Results for Different Scale Networks

Network Spot Number Itinerary Cost(¥) Time(h) Time(D) CT(s)


A I K J C B B(5) F H G E E(5) D A
10 AT 07:08 10:10 13:25 16:02 18:52 21:15 08:47 11:56 14:04 19:00 22:00 08:41 11:48 16562 53.80 3 0.69
DT 06:00 09:57 12:47 15:52 18:38 21:06 08:00 11:33 13:57 16:32 21:53 08:00 11:10
Case study
A I K J C(1) C B F H H(4) G E D D(4) A
10 AT 10:08 13:10 16:25 19:02 8:10 11:01 14:02 17:11 19:23 08:07 13:02 16:36 19:16 08:38 27182 71.63 4 0.57
DT 09:00 12:57 15:47 18:52 08:00 10:47 13:15 16:48 19:12 08:00 10:35 15:56 19:06 08:00
1 5 3 7 19(4) 19 22 25 27(4) 27 30 24 21(5) 21 8 15 26 26(1) 2 29
AT 06:54 12:23 16:02 20:33 08:12 11:42 16:49 20:44 08:12 12:03 15:57 21:04 08:10 11:20 14:50 18:30 20:51 10:51 14:32
DT 06:00 09:47 15:00 18:29 08:00 10:36 14:03 18:57 08:00 10:24 14:15 17:57 08:00 10:13 13:22 17:00 20:42 08:00 13:05 16:48
29 85637 229.80 10 0.61
16 16(3) 10 4 28 28(2) 11 14 6(5) 6 9 12 12(4) 23 13 20 20(5) 17 18 1
AT 18:36 21:09 09:25 13:23 18:19 21:09 10:37 15:07 19:49 08:15 13:34 17:12 20:08 09:10 13:06 16:47 19:52 10:26 15:39 19:48
DT 21:00 08:00 11:52 15:52 20:55 08:00 13:14 17:52 08:00 11:01 16:23 20:00 08:00 12:00 16:01 19:40 08:00 13:22 18:34
Example 1
1 5 3 7(5) 7 19 22 25(4) 25 27 30 24(5) 24 21 8 15(5) 15 26 2 2(3)
AT 09:54 15:23 19:02 08:12 12:43 16:13 21:20 08:09 12:04 15:56 19:50 08:14 13:22 16:32 20:02 08:08 11:49 16:52 19:20
DT 09:00 12:47 18:00 08:00 10:40 15:08 18:34 08:00 10:17 14:16 18:08 08:00 10:15 15:25 18:33 08:00 10:19 14:01 19:06 08:00
29 91388 245.44 11 0.55
29 16 10 10(4) 4 28 11(1) 11 14 6 6(5) 9 12 23 23(1) 13 20 17 17(1) 18 1
AT 09:27 13:31 17:20 19:55 09:31 14:27 19:40 08:08 12:39 17:20 20:21 10:33 14:12 18:09 21:10 09:07 12:48 18:07 21:13 10:18 14:26
DT 11:43 15:55 19:47 08:00 12:00 17:03 08:00 10:45 15:23 20:06 08:00 13:22 16:59 20:58 08:00 12:03 15:42 21:03 08:00 13:13
1 5 41 18 17(1) 17 20 47 45(2) 45 37 37(1) 2 29 16 36(5) 36 50 39 39(1)
AT 07:54 12:24 17:30 20:34 10:04 14:07 18:38 21:30 10:03 14:44 20:01 08:10 12:53 16:58 19:33 09:03 13:17 16:03 19:19
DT 06:00 10:48 15:14 20:25 08:00 13:00 17:00 21:20 08:00 12:53 17:35 08:00 10:24 15:10 19:22 08:00 11:28 15:29 18:16 08:00
15 8 31 31(1) 21 24 25 25(2) 27 30 26 26(4) 3 28 11 14(5) 14 6 13(4) 13
AT 08:15 12:34 15:25 19:04 08:09 12:34 15:30 20:34 08:11 11:05 14:40 19:37 08:09 14:01 18:15 21:07 09:09 14:09 19:10 08:13
DT 10:25 14:35 17:26 08:00 10:12 14:35 17:39 08:00 10:23 13:17 16:52 08:00 10:45 16:37 20:52 08:00 11:54 16:55 08:00 11:09
50 146580 399.25 17 0.72
34 9(3) 9 23 12 12(4) 32 43 40 46 46(3) 10 4 33 48(4) 48 7 19 22 22(5)
AT 13:26 19:10 08:08 12:37 16:33 19:33 08:28 11:34 15:00 18:11 20:50 09:27 13:07 16:23 21:13 08:14 13:22 16:06 18:59 21:33
DT 16:22 08:00 10:57 15:26 19:21 08:00 11:13 14:15 17:41 20:41 08:00 11:54 15:36 18:50 08:00 10:40 15:49 18:31 21:20 08:00
42 49 44(1) 44 35 38 1
AT 10:22 14:38 19:31 08:15 12:05 15:52 21:15
DT 12:23 16:56 08:00 10:38 14:33 18:24
Example 2
1 5 41 18(1) 18 17 20 20(3) 47 45 37(2) 37 2 29 29(4) 16 36 50 50(2) 39
AT 10:54 15:24 20:30 08:09 13:08 17:11 20:17 09:37 14:23 19:04 08:14 13:31 18:14 20:43 09:48 13:16 17:30 19:49 08:35
DT 09:00 13:48 18:14 08:00 11:04 16:04 20:04 08:00 12:20 17:13 08:00 11:05 15:45 20:30 08:00 12:13 15:41 19:41 08:00 10:47
15 8 31(5) 31 21 24 25(4) 25 27 30 26(3) 26 3 28(4) 28 11 14 6(1) 6 13
AT 11:50 16:09 19:01 08:12 11:51 16:16 19:12 08:11 13:15 16:09 19:44 08:13 13:09 19:01 8:13 12:28 16:14 21:14 08:09 13:11
DT 14:01 18:10 08:00 10:13 13:54 18:17 08:00 10:19 15:27 18:21 08:00 10:25 15:45 08:00 10:49 15:04 18:58 08:00 10:55 16:06
50 191670 413.71 18 1.06
34 34(1) 9 23 12(3) 12 32 43 40 40(5) 46 10 4 33(5) 33 48 7 7(1) 19 22
AT 18:23 21:30 10:48 15:17 19:13 08:09 11:24 14:30 17:57 20:45 08:30 12:28 16:08 19:24 08:09 12:58 18:06 20:43 08:17 11:10
DT 21:19 08:00 13:38 18:06 08:00 10:56 14:09 17:11 20:37 08:00 11:00 14:55 18:37 08:00 10:35 15:24 20:33 08:00 10:41 13:31
42 49(1) 49 44 35 35(4) 38 1
AT 15:53 20:09 08:08 13:02 16:52 19:32 09:20 14:42
DT 17:54 08:00 10:27 15:25 19:20 08:00 11:51
1 82 86 5 73(5) 73 12 56 56(3) 79 47 32 32(2) 65 9 2 2(5) 29 72 49(2)
AT 06:06 10:27 15:39 20:29 08:14 12:48 17:50 20:42 09:40 14:21 18:36 21:29 09:45 14:07 17:31 19:56 09:54 15:16 19:27
DT 06:00 09:03 13:25 18:33 08:00 11:04 15:36 20:32 08:00 12:22 17:04 21:21 08:00 12:34 16:57 19:44 08:00 12:10 17:34 08:00
49 85 91 91(3) 24 99 42 42(1) 31 92 21 21(1) 8 50 30 30(1) 15 51 39 39(1)
AT 08:11 13:31 17:46 20:05 10:30 13:29 18:04 20:16 10:09 14:29 18:12 20:24 09:42 13:29 17:58 20:21 09:15 13:38 18:14 20:36
DT 10:29 15:42 19:54 08:00 12:30 15:33 20:05 08:00 12:10 16:30 20:16 08:00 11:43 15:40 20:10 08:00 11:26 15:50 20:26 08:00
3 80 60 60(3) 76 28 70(5) 70 77 83 35(1) 35 78 57 69(1) 69 97 87 26(3) 26
AT 10:08 14:00 18:23 21:12 10:32 14:51 19:30 08:12 11:00 15:44 20:11 08:13 11:15 13:49 19:25 08:15 11:11 16:32 20:28 08:09
DT 12:45 16:37 20:59 08:00 13:14 17:27 08:00 10:44 13:33 18:15 08:00 10:41 13:41 16:12 08:00 10:35 13:28 18:48 08:00 10:21
27 25 68(2) 68 95 44 10(5) 10 48 19 19(4) 16 7 61 61(3) 4 33 36(4) 36 67
100 AT 12:35 16:47 20:46 08:12 10:57 14:21 19:20 08:10 12:40 17:05 19:44 09:48 12:52 17:49 20:33 10:38 15:53 20:04 08:14 12:02 292510 845.22 36 1.43
DT 14:47 18:55 08:00 10:29 13:15 16:44 08:00 10:37 15:05 19:30 08:00 12:12 15:19 20:19 08:00 13:08 18:20 08:00 10:39 14:23
22 88(2) 88 55 94 53(5) 53 75 98 46(4) 46 11 38 38(1) 54 74 13(2) 13 17 89
AT 16:09 21:25 08:09 11:29 14:49 19:02 08:08 11:45 16:10 19:49 08:15 12:47 18:08 20:54 08:38 14:16 19:34 08:10 13:07 16:05
DT 18:30 08:00 10:24 13:38 17:01 08:00 10:26 14:07 18:38 08:00 10:45 15:24 20:40 08:00 11:10 16:50 08:00 11:06 16:03 18:57
37(3) 37 41 14 14(5) 43 40 90(5) 90 23 84 84(4) 34 100 100(4) 6 96 64 64(5) 66
AT 20:21 08:13 12:43 18:11 21:10 09:26 14:36 19:25 08:12 13:36 17:48 20:50 10:14 16:7 19:16 09:54 15:27 19:00 21:49 08:34
DT 08:00 11:04 15:33 20:55 08:00 12:07 17:16 08:00 10:56 16:25 20:41 08:00 13:10 19:07 08:00 12:40 18:07 21:38 08:00 11:13
52 71 71(2) 63 59 62 62(4) 18 20 93 93(5) 58 45 81 81(5) 1
AT 13:33 18:28 21:22 09:30 12:50 17:43 20:45 09:50 14:8 17:37 20:43 09:16 15:01 18:03 21:00 11:13
DT 16:14 21:12 08:00 12:19 15:41 20:37 08:00 12:45 17:01 20:31 08:00 12:08 17:51 20:51 08:00
Example 3
1 82 86 5 5(1) 73 12 56(3) 56 79 47 47(1) 32 65 9 9(2) 2 29 72 72(4)
AT 09:06 13:27 18:39 21:42 09:57 14:31 19:33 08:09 12:32 17:12 20:03 09:32 14:02 18:24 21:22 08:34 12:42 18:04 20:31
DT 09:00 12:03 16:25 21:33 08:00 12:47 17:18 08:00 10:51 15:13 19:55 08:00 12:17 16:51 21:14 08:00 10:48 14:58 20:22 08:00
49 85 91(4) 91 24 99 42(1) 42 31 92 21(1) 21 8 50 30(1) 30 15 51 39(1) 39
AT 09:53 15:13 19:28 08:14 12:52 15:52 20:27 8:11 12:21 16:41 20:24 08:08 11:53 15:40 20:9 08:12 11:39 16:02 20:38 08:09
DT 12:11 17:24 08:00 10:23 14:53 17:55 08:00 10:12 14:22 18:42 08:00 10:12 13:55 17:52 08:00 10:24 13:49 18:14 08:00 10:22
3 80 60(3) 60 76 28 28(3) 70 77 83 83(1) 35 78 57 69(1) 69 97 87 26(3) 26
AT 12:30 16:22 20:45 08:13 13:21 17:39 20:27 10:03 12:52 17:35 20:18 09:56 12:57 15:31 21:07 08:15 11:11 16:32 20:28 08:09
DT 15:06 18:59 08:00 10:49 16:03 20:16 08:00 12:35 15:25 20:07 08:00 12:24 15:24 17:54 08:00 10:35 13:28 18:48 08:00 10:21
27 25 68(2) 68 95 44 10(5) 10 48 19 19(4) 16 7 61 61(3) 4 33 36(4) 36 67
100 AT 12:35 16:47 20:46 08:12 10:57 14:21 19:20 08:10 12:40 17:05 19:44 09:48 12:52 17:49 20:33 10:38 15:53 20:04 08:14 12:02 294890 842.22 36 1.58
DT 14:47 18:55 08:00 10:29 13:15 16:44 08:00 10:37 15:05 19:30 08:00 12:12 15:19 20:19 08:00 13:08 18:20 08:00 10:39 14:23
22 88(2) 88 55 94 53(5) 53 75 98 46(4) 46 11 38 38(1) 54 74 13(2) 13 17 89
AT 16:09 21:25 08:09 11:29 14:49 19:02 08:08 11:45 16:10 19:49 08:15 12:47 18:08 20:54 08:38 14:16 19:34 08:10 13:07 16:05
DT 18:30 08:00 10:24 13:38 17:01 08:00 10:26 14:07 18:38 08:00 10:45 15:24 20:40 08:00 11:10 16:50 08:00 11:06 16:03 18:57
37(3) 37 41 14 14(5) 43 40 90(5) 90 23 84 84(4) 34 100 100(4) 6 96 64 64(5) 66
AT 20:21 08:13 12:43 18:11 21:10 09:26 14:36 19:25 08:12 13:36 17:48 20:50 10:14 16:7 19:16 09:54 15:27 19:00 21:49 08:34
DT 08:00 11:04 15:33 20:55 08:00 12:07 17:16 08:00 10:56 16:25 20:41 08:00 13:10 19:07 08:00 12:40 18:07 21:38 08:00 11:13
52 71 71(2) 63 59 62 62(4) 18 20 93 93(5) 58 45 81 81(5) 1
AT 13:33 18:28 21:22 09:30 12:50 17:43 20:45 09:50 14:8 17:37 20:43 09:16 15:01 18:03 21:00 11:13
DT 16:14 21:12 08:00 12:19 15:41 20:37 08:00 12:45 17:01 20:31 08:00 12:08 17:51 20:51 08:00

algorithm. We first cluster the travel spots into some groups algorithm. By instances of the different scales, the method
by K-means algorithm. Then, the branch and bound al- and framework developed are readily applicable to other
gorithm is used for each group of travel spots to solve networks.
the travel salesmen problem. Finally, the hotel choice and
time arrangement are obtained by the corresponding con-
In future studies, it would be interesting to further in-
straints. We test our algorithm in different scales instances.
vestigate the more realistic constraints, for example, best
A real-world instance of multiple data sources and large-
viewing time window and departure time and so on. In
scale instance are solved to test the performance of the
addition, travel spot choice also is a valuable problem.

1052

Authorized licensed use limited to: Middlesex University. Downloaded on August 31,2020 at 21:05:38 UTC from IEEE Xplore. Restrictions apply.
References [20] B. Cao, J. Wang, J. Fan, J. Yin, and T. Dong, "Querying similar pro-
cess models based on the Hungarian algorithm," IEEE Transactions
on Services Computing, vol. 10, pp. 121-135, 2017.
[1] M. Dorigo and L. M. Gambardella, "Ant colony system: A cooper-
ative learning approach to the traveling salesman problem," in IEEE [21] W. P. Coutinho, R. Q. D. Nascimento, A. A. Pessoa, and A. Sub-
Transactions on Evolutionary Computation, vol. 1, pp. 53-66, 1997. ramanian, "A branch-and-bound algorithm for the close-enough trav-
eling salesman problem," INFORMS Journal on Computing, vol. 28,
[2] N. Agatz, P. Bouman, and M. Schmidt, "Optimization approaches for pp. 752-765, 2016.
the traveling salesman problem with drone," Transportation Science,
2018. [22] P. Stanojevic, M. Maric, and Z. Stanimirovic, "A hybridization of an
evolutionary algorithm and a parallel branch and bound for solving
[3] C. Chen, J. Ma, Y. Susilo, Y. Liu, and M. Wang, "The promises the capacitated single allocation hub location problem," Applied Soft
of big data and small data for travel behavior (aka human mobility) Computing, vol. 33, pp. 24-36, 2015.
analysis," Transportation Research Part C, vol. 68, pp. 285-299, 2016.
[4] J. L. Toole, S. Colak, B. Sturt, L. P. Alexander, A. Evsukoff, and
M. C. Gonzalez, "The path most traveled: Travel demand estimation
using big data resources," Transportation Research Part C, vol. 58,
pp. 162-177, 2015.
[5] A. Vij and K. Shankari, "When is big data big enough? Implications of
using GPS-based surveys for travel demand analysis," Transportation
Research Part C, vol. 56, pp. 446-462, 2015.
[6] S. J. Miah, H. Q. Vu, J. Gammack, and M. McGrath, "A big
data analytics method for tourist behaviour analysis," Information &
Management, vol. 54, pp. 771-785, 2017.
[7] G. Fusco, A. Bracci, T. Caligiuri, C. Colombaroni, and N. Isaenko,
"Experimental analyses and clustering of travel choice behaviours by
floating car big data in a large urban area," IET Intelligent Transport
Systems, vol. 12, pp. 270-278, 2018.
[8] J. LaRusic and A. P. Punnen, "The asymmetric bottleneck traveling
salesman problem: Algorithms, complexity and empirical analysis,"
Computers & Operations Research, vol. 43, pp. 20-35, 2014.
[9] R. Bolanos, M. Echeverry, and J. Escobar, "A multiobjective non-
dominated sorting genetic algorithm (NSGA-II) for the Multiple
Traveling Salesman Problem," Decision Science Letters, vol. 4, pp.
559-568, 2015.
[10] D. TaA, M. Gendreau, O. Jabali, and G. Laporte, "The traveling
salesman problem with time-dependent service times," European
Journal of Operational Research, vol. 248, pp. 372-383, 2016.
[11] M. T. Godinho, L. Gouveia, and P. Pesneau, "Natural and extended
formulations for the time-dependent traveling salesman problem,"
Discrete Applied Mathematics, vol. 164, pp. 138-153, 2014.
[12] D. E. Boyce and H. Bar-Gera, "Validation of multiclass urban travel
forecasting models combining origin-destination, mode, and route
choices," Journal of Regional Science, vol. 43, pp. 517-540, 2003.
[13] G. M. Coldren, F. S. Koppelman, K. Kasturirangan, and A. Mukher-
jee, "Modeling aggregate air-travel itinerary shares: Logit model
development at a major US airline," Journal of Air Transport Man-
agement, vol. 9, pp. 361-369, 2003.
[14] J. Long, H. J. Huang, Z. Gao, and W. Y. Szeto, "An intersection-
movement-based dynamic user optimal route choice problem," Oper-
ations Research, vol. 61, pp. 1134-1147, 2013.
[15] B. de Jonge and R. H. Teunter, "Optimizing itineraries in public
transportation with walks between rides," Transportation Research
Part B, vol. 55, pp. 212-226, 2013.
[16] Y. Iida, T. Akiyama, and T. Uchida, "Experimental analysis of dy-
namic route choice behavior," Transportation Research Part B, vol.
26, pp. 17-32, 1992.
[17] L. Bobrowski and J. C. Bezdek, "C-means clustering with the l/sub
l/and l/sub infinity/norms," IEEE Transactions on Systems, Man, and
Cybernetics, vol. 21, pp. 545-554, 1991.
[18] S. Z. Selim and M. A. Ismail, "K-means-type algorithms: A general-
ized convergence theorem and characterization of local optimality,"
IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.
1, pp. 81-87, 1984.
[19] R. Jonker and T. Volgenant, "Improving the Hungarian assignment
algorithm," Operations Research Letters, vol. 5, pp. 171-175, 1986.

1053

Authorized licensed use limited to: Middlesex University. Downloaded on August 31,2020 at 21:05:38 UTC from IEEE Xplore. Restrictions apply.

S-ar putea să vă placă și