ABSTRACT:--
Mining the most influential k-location set finds k locations, traversed by the maximum number of unique trajectories, in a given spatial region. These influential locations are valuable for resource allocation applications, such as selecting charging stations for electric automobiles and suggesting locations for placing billboards. This problem is NP-hard and usually calls for an interactive mining process, e.g., changing the spatial region and k, or removing some locations (from the results in the previous round) that are not eligible for an application according to the domain knowledge. Thus, efficiency is the major concern in addressing this problem. In this paper, we propose a system by using greedy heuristics to expedite the mining process. The greedy heuristic is efficient with a performance guarantee. We evaluate the performance of our proposed system based on a taxi dataset of Tianjin and provide a case study on selecting the locations for charging stations in Beijing.
Keywords:-- Location Selection, Trajectory Data Mining, Maximum Coverage.
INTRODUCTION:--
Advances in location acquisition technology have resulted in massive trajectories, representing the mobility of a diversity of moving objects, e.g., human, vehicles, and animals. Finding k locations traversed by the maximum number of unique trajectories in a given spatial region is vital to many resource allocation problems: The first application is selecting charging stations for electric vehicles according to their GPS trajectories. As shown in Fig. 1(a), the location is defined as road intersection. Intersections n1 and n3 form the most influential 2-location set by covering 5 unique trajectories. n2 and n3 are not the most influential set, as they only
Mining the most influential k-location set finds k locations, traversed by the maximum number of unique trajectories, in a given spatial region. These influential locations are valuable for resource allocation applications, such as selecting charging stations for electric automobiles and suggesting locations for placing billboards. This problem is NP-hard and usually calls for an interactive mining process, e.g., changing the spatial region and k, or removing some locations (from the results in the previous round) that are not eligible for an application according to the domain knowledge. Thus, efficiency is the major concern in addressing this problem. In this paper, we propose a system by using greedy heuristics to expedite the mining process. The greedy heuristic is efficient with a performance guarantee. We evaluate the performance of our proposed system based on a taxi dataset of Tianjin and provide a case study on selecting the locations for charging stations in Beijing.
Keywords:-- Location Selection, Trajectory Data Mining, Maximum Coverage.
INTRODUCTION:--
Advances in location acquisition technology have resulted in massive trajectories, representing the mobility of a diversity of moving objects, e.g., human, vehicles, and animals. Finding k locations traversed by the maximum number of unique trajectories in a given spatial region is vital to many resource allocation problems: The first application is selecting charging stations for electric vehicles according to their GPS trajectories. As shown in Fig. 1(a), the location is defined as road intersection. Intersections n1 and n3 form the most influential 2-location set by covering 5 unique trajectories. n2 and n3 are not the most influential set, as they only
cover 4 unique trajectories. Though they individually cover the
most number of trajectories (i.e., 3 for each).
The second application is to select locations for placing billboards
based on users’ check-in histories. As shown in Fig. 1(b), a
location can be defined as a uniform grid covering a few points of
interests (POIs). g1 and g2 form a most influential 2-location set,
traversed by 4 unique trajectories, i.e., visited by 4 users.
The third application is to place observation stations for migratory
birds, where a location can be a cluster of birds’ stay points. As
shown in Fig. 1(c), c2 and c4 form the most influential 2-location
set that covers all birds’ trajectories.
There are three major challenges in mining the most influential
location set from massive trajectories: i) this problem can be
mapped to the MAX-k-COVER problem, which is NP-hard and
computational intensive; ii) different users may be interested in
mining k locations in different spatial areas. For instance, as shown
in Fig. 2(a), two local business owners may want to place different
numbers of billboards in different areas. As a result, it is not possible
to pre-compute one location set to serve all requests with different
mining parameters; and iii) users, e.g., domain experts, may
need to interact with our system several times. For example, as depicted
in Fig. 2(b), c4 is located in a lake where we cannot find land
to place an observation station, we need to remove it from the returned
set. Then, c1 and c5 become the most influential 2-location
set. Although the MAX-k-COVER problem has been studied [1,
2, 3], existing methods are off-line approaches that find a one-time
result for a given dataset. Different from existing works, our problem
setting allows users: i) to specify a spatial region R and value
k, and ii) to refine the returned result interactively and iteratively.
To this end, we propose a system to find the most influential
k-location set efficiently in this paper. Our system contains two
main modules: i) pre-processing module, which creates the spatial
networks from different types of trajectory data and builds a set of
index structures to speed up the mining process; and ii) location
set mining module, which finds k locations by taking spatial region
R, value k, and choices made during the user’s interaction as the
input. Our main contributions are summarized as follows:
We introduce a novel problem, i.e., mining the most influential
k-location set, with many potential applications.
• We propose an efficient algorithm to find the k-location set.
Even the solution is based on greedy heuristics, it can provide
the performance guarantee.
• Evaluation results on real datasets demonstrate the efficiency
of our proposed solution.
We also provide a case study to
show the effectiveness of our proposed system.
No comments:
Post a Comment