Understanding travellers’ preferences for different types of trip destination based on mobile internet usage data

https://doi.org/10.1016/j.trc.2018.03.009Get rights and content

Highlights

  • Mobile internet usage data is used to distinguish users’ preferred destination types.

  • People who like different content categories prefer different destination types.

  • People who use more mobile internet prefer to visit more commercial areas.

  • People who use less mobile internet prefer to visit more residential areas.

Abstract

New mobility data sources like mobile phone traces have been shown to reveal individuals’ movements in space and time. However, socioeconomic attributes of travellers are missing in those data. Consequently, it is not possible to partition the population and have an in-depth understanding of the socio-demographic factors influencing travel behaviour. Aiming at filling this gap, we use mobile internet usage behaviour, including one’s preferred type of website and application (app) visited through mobile internet as well as the level of usage frequency, as a distinguishing element between different population segments. We compare the travel behaviour of each segment in terms of the preference for types of trip destinations. The point of interest (POI) data are used to cluster grid cells of a city according to the main function of a grid cell, serving as a reference to determine the type of trip destination. The method is tested for the city of Shanghai, China, by using a special mobile phone dataset that includes not only the spatial-temporal traces but also the mobile internet usage behaviour of the same users. We identify statistically significant relationships between a traveller’s favourite category of mobile internet content and more frequent types of trip destinations that he/she visits. For example, compared to others, people whose favourite type of app/website is in the “tourism” category significantly preferred to visit touristy areas. Moreover, users with different levels of internet usage intensity show different preferences for types of destinations as well. We found that people who used mobile internet more intensively were more likely to visit more commercial areas, and people who used it less preferred to have activities in predominantly residential areas.

Introduction

There is a recent trend in complementing or even replacing traditional travel survey data with new mobility-related data sources, such as GPS data, mobile phone traces and smart card transaction data (Chen et al., 2016, Demissie et al., 2013a, Iqbal et al., 2014, Ni et al., 2018, Toole et al., 2015, Wang et al., 2017, Wolf, 2006, Yue et al., 2014, Zhao et al., 2018). These trajectory-based data are getting popular for travel analysis because (1) they are inexpensive to collect; (2) they are usually up to date; and (3) most of them contain a large sample with observations that are longitudinal in time (Calabrese et al., 2013, Demissie et al., 2013b, Morency et al., 2007).

However, despite the potential advantages, these sources of information only include the spatial-temporal traces describing people’s movements. If the aim is to understand travel behaviour from an activity-based perspective (Chen et al., 2016, Rasouli and Timmermans, 2014, Zhao and Zhang, 2017), the information of these data sets is usually very limited. For example, activity purpose of the trips is typically missing (Calabrese et al., 2013). Moreover, in traditional travel demand models, socioeconomic information is used to segment the population, and better explain the heterogeneity of activity-travel behaviour, including, but not limited to, activity patterns (Balmer et al., 2008) and location choice (Sivakumar and Bhat, 2007). However, in anonymous big data, socioeconomic information is unavailable mainly due to privacy reasons (Calabrese et al., 2014).

To deal with such problems, researchers have tried to combine different types of data in order to fill the gaps (Anda et al., 2017). In attempting to derive activity purpose information from trajectory data, there have been several applications fusing trajectory data with land use data, OpenStreetMap data or point of interest (POI) data (Dashdorj et al., 2013, Demissie et al., 2015, Wolf et al., 2004, Yuan et al., 2012). This geo-coded background knowledge can help estimating the function of an area, which can tentatively be connected to the type of activity that a visitor performed in that area (Furletti et al., 2013, Jiang et al., 2015, Phithakkitnukoon et al., 2010, Wolf et al., 2001). We referred to the main function of an area being visited as “type of trip destination” in this paper. The left chain in Fig. 1 shows how we derive the dependency of one’s preference for destination types on socioeconomic attributes, based on literature review. Intuitively, such dependency exists in most cities. For example, it is common that some specific urban areas are more frequented by young people.

To partition the population using mobile phone data, Arai et al., 2014, Bwambale et al., 2017 suggested utilizing calling behaviour such as calling frequency and duration to predict one’s personal attributes. However, mobile phones are less used for calls today, making calling behaviour less useful, while simultaneously people are spending more time on services provided by mobile internet such as mobile apps (Richmond, 2012). Therefore, mobile internet usage behaviour, if available, could have a greater potential to reflect individuals’ traits, such as gender and age (Seneviratne et al., 2015, Seneviratne et al., 2014). The right chain in Fig. 1 shows the dependency of mobile internet usage behaviour on socioeconomic attributes.

As a whole, Fig. 1, which can be regarded as a conceptual framework, shows the relationship between mobile internet usage behaviour and preference for types of trip destination. Since they are both dependent on the socioeconomic attributes, even if the socioeconomic attributes are unobserved, they are still likely to be correlated with each other. Based on this hypothesis derived from the conceptual framework, our study aims to understand travellers’ preferences for types of trip destination by means of segmenting them based on the preferred type of sites and applications visited through mobile internet as well as the level of visiting frequency, by fusing mobile phone traces and mobile internet usage data. We are allowed to do this study because of the data provided by the Shanghai Unicom WO+ Open Data Application Contest.1

Furthermore, mobile internet usage behaviour might sometimes be able to reflect even more information about a person, such as one’s specific interests and lifestyles, than the traditional socioeconomic attributes do. At the same time, one’s interests and lifestyles are regarded as the determinants of location choice through preference for different types of non-work activities (Wen and Koppelman, 2000). A more specific interest or lifestyle might be related to a more specific travel preference especially for non-work activities. For example, a foodie would visit more sites and applications about food, and meanwhile, he/she would also like to visit more restaurants in real life. We see the potential to explore such relationships by fusing mobile internet usage data and mobile phone traces, and we especially focus on the types of destinations for out-of-home non-work activities, designated herein as secondary activities for simplicity. Many studies have used mobile phone data to analyse users’ home and workplace locations as well as commuting trips (Ahas et al., 2010, Alexander et al., 2015, Calabrese et al., 2011, Isaacman et al., 2011). However, trips for secondary activities have not often been analysed using this type of data, except in only a few studies (e.g., Järv et al., 2014, Huang and Levinson, 2015), which does not mean that they are not an important part of urban travel demand. In fact, they are taking a larger share than ever before, especially in large metropolitan areas (Wang et al., 2017).

The rest of this paper is organized as follows. First, we introduce the data used in our research. Next, we explain our research method. Then, the results are presented. Finally, we draw the conclusions, discuss the usefulness and limitation of our research, and point out the directions for future research.

Section snippets

Case study

In this paper, the case study is conducted in Shanghai, China. As one of the four directly-controlled municipalities of China, Shanghai is world famous for being a global financial centre and transport hub. The total area of Shanghai is 6340 square kilometres, and the population of Shanghai has exceeded 24 million. The city of Shanghai is divided into 16 districts. Except the Chongming district composed of three islands in the Yellow Sea, the other 15 districts lie on China’s east coast. They

Methodology

In Fig. 3, we present a flowchart of the proposed research method in this study. First, trip destinations chosen by the users for secondary activities can be extracted from mobile phone traces. Second, each trip destination can be labelled by the cluster of the grid cell calculated based on the POI data, and we can discover the users’ preferences regarding the types of trip destinations for secondary activities. Third, the favourite categories of mobile internet contents and the total usage

Results and discussion

We processed the mobile phone traces using the method explained in Section 3.1. As a result, we obtained 26,535 target users meeting the specified criteria, and we detected their trips for secondary activities. Next, we clustered the grid cells using the method explained in Section 3.2. As shown in Fig. 4, based on the Dunn index, we can find that the clusters are best distinguished by setting the number of clusters as 6 or 7 and setting the side length as 500 m in our case. We chose the

Conclusions and recommendations

This paper proposes a method to segment the population and understand travellers’ preferences for types of trip destinations by fusing mobile internet usage data and mobile phone traces. The results of a case study, using a dataset from Shanghai, China, show that given one’s favourite category of mobile internet content, the proportions of visiting some types of destinations were significantly higher, and the proportions of visiting some others were significantly lower. Many of these observed

Acknowledgment

We would like to express our gratitude to the Shanghai Unicom WO+ Open Data Application Contest for making the mobile phone data available for this research. We are grateful to the Yanxishe (a Chinese urban data research organization), who provided us the POI data extracted from the Gaode Maps service. Thanks go also to the TRAIL research school and the Dutch Organization for Scientific Research (NWO) for sponsoring the first author for his PhD study.

References (61)

  • O. Järv et al.

    Understanding monthly variability in human activity spaces: A twelve-month study using mobile phone call detail records

    Transp. Res. Part C Emerg. Technol.

    (2014)
  • S. Jiang et al.

    Mining point-of-interest data from social networks for urban land use classification and disaggregation

    Comput. Environ. Urban Syst.

    (2015)
  • F. Liu et al.

    Annotating mobile phone location data with activity purposes using machine learning algorithms

    Expert Syst. Appl.

    (2013)
  • C. Morency et al.

    Measuring transit use variability with smart-card data

    Transp. Policy

    (2007)
  • L. Ni et al.

    A spatial econometric model for travel flow analysis and real-world applications with massive mobile phone data

    Transp. Res. Part C Emerg. Technol.

    (2018)
  • T.H. Rashidi et al.

    Exploring the capacity of social media data for modelling travel behaviour: Opportunities and challenges

    Transp. Res. Part C Emerg. Technol.

    (2017)
  • J.L. Toole et al.

    The path most traveled: Travel demand estimation using big data resources

    Transp. Res. Part C Emerg. Technol.

    (2015)
  • F. Wang et al.

    On data processing required to derive mobility patterns from passively-generated mobile phone data

    Transp. Res. Part C Emerg. Technol.

    (2018)
  • Y. Wang et al.

    Using metro smart card data to model location choice of after-work activities: An application to Shanghai

    J. Transp. Geogr.

    (2017)
  • Y. Yue et al.

    Zooming into individuals to understand the collective: A review of trajectory-based travel behaviour studies

    Travel Behav. Soc.

    (2014)
  • S. Zhao et al.

    Observing individual dynamic choices of activity chains from location-based crowdsourced data

    Transp. Res. Part C Emerg. Technol.

    (2017)
  • Z. Zhao et al.

    Individual mobility prediction using transit smart card data

    Transp. Res. Part C Emerg. Technol.

    (2018)
  • A. Abbasi et al.

    Utilising location based social media in travel survey methods

  • R. Ahas et al.

    Using mobile positioning data to model locations meaningful to users of mobile phones

    J. Urban Technol.

    (2010)
  • C. Anda et al.

    Transport modelling in the age of big data

    Int. J. Urban Sci.

    (2017)
  • A. Arai et al.

    Understanding user attributes from calling behavior

  • Balmer, M., Meister, K., Nagel, K., 2008. Agent-based simulation of travel demand: Structure and computational...
  • V.D. Blondel et al.

    A survey of results on mobile phone datasets analysis

    EPJ Data Sci.

    (2015)
  • A. Bwambale et al.

    Modelling trip generation using mobile phone data: a latent demographics approach

    J. Transp. Geogr. In Press.

    (2017)
  • F. Calabrese et al.

    Estimating origin-destination flows using mobile phone location data

    IEEE Pervasive Comput.

    (2011)
  • Cited by (42)

    • A systematic review of big data-based urban sustainability research: State-of-the-science and future directions

      2020, Journal of Cleaner Production
      Citation Excerpt :

      To this end, Salas-Olmedo et al. (2018) used three types of data sources (i.e., Panoramio, Foursquare, and Twitter) to clarify the activities of tourists and found that these three types of activities were both redundant and complementary. These studies evidenced that the use of multisource data (e.g., POIs, cellular network data, and social media data) was more reliable for tourism (Wang et al., 2018c). The efficient use of resources and the continuous supply of energy are essential guarantees for sustainable urban development (United Nations Centre for Human Settlements, 1996).

    • Estimation of a recursive link-based logit model and link flows in a sensor equipped network

      2020, Transportation Research Part B: Methodological
      Citation Excerpt :

      Yoon et al. (2006) studied route choice behavior as well, by generating a distance-based set of path alternatives, and statistically deriving route splits from the data. Also mobile phone network data has been used to study destination choice (Iqbal et al., 2014; Wang et al., 2018) as well as route choice (Leontiadis et al., 2014; Huang et al., 2018). Huang et al. (2018) estimated the perception parameter of a C-logit model with so-called antenna ID paths. van den Heuvel et al. (2015)

    View all citing articles on Scopus
    View full text