Characterizing and Modeling Microblog Traffic in Cellular Data Network Based on Massive Data Analysis

Article Preview

Abstract:

In this paper, we present an approach to characterize and model microblog traffic in cellular data network. In contrast to previous methods, our approach is based on the cloud computing platform and the cluster system, including the Hadoop Distributed File System (HDFS) and the parallel processing software framework MapReduce. Whats more, we focus on the contrast of Sina and Tencent microblogs. We analyze the features of microblog traffic in four aspects of increasing details, which are (i) traffic diurnal pattern, (ii) modeling the traffic distribution, (iii) user distribution, (iv) diversity usage of microblogs. This approach of analyzing microblog traffic comprehensively is probably the most important contribution of this paper. Furthermore, our approach has two important features. First, the massive mobile subscriber data we used in our experiments was collected from a commercial Internet Service Provider (ISP) covering an entire province in Southern China. Therefore, it ensures the results indicate the true characteristics of microblog traffic in network. Second, we investigate that the microblog traffic fits with the power law distribution. We demonstrate the electiveness of our approach on three real datasets. Our results are important for cellular network operators to learn user behavior and optimize the future microblog application designs.

You might also be interested in these eBooks

Info:

Periodical:

Advanced Materials Research (Volumes 926-930)

Pages:

2781-2785

Citation:

Online since:

May 2014

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2014 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

* - Corresponding Author

[1] http: /www. cnnic. net. cn/hlwfzyj/hlwxzbg/hlwtjbg/201301/t20130115_38508. htm.

Google Scholar

[2] Yuanyuan Qiao, Zhenming Lei and Jie Yang: Wireless Personal Multimedia Communications (WPMC), 2013 16th International Symposium on. IEEE, Vol. 12, p.1.

Google Scholar

[3] Sanghoon Lee, Sunny Shakya. Real Time Micro-Blog Summarization based on Hadoop/HBase. (2013).

DOI: 10.1109/wi-iat.2013.148

Google Scholar

[4] Tianyin Xu, Yang Chen, Lei Jiao. Proceedings of the 12th International Middleware Conference(2011). p.20.

Google Scholar

[5] Hui Liu, Wu Qu, Jin Yi, Junhe Wang, Chenghao Sun. Control and Decision Conference (CCDC), 2013 25th Chinese. IEEE p.1850.

Google Scholar

[6] J. Dean and S. Ghemawat, MapReduce: Simplified Data Processing on Large Cluster,. OSDI, (2004).

Google Scholar

[7] Hadoop, http: /hadoop. apache. org.

Google Scholar

[8] R.C.H. Lin, H.J. Liao, K.Y. Tung, Y.C. Lin, S.L. Wu: Journal of Internet Technology, Vol. 13 (2012) No. 6, p.953.

Google Scholar

[9] Y. Lee, W. Kang and H. Son, Network Operations and Management Symposium Workshops (NOMS Wksps), 2010 IEEE/IFIP, p.357.

Google Scholar

[10] Vinh Khuc, B.S. Proceedings of the 27th Annual ACM Symposium on Applied Computing. ACM, 2012. p.459.

Google Scholar

[11] Younghoon Kim, Kyuseok Shim. Data Mining (ICDM), 2011 IEEE 11th International Conference on. IEEE. p.340.

Google Scholar

[12] L. A. Adamic, B. A. Huberman. Glottometrics, Vol. 3 (2002) No. 1, p.143.

Google Scholar

[13] C. Wang and X. Zhou. Computer Network and Multimedia Technology(2009). Vol. 1, pp.1-4.

Google Scholar