Skip to main content
Log in

Exploring characteristics of suspended users and network stability on Twitter

  • Original Article
  • Published:
Social Network Analysis and Mining Aims and scope Submit manuscript

Abstract

Social media is rapidly becoming a medium of choice for understanding the cultural pulse of a region; e.g. for identifying what the population is concerned with and what kind of help is needed in a crisis. To assess this cultural pulse, it is critical to have an accurate assessment of who is saying what. Unfortunately, social media is also the home of users who engage in disruptive, disingenuous, and potentially illegal activity. A range of users, both human and non-human, carry out such social cyber-attacks. We ask, to what extent does the presence or absence of such users influence our ability to assess the cultural pulse of a region? Our prior research on this topic showed that Twitter-based network structures and content are unstable and can be highly impacted by the removal of suspended users. Because of this, statistical techniques can be established to differentiate potential types of suspended and non-suspended users. In this extended paper, we develop additional experiments to explore the spatial patterns of suspended users, and we further consider how these users affect structural and content concentrations via the development of new metrics and new analyses. We find significant evidence that suspended users exist on the periphery of social networks on Twitter and consequently that removing them has little impact on network structure. We also improve prior attempts to distinguish among different types of suspended users by using a much larger dataset. Finally, we conduct a temporal sentiment analysis to illustrate differences between suspended users and non-suspended users on this dimension.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21

Similar content being viewed by others

Notes

  1. http://www.politico.com/story/2014/06/twitter-politicians-107672.

References

  • Amleshwaram AA, Reddy N, Yadav S, Gu G, Yang C (2013) CATS: characterizing automation of twitter spammers. In: Communication systems and networks (COMSNETS), 2013 fifth international conference on, IEEE, pp 1–10

  • Anthonisse JM (1971) The rush in a directed graph. Stichting Mathematisch Centrum Mathematische Besliskunde (BN 9/71):1–10

  • Bíró I, Szabó J, Benczúr AA (2008) Latent dirichlet allocation in web spam filtering. In: Proceedings of the 4th international workshop on adversarial information retrieval on the web, ACM, pp 29–32

  • Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022

    MATH  Google Scholar 

  • Bolton RJ, Hand DJ (2002) Statistical fraud detection: a review. Stat Sci 17:235–249

  • Borgatti SP, Carley KM, Krackhardt D (2006) On the robustness of centrality measures under conditions of imperfect data. Soc Netw 28(2):124–136

    Article  Google Scholar 

  • Bosagh Zadeh R, Goel A, Munagala K, Sharma A (2013) On the precision of social and information networks. In: Proceedings of the first ACM conference on Online social networks, pp 63–74

  • Carley KM, Pfeffer J, Morstatter F, Liu H (2014) Embassies burning: toward a near-real-time assessment of social media using geo-temporal dynamic network analytics. Soci Netw Anal Min 4(1):1–23

    Google Scholar 

  • De Lathauwer L, De Moor B, Vandewalle J, by Higher-Order BSS (1994) Singular value decomposition. In: Proceedings of the EUSIPCO-94, Edinburgh, Scotland, UK, vol 1, pp 175–178

  • Diao Q, Qiu M, Wu CY, Smola AJ, Jiang J, Wang C (2014) Jointly modeling aspects, ratings and sentiments for movie recommendation (jmars). In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 193–202

  • Dumais ST (2004) Latent semantic analysis. Ann Rev Inf Sci Technol 38(1):188–230

    Article  Google Scholar 

  • Esuli A, Sebastiani F (2006) Sentiwordnet: a publicly available lexical resource for opinion mining. In: Proceedings of LREC, Citeseer, vol 6, pp 417–422

  • Frantz TL, Cataldo M, Carley KM (2009) Robustness of centrality measures under uncertainty: examining the role of network topology. Comput Math Organ Theory 15(4):303–328

    Article  Google Scholar 

  • Freeman LC (1979) Centrality in social networks conceptual clarification. Soc Netw 1(3):215–239

    Article  Google Scholar 

  • Gelman A (2008) Scaling regression inputs by dividing by two standard deviations. Stat Med 27(15):2865–2873

    Article  MathSciNet  Google Scholar 

  • Golder SA, Macy MW (2011) Diurnal and seasonal mood vary with work, sleep, and daylength across diverse cultures. Science 333(6051):1878–1881. doi:10.1126/science.1202775, http://www.sciencemag.org/content/333/6051/1878

  • Griffiths TL, Steyvers M (2004) Finding scientific topics. Proc Natl Acad Sci 101(suppl 1):5228–5235

    Article  Google Scholar 

  • Heise DR (1987) Affect control theory: concepts and model. J Math Sociol 13(1–2):1–33

    Article  MathSciNet  MATH  Google Scholar 

  • Hern A (2015) Twitter CEO: we suck at dealing with trolls and abuse. http://www.theguardian.com/technology/2015/feb/05/twitter-ceo-we-suck-dealing-with-trolls-abuse

  • Hong L, Ahmed A, Gurumurthy S, Smola AJ, Tsioutsiouliklis K (2012) Discovering geographical topics in the twitter stream. In: Proceedings of the 21st international conference on world wide web, ACM, pp 769–778

  • Hong L, Davison BD (2010) Empirical study of topic modeling in twitter. In: Proceedings of the First Workshop on Social Media Analytics, ACM, pp 80–88

  • Hutto C, Gilbert E (2014) Vader: a parsimonious rule-based model for sentiment analysis of social media text. In: Eighth international AAAI conference on weblogs and social media

  • Jordan MI (1998) Learning in Graphical Models: [proceedings of the NATO Advanced Study Institute...: Ettore Mairona Center, Erice, Italy, September 27-October 7, 1996], vol 89. Springer Science & Business Media

  • Joseph K, Carley KM (2015) Culture, networks, twitter and foursquare: testing a model of cultural conversion with social media data. In: Proceedings of the 7th international AAAI conference on weblogs and social media (ICWSM)

  • Joseph K, Tan CH, Carley KM (2012) Beyond local, categories and friends: clustering foursquare users with latent topics. In: Proceedings of the 2012 ACM conference on ubiquitous computing, ACM, pp 919–926

  • Le QV, Mikolov T (2014) Distributed representations of sentences and documents. arXiv preprint arXiv:1405.4053

  • Lim KH, Datta A (2013) A topological approach for detecting twitter communities with common interests. In: Atzmueller M, Chin A, Helic D, Hotho A (eds) Ubiquitous social media analysis. Springer, Berlin Heidelberg, pp 23–43

  • Lin C, He Y (2009) Joint sentiment/topic model for sentiment analysis. In: Proceedings of the 18th ACM conference on information and knowledge management, ACM, pp 375–384

  • Liu B (2012) Sentiment analysis and opinion mining. Synth Lect Hum Lang Technol 5(1):1–167

    Article  Google Scholar 

  • Luxton DD, June JD, Fairall JM (2012) Social media and suicide: a public health perspective. Am J Public Health 102(S2):S195–S200. doi:10.2105/AJPH.2011.300608

    Article  Google Scholar 

  • Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781

  • Miller Z, Dickinson B, Deitrick W, Hu W, Wang AH (2014) Twitter spammer detection using data stream clustering. Inf Sci 260:64–73

    Article  Google Scholar 

  • Moh TS, Murmann AJ (2010) Can you judge a man by his friends?-enhancing spammer detection on the twitter microblogging platform using friends and followers. In: Information systems, technology and management. Springer, pp 210–220

  • Monmarché N, Slimane M, Venturini G (1999) Antclass: discovery of clusters in numeric data by an hybridization of an ant colony with the kmeans algorithm

  • Newman ME (2006) Modularity and community structure in networks. Proc Natl Acad Sci 103(23):8577–8582

    Article  Google Scholar 

  • Pak A, Paroubek P (2010) Twitter as a corpus for sentiment analysis and opinion mining. In: LREC, vol 10, pp 1320–1326

  • Pang B, Lee L, Vaithyanathan S (2002) Thumbs up?: Sentiment classification using machine learning techniques. In: Proceedings of the ACL-02 conference on empirical methods in natural language processing-volume 10, association for computational linguistics, pp 79–86

  • Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V et al (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12:2825–2830

    MathSciNet  MATH  Google Scholar 

  • Pennebaker JW, Booth RJ, Francis ME (2007) Linguistic inquiry and word count: Liwc. Liwc net, Austin

    Google Scholar 

  • Ratkiewicz J, Conover M, Meiss M, Gonçalves B, Flammini A, Menczer F (2011) Detecting and tracking political abuse in social media. In: ICWSM

  • Reynolds D (2009) Gaussian mixture models. In: Encyclopedia of biometrics. Springer, pp 659–663

  • Romero DM, Tan C, Kleinberg J (2013) On the interplay between social and topical structure. In: Proceedings of the 7th International AAAI Conference on weblogs and social media (ICWSM)

  • Santos I, Miambres-Marcos I, Laorden C, Galn-Garca P, Santamara-Ibirika A, Bringas PG (2014) Twitter content-based spam filtering. In: International joint conference SOCO13-CISIS13-ICEUTE13. Springer, pp 449–458

  • Thomas K, Grier C, Song D, Paxson V (2011) Suspended accounts in retrospect: an analysis of twitter spam. In: Proceedings of the 2011 ACM SIGCOMM conference on internet measurement conference, ACM, pp 243–258

  • Thomas K, McCoy D, Grier C, Kolcz A, Paxson V (2013) Trafficking fraudulent accounts: the role of the underground market in twitter spam and abuse. Presented as part of the 22nd USENIX security symposium (USENIX Security 13). USENIX, Washington, D.C., pp 195–210

  • Titov I, McDonald RT (2008) A joint model of text and aspect ratings for sentiment summarization. In: ACL, Citeseer, vol. 8, pp 308–316

  • Wang AH (2010) Don’t follow me: spam detection in twitter. In: Security and cryptography (SECRYPT), proceedings of the 2010 international conference on, IEEE, pp 1–10

  • Wang C, Blei DM (2011) Collaborative topic modeling for recommending scientific articles. In: Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 448–456

  • Wei W, Carley K (2014) Real time closeness and betweenness centrality calculations on streaming network data.

  • Wei W, Carley KM (2015) Measuring temporal patterns in dynamic social networks. ACM Trans Knowl Discov Data (TKDD) 10(1):1–27. doi:10.1145/2749465

  • Wei W, Joseph K, Liu H, Carley KM (2015a) The fragility of twitter social networks against suspended users. In: Proceedings of the 2015 IEEE/ACM international conference on advances in social networks analysis and mining 2015, ACM, pp 9–16

  • Wei W, Joseph K, Lo W, Carley KM (2015b) A bayesian graphical model to discover latent events from twitter. In: Ninth international AAAI conference on web and social media

  • Wei W, Pfeffer J, Reminga J, Carley KM (2011) Handling weighted, asymmetric, self-looped, and disconnected networks in ora. Tech. rep., DTIC Document

  • Xia P, Jiang H, Wang X, Chen C, Liu B (2014) Predicting user replying behavior on a large online dating site. In: Proceedings of 8th international AAAI conference on weblogs and social media

  • Xia P, Liu B, Sun Y, Chen C (2015) Reciprocal recommendation system for online dating. arXiv preprint arXiv:150106247

  • Xie Y, Yu F, Achan K, Panigrahy R, Hulten G, Osipkov I (2008) Spamming botnets: signatures and characteristics. In: ACM SIGCOMM computer communication review, ACM 38:171–182

  • Xu R, Wunsch D et al (2005) Survey of clustering algorithms. Neural Netw IEEE Trans 16(3):645–678

    Article  Google Scholar 

  • Yin J, Ho Q, Xing EP (2013) A scalable approach to probabilistic latent space inference of large-scale networks. In: Advances in neural information processing systems, pp 422–430

  • Yuan J, Zheng Y, Xie X (2012) Discovering regions of different functions in a city using human mobility and pois. In: Proceedings of the 18th ACM SIGKDD international conference on kowledge discovery and data mining, ACM, pp 186–194

Download references

Acknowledgments

This work was supported in part by the Office of Naval Research (ONR) through a MURI N000140811186 on adversarial reasoning, DTRA HDTRA11010102, by the Department of Defense under the MINERVA initiative through the ONR N000141310835 on Multi-Source Assessment of State Stability, and by Center for Computational Analysis of Social and Organization Systems (CASOS). The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the Office of Naval Research, the Department of Defense, or the United States Government.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wei Wei.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wei, W., Joseph, K., Liu, H. et al. Exploring characteristics of suspended users and network stability on Twitter. Soc. Netw. Anal. Min. 6, 51 (2016). https://doi.org/10.1007/s13278-016-0358-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s13278-016-0358-5

Keywords

Navigation