Abstract
Increased data gathering capacity, together with the spread of data analytics techniques, has prompterd an unprecedented concentration of information related to the individuals’ preferences in the hands of a few gatekeepers. In the present paper, we show how platforms’ performances still appear astonishing in relation to some unexplored data and networks properties, capable to enhance the platforms’ capacity to implement steering practices by means of an increased ability to estimate individuals’ preferences. To this end, we rely on network science whose analytical tools allow data representations capable of highlighting relationships between subjects and/or items, extracting a great amount of information. We therefore propose a measure called Network Information Patrimony, considering the amount of information available within the system and we look into how platforms could exploit data stemming from connected profiles within a network, with a view to obtaining competitive advantages. Our measure takes into account the quality of the connections among nodes as the one of a hypothetical user in relation to its neighbourhood, detecting how users with a good neighbourhood—hence of a superior connections set—obtain better information. We tested our measures on Amazons’ instances, obtaining evidence which confirm the relevance of information extracted from nodes’ neighbourhood in order to steer targeted users.
Similar content being viewed by others
Notes
In the light of the existing literature, the value of a network can be inferred from to Metcalfe’s Law (Gilder 1993) which defines the network value as proportional to the square of its size. Some other laws have been proposed: Sarnoff’s and the Reed’s laws (Reed 1999) and the Odlyzko’s one (Briscoe et al. 2006). However, despite its simplicity and some of its limits (Swann 2002; Briscoe et al. 2006), Metcalfe’s law remains a reliable tool (Madureira et al. 2013; Van Hove 2016), which has been used, for example, in order to figure out Facebook’s network value (Metcalfe 2013; Zhang et al. 2015).
A collaborative filtering algorithm implements a recommender system (Lü et al. 2012) in order to offer goods or services to a user, starting from the analysis of his/her behaviours (and those of similar users), ending up with advising what the same consumer might find useful, on the basis of the expressed preferences (Resnick and Varian 1997; Sarwar et al. 2001; Breese et al. 2013), of the choices made by similar consumers, which allows to cluster them (Xue et al. 2005). Relying on similar form of users’ profiling techniques (such as Collaborative Filtering methods) may allow indeed an efficient matching of people and relevant purchase opportunities, (Levin 2011), although it could also bring about some distortions and disequilibria, up to market failures (Gertz 2002). For a different take on the functioning of digital markets, see Fuller (2019).
Amazon’s profiling techniques—their so called item-to-item collaborative filtering—focus is though not on grouping similar customers on the basis of their individually analysed behaviours, but rather on clustering them via the correlation of similarities between the items that they have chosen: on the basis of a user’s purchased and/or rated items, the algorithm attempts to find out similarities with items not chosen yet, then it aggregate them in order to come up with purchase recommendations (Linden et al. 2003).
Collaborative filtering offers consumption prediction based on the customers’ purchase history (Linden et al. 2003).
In this respect, it is worth recalling that the economic literature identifies three types of price discrimination: first-degree price discrimination, which occurs when a seller is a position to charge different prices for each buyer (personalised pricing); second-degree price discrimination, occurring when a consumer self-selects himself by choosing a specific package—whose per-unit price is dependent on the amount purchased—that best fits his needs (think about the Netflix package that each individual is free to choose according to his specific needs); third-degree price discrimination arising when sellers propose different prices to different socio-demographic groups (Cabral 2000; Acquisti 2008; Arpetti 2018).
Such goods could be perfect substitutes and an algorithm could decide to show only a part of them, proposing to the consumer the only goods that reflects the Willingness To Pay (WTP) identified for that profiled user.
The two most well-known classification and clustering algorithms—used by CF—are, respectively, k-nearest neighbour algorithm (k-nn) (not to be confused with the “Nearest Neighbour Degree”, \(k_{nn}\), which origins from the network science field and which we will use later in this paper) and k-mean clustering (Sarwar et al. 2001; Mobasher et al. 2001). While the k-nn algorithm proposes a classification measuring the distance between similar items or users in order to suggest which of them are closer (in terms of the already purchased item or in terms of users brought together by similar purchase histories) from the targeted user, k-mean algorithms are designed to cluster items (and so individuals on the basis of their choices and purchase histories when they are “user-item”), suggesting to the targeted subject those items surrounding those already purchased by him/her, thus defining the center of the temporary cluster and suggesting the closest nodes (Paterek 2007; Katarya and Verma 2016).
Is it worth to specify that the term social network has a double meaning: it may refer both to a social structure where actors interact networked by means of a web platform and to the scientific discipline of measuring actor interaction on a network as a mathematical object.
In order, of course, to suggest them to the platform’s users.
i.e. behavioural biases to which consumers are generally subject.
i.e. purchase and internet history, GPS data stemmed from mobile devices, etc.
In this regard, the UK Competition and Market Authority stated that: «Firms could also seek to discriminate between customers using competitive variables other than price. [...] The collection of consumer data may enable firms to make judgements about the lowest level of quality needed by consumers/groups of similar consumers. This may enable a firm to engage in quality discrimination where quality differences are not reflected in the prices of goods or services. Firms may do this by restricting the products that are displayed to consumers or by varying the order in which products are listed on their website to display relatively poorer or better quality products first depending on the information they collect about consumers» (Competition and Markets Authority—CMA 2015, pp. 93).
A platform should be able to estimate each consumers’ reservation price, variable that is not directly observable but just inferred from. Moreover, not all platforms avail of the means to constantly observe every move made online by each consumer, while none of them can estimate their WTP for each item. As stated by Ezrachi and Stucke in “Virtual Competition”:«One impediment to perfect discrimination is insufficient data. Although the algorithm has a lot more personal data that brick-and-mortar retailer of twenty years ago, the algorithm still has insufficient data for any particular customer: the customer may never have bought the item before; and the customer’s behavior may never have signaled how much or she is willing to spend to accurately predict an individual’s reservation price would require sufficient data to identify and measure each of many variables that affect the reservation price». (Ezrachi and Stucke 2016b, pp. 96–97).
In order to implement a perfect price discrimination practice, it would be necessary to predict how individuals are subjected to biases and heuristics, that can change their preferences and affect their reservation prices: an algorithm whose predictions are based on preexisting reservation prices cannot ignore how such elements could change (Arrow 1958; Kahneman and Tversky 1986; Simon 1955, 1990).
An algorithm must have an adequate amount of data so that a perfect price discrimination practice can be performed. To this purpose, it is presumable that algorithms avail of the necessary data sources to infer information, hence to make predictions on individuals’ purchase preferences related to products routinely bought. However, as pointed out by Ezrachi and Stucke, the platforms do not have sufficient data about scattered purchases: when a purchase is not cyclically performed, it is more complex to elaborate forecasts on individual behaviour because of the absence of a sufficient amount of «trial-and-error»data up to the point to detect every variable needed to figure up the reservation price of each individual for a corresponding good [think, for instance, about the rate at which individuals buy a PC screen (Ezrachi and Stucke 2016b, pp. 99)].
In order for this to happen, segmentation strategies and individuals’ clustering practices are put in place to include each individual in a specific group of consumers who share similar preferences and price sensitivity, assuming that grouping persons with analogous tastes and purchase behaviours in small clusters allows a better approximation of their reservation prices (Ezrachi and Stucke 2016a)
Despite the fact that, the same report further explains how more complex it is to infer information about Internet users while having access to their data (such as the user’s IP address, or the kind of operating system) but without being able to define and their willingness to pay.
To be understood as the possibility to decide what to show to individuals and guiding their choices in this sense.
In such case the network can be generated exploiting analytical methodologies (Zhou et al. 2014).
Therefore, n nodes are so connected to other \(n-1\) nodes, thus the value is proportional to \(n^2\)
In fact, in this case each node has degree \(d_i = n -1\), then \(IP_i = d_i/2m = (n-1)/n(n-1) = 1/n\), while \(\langle d^2 \rangle = (n-1)^2\) and \(\langle d \rangle = n-1\) thus \(NIP_i = (1/n) (1 + (n-1)^2/(n-1))= 1\).
For instance, when neutral with respect to degree correlation, it does not present assortative mixing (\(r = 0\))
The centralisation degree of a node is merely its degree, i.e. the number of links connected to it. In Social Network Analysis literature it is emphasised as a measure since is meaningful in many circumstances. For an extensive discussion on this measure and many other centrality measures please refer to Scott and Carrington (2011) or Newman (2018).
For the sake of completeness, the computational complexity of our measures depend from the calculation of \(k_{nn}\), which correspond to a product of a square matrix of size n for a vector of size n. It is well known that this calculation is polynomial in the size of the matrix, i.e., it is an \(O(n^2)\).
The density of a network is defined as the number of its actual links divided by the number of potential links which is \(n(n-1)/2\). It can assume value from 0 (empty network) to 1 (complete network).
References
Acquisti, A. (2008). Identity management, privacy, and price discrimination. IEEE Security and Privacy, 6(2), 46–50.
Acquisti, A., Taylor, C., & Wagman, L. (2016). The economics of privacy. Journal of Economic Literature, 54(2), 442–492.
Akerlof, G. A. (1970). The market for “Lemons” quality uncertainty and the market mechanism. The Quarterly Journal of Economics, 84(3), 488–500.
Arpetti, J. (2018). Economia della privacy: Una rassegna della letteratura (in italian). Rivista di diritto dei media, 2, 267–297.
Arrow, K. J. (1958). Utilities, attitudes, choices: A review note. Econometrica: Journal of the Econometric Society, 26, 1–23.
Bakshy, E., Rosenn, I., Marlow, C., & Adamic, L. (2012). The role of social networks in information diffusion. In Proceedings of the 21st international conference on World Wide Web, (pp. 519–528). ACM
Barabási, A. L. (2013). Network science. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 371(1987), 20120375.
Barabási, A. L. (2016). Network science. Cambridge: Cambridge University Press.
Birke, D. (2013). Social networks and their economics: Influencing consumer choice. Chichester: Wiley.
Bollobás, B. (2013). Modern graph theory (Vol. 184). New York: Springer Science & Business Media.
Breese, J. S., Heckerman, D., & Kadie, C. (2013). Empirical analysis of predictive algorithms for collaborative filtering. Tech. rep. Microsoft Research.
Briscoe, B., Odlyzko, A., & Tilly, B. (2006). Metcalfe’s law is wrong. IEEE Spectrum, 43(7), 34–39.
Cabral, L. M. B. (2000). Introduction to industrial organization. Cambridge: MIT Press.
Castillejo, E., Almeida, A., & López-de Ipina, D. (2012). Social network analysis applied to recommendation systems: alleviating the cold-user problem. In International Conference on Ubiquitous Computing and Ambient Intelligence, Springer, (pp. 306–313)
Catanzaro, M., Boguñá, M., & Pastor-Satorras, R. (2005). Generation of uncorrelated random scale-free networks. Physical Review E, 71(2), 027103.
Cerqueti, R., Ferraro, G., & Iovanella, A. (2018a). A new measure for community structures through indirect social connections. Expert Systems with Applications, 114, 196–209.
Cerqueti, R., Rotundo, G., & Ausloos, M. (2018b). Investigating the configurations in cross-shareholding: A joint copula-entropy approach. Entropy, 20(2), 134.
Competition and Markets Authority–CMA. (2015). The commercial use of consumer data report on the cma’s call for information. Competiotion and Markets Authority: Tech. rep.
Council of Economic Advisers–CEA (2015) Big Data and Differential Pricing. Tech. rep., Council of Economic Advisers (CEA)–Executive Office of the President of the United States
Csardi, G., Nepusz, T., et al. (2006). The igraph software package for complex network research. InterJournal, Complex Systems, 1695(5), 1–9.
D’Agostino, G., Scala, A., Zlatić, V., & Caldarelli, G. (2012). Robustness and assortativity for diffusion-like processes in scale-free networks. EPL (Europhysics Letters), 97(6), 68006.
Erdős, P., & Gallai, T. (1960). Graphs with prescribed degrees of vertices (in hungarian). Matematikai Lapok, 11, 265–274.
Ezrachi, A., & Stucke, M. E. (2016a). The rise of behavioural discrimination. European Competition Law Review, ECLR, 37(12), 485–492.
Ezrachi, A., & Stucke, M. E. (2016b). Virtual competition: The promise and perils of the algorithm-driven economy. Cambridge: Harvard University Press.
Feld, S. L. (1991). Why your friends have more friends than you do. American Journal of Sociology, 96(6), 1464–1477.
Firdaus, S., & Uddin, M. A. (2015). A survey on clustering algorithms and complexity analysis. International Journal of Computer Science Issues, 12(2), 62–85.
Fuller, C. S. (2019). Is the market for digital privacy a failure? Public Choice.
Fuller, C. S. (2018). Privacy law as price control. European Journal of Law and Economics, 45(2), 225–250.
Galati, F., Bigliardi, B., Petroni, A., Petroni, G., & Ferraro, G. (2019). A framework for avoiding knowledge leakage: Evidence from engineering to order firms. Knowledge Management Research & Practice, 17(3), 340–352.
Gertz, J. D. (2002). The purloined personality: Consumer profiling in financial services. San Diego L Rev, 39, 943.
Gilder, G. (1993). Metcalfe’s law and legacy. Forbes ASAP, 13, 1993.
Hakimi, S. L. (1962). On realizability of a set of integers as degrees of the vertices of a linear graph. Journal of the Society for Industrial and Applied Mathematics, 10(3), 496–506.
Hannak, A., Soeller, G., Lazer, D., Mislove, A., & Wilson, C. (2014). Measuring Price Discrimination and Steering on E-commerce Web Sites. In Proceedings of the 2014 conference on internet measurement conference–IMC ’14 (pp. 305–318). New York: ACM Press
Jentzsch, N. (2017). Secondary use of personal data: A welfare analysis. European Journal of Law and Economics, 44(1), 165–192.
Kahneman, D., & Tversky, A. (1986). Rational choice and the framing of decisions. Journal of Business, 59(4), 251–278.
Kamishima, T., & Akaho, S. (2011). Personalized pricing recommender system. In Proceedings of the 2nd international workshop on information heterogeneity and fusion in recommender systems–HetRec ’11 (pp. 57–64). New York: ACM Press
Katarya, R., & Verma, O. P. (2016). A collaborative recommender system enhanced with particle swarm optimization technique. Multimedia Tools and Applications, 75(15), 9225–9239.
Konstas, I., Stathopoulos, V., & Jose, J. M. (2009). On social networks and collaborative recommendation. In Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval (pp. 195–202). ACM
Krämer, A., & Kalka, R. (2017). How digital disruption changes pricing strategies and price models. In Phantom Ex Machina (pp. 87–103). Springer
Kshetri, N. (2014). Big data’s impact on privacy, security and consumer welfare. Telecommunications Policy, 38(11), 1134–1145.
Lam, C. P., & Goeksel, M. (2010). System and method for utilizing social networks for collaborative filtering. US Patent 7,689,452
Leskovec, J., Adamic, L. A., & Huberman, B. A. (2007). The dynamics of viral marketing. ACM Transactions on the Web (TWEB), 1(1), 1–39.
Levin, J. (2011). The economics of internet markets. Tech. rep. National Bureau of Economic Research, Cambridge, MA.
Linden, G., Smith, B., & York, J. (2003). Amazon.com recommendations: iItem-to-item collaborative filtering. IEEE Internet Computing, 7(1), 76–80.
Liu, F., & Lee, H. J. (2010). Use of social network information to enhance collaborative filtering performance. Expert Systems with Applications, 37(7), 4772–4778.
Lü, L., Medo, M., Yeung, C. H., Zhang, Y. C., Zhang, Z. K., & Zhou, T. (2012). Recommender systems. Physics Reports, 519(1), 1–49.
Lu, J., Wu, D., Mao, M., Wang, W., & Zhang, G. (2015). Recommender system application developments: A survey. Decision Support Systems, 74, 12–32.
Madureira, A., den Hartog, F., Bouwman, H., & Baken, N. (2013). Empirical validation of metcalfe’s law: How internet usage patterns have changed over time. Information Economics and Policy, 25(4), 246–256.
Mattioli, D. (2012). On orbitz, mac users steered to pricier hotels. Wall Street Journal, 23, 2012.
Mavlanova, T., Benbunan-Fich, R., & Koufaris, M. (2012). Signaling theory and information asymmetry in online commerce. Information & Management, 49(5), 240–247.
Metcalfe, B. (2013). Metcalfe’s law after 40 years of ethernet. Computer, 46(12), 26–31.
Mikians, J., Gyarmati, L., Erramilli, V., & Laoutaris, N. (2012). Detecting price and search discrimination on the internet. In Proceedings of the 11th ACM workshop on hot topics in networks (pp. 79–84). ACM
Mobasher, B., Dai, H., Luo, T., & Nakagawa, M. (2001). Improving the effectiveness of collaborative filtering on anonymous web usage data. In Proceedings of the IJCAI 2001 workshop on intelligent techniques for web personalization (ITWP01) (pp. 53–61).
Newman, M. E. (2002). Assortative mixing in networks. Physical Review Letters, 89(20), 208701.
Newman, M. E. (2003). The structure and function of complex networks. SIAM Review, 45(2), 167–256.
Newman, M. (2018). Networks. Oxford: Oxford University Press.
Nguyen, A. T., Denos, N., & Berrut, C. (2007). Improving new user recommendations with rule-based induction on cold user data. In Proceedings of the 2007 ACM conference on Recommender systems (pp. 121–128). ACM
Pagallo, U. (2014). Il diritto nell’età dell’informazione: il riposizionamento tecnologico degli ordinamenti giuridici tra complessità sociale, lotta per il potere e tutela dei diritti (in Italian). G. Giappichelli
Page, L., Brin, S., Motwani, R., & Winograd, T. (1999). The pagerank citation ranking: Bringing order to the web. Tech. rep. Stanford InfoLab
Pastor-Satorras, R., Vázquez, A., & Vespignani, A. (2001). Dynamical and correlation properties of the internet. Physical Review Letters, 87(25), 258701.
Paterek, A. (2007). Improving regularized singular value decomposition for collaborative filtering. Proceedings of KDD Cup and Workshop, 2007, 5–8.
Peel, L., Larremore, D. B., & Clauset, A. (2017). The ground truth about metadata and community detection in networks. Science Advances, 3(5), e1602548.
Reed, D. P. (1999). That sneaky exponential–beyond metcalfe’s law to the power of community building. Context magazine, 2(1),
Regner, T., & Riener, G. (2017). Privacy is precious: On the attempt to lift anonymity on the internet to increase revenue. Journal of Economics & Management Strategy, 26(2), 318–336.
Resnick, P., & Varian, H. R. (1997). Recommender systems. Communications of the ACM, 40(3), 56–58.
Rotundo, G., & D’Arcangelis, A. M. (2014). Network of companies: An analysis of market concentration in the italian stock market. Quality & Quantity, 48(4), 1893–1910.
Sarwar, B., Karypis, G., Konstan, J., & Riedl, J. (2001). Item-based collaborative filtering recommendation algorithms. In Proceedings of the 10th international conference on World Wide Web (pp. 285–295). ACM
Scott, J., & Carrington, P. J. (2011). The SAGE handbook of social network analysis. Thousand Oaks: SAGE Publications.
Shiller, B. R. (2014). First-degree price discrimination using big data. Tech. rep.. Brandeis Univerisity.
Shiller, B. R. (2015). Approximating Reservation Prices From Broad Consumer Tracking, Department of Economics. Brandeis University.
Simon, H. A. (1990). Bounded rationality. In Utility and probability (pp. 15–18). Springer
Simon, H. (1955). A behavioral model of rational choice. Quarterly Journal of Economics, 69(1), 99–118.
Swann, G. P. (2002). The functional form of network effects. Information Economics and Policy, 14(3), 417–429.
Team, R. C., et al. (2013). R: A language and environment for statistical computing. Vienna: Austria.
The Economist (2010) Clicking for gold. how internet companies profit from data on the web. The Economist—A Special Report on Managing Information
Tsai, J. Y., Egelman, S., Cranor, L., & Acquisti, A. (2011). The effect of online privacy information on purchasing behavior: An experimental study. Information Systems Research, 22(2), 254–268.
Van Hove, L. (2016). Testing metcalfe’s law: Pitfalls and possibilities. Information Economics and Policy, 37, 67–76.
Wang, X. F., & Chen, G. (2003). Complex networks: Small-world, scale-free and beyond. IEEE Circuits and Systems Magazine, 3(1), 6–20.
Xu, R., & Wunsch, D. C. (2005). Survey of clustering algorithms. IEEE Transaction on Neural Networks, 16(3), 645–678.
Xue, G. R., Lin, C., Yang, Q., Xi, W., Zeng, H. J., Yu, Y., & Chen, Z. (2005). Scalable collaborative filtering using cluster-based smoothing. In Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval (pp. 114–121). ACM
Zhang, X. Z., Liu, J. J., & Xu, Z. W. (2015). Tencent and facebook data validate metcalfe’s law. Journal of Computer Science and Technology, 30(2), 246–251.
Zhao, Q., Zhang, Y., Zhang, Y., & Friedman, D. (2016). Recommendation based on multiproduct utility maximization. Tech. rep. WZB Discussion Paper
Zhou, W., Duan, W., & Piramuthu, S. (2014). A social network matrix for implicit and explicit social network plates. Decision Support Systems, 68, 89–97.
Acknowledgements
We would like to thank the anonymous reviewers for all their useful suggestions, as they helped us improve the quality of our paper.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Arpetti, J., Iovanella, A. Towards more effective consumer steering via network analysis. Eur J Law Econ 50, 359–380 (2020). https://doi.org/10.1007/s10657-019-09637-2
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10657-019-09637-2