Design of data science algorithms and techniques, central to the Internet and on-line media, needs to be revolutionized. Current designs ignore participants’ strategic incentives. We are establishing an entirely new repertoire of incentive-compatible data science algorithms and techniques, obtained through pioneering the application of game-theoretic mechanism design for data science. Game theory is the branch of mathematics that models multi-agent interactions. Mechanism design is the part of game theory that designs protocols/algorithms for environments consisting of self-motivated participants. Mechanism design has been central in bridging computer science and game theory; including wide application to electronic commerce, advertising and routing networks. At the same time, data science is flowering, with major applications in search and information retrieval, on-line recommendation systems, clustering and segmentation, and social networks analysis. Quite surprisingly, although the incentives of publishers/firms/customers in such data science contexts are of great importance, mechanism design in the related settings has been almost completely neglected. Our group aims is creating theoretical foundations and developing algorithms, validated through experiments, to build a fundamental bridge between mechanism design and data science.
Adversarial Information Retrieval
Search is probably the most powerful tool used in on-line media. Search algorithms are based on estimating, and then presenting, content most relevant to a query. The current approach today does not account for the strategic nature of publishers who wish to promote their own content. In fact, publishers’ strategies are the documents they write, and the publishers’ payoffs are determined by their documents’ ranking for different queries. Our major observation is that classical information retrieval (IR) approaches fail in adversarial settings, where publishers act as search engine optimizers (SEOs) who optimize their utility. It can be shown that in equilibrium publishers tend to write on similar (popular) subjects, ignoring other topics, decreasing users’ social welfare compared to the case where publishers are not strategic. Our group develops game-theoretic studies of IR, with the aim of creating relevance ranking algorithms that address strategic SEO settings, yielding significantly higher expected social welfare.
Machine Learning in Competition
Fundamental tools provided by data analysts are machine learning (ML) algorithms, such as clustering and regression. In on-line media, clustering users into segments is a common way to present users with targeted promotions and services. Similarly, regression algorithms are central to the task of prediction. However, the above ML tasks are typically done by interested competing parties. For example, one may wish to cluster users such that a service will be selected by most users despite competing offers, and one may like to offer predictions that in most cases will outperform those provided by competitors. The competing ML setup has been completely neglected in current clustering and regression algorithms. Our group pioneers the study of competing ML, focusing on the tasks of segmentation/clustering and regression.
Incentive Compatible Explore and Exploit
The wisdom-of-the-crowd is one of the most appealing concepts that emerged in the on-line economy. The idea is that by learning from each other’s actions, we may be able to act extremely efficiently as a group over time. For my daily commute to work I may know the a-priori distribution of traffic on the two possible main roads I may take, but I do not know the current instantiation. Ideally I would explore the different routes first and then exploit the better one and indeed; this explore & exploit technique is what on-line recommendation systems implement. When some information is unknown or uncertain, be it a new type of service or a known type of service which has not been used recently, a mediator which offers on-line recommendations would need to recommend a-priori suboptimal actions from time to time. This way it can aggregate desired information and optimize the social welfare of the population over time. Evidently, such local exploration is unavoidable. Until recently such on-line learning has ignored self-motivated agents who may definitely decline or deviate from a recommendation they expect to be non-optimal given their knowledge of the system. In fact, the simple mediator which behaves as if he is in control of the explore & exploit process may fail to satisfy the following incentive constraint: a recommended action should be optimal from the agent’s perspective. A major aim of our group is finding a provably incentive-compatible approximately-optimal explore & exploit algorithm.
Incentive-Compatible diffusion and strategic communication
Social networks have become the main source of on-line user interaction. An important feature of social networks is that they embody natural and intuitive social choice within a group. That is, whenever one decides to follow a user in Tweeter, he in fact endorses/votes for her. The same logic applies to e.g. ranking pages based on an algorithm such as PageRank. Accordingly when a message is initiated or endorsed by an individual, the result is that the more users follow/connect to that individual the message will more effectively propagate through the network. This phenomenon is the basis for much of the work in network analysis, focusing on identifying users who are influencers, i.e. users who can maximally promote a message in the social network because of their large network. As influencers have become a target for businesses who want their promotions to reach and influence as many potential customers as possible, being recognized as influencer will enable a user to draw attractive personal offers with high monetary/social value. As a result, and due to the fact that only a subset of the users would be treated as influencers, strategic behavior can emerge. In its simplest form it might change the endorsements one makes, with an individual preferring to minimize the endorsements of others in order to increase his own chances of being selected as an influencer. Using social choice terminology, current influencers’ selection algorithms in social networks are not strategy-proof. We aim at finding a strategy-proof influencers’ selection algorithm, which approximates the best selection of influencers while ensuring that users do not have an incentive to change their endorsements/votes. We also study voting in such networks, and the strategic communication of information when propagated through networks.