Big Data/Analytics Zone is brought to you in partnership with:

Damaris has posted 19 posts at DZone. View Full User Profile

Detecting Social Capitalists on Twitter with Graph Databases

04.17.2013
| 4403 views |
  • submit to reddit

 Curator's Note: The content of this article is based on the original written over at the Sparsity Technologies' blog .

Nicolas Dugué and Anthony Perez from the University of Orleans introduce new techniques to detect social capitalists on Twitter.

Social capitalists are those users that try to gain visibility by following users regardless of their content. Social capitalists are not healthy for social networks as they help spammers to gain visibility and may mislead influence detection.

In the article, they show that social capitalists can be detected using similarity measures, and that there is no need to analyze the tweets of the users, but rather the graph topology.

Another aim of the research was to focus on efficient & high-level techniques to store and handle very large graphs. After unsuccessfully evaluating SQL and other NOSQL technologies, such as Cassandra, they moved to graph databases which are better suited to quickly answer questions like retrieving the neighborhood of the nodes, which is essential in the computation of their algorithms. Nicolas Dugué and Anthony Perez research uses the Twitter graph, a spam graph and a list of 100.000 potential social capitalists. Using DEX high performance graph databases they were able to store a graph containing about 15M vertices and 1B arcs.

Some of the techniques used by social capitalists are “follow me and I follow you” or “I follow you, follow me”, making that the most of the users they follow should follow them back (overlap). On the other side, spammers wish to accumulate followers and then spread spam links. A previous paper about link-farming on Twitter and focus on spammers by Gosh et al introduced social capitalists as the users who most respond to request by spammers. Nicolas Dugué and Anthony Perez, use the previous results to contrast theirs using the proposed new faster detection techniques, obtaining an even bigger list of social capitalists.

This is a good step forward having more healthy and reliable social networks. To learn more about the detection of social capitalists, we highly engage you to read their article here: http://link.springer.com/chapter/10.1007/978-3-642-36844-8_1


DEX graph database is available free for research as part of their research license program. More details about licenses here .

Published at DZone with permission of its author, Damaris Coll.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)