SNA techniques are derived from sociological and
social-psychological theories and take into account the whole
network (or, in case of very large networks such as Twitter -- a
large segment of the network). Thus, we may arrive at results that
may seem counter-intuitive -- e.g. that Justin Bieber (7.5 mil.
followers) and Lady Gaga (7.2 mil. followers) have relatively
little actual influence despite their celebrity status -- while a
middle-of-the-road blogger with 30K followers is able to generate
tweets that "go viral" and result in millions of impressions.
In this tutorial, we will conduct social network analysis of a real
dataset, from gathering and cleaning data to analysis and
visualization of results. We will use Python and a set of
open-source libraries, including NetworkX, NumPy and
Matplotlib.
Outline:
Introduction. Why should we do this? What is the data like? Why is
this different from other techniques? What can we learn?
Centralities: Degree, closeness, betweenness, PageRank, Klout
Score
Beyond Klout Score: Finding communities of interest, finding
clusters in networks
Information diffusion in networks -- how do things go viral?