• July 2014
    M T W T F S S
    « Jun   Aug »
  • Pages

  • Marginalia

  • Accumulations

Gephi Exercise | GNU Horizons

Screenshot from 2014-07-29 13:53:27

An extremely interesting feature of the SNA package gephi is the ability to work on a single network in several workspaces at the same time.

Using Gephi, for instance, we might analyze what social network data we have for Gnu.org — a likely place to start given its own stated principles and procedures. Nothing like targeting networks of the networked as practiced by networks people to see how that networking is done.

The analysis includes sorting nodes into communities or clusters according to a modularity algorithm that I will not pretend to understand — Mrvar has an excellent tutorial — making this somewhat useless as a tutorial, but useful to me as a benchmark.


The illustration shown here is the result of taking a network of [x] nodes and [x] edges, crawled and stored by the WIRE bot in the .net format of Pajek,  and applying Force Atlas — which, like some of the circular layouts, tends to sort nodes into a maplike configuration, tending to segregate nodes into non-overlapping territories.


In this case, I calculated the modularity statistic with its default value and was left with what always looks to me like a human brain, with its modules, folds, nodes and nodules and so on.

Screenshot from 2014-07-29 17:19:05

Next, I created a filter based on modularity clusters.

Using this filter, I created three modules, moved each to its own workspace, and applied the Concentric Circular Layout to them one by one.


That is, I filtered out clusters 1 and 2 and then applied Circular Layout to cluster 3, as shown above.

I repeated the procedure with the other two filtered, then unfiltered all subnetworks to obtain the Homer Simpson Configuration — mmmmm, donuts.


Now what can we do? I am still in the learning phases, but with the data as layed out above, we do have the ability to download the data from the visible layer — and only visible layer. What sort of cohesion is present?

More on that in a bit.

The Brokerage Roles of Advertising Servers?


Above, a yEd diagram of the ego-network of Estadao.com.br, the good old-fashioned Estado de S. Paulo. The tool provides a modest array of data analysis modules — centrality, mostly — a wealth of layouts and a relatively easy to use interface for reference purposes.

Above, for example, I have labeled the highest-ranked group Ad Servers because sites I know to be ad servers display high centrality indices in that group.

Using Pajek to derive a k-neighbor subnetwork from the most influential of the ad servers and study what kind of relations they maintain with the supercomputers.

Screenshot from 2014-08-01 14:07:13

Screenshot from 2014-08-01 14:07:58

The groups can be opened up and, assumed to reflect corporate relations, studied as rosters of network players. There is a neat little feature in which clicking on a group or node brings up its ego network.

In this example, I color-coded the groups to reflect an emerging pattern happening simultaneously in the media market — in which strategic alliances in keyword search pit Globo-MSFT (blue) against Google, with Yahoo, R7 and iG bringing up the rear.


The sector seems unsettled. In February, a partnership linking the state government of S. Paulo, Google and MSN … to do what together, in the education field? Visitor to the site of the program inform us that it has been taken down in accordance with limits on governmental publicity during election seasons.


Screenshot from 2014-07-31 09:52:01

Anyway, by pure happenstance, I devised the exercise above while trying to solve another research problem, hazily understood, in social network analysis.

To put it very simply, advertising servers are omnipresent in the pages served to users, but there are cases in which might wish to exclude them to see what happens then to the rest of the network. What precautions should we take with the Facebook Effect? Does the world end if they go extinct?

Above, the latest step, drawn in Pajek as a k-neighbors graph of I believe it was Civicus.org.

I have begun by identifying Social Network Effect vertices — not just the social networks, but AdWords, DoubleClick, Bing Ads, or in other words a herd of advertising elephants I am observing in the wild.

Screenshot from 2014-07-31 09:52:01

Here I have stripped most of the includes I am concerning myself with and layed it out along a y-axis in Pajek Draw.

One of the neat things about Pajek, despite its stubborn ugliness, is that it displays a description of the operation just complete at the top of a Draw window.

You can see below that the network view was a result of deleting several nodes as well as calculating a k-node.

Screenshot from 2014-07-31 09:50:26

Screenshot from 2014-07-31 09:49:01

The removal of the other virtual ATMs seems to affect the structural prestige of the survivors, no?

Screenshot from 2014-07-31 09:41:12

Here is a view of the original k-neighbor map of Civicus.org.

Can we go ahead and call the ad servers the K-Mart components of the subnetwork?

Screenshot from 2014-07-31 11:07:33

I found no ad server playing the role of the cut vertex. In fact, this is a textbook Hubs and Authority configuration. Civicus is part of a community of blue nodes that brokers between clients and customers, and having a look at its ego-network suggests it enjoys higher prestige.


Homework: Apply the Donut Macro to this egocentric view of the Niemem.

Or, see if we can use the modularity tool to understand its structure and implied workflow — unless you are a died in the wool “the end of work as we know it” apostle …