• March 2014
    M T W T F S S
    « Feb   Apr »
  • Pages

  • Marginalia

  • Accumulations

  • Advertisements

Pajek Trek | Or, The 7 Million Points of Light



Above, memory usage on the machine I am using in a focused Web-crawling exercise targeting the point of contact between philanthropy and the tech sectors,. Turns out I can pretty much run the entire process — crawling and processing — on a tiny 350G USB device, if I wanted to. (My data and the StanfordBerkeley dataset are comparable in a number of ways.)

(I keep a copy of Pup Linux handy for children I might meet along the way.)

Meanwhile, does anyone remember the OLPC? After early apocalyptic expectations fell through, the program has reached only 2 million users worldwide — and its marketing pitch seems crafted more for the procurement side of the program than the end user.

Unique to the XO-4 Touch is an easy-to-repair touchscreen that does not compromise the readability of the XO’s sunlight-readable display. 5 GHz Wifi, Bluetooth, and HDMI support also have been added, along with an accelerometer which can be queried by programs to interact with the child.

With the addition of additional wireless connectivity, is the device of greater or lesser use to its owner?

I for one have always wondered: Why only one? Why only children? Why not tightly networked berimbau circles in which the evanescent dancers circulate in and out of the circle. So much of slum culture has been influenced by the mythos of the criminal dandy — the malandro — a figure which has given way to the more brutal violence of the drug business on an industrial scale.)

Why not imagine a world in which, empowered by the projects like the Samba program, our cybernetic Olivers Twist and Artful Dodgers could strike back at the cruel, greedy martinets who keep them locked away in misery dreaming up their artful dodges?


In any event, the moment recorded in this screenshot is an especially precious one for me. The SNA client, Pajek, not only loads the seven million points of light I have gathered through WIRE but also enables them to be operated on in a meaningful and to a satisfactory degree in tools like gephi and yEd. 

Screenshot from 2014-03-31 13:17:42

The crawl data were generated from such sources (above) as the encyclopedic Arts & Letters Daily — which is, after all, an already a dynamic network of minds that differ as much as they defer to one another.

Remember that scene in Social Network in which the guy, what’s his face ? is gushing about how easy it is to just pull down data off the server at Brwn Mawr –or some other Ivy — admissions office with a quick wget?

In WIRE — a project of young programmers at the University of Chile — the crawler reiterates until all its essential information — URL and the like — are known. It can then be analyzed along various lines. It allows one to be a little more than just an infotainment infoconsumer. To put a dime in the bucket now and again.

The online network of the World Association of Newspapers, meanwhile, is both thorough and informative — its membership runs to 3,000 newspapers from Norway to Namibia and most provide a URL to their Web sites. The site’s information design was therefore susceptible to crawling and being added to the crawled collection. The media sector can be studied broadly or narrowly — SIPIAPA represents the Southern Americas, for example. It is said to serve as a shill for U.S. foreign policy, say irate opponents ….

Screenshot from 2014-03-31 13:09:41

I have a ways to go before being able to apply these new-found pastimes to Unicode, that blacksmith of Anglo-European reality. WIRE is very sensitive but with care produces a useful result. Nodes and edges can be correlated with the simple commands

wire-info-extract --links
wire-info-extract --sites



Screenshot from 2014-03-31 13:25:28


This method is exceedingly wasteful of useful data uncovered by Pajek, but much of this analysis work can be repeated in the gephi Data Desktop section of that Java-based SNA desktop.

Anyhow. By the way, wanted to express my appreciation for the cartoonist responsible for the bathetic satire on political rhetoric — at least I think it is meant as a joke — but cannot trace the source.