• November 2013
    M T W T F S S
    « Oct   Dec »
     123
    45678910
    11121314151617
    18192021222324
    252627282930  
  • Pages

  • Marginalia

  • Accumulations

Arts and Letters Spatially | Drafts

Screenshot from 2013-11-01 11:49:27

Good old Arts & Letters Daily — a loving curated museum of  press sources of all kind, dating back to the early 1990s, it resembles one of those kiosks on the Avenida Paulista selling everything that is fit to print, from everywhere — and may also keep a few Romeo y Julietas in a drawer somewhere for those who appreciate the legendary thighs of mulato women.

The sight makes for the basis of an meaningful and useful Web crawl, and that is what I have done with it. I obtained a sampling of its network neighborhood with

mkdir ALD
cd ALD
wget -rH http://aldaily.com

and then used some tricks to format them into a list readable by the Web crawler WIRE.

cd ..
ls ALD > ALD.list

and used some simple regular expressions to format the input file as required by WIRE.

This list is then used to seed the Web crawl:

wire-bot-seeder –start ALD.list

The bot proceeds to catalog the literary and intellectual wealth of the site. A second run within the same topic can be performed, beginning with

wire-info-extractor --seeds > next.seeds

ALDAILYESQ

Here is a pretty picture showing the central point of reference in the context of partitions of various thematic unities. I am basically putting myself through the Coursera source on social network analysis, which requires more Gephi than yEd, the diagrammer I normally use but which crashes a lot.