MCSN Tuesday, 8-Nov-11

From CCE wiki archived
Revision as of 11:14, 8 November 2011 by Michaelf (talk | contribs) (Chapter 5: Affiliation networks)
Jump to: navigation, search

Schedule

  • office hrs Wed Nov 9 from 2:30 to 3:15 (I can stay until just before 4. No office hours next week however...)
  • no class Thursday Nov 10 (Remembrance Day)
  • short class Tuesday Nov 15 (new drafts of proposals due; intro to chapter 6 and network game)
  • self-guided class Thurs Nov 17
  • course evaluation on Tuesday Nov 22
  • quiz on Thursday November 24 (to cover material up to that point)
  • presentations on Tuesday November 29 (presentations: 10 minutes each)
  • self-guided class Dec 1 (polish your music compositions for possible performance)
  • class on Dec 6 (last class - more presentations)

Project discussions

for those who didn't talk about their projects last time, and would like class feedback, suggestions, discussion...(but briefly so we can wrap up chapter 5 today)

Chapter 5: Affiliation networks

Concepts

Basic ideas

  • People affiliate to groups (often defined by space, like the University of Alberta), and events (typically defined by space-time, like this class session), whether by choice or circumstance.
  • Such affiliations define bipartite networks comprising two kinds of vertex, which we can call actors and events (don't be confused - events could be more like groups).
  • In a bipartite network there are two kinds of vertex, type A and type B. All lines connect a type A vertex to a type B vertex - there are no direct connections between vertices of type A, nor are there direct connections between vertices of type B.
  • A bipartite network is also called "two mode", since there are two kinds of vertex, and is represented by a matrix rectangle rather than a square (see this in Excel).
  • Affiliation networks are bipartite (or two-mode), but the reverse doesn't hold (e.g. links between keys and locks is bipartite, but not an affiliation network in the usual sense). An affiliation network is a special interpretation of a bipartite network.
  • Affiliations define social circles (the term comes from sociologist George Simmel) which overlap.
  • Network representation of identity as a model for social belonging:
    • Culture model (common in traditional ethnomusicology, and applicable small-scale societies): each individual belongs to one "complex whole" as Tylor put it in 1847. This "complex whole" might be identified as the sum of many social circles, but they heavily overlap: everyone in a small community belongs to many or most of them (work circle, religious circle, etc.) with diversions primarily on the basis of age and gender.
    • Identity model (more common in sociology and contemporary ethnomusicology, and more applicable to large-scale societies): individuality is the sum total of multiple "simple parts", each person summing them in a slightly different way. These "parts" can be viewed as social circles whose intersection is the individual.
    • Note: social identity can't be captured in a single Pajek partition....why? The concept of partition is closer to the traditional "culture" model of exclusive all-encompassing identities.
  • Social circles may also imply power circles with critical implications for relationships among "events" (groups). Example: Interlocking directorates
  • Degree of a vertex indicates the scope of the corresponding social circle:
    • Degree of an event (group): size of the event (group)
    • Degree of an actor: rate of participation of the actor

Typical assumptions about affiliation networks

  • Book states them as facts (see p. 101), but you should critique them in theory! test them in your projects!
  1. Affiliations are institutional or structural - less personal than friendships or sentiments. [What do you think? How could we test this?]
  2. "Although membership lists do not tell us exactly which people interact, communicate, and like each other, we may assume that there is a fair chance that they will." [what factors might impact the chances of actual dyadic interaction?]
  3. Actors at the intersection of multiple social circles...
    1. tend to interact even more
    2. become bridges enabling indirect communication/control between the circles as a whole.
  4. "Joint membership in a social circle often comes with similarities in other social domains." (i.e. homophily principle..."birds of a feather flock together". Is this cause of common affiliation, or effect? Understanding the difference might require different methods: (a) temporal network analysis, or (b) qualitative (interview, observation) analysis.)

Representations via matrix or edge list

Matrices are useful conceptual tools, but Pajek relies on lists of edges.

  • Matrices represent one or two mode networks very naturally
  • One-mode networks are naturally represented using
    • upper triangular matrix, no diagonal (undirected simple)
    • upper triangular matrix (undirected with loops)
    • square matrix (directed with loops)
  • Two-mode networks are naturally represented using rectangular matrices
    • Rows represent first mode (e.g. actors)
    • Columns represent second mode (e.g. events)
  • Deriving one-mode network from two-mode network.
    • Mapping the "hidden networks" implied by two-mode network (under assumptions above) can be highly significant
    • One-mode network derived from rows (e.g. actors)
    • One-mode network derived from columns (e.g. events)
  • One can also represent two-mode networks with lists of edges
    • Actors and events must be clearly differentiated
    • Simply listing edges may violate condition that actors can't link to actors, or events to events
    • Thus it's necessary to provide a simple means of identifying which vertices are rows (or, conversely, which vertices are columns)
    • If we number the vertices, it's easiest to separate rows and columns by assuming that the first N vertices are rows (and so the rest are columns). This is the approach taken in Pajek...

Applications: creating and manipulating two mode networks

  • Two-mode network in Pajek
    • Vertex command is followed by two numbers: (a) the number of vertices; (b) the number of rows (whether actors or events)
    • When Pajek sees two numbers instead of one, it generates an affiliation partition to match.
  • Using txt2pajek to generate a sample two-mode network

Corporate interlocks in Scotland, 1904-5

  • Early 20th century: joint stock companies began to form
    • owned by shareholders
    • represented by boards of directors
  • Interlocking directorates linked the companies (and companies linked the directors)
  • Data: 136 multiple directors for 108 largest joint stock companies, of various types:
    • non-financial firms (64)
    • banks (8)
    • insurance companies (14)
    • investment and property companies (22)
  • Partition: indicates industry type
  1. oil & mining
  2. railway
  3. engineering & steel
  4. electricity & chemicals
  5. domestic products
  6. banks
  7. insurance
  8. investment
  • Vector: indicates total capital in 1,000 pounds sterling

Analyzing Scotland.paj

  • Info->network
    • Number of vertices
    • Number of lines
  • Affiliation partition separates firms and directors (examine)
  • Drawing and energizing. Note bipartite property.
  • Degree partition (size of events and rates of participation), can be displayed as vertex size (convert to vector)
  • Components

Deriving one mode nets from affiliation nets

  • Derived networks: Each two-mode network induces two one-mode networks: (a) by events (groups), (b) by actors, as follows:
    • By events (groups): events are linked by one line per shared actor
    • By actors: actors are linked by one line per shared event (group)
    • Note: loops represent size of events, participation rates of actors:
      • each event (group) shares each actor with itself, so each actor induces a loop for every event in which it participates
      • each actor shares each event (group) with itself, so each event induces a loop for every actor participating in it
  • Derived networks are typically not simple, but one can replace multiple lines by a single line with value = number of lines replaced. This value is called line multiplicity and the resulting network is called a valued network.
  • We can convert Scotland.net into one-mode network of firms (settings: no loops, no multiple lines).
    • Lines between firms now represent the number of shared directors.
    • View line values (info->network->line values)
    • Add degree information from the original network (create a degree partition, then extract using the affiliation partition)
    • m-slices
      • m-slice is derived by deleting all lines of multiplicity less than m, and then deleting all isolated vertices. Detect isolated vertices by running component analysis with minimum component size = 2, then deleting the zero cluster.
      • vertices of the m-slice are precisely those that are attached to at least one line of multiplicity m or more.
      • m-slices are therefore nested (like k-cores): a vertex in the 3-slice (m=3) is automatically in the 2-slice (m=2), and a line in the 3-slice is automatically in the 2-slice.
        • 1-slice contains the 2-slice
        • 2-slice contains the 3-slice
        • 3-slice contains the 4-slice
        • etc.
      • Note: m-slices need not be connected - so after finding an m-slice, run component analysis to find its "pieces"
    • valued core
      • useful when we want to derive a partition from values on incident lines
      • first we decide if we want to derive this value from the maximum or sum of incident lines
      • then we sort this maximum or sum into a single "bin" representing a range of values
      • finally we have our vertex partition
    • valued core for m-slices
      • note that the highest m-slice to which a vertex belongs is simply the highest incident line multiplicity
      • so we can use valued core on the "max" setting, with bins defined as one per multiplicity (this is the default)
      • vertices are now partitioned by m-slice, but in order to find a particular m-slice we need to both (a) extract clusters, and (b) delete lines with value under m.

Analyzing Scotland.paj

  • Derive network of firms
    • Energize with line values = similarities
    • Derive the 2-slice directly, and look at its components
  • Run valued core and examine results with "info" and "draw"
    • Derive the 2-slice again from valued core result
    • Use valued core partition to derive Industrial Categories partition and Capital vector.
    • Examine resulting sociogram
  • 3D views
    • Note: SVG is no longer supported; VRML may work on PCs
    • Two techniques will work:
      • Energize in 2D, then apply Layers command in Draw menu
        • Draw network of firms, with valued core partition selected
        • Select Layers>Type of layout>3D
        • Run Layers>In Z direction
        • Energize in 3D and rotate or spin
      • Energize with Fruchterman Reingold 3D