MCSN Tuesday, 11-Oct-11

From CCE wiki archived
Jump to: navigation, search
  • Musical compositions?
  • Mac hint: System preferences, trackpad, secondary click: enables right-click functions in Pajek

Cohesive subgroups

  • Cohesive subgroups may be suggested by:
    • Interaction networks: Dyadic relations (one mode network) - e.g. communication - vs.
    • Non-interaction networks: Attributes and affiliations (two mode network) - e.g. gender, class, identity, norms, values, subculture. Note: partitions also define a two mode network, in which everyone's affiliated, and affiliation groups do not overlap. (In the general case, some people might not affiliate, and affiliation groups may overlap.)
    • Homophily hypothesis: "birds of a feather flock together": the two forms of cohesion (interaction and attribute/affiliation) are related
    • General method: look for cohesive subgroups, and analyze their relation to one another. But how can we find cohesive subgroups amidst the complexity of network data?
  • Data from Turialba, Costa Rica
    • Interaction network: family visits
    • Non-interaction network: family friendship groupings (basis?)
    • How well do these match up?
  • Measures of cohesion. (Everyone take a sheet of paper and create your own examples for discussion.)
    • Network density
      • How many possible lines are there in a network with 60 vertices (graph of order 60), if...
        • loops are possible?
        • loops are not possible?
      • Density: number of lines (size of graph) as % of possible number of lines
      • Complete network: density = 100% (all possible lines are present)
      • Problem: density tends to decrease rapidly with graph order, since the number of interactive lines is usually limited (Assuming a person can have only 3 friends, create a network with 5 people and another with 10, and see for yourself...)
      • Application: Attiro.paj
    • Average degree
      • Adjacent vertices: two vertices connected by a line
      • Degree of a vertex: number of incident lines. Note that arcs are counted in both directions, as are multiple lines. (see f88) Loops count twice (in and out)
      • Indegree, outdegree: for digraphs
      • Number of adjacent vertices = degree in a simple undirected graph (no loops or multiple lines are allowed)
      • However it is not true that indegree+outdegree = adjacent vertices in a simple digraph (the same vertex might connect in both directions, and there can be loops)
      • Applications on Attiro.paj (Symmetrize, Info->partition, Partition->Make vector, Info->vector
    • Components
      • Subnetwork: a subset of vertices, together with lines connecting them
      • Maximal subnetwork according to any property: (a) the subnetwork has the property; (b) adding any other vertex destroys the property.
      • undirected network: walk, path
      • directed network:
        • walk, semiwalk
        • path, semipath
      • connected undirected network: any two vertices (a,b) are connected by a path from a to b
      • connected directed network:
        • strongly connected: any two vertices (a,b) are connected by a path from a to b, and a path from b to a
        • weakly connected: any two vertices (a,b) are connected by a semipath from a to b
      • Components:
        • Undirected graph: a component is a maximal connected subnetwork ("hangs together")
        • Directed graph:
          • Strong component: maximal strongly connected subnetwork
          • Weak component: maximal weakly connected subnetwork (equivalent to components of symmetrized network)
          • Note that weak components are always "bigger" than strong components - a strong component always stays together within a weak component, but not the other way 'round.
      • Test your understanding: can components overlap? What is the relation between 'component' and 'partition'?
      • Application on Attiro.paj: Net->Components->strong/weak (note: eliminate lots of one-vertex components by setting minimum component size = 2), then Draw-partition. Use Info->partition to get stats on what happens in each case.
      • Strong or weak? Semantic and pragmatic considerations:
        • Semantic: Does line direction matter? If not, use weak (symmetrize).
        • Pragmatic: First look at weak, then subdivide using strong (strong components are typically smaller). Further subdivide using notion of k-connected components, e.g. bi-components (2-connected) in which more than one path is required.
      • Homophily hypothesis: What is the relation between components (interaction network based on visits) and friendship groupings (partition)? Generate Cross table stats: Set partition 1 and 2, and run Partitions->info->Cramer's V, Rajski