Notes on Cohesive Subgroups
Jump to navigation
Jump to search
Cohesive subgroups
- Cohesive subgroups may be suggested by:
- Interaction networks: Dyadic relations (one mode network) - e.g. communication - vs.
- Non-interaction networks: Attributes and affiliations (two mode network) - e.g. gender, class, identity, norms, values, subculture. Note: partitions also define a two mode network, in which everyone's affiliated, and affiliation groups do not overlap. (In the general case, some people might not affiliate, and affiliation groups may overlap.)
- Homophily hypothesis: "birds of a feather flock together": the two forms of cohesion (interaction and attribute/affiliation) are related
- General method: look for cohesive subgroups, and analyze their relation to one another. But how can we find cohesive subgroups amidst the complexity of network data?
- Data from Turialba, Costa Rica
- Interaction network: family visits
- Non-interaction network: family friendship groupings (basis?)
- How well do these match up?
- Measures of cohesion. (Everyone take a sheet of paper and create your own examples for discussion.)
- Network density
- How many possible lines are there in a network with 60 vertices (graph of order 60), if...
- loops are possible?
- loops are not possible?
- Density: number of lines (size of graph) as % of possible number of lines
- Complete network: density = 100% (all possible lines are present)
- Problem: density tends to decrease rapidly with graph order, since the number of interactive lines is usually limited (Assuming a person can have only 3 friends, create a network with 5 people and another with 10, and see for yourself...)
- Application: Attiro.paj
- How many possible lines are there in a network with 60 vertices (graph of order 60), if...
- Average degree
- Adjacent vertices: two vertices connected by a line
- Degree of a vertex: number of incident lines. Note that arcs are counted in both directions, as are multiple lines. (see f88) Loops count twice (in and out)
- Indegree, outdegree: for digraphs
- Number of adjacent vertices = degree in a simple undirected graph (no loops or multiple lines are allowed)
- However it is not true that indegree+outdegree = adjacent vertices in a simple digraph (the same vertex might connect in both directions, and there can be loops)
- High degree nodes are central, though the nature of that centrality depends on the interpretation of the links.
- Applications on Attiro.paj (Symmetrize, Info->partition, Partition->Make vector, Info->vector
- Components
- Subnetwork: a subset of vertices, together with lines connecting them
- Maximal subnetwork according to any property: (a) the subnetwork has the property; (b) adding any other vertex destroys the property.
- undirected network: walk, path
- directed network:
- walk, semiwalk
- path, semipath
- connected undirected network: any two vertices (a,b) are connected by a path from a to b
- connected directed network:
- strongly connected: any ordered pair of vertices (a,b) are connected by a path from a to b (a condition which implies that there must also be a path from b to a, since we can consider (b,a) - and note that there could be a path from a to b but not the reverse!]
- weakly connected: any two vertices {a,b} are connected by a semipath from a to b (order doesn't matter - if there's a semipath in one direction, there's a semipath in the other direction)
- Components:
- Undirected graph: a component is a maximal connected subnetwork ("hangs together")
- Directed graph:
- Strong component: maximal strongly connected subnetwork
- Weak component: maximal weakly connected subnetwork (equivalent to components of symmetrized network)
- Note that weak components are always "bigger" than strong components - a weak component may comprise several strong components, but not the other way 'round.
- Test your understanding: can components overlap? What is the relation between 'component' and 'partition'?
- Application on Attiro.paj: Net->Components->strong/weak (note: eliminate lots of one-vertex components by setting minimum component size = 2), then Draw-partition. Use Info->partition to get stats on what happens in each case.
- Strong or weak? Semantic and pragmatic considerations:
- Semantic: Does line direction matter? If not, use weak (symmetrize).
- Pragmatic: First look at weak, then subdivide using strong (strong components are typically smaller). Further subdivide using notion of k-connected components, e.g. bi-components (2-connected) in which more than one path is required.
- Homophily hypothesis: What is the relation between components (interaction network based on visits) and friendship groupings (partition)? Generate Cross table stats: Set partition 1 and 2, and run Partitions->info->Cramer's V, Rajski
- Network density