# MCSN Tuesday, 15-Nov-11

## Contents

# General

- I've written up lots of new Pajek help - see Pajek help. Please be sure to master these techniques, especially
- how to extract a subnetwork based on a partition, using partitions menu
- how to extract a subpartition or subvector based on a partition, using vector menu
- how to compare two partitions
- how to compare two vectors

- Note: I'm away on Thursday, but class will be held and will be both important and fun, so please do not miss it! (someone please take attendance)
- class affiliation partition
- network closeness game

- draft proposals - be adding, correcting...I will give you feedback.
- We're skipping chapters on brokerage (analysis of critical information-passing points) and diffusion (how information propagates through networks over time)- please look at them though if these concepts pertain to your project
- sign up for your presentation slots using google calendar

# Exercise 5.8

- Movies.paj contains
- 2-mode network linking 62 producers (mode 1, rows) to 40 composers (mode 2, columns) based on co-production during 1964-76
- partition indicating top composers
- questions:
- do top composers work for relatively few producers (repeatedly work with the same producers over and over again)?
- do top composers all work for the same producers?

- procedure
- examine network, looking at top composers. Any patterns?
- compare partitions (partitions>info): (1) degree, and (2) being a top composer. What do cross-tabs tell us?
- now extract one-mode net of composers only
- draw (line values = similarity)
- examine line value stats
- run valued core partition (use "input" and "max") and look at stats
- extract corresponding "top composers" partition.
- compare partitions: valued core and "top composers". Where do top composers lie?
- delete low valued lines, extract higher m-slice, and extract corresopnding "top composers". Are they connected? If so this tells us the top composers work for many of the same producers.
- try extracting only the top composers - from the one-mode network - how are they connected?

# Chapter 6: center and periphery

- Questions:
- centrality (egocentric): how central is a particular vertex? (the "energize" command enables visualization, but not precise measurement).
- Intuitively: a vertex is central if it's well-connected.
- Maximum centrality: center of a star network

- centralization (sociocentric): how centralized is the network as a whole?
- Intuitively: a network is centralized if a well-defined segment is central, and the rest is not.
- Maximum centralization: a start network

- centrality (egocentric): how central is a particular vertex? (the "energize" command enables visualization, but not precise measurement).
- Three approaches to centrality and centralization

## Degree centrality

- Intuition: Let's start with the local view. Those who are well-connected get information faster, and send it out faster. They're more central. So, let's measure centrality simply based on number of neighbors.
- (This is a naive intuition, since I might have lots of neighbors who are not well-connected, but it's a starting point!)
- In a simple network of N vertices, number of neighbors = degree.
- Star formation provides maximum centrality
- Centralization:
- first compute degree variation as follows:
- find the most central vertex, and use its centrality as a comparison point, C
- compute the difference between C and ever other vertex's centrality
- add these differences

- then divide by the maximum degree variation on N vertices (Note: this will be the variation on a star network, with one vertex at the center, and (N-1) around the edge; the former has degree (N-1) and the latter all have degree 1. So the max is (N-1)*[(N-1)-1] = (N-1)*(N-2). Thus for N=5, the maximum variation is 12.

- first compute degree variation as follows:

## Closeness centrality

- Intuition: Let's take a more global view. Instead of looking at how well-connected a vertex is at one hop (neighbors), let's look at how well-connected it is to the entire network. So we want to check on how long the paths are connecting it to everyone else. Those with short paths are closer to the network as a whole - they're more central.
- Geodesic: shortest path between 2 vertices
- Distance from vertex A to B (call it d(A,B)) is the length of the geodesic from A to B.
- Note that d(A,B) = d(B,A) for undirected networks (symmetry), since all lines are bidirectional.
- In a star network, the center is most closeness-central - it's at one hop from everyone. So we compute closeness centrality of any vertex by comparing to this star-center.
- More precisely: for each vertex in a network of size N
- compute the sum of the distances to every other vertex
- divide this value into the number of other vertices (N-1), which represents the sum for the star-center.

- Compute centralization as before:
- compute degree variation
- find the most central vertex, and use its centrality as a comparison point, C
- compute the difference between C and ever other vertex's centrality
- add all these differences

- divide by the star network degree variation

- compute degree variation
- Note that this technique can only work if the network is connected! (one component)

## Betweenness centrality

- Intuition: a vertex is central if lots of geodesics pass through it
- Define betweenness centrality of a vertex as the percentage of geodesics between other pairs that pass through it.
- Betweenness centralization is the variation in this centrality, divided by the maximum possible variation.