Pajek help

From CCE wiki archived
Revision as of 11:34, 10 November 2011 by Michaelf (talk | contribs) (Pajek tips)
Jump to: navigation, search

Installing Pajek on your computer

See these instructions.

Pajek data files

Pajek reads and saves network data by using special files with extensions such as "net", "clu", "vec", and "paj". The "net" file stores a network; "clu" and "vec" store cluster (integer) and vector (real) arrays, assigning one value per node. A "paj" file can combine other files, saving the time of reading them all in separately. You may create networks from within Pajek, in which case Pajek will create these files for you when you save. Or you may wish to create the files yourself. But be careful - these files must be pure text, without any additional formatting information (such as word processors like Word usually add)...or they won't work.

Creating Pajek files in Pajek

Pajek is capable of creating and editing networks, which can then be saved in the form of Pajek files. The easiest way to create a network from Pajek is to use the command:

Net->Random Network->Total number of arcs

Tell Pajek how many vertices you want, and then ask it to create zero arcs. The result is a graph with no arcs. Draw the graph, and you can add the arcs yourself. Or you can edit the graph in Pajek's main screen under "Network".

Editing Pajek files

Pajek files are pure text, and easily readable or writable. The trick is to keep them pure text.

Do not use tab characters! Formatting using spaces is much safer!!

On Windows, create data files using a simple text editor, such as Wordpad, that saves text only.

For Mac, follow these instructions:

  • Create a folder on your Desktop called "AAA Pajek Data", with as many subfolders as you like. You can create an alias to this folder and keep it anywhere else, but having this folder on the Desktop will save time, because the Mac version of Pajek requires that you navigate from the top level of your directory hierarchy.
  • I suggest that you create your Pajek files using Word. Be sure to turn off "Smart Quotes" (so that quotation marks appear "straight"); in MS Word 2008 for Mac this option appears under menus Tools/AutoCorrect/AutoFormat as You Type (other versions may place it elsewhere). Be sure to enter a carriage return ("return") after every line, and leave no blank lines. Save your files using output format “Text only with line breaks”, or "Plain text" (with option "MS DOS"). This will force the file into the required plaintext PC format (where line breaks contain both a carriage return and a newline, unlike unix which uses newline only). Word wants to add the ".txt" extension to such files - delete that and replace with the required file extension instead. Other text editors may also be able to create readable Pajek files - try and see what works.

Renaming file extensions

For either, Mac or Windows, saving as pure text inevitably adds a ".txt" file extension. You'll need to change this extension to what Pajek wants (e.g. from "txt" to "net", "clu", "vec", or "paj", as needed), or it won't read them. This is simply a matter of editing the file name. The problem is that operating systems like to hide the extension from you (thinking you don't want to see them!). Here's what to do:

For Windows: from the Start menu, go to Control Panel, select Folder Options, select the View Tab, and then uncheck Hide extensions for known file types.


On the Mac, go to Finder. select Preferences, Advanced, and check Show all file extensions.

Once you can see the file extensions, you can change them by renaming the file (usually a right or control-click will do it). Delete txt and insert the correct Pajek extension instead.

Pajek file formats

Pajek uses primarily three file types: .net, .clu, .vec (network, partition, vector). The .paj type allows you to combine multiple networks, partitions, and vectors in one file. You'll generally generate .paj files using Pajek. However you may be creating .net, .clu, and .vec files using spreadsheets and word or text processors.

Network files

Network files (extension ".net") define a network, as in the following:

*Vertices 3
1 "Doc1" 0.0 0.0 0.0 ic Green bc Brown
2 "Doc2" 0.0 0.0 0.0 ic Green bc Brown
3 "Doc3" 0.0 0.0 0.0 ic Green bc Brown
1 2 3 c Green
2 3 5 c Black
1 3 4 c Green

This example defines 3 vertices (Doc1, Doc2 and Doc3) denoted by numbers 1, 2 and 3. The (fill) color of these nodes is Green and the border color is Brown. The initial layout location of the nodes is (0,0,0). Note that the (x,y,z) values can be changed interactively after drawing.

There are two arcs (directed edges). The first goes from node 1 (Doc1) to node 2 (Doc2) with a weight of 3 and in color Green.

There is one edgerom node 1 (Doc1) to node 3 (Doc3) of weight of 4, and is colored green.

Note that you must include a carriage return/new line after the last line of a Pajek file!

Partition files

Partition files (extension ".clu") divide the vertices into classes, called partitions. These partitions cannot overlap, and yet must cover all the vertices. In other words, every vertex is assigned to one and only one partition. Here's an example:

*Vertices 3

In this example, the three vertices are assigned to two different partitions - vertex #1 in partition #4, and the other two in partition #8 (note that the vertex numbers do not appear, but are simply inferred by ordering.

Vector files

Vector files (extension ".vec") assign a number to every vertex.

Here's an example:

*Vertices 3

In this example, the first vertex is assigned a value of .35, the second 8.9, and the third 100.0. Again, the vertex numbers do not appear, but are simply inferred by ordering.

What's the difference between vec and clu?

Numerically, vector values are continuous (real numbers), while partition values are discrete (integers). Vector values are continuous, while partition values are discrete. More importantly, they have different interpretations. A vector assigns a real value to each vertex - these values are not expected to repeat, or define classes. A partition divides the vertices into classes, most typically with more than one vertex per class.

For instance, if people are vertices, a vector might define their weights or heights. We don't expect these values to repeat exactly. A partition might divide them into classes based on educational level, assigning the value 0 for those who haven't finished elementary school, 1 for those who have completed only elementary school, 2 for high school, and 3 for college.

But Pajek can convert partitions to vectors (Partition->Make Vector) easily enough (integer values are also real numbers). Slightly more complex is converting a vector to a partition (Vector->Make Partition) since real values must first be "rounded" to integers. Ensuring that these conversions make sense is up to the user!

Pajek datasets for ESNAP

Download these sample data sets for use with the textbook, ESNAP. Store them in your Pajek Data directory for future use.

Pajek tips

  • To edit vertex labels, use the Partition section of the Pajek main screen. Edit any partition (or create a null partition using Partition->Create null partition). Click the edit button (small hand holding pencil). You can then edit vertex labels.
  • Creating networks:
    • Create networks quickly using txt2Pajek, or create a random network (Net>Random network>total number of arcs) with no lines then add them yourself by clicking with the mouse in the Draw window
    • Or edit the .net file yourself - use any of the book's sample files as a starting point to get the format right
  • Creating partitions and vectors:
    • Create partitions and vectors quickly by using commands Partition>Create null partition or Vector>Create identity vector, then editing these (by clicking the hand/pencil icon) to insert the values you want.
    • Or, for bigger networks, you can simply edit the file yourself (.clu for partitions; .vec for vectors), using any of the book's sample files as a starting point and model. If you can cut and paste the data from a website, this way may be faster.
    • Remember that the partition must be the same size as the network, so that every vertex gets a value
  • Comparing multiple partitions. Typically you want to find the relationship between two or more partitions - one derived from attribute data external to the network itself (for instance, gender, age, musical genre, etc.) and the other resulting from a Pajek analysis of network connections (for instance, specifying, for each vertex, which component it's in - or its degree, or which component of the 4-core (or 4-slice) it's in, or how many triads it participates in). How to do this?
    • Make sure you have both partitions available in the drop-down list.
    • Select one of the two.
    • Choose Partitions>First partition
    • Select the second of the two
    • Choose Partitions>Second partition
    • Run Partitions>Info>Cramers V/Rajski
    • Examine the resulting cross-table and interpret the statistical results, as per ESNAP

  • After analyzing with Pajek, to save all your work, use
  • To create a partition

Converting lists of vertex labels to Pajek files

Pajek files require you to number vertices, and define arcs and edges by vertex number, not label, requiring you to keep track of the number/label relationship. The following two programs are very useful for generating Pajek files out of data comprising a list of label pairs. Each label automatically defines a vertex. Be careful of misspellings! Note also the difference between 1-mode and 2-mode, and between arc and edge networks.



Documentation and Tutorials