top of page
Writer's pictureikercamarab

How to prepare your data to build a network chart with Vega

Updated: Apr 26, 2020

Your design consists of a network but you realize that your dataset is not ready to use in Vega. Don't worry, it is very easy to arrange your data in the node and links format required. In this post, we are going to explain how the libraries igraph and d3r in R can solve your problem.


Your data can be stored in many different ways. We are going to explore a few that we found useful. For example, you could have an adjacency matrix which is a square matrix usually used to represent a graph. In that case, the function used would be graph_from_adjacency_matrix().

# Example of adjacency matrix
data <- matrix(sample(0:1, 16, replace=TRUE), nrow=4)
colnames(data) <- c("Brad Pitt", "Meryl Streep",                     
                    "Martin Scorsese", "Quentin Tarantino")
rownames(data) <- c("Brad Pitt", "Meryl Streep",
                    "Martin Scorsese", "Quentin Tarantino")
# Transform it in a graph format
library(igraph)
network <- graph_from_adjacency_matrix(data)

It could also be that you had all the connections listed in the dataset in two columns. One with the origin and the other with the end of the links. For that case, we use graph_from_data_frame().

links <- data.frame(
              source = c("Brad Pitt","Brad Pitt","Meryl Streep"),
              target = c("Martin Scorsese", "Quentin Tarantino", 
                         "Martin Scorsese")
)
# Transform it in a graph format
library(igraph)
network <- graph_from_data_frame(d=links, directed=F)

Furthermore, to the previous case it is possible to add additional information to the nodes. This could be the case if you wanted to distinguish among groups to later use different colours.

links <- data.frame(
              source = c("Brad Pitt","Brad Pitt","Meryl Streep"),
              target = c("Martin Scorsese", "Quentin Tarantino", 
                         "Martin Scorsese")
)
nodes <- data.frame(
              name=c("Brad Pitt", "Meryl Streep", 
                     "Martin Scorsese", "Quentin Tarantino"),
              group=c("Actor", "Actor", "Director", "Director")
)
# Turn it into igraph object
library(igraph)
network <- graph_from_data_frame(d=links, vertices=nodes, directed=F)

The last step is transforming the igraph object to json format using d3_igraph() from the library d3r.

library(d3r)
data_json <- d3_igraph(network)
write(data_json, "example.json")

We have just generated the following json:


As a final comment, we found some trouble using the created file in force directed graphs because the nodes were not zero-indexed. For our example, this can be solved using the variable "name" as id.

...
"transform": [
        {
 "type""force",
 "iterations"100,
 "forces": [
            {"force""center""x": {"signal""width / 2"}, "y": {"signal""height / 2"}},
            {"force""collide""radius"20},
            {"force""nbody""strength"-10},
            {"force""link""links""linkWidth""id""datum.name","distance"10}
          ]
        }
      ]
...

More information of this issue can be found in the next post.



23 views0 comments

Recent Posts

See All

New visuals

Comments


bottom of page