Import of Excel Files

This feature allows for the import of files from spreadsheet software like Microsoft Excel®, OpenOffice.org Calc, and Google Docs. At the moment, yEd can open the following file formats directly:

  • MS Excel 97/2000/XP (*.xls)
  • MS Excel 2007 (*.xlsx)

By using one of the above mentioned applications as intermediate converter, a large number of other file formats is also supported, for example text files containing comma-separated values (*.csv).

In addition to the plain graph structure, i.e., the nodes and edges, an arbitrary number of custom properties can be imported. At the moment, the following means of representation are supported: adjacency matrix, edge list, and node list.

Basics

The import dialog consists of the settings panel on the left-hand side and the spreadsheet panel on the right-hand side.

The Spreadsheet Panel

The spreadsheet panel shows the file content in one table per sheet similar to other tools. The relevant data ranges can be selected using the mouse and adopted into the matching settings item.

The Settings Panel

The settings panel consists of the sections Data and Presentation. Data contains separate groups of settings for every means of representation. These settings are needed for correctly opening the file. The adjacency matrix and edge list representations are mutually exclusive and a node list can be used in combination with both of them.

In Presentation, there are optional settings to control the details of the presentation of the graph. The same results can be achieved by using the corresponding layout algorithms and tools retroactively.

Means of Representation

The most important properties of the means of representation available in yEd are discussed in dedicated sections. Additionally, we exemplarily import a small graph of 5 nodes, 8 edges, and some custom properties there:

For more general information, the Wikipedia article on adjacency matrix is a good starting point: http://en.wikipedia.org/wiki/Adjacency_matrix.

Supplemental Information

Adjacency Matrix

This representation is based on a matrix in which each node is related to one row and one column. Thus, a graph of n nodes requires a matrix of size n x n. The cells of the matrix define the edges of the graph and an edge connecting the nodes i and j exists if and only if the cell at position (i,j) is not empty and its value is not equal to 0.

The values of cells which yield edges are provided in a custom edge property. Optionally, yEd can treat the values of the first row and the first column of the matrix as node IDs and import them as custom property. Typically, the sequence of nodes implied by the rows and columns is the same. However, if node IDs are present in both the first row and the first column, the mapping to nodes respects these IDs. Thereby, yEd handles differing mappings for rows and columns correctly, which is useful for example for bipartite representations. To allow additional custom node properties besides the node ID, this representation can be combined with a node list. In this case, nodes from the matrix and from the list are mapped according to the specified node IDs.

Example

The above illustration shows the representation of the example graph as an adjacency matrix. The cells A4:A8 in column A and B3:F3 in row 3, which are highlighted in blue, contain node IDs, and the other cells in the range B4:F8 represent potential edges. Thus, the resulting graph contains 8 edges, defined by the cells of value 1. For example, the cells C4, D4, and E4 establish edges from Anna to Ben, to Chris, and to Dan, respectively, and there is no edge from Anna to Elena since the value of cell F4 is 0. The following screenshot shows the settings for the import of the example matrix including the node IDs from the first row and the first column. Please note that the input field Data Range contains the cell range and a sheet name (here ‘Matrix 1’). Since the node IDs in rows and columns are the same, using only one of them results in the same graph. For example, it is sufficient to specify the range A4:F8 and enable the setting Node IDs in First Column.

Additional Options

Typically, the sequence of nodes represented by rows and by columns is the same, i.e., row i and column i represent the same node. If node IDs are present simultaneously for rows and for columns, the mapping respects the IDs and the sequences can be different. Thus, it is possible to import bipartite matrix representations, for example. Additionally, empty cells are supported and are handled like cells of value 0. The following example represents a graph which is equal to the example above except for the values of the edge property.

Edge List

This means of representation describes the graph in a list of edges. Every edge is related to a separate row which contains the IDs of the two end nodes of the edge and an arbitrary number of additional values which are provided as custom edge properties. The sequence of the types of the entries of all rows must be identical, that is, for example, if the ID of the source node is the second entry of the first row, it must be the second entry of all other rows, too. The node IDs are provided as custom (node) property. To allow additional custom node properties besides the node ID, this representation can be combined with a node list. In this case, nodes from the matrix and from the list are mapped according to the specified node IDs.

Example

The table above shows the representation of the example graph as edge list. The cells of row 3 which are highlighted in blue contain the names of custom edge properties and each following row corresponds to one edge. In each row, the columns A and B indicate the IDs of source and target nodes followed by the values of the edge properties Weight and Label. For example, the rows 4, 5, and 6 describe edges from Anna to Ben, to Chris, and to Dan, respectively. If the imported graph does not contain the edges you expected, please check if you selected the correct columns for the IDs.

Node List

A node list represents only nodes and no edges. Its main use case is the combination with one of the other means to specify additional node properties, however, it can also be used on its own. The format is similar to edge lists, but every row is related to a node. Besides the ID of the respective node, it can contain an arbitrary number of additional values which are provided as custom node properties. Just like for edge lists, the sequence of the types of the entries of all rows must be identical, that is, for example, if the ID of the node is the second entry of the first row, it must be the second entry of all other rows, too.

Example

The above illustration shows a node list that is compatible with the example graph. The cells of row 3 which are highlighted in blue contain the names of custom node properties and each following row corresponds to one node. The values in column A (the property Name) can be used as node IDs to match each row and its corresponding node to a node from another type of representation. If there are IDs in the node list which do not match to nodes of another representation, additional nodes are created for them. Accordingly, if a node list is used without an additional representation, the resulting graph will contain unconnected nodes only.

Settings in Section ‘Data’

Edge Representation

Determines the means of representation of the edges. Available options are adjacency matrix and edge list. Note that a node list can be combined with any of these.

Data Range of Adjacency Matrix

Determines the range of cells which contains an adjacency matrix. If the aspect ratio of the specified range is invalid, empty cells are used to pad the shorter direction.

Node IDs in First Row / Column

Specifies whether the first row (column) of the data range contains node IDs. Such header rows (columns) are ignored when creating edges.

Data Range of Edge List

Determines the range of cells which contains an edge list. In order to create a graph, the columns of source and target IDs are also required. Besides these IDs, each row may contain an arbitrary number of custom properties.

Column of Source / Target ID

Determines which column contains the ID of the source node (target node). For convenience, ‘Adopt’ utilizes the column of the left cell of the current selection. Therefore, it is not necessary to select exactly the respective cells.

Property Names in First Row

Specifies whether the first row of the data range contains property names. Such a header row is ignored when creating edges.

Data Range of Node List

Determines the range of cells which contains a node list. If a node list is used in combination with another representation, the column of node IDs is required to match corresponding nodes of the two representations. Besides the ID, each row may contain an arbitrary number of custom properties.

Column of Node ID

Determines which column contains the ID of the node. For convenience, ‘Adopt’ utilizes the column of the left cell of the current selection. Therefore, it is not necessary to select exactly the respective cells.

Property Names in First Row

Specifies whether the first row of the data range contains property names. Such a header row is ignored when creating nodes.

Settings in Section ‘Presentation’

The settings on this tab enable you to determine the visual representation of newly created nodes and edges. Additionally, you can also choose the type of automatic layout that is applied to the graph that results from importing the spreadsheet data.

Nodes

The following settings allow you to specify aspects of the node representation. You can use the values of a custom property as adopted from the spreadsheet data as the label text of newly created nodes.

Alternatively, and more powerfully, you can use a node configuration as defined in the Properties Mapper dialog to turn custom properties as adopted from the spreadsheet data into visual properties of the newly created nodes.

Use Configuration
If enabled, one of the node configurations as defined in the Properties Mapper dialog will be applied to the nodes after importing the spreadsheet data. You can choose the configuration from the drop-down combo box.
Label Text
Allows you to select a custom property whose values are displayed as node labels. It is also possible to do without any labels.
Fit Size to Label
Specifies whether the size of the nodes is adjusted to fit the length of its label. Internally, the tool Fit Node to Label is used. Thus, it is possible to fine-tune the node size adjustment by changing the settings of the tool.
Template
Determines the template for the visual representation of the nodes. Alternatives are the currently selected sample representation from the palette and a rectangular shape.

The latter options are only available when ‘Use Configuration’ is disabled.

Edges

The following settings allow you to specify aspects of the edge representation. You can use the values of a custom property as adopted from the spreadsheet data as the label text of newly created edges.

Alternatively, and more powerfully, you can use a edge configuration as defined in the Properties Mapper dialog to turn custom properties as adopted from the spreadsheet data into visual properties of the newly created edges.

Use Configuration
If enabled, one of the edge configurations as defined in the Properties Mapper dialog will be applied to the edges after importing the spreadsheet data. You can choose the configuration from the drop-down combo box.
Label Text
Allows you to select a custom property whose values are displayed as edge labels. It is also possible to do without any labels.
Template
Determines the template for the visual representation of the edges. Alternatives are the currently selected sample representation from the palette and polyline edges with or without target arrows.

The latter options are only available when ‘Use Configuration’ is disabled.

Layout

Determines the initial layout. Alternatives are circular, organic, and hierarchical layout. The computation takes the settings of the respective layout into account. Thus, their controls can be used to fine-tune the result.

Frequently Asked Questions

There is an additional node (edge) in my imported graph. Or: One node (edge) is missing in my imported graph.

Please check if your data ranges contain header rows and the settings ‘Property Names in First Row’ for node list and edge list are set accordingly.