Export the structure of the graph (not the layout) as XML

Question

Export the structure of the graph (not the layout) as XML

Hi,

This request is about adding a "Export graph as XML without layout" option to yEd.

This new file format would save only nodes, relations, groups and labels. Icons could be saved by using their filenames and/or "src" attributes (so no more embedding them as Base64).

Rational : yEd generates large GraphML files because they contain both "structure" and "layout" (in the yWorks namespace), whereas programmers want only the "structure" part. Writing an XSL stylesheet is already possible, but it would be easier to transform a file containing only useful (that is, structural) information.

Use case : a programmer wants to use yEd as a tool to modelize something using a graph. Once he's happy with his modelization, he saves it in this new "structural XML" format, and starts building something around it.

EDIT : replaced "save" by "export" (this would not be a in/out file format, only a "out" one)

asked Sep 11, 2013 in Feature Requests by geceo (600 points)
edited Sep 11, 2013 by geceo

1 Answer

thomas.behr · Answer 1 · 2013-09-11T12:25:22+0000

You realize that it is very easy to parse structure only from a GraphML, don't you? Simply ignore all <data> elements and you are done. Any decent XML framework will allow you to do that. No need to use XSLT at all.

That said, you actually do not want "structure only". Otherwise you would not want labels and icons. Fortunately, that information is included in yEd's GraphML, too. (Admittedly, the GraphML format used by yEd does a poor job of separating geometry data from other visualization data such as label text or label icons, but it is possible to extract text and icons nonetheless.)

By the way, "storing" icons by storing their filenames is basically the most fragile and unportable way there is to store such data.

You need to realize that your requirements are very specific. If we created a dedicated export format for each and every such requirement, there would be hundreds of different export formats in yEd. That is simply not feasible. Having one post-processing friendly format which includes all information is much more sensible.

answered Sep 11, 2013 by thomas.behr [yWorks] (163k points)

Hi,

The current GraphML format (as generated by yEd) is not "friendly", as you say it mixes layout/graphical elements with structural/information elements, just like HTML did prior to CSS, when there was <font> tags etc.

Removing <data> tags, as you suggest, leads to a loss of "information" (what I previously called "structure"). For example, the text of the labels and the paths to the icons are information (these should be parts of the GraphML grammar but unfortunately are only extensions in another namespace).

My requirements are not specific, I'm requesting a new XML format similar to yEd's GraphML format, but without all the geometry and clutter. I want to look at the file and consider it "clean".

commented Sep 11, 2013 by geceo (600 points)
edited Sep 11, 2013 by geceo

First, GraphML is an XML format and as such it is post-processing friendly compared to proprietary (binary) formats. It is post-processing friendly in the sense that there a lots of standard technologies and tools to process XML.

Second, when talking about graph exchange formats, "structure" means the structure of the graph. In GraphML, <data> elements do not contain information related to the graph structure. That information is stored in <graph>, <node>, and <edge> (and <port>) elements only. Label text is not structural information but visualization data. The same is true for icons.

Third, your requirements are highly specific. Geometry data in a diagram is not clutter. Quite the opposite, geometry data is the most important data besides the graph structure. yEd is all about arranging diagrams. You cannot do that without geometry.

As I mentioned in my previous post, we are aware of the fact that the way data is organized in yEd's GraphML format is far from perfect. (That is why we improved the format for other products.) Unfortuately, changing the format now will break compatibility for a lot of yEd users and many yWorks customers as well. That is not something we will do lightly just to make it look "clean".
That does not mean we will never change the format. Actually, we regularly discuss changing the format - however, up to now the cons always outweighed the pros.
Finally, even if we did change the format, I doubt it would be in a way that you would consider "clean". There would still be <data> children of <node> and <edge> elements. Actually, the change would be breaking up the single "nodegraphics" and "edgegraphics" <data> elements in several dedicated <data> elements for geometry, labels, colors, etc. resulting in more <data> elements than before but still inside the <node> and <edge> elements. (That kind of XML structure is actually specified in the GraphML standard, so that will never change.)

commented Sep 11, 2013 by thomas.behr [yWorks] (163k points)

"First, GraphML is an XML format and as such is post-processing friendly compared to proprietary (binary) formats. It is post-processing friendly in the sense that there a lots of standard technologies and tools to process XML."

In that perspective of "XML vs binary", I agree.

"Second (...) structure means the structure of the graph"

Agreed again, but from the mathematical point of view. From the point of view of the user using yEd :
- a label is some text that he/she associated to a node or edge,
- an icon is an image that he/she associated to a node.
Why does the user add labels and icons? Well, to add information. If the user sets a blue icon on some nodes, and a red icon on other nodes, then there's a chance that this means something to him, that this carries valuable information to him.

Now, looking at the current "GraphML+yWorks" file format, the same cannot be said for data that are only used by yEd to re-open and re-display the graph at a later stage, like the various coordinates.

"Third, your requirements are highly specific. Geometry data in a diagram is not clutter"

I realize I made a major mistake with this request : I have written "save" when I was thinking "export". The feature request is not about replacing the current in/out file format (GraphML+yWorks). The idea is to add another "export" format, which would export only the "structure + information introduced by the user" without the layout/coordinates elements. What you describe with the <data> tag being child of <edge> and <node> is precisely what I'm thinking of. Add texts and icons inside <data> and this is it.

Then yEd could be used as a general tool to "conceive" a graph. Once the graph is "perfect", then a programmer could export the structure (nodes, edges, groups) and "information" (labels, icons) in a clean XML structure.

I guess this won't make its way in yEd but anyway, thanks for your comments.

commented Sep 11, 2013 by geceo (600 points)
edited Sep 11, 2013 by geceo

Most popular tags

Categories

Export the structure of the graph (not the layout) as XML

Your comment on this question:

Your answer

1 Answer

Your comment on this answer: