Can We Do Better Than XML and JSON?

FtanML looks for the best of both

Today’s Balisage conference got off to a great start. After years of discussing the pros and cons of XML, HTML, JSON, SGML, and more, it was great to see Michael Kay (creator of the SAXON processor for XSLT and XQuery) take a fresh look at what a markup language should be.

Many recent efforts have been reductions. JSON was an extraction from JavaScript. XML was a simplification from SGML. MicroXML pushes simplification much further. Reductions are great for cleaning up past practice and (usually) making tools more accessible, but genuinely new features come later, if at all. The JSON and XML camps mostly stare at each other warily, and though people mix them, there’s little real “best of both worlds.”

Kay, with the benefit of time in Ftan, Switzerland and a group of students, took a look at reconciling the two approaches. Compatibility with the past was unimportant, but it had to be as good as both JSON and XML in their typed data and document worlds. The (relative) human-readability of XML remained important, as did JSON’s ease of yielding data structures.

This is, of course, risky:

…the implicit decision that we would not compromise technical quality in the interests of market acceptance. The aim was to do it right, and we would not measure success by the level of adoption.

The result was three pieces: a markup language, a schema language, and a scripting/transformation language. The markup language supports a few key types—boolean, numbers (all decimal), string, list, element, null, and rich text (also known as mixed content). The syntax uses a mix of JSON and XML symbols and conventions, offering more flexibility than JSON but a much more concise system than XML. There are no namespaces, something that cheered me immensely. One of Kay’s examples provides a good sense:

FtanGram is the schema language, if you’re into such things, and is one of the simplest I’ve seen, and relatively easy for humans to read directly. The scripting language, FtanSkrit, was exciting in a different way, as it used an extra type in FtanML: functions. I keep having conversation after conversation about data flows through functions replacing object hierarchies, so FtanSkrit seems to be in exactly the right place.

(Later in the day, Dmitri Novatchev looked at programming, especially functional programming, in XPath 3.0. Functional programming is a constant and rising theme in my technical life.)

Is it too late for something new in markup? Can a study project make a difference? I’m not sure, but I’m cheering for FtanML anyway.


Sign up for the O'Reilly Programming Newsletter to get weekly insight from industry insiders.
topic: Programming
  • russnelson

    Looks like it was devised somebody who likes JSON but has an allergy to braces.

    • John Cowan

      FtanSkrit uses braces to delimit functions.

      The main lexical problem, I think, is the use of | at both the beginning and the end of rich text. Michael said he was considering guillemets, only they are hard to type on US and UK keyboards; I suggested ; we agreed that having both would be a Good Thing.

  • Eric Elliott

    How is this better than JSON? Yeah, you got rid of colons and commas, but replaced them with angle brackets. Is it really more terse? maybe. Is it more readable? Not really.

    • Eric Elliott

      Oh. Nevermind. I get it.

  • Randal L. Schwartz

    Seems not much more efficient than YAML. Why not YAML, which is (nearly) a superset of JSON?

    • Simon St.Laurent

      I love YAML, and should have mentioned it, but there’s a key problem: no mixed content. There’s no rich text, something I use every day.

      If you work with documents, even if you work with data-rich documents, mixed content is a key feature. For the past few years, I’ve used it as the boundary for deciding what format to use, recommending XML pretty much only in cases where rich text was necessary.

      The function type is another key differentiator. I don’t know that it makes that much difference to a YAML audience, as I haven’t heard of people wanting to process YAML with YAML, but it opens a lot of doors I’d like to see more people explore.

  • TatuSaloranta

    It seems to me like this is actually just an evolutionary thing based on XML: with mixed content, and (unless I am wrong here) lack of actual Object as main structure (instead of using sequences), it is conceptually more different from JSON and YAML than XML. Is this correct interpretation? Or did I misread this part:

    billTo = <country="US" [


    which looked bit odd; I am guessing it is either a sequence; or denotes difference between concepts similar to Element/Attribute separation in XML.

    If so, it’d be more like borrowing syntax from JSON, and logical structure from XML.

  • Samuel Falvo II

    I still miss those halcyon days of EA-IFF-85a.

  • Richard Stanford

    One major issue I have with it is this line from the overview: “An extensible mechanism for data types is needed: for example, representing dates as values.”. This is, quite frankly, crap. The largest issue I have in the Real World with JSON is that it has no built-in representation for a date literal; pretending that Date is just as custom as PurchaseOrder is unforgivable these days.

  • Seth Willimas