New XML Grammar

Can you help me? I’m about to reinvent the wheel for this project I’m working on.The requirements are as follows. I need an app that:

  • Allows users to define their own report layouts
  • Allows users to customize existing report layouts
  • Draws data from objects in our existing application
  • Quietly sneaks $.001 off of each order transaction processed and deposits that amount into a Swiss bank account which only I have access to.

The last line item is what you would call an optional requirement or a “nice to have” feature. The product would function just fine without it but if you could help me implement it then all the better off I’d be. The truth is that I’ve been asked to create a report engine. More accurately, I’ve been asked to rewrite a very restrictive homegrown database table-template driven report engine to include such features as XML serialization of our existing application objects. I’ve also been asked repeatedly to fix the broken hot water heater that sits in the basement of my house and requires frequent re-igniting of the pilot light but it is the rewriting of this engine that has me more concerned. The application code that we use now is restrictive because for input it builds a fat object that knows all about the current order, depends on and involves every object known to order entry (and practically every object known under to the default JVM) and uses rows in a DB table to drive the output. It has become apparent over the years that a DB table offers little flexibility for defining templates and thus we want to shift to a new Golden Hammer, XML/XSL.

That’s where I come in. I was involved because… well because I kinda like XSL transforms. Also because I am the author of one of the most complicated XSL stylsheets known to man. The one that powers our sales analysis reports. (I’d go into detail about why it is so complicated and ridiculous but that’s food for another post.) The powers that be (that is my boss of course, a rather nice gentleman and very sharp with data driven design) have coerced me into building what would be a report designer. (Well it would be if I had a year or two to develop, a staff of my own developers, a separate office building, funding, a jacuzzi, advertising and a team mascot. Don’t worry about what the jacuzzi is for. Just know that it would be necessary.) I have now embarked on a quest that has lead me on trails through object to XML conversion, the Java TrAX API and now custom XML schema definition.

What I’ve come up with

I hereby present to you my progress. I have an object to xml converter that uses reflection to generate SAX events so that it can be dropped into the TrAX API as an XMLReader. Also, I have the buddings of what is to become the greatest programming language known to man, SSML. What the heck is SSML? It’s a cool acronym I just made up moments ago to describe my over-engineered solution to what probably should just be a simple parameterized report. SSML embodies everything I would ever want an end user to know about the inner workings of our report generator. It is the language which will revolutionize programming in the future. It may also be used to bring our struggling troops home from Iraq once it becomes formalized as a w3c reccomendation. It is the end to world famine and the cure for infetious disease. Honestly, it’s just a stoopid XML grammar that I’m inventing to capture the intentions of a user that is trying to print out invoices from our system. (See what I mean about over-engineering?)

Enter the SSML (…and the dragon while you’re at it)

Simple Stylesheet Markup Language: My hair brained idea is to create a simple XML grammar that will function as a stripped down version of XSL-FO. I mean really stripped down to the bare bone marrow. It will provide a means of defining an output document. The document will consist of an optional first and/or last page, middle pages, and sections within those pages. A section may be positioned anywhere within a page. A page may be sequenced anywhere within the document, such as first-page, last-page, page23, etc. Within section you can define data (which is just literal values), variable data (which is data derived from an XML-serialized object), and possibly tables. I have not yet determined how I’ll use the tables. These are very early ideas that you can play with at your own liesure. (Early as in I just thought of them as I keyed them into this here blog). However I am hereby copywriting them so that if you decide to use them for your own evil schemes you will be required to pay royalties to me or else suffer a wrath of 738 attornys showing at your doorstep. If you do develop something I can use please email me the source code so that I can claim it as my own and charge $259 per copy for my shrink-wrapped version of your hard work. (Don’t worry, I’ll cut you in on some of the profits, I promise.)

If you have any ideas on how any of this can be done better I’m open to suggestions. I’m still not telling you what the jacuzzi is for, but please hit me off on the comments and how-tos and I’ll holla…

(The phrase “hit me off” is a ghetto-slick way of requesting your thoughts and input on the subject matter. Please do not assume aggression as the author does not promote violence. Comments, criticism, and code examples are all welcome contributions but will become property of the site master and ultimately used for world domination.)


  1. That’s a hard cookie you’re eating!!

    So if I get this correctly, the data is going through many transformations: from DB to a POJO, then to an XML, then transformed to another XML with a stylesheet that is dynamically generated from the user input, then sent to the printing module.

    Do you know Freemarker? You could have one single template that uses the user input for formatting and the POJO data for content. The output from the processed template can be any text you like, HTML, XML, etc.

    It’s not a piece of cake, but looks leaner to me. Especially if you get the formatting options as Java objects.

    Good luck!

  2. I’ve chewed hard cookies before. There was this time I bought a pack of tollhouse and left it open while I went away on vacation. Then I came back and, well that’s not the point. By the way, you left out a step. The data makes a stop at the rest room after being transformed to XML but before being converted to another XML by the stylesheet that is created from my custom XML which I could probably generate from… Neva’mind I’ll leave that step out. I can add it after the beta release. Actually, it’s not as bad as it sounds. Most of the transformations are never materialized as it’s all SAX powered. What I mean is the object is presented to the TrAX API as XML by a SAX event generator. Those SAX events are piped into a transform process that involves a pregenerated stylsheet. So It would be like a normal transform involving an XML input and a stylesheet. The pregenerated stylesheet will produce FO which also is never materialized. Instead the SAX event chain is piped through to the FOP engine, which produces the final output. No XML is ever written out in between transformations so the performance impact should be minimal. (Or I hope!) I did look at Freemaker a while ago when I first started on my XML madness here. I turned away from it because it was stream based. In other words each transformation is materialized as a stream which needs to be fed into something else. With Freemaker wouldn’t I need to create separate templates for the different outputs? My initial attempts at Freemaker involved generating XML-FO for the FOP engine. I wasn’t happy because I needed a separate thread to start generating the stream while the main thread pushed the generated stream into the FOP engine. The alternative was writing the generated stream to disk and reading it back into the FOP engine. That’s when I got deep with the TrAX API and SAX event chaining. It’s really cool how you can chain filters together which could be XSLT stylesheets, POJOs or anything. I do think I’m overboard here. I just can’t see past my own big nose on this one.

  3. I thank you for your comment.

  4. Rosie,

    You’re welcome, I guess. I’m not sure if you’re thanking me for my response to Tiago’s cookie drama or if you were thanking me in advance for the comment I’m leaving right now. (That implies that somehow you knew I was going to comment which, when considering your comment, would seem highly probable since nobody leaves a thank you for a comment that doesn’t exist.) Anyway, I thank you for your thanking me for my comment that I’m unsure of. And before you thank me for that I’ll spit out an immediate, “you’re welcome”. So now you don’t have to post the extra thanks. Thanky… err… you’re welcome, or whatever!


  1. Can’t see nothing but the source code » The pointy edge of XML

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


Get every new post delivered to your Inbox.

Join 250 other followers

%d bloggers like this: