XML
Not Just a Good Idea.
It's the Law

I have trouble, as programmer of the SharpHog model and Spot-Growth Database project, conveying to scientists, the consumers of the SharpHog model and Spot-Growth Database information, the magnitude of the simple act of gathering and defining all their biological, both field and computational, data in Xml.

I had to write a special tool to query the SharpHog SPBModel Input-Output dataset, and render from it a few selected values as a comma-separated-values formatted series of numbers for one particular user. Without asking their intended purpose I spent a lot of time creating the third tab of the SharpHog Model Desktop Application Interface to get this done flexibly. The user, it turns out, would then run my Xslt query on the Xml data, copy to the clipboard the result CSV, paste that into an Excel spreadsheet, then laboriously walk Excel through setting up columns from commas, to copy that and feed it to a statistical analysis package.

I still have trouble explaining the inelegance of that. I want to say, here is your data, it is a file in a folder on a computer, and that file is formatted to the international standard for data-markup, Xml. Furthermore, there is a definition of the structure of this data available as well, called a schema, specifying the implementation of Xml for the data at hand in this file. But even without the literal definition and description of the data, simply by structuring the file we have gained innumerable tools for querying and transforming that information. This Xml data-file is indepenant of however you may choose to consume it, and won't be altered purely from the act of examining it.

One exisiting tool for working with data organized as Xml is called Xslt. Xslt is many things, but for the use here it can be thought of as an extremely sophisticated querying language. Scientists are used to "old" Access Databases or SQL Databases, or SAS projects which each have their own particular version of query language to administrate and access the information.

Xslt is the new international standard language for querying the new international format for organizing data, Xml.
And when I say international standard, I mean this is the way things are done.

XSLT
eXtensible Stylesheet Language Transformation files
are commonly called stylesheets or transforms
(because they style and / or transform data)

So, if a user wanted to work with data from SharpSpb in an Excel spreadsheet, what they would do is use Xslt to transform the SharpHog SPBModel Schema Xml data into Excel Workbook Schema Xml Data. Not create a CSV and copy it to the clipboard. Actually, I'm not an Excel whiz, but I'm pretty sure you could just show the SharpHog SPBModel or SpotGrowth Database file to Excel, and it has a tool for checking-off what key values you'd like to import into your spreadsheet (not too dissimilar from the Xslt tool in the SharpHog SPBModel Desktop Administration Application). Excel utilizes Xslt internally to achieve this result.

 

If a consumer of SpotGrowth schema-formatted information wanted to compare it using some old routines in their SQL (or AccessDB) toolbox, they would use Xslt to transform the interesting items from the SpotGrowth Xml into a new query in SQL Query Language, inserting the information as a table into their SQL (or Access) database for use by the stored procedures.

 

If a user wanted to view their model-run output as a summarizing table in a web page, they would use Xslt to extract the values of interest and mark them up as an Html table.

 

If a user wanted to provide input for a statistical analysis package, which is not my arena, I think what they would want to do is use Xslt to get the values and present them however SAS expects them, as oposed to paste them in a spreadsheet and copy the columns to a text file.

 

It should be noted that Xslt is certainly capable of, and perhaps even better suited to, performing analyses of data itself. Xslt isn't just a query language, it is a Turing-Complete programming language which can achieve any computational goal.

 

To be sure, it takes time to write an Xslt query. But it takes time to prepare any parsing of information. All things considered, Xslt being easy to follow just reading it, simplicity alongside myriad existing tools for writing it, its incredible flexibility for reuse, and sheer computational power--it is easy to understand why it has become so popular worldwide. New users may have to learn Xslt, but it will be the only language of its type they'll ever need again.

Check XHTML « spb.xanderlih.com Copyright © Xander Lih 2000-2012  » Check CSS