A common, valid, observation concerning Xml is that it takes a significantly greater number of bytes to convey the information.

Consider this data file from over at JavaHog.

60199 93 74 74 13 13 24 24
0.22 25.3 13.8 21.1 86.5 32.7

Now, that will certainly take up less room on the floppy disc, but you still have to find a place (disc label maybe) to list what the numbers mean.


startDate daysDuration dead infested underAttack withEggs withImm withBrood
loblollyRatio pineBA hardwoodBA standAvgDBH longitude latitude
First line is integers, second line Double-precision reals.

Add the fact that important information is contained in the filenames of each of the dozens of, two-line, files into the mix, and suddenly, something like this starts to look pretty good

<DayZero>
  <Spot>AL0017</Spot>
  <Date>1999-06-01</Date>
  <RunDuration>93</RunDuration>
  <Dead>74</Dead>
  <Infested>74</Infested>
  <Attack>13</Attack>
  <Egg>13</Egg>
  <Immature>24</Immature>
  <Brood>24</Brood>
  <Loblolly>0.22</Loblolly>
  <PineBasalArea>25.3</PineBasalArea>
  <HardwoodBasalArea>13.8</HradwoodBasalArea>
  <MeanDiameter>21.1</MeanDiameter>
  <Longitude>86.5</Longitude>
  <Latitude>32.7</Latitude>
</DayZero>

But then again, that is one thing, but this is how that data looks in the SPBModel schema

<?xml version="1.0" standalone="yes"?>
<SPBModel>
  <Batch>
    <BatchId>B32ccf8c5</BatchId>
    <BatchName>JavaHogVerify</BatchName>
    <BatchDescription>Batch Created: 7/16/2009 6:11:17 PM</BatchDescription>
    <BatchXslt>&lt;xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" /&gt;</BatchXslt>
    <Frame>
      <BatchId>B32ccf8c5</BatchId>
      <FrameId>687dafb5</FrameId>
      <IsModelInputFrame>true</IsModelInputFrame>
      <FrameName>AL0017</FrameName>
      <FrameDescription>Frame Created: 7/16/2009 6:11:21 PMBased on: AL0017.dat</FrameDescription>
      <FrameConfig>&lt;FrameConfig&gt;
  &lt;RunDuration&gt;93&lt;/RunDuration&gt;
  &lt;ReportHour&gt;2&lt;/ReportHour&gt;
  &lt;TemperatureOffset&gt;0.0&lt;/TemperatureOffset&gt;
  &lt;CohortCount&gt;
    &lt;Attacking&gt;50&lt;/Attacking&gt;
    &lt;Parent&gt;50&lt;/Parent&gt;
    &lt;Egg&gt;200&lt;/Egg&gt;
    &lt;Immature&gt;500&lt;/Immature&gt;
    &lt;Brood&gt;250&lt;/Brood&gt;
    &lt;Emerging&gt;1&lt;/Emerging&gt;
  &lt;/CohortCount&gt;
&lt;/FrameConfig&gt;</FrameConfig>
      <StartDate>1999-06-01T00:00:00-07:00</StartDate>
      <Latitude>32.7</Latitude>
      <Longitude>-89.5</Longitude>
      <LoblollyRatio>0.22</LoblollyRatio>
      <ShortleafRatio>0</ShortleafRatio>
      <PineBasalArea>25.3</PineBasalArea>
      <HardwoodBasalArea>13.8</HardwoodBasalArea>
      <DiameterBreastHeight>53.59</DiameterBreastHeight>
      <MeanPineAge>25</MeanPineAge>
      <FiveYearGrowth>5</FiveYearGrowth>
      <PreviouslyInfested>0</PreviouslyInfested>
      <CurrentlyInfested>74</CurrentlyInfested>
      <TreesAttacking>13</TreesAttacking>
      <TreesEgg>13</TreesEgg>
      <TreesImmature>24</TreesImmature>
      <TreesBrood>24</TreesBrood>
    </Frame>
  </Batch>
</SPBModel>

Isn't that a bit much?

Well, maybe. We went for the long descriptive element names. The trade off of increased size is offset by the increased information, the improved human readability, and--now that it is xml--the ability to read it into the Spb Administrative applications or into a statistical package, spreadsheet, or database.

Really, the fact is, the argument against verbosity (and in support of two lines of numbers) has been rendered moot by Moore's Law.  Computers aren't running on 256 kilobytes RAM, portable media isn't confined to a few hundred kilobytes, and no-one even knows what baud-rate is anymore.

There's no reason not to add a bit of markup to data these days.

Today, as the saying goes,
Memory is Cheap, Storage is Cheap, Bandwidth is Cheap, & Xml Compresses Well.

.

Check XHTML « spb.xanderlih.com Copyright © Xander Lih 2000-2012  » Check CSS