Overview of my Server-Side Processing

August 2, 2005

The base of this blog is a hierarchy of XML files. They are for the most part all named index.xml and all exist in the “http://ersoft.org/blog” namespace. In addition, all the files are saves as UTF-8. A file can consist of several parts, but this is an example that shows all possibilities:

<index title='Pretty Title' time='22:20:00' author='Eric' name='realname'
       xmlns="http://ersoft.org/blog" type='root|group|year|month|day|post'>
	<header>
	   <!-- Nodes will be placed in the header of an XHTML document. -->
	</header>
	<sidebar>
	   <!-- Nodes will be placed in the sidebar of an XHTML document. -->
	</sidebar>

	<listing>
	   <!-- A listing is used to refer to another child xml document -
		such as those used to form the date, time, etc. of a post.
	     -->
		<index>name</index>
		   <!-- The name here is the folder name that the index.xml
			exists in.
		     -->
	</listing>

	<include>/sudo/path/to/file.xml</include>
	   <!-- This is what is used for the recent posts. It includes the xml
		from the file into the current document. No automation to
		determine the most recent posts currently exists.
	     -->

   <!-- Any other nodes will be assumed to be XHTML and be copied to the output
	when needed.
     -->
</index>

This format is very raw and not very imaginative, but it seems to work well. I plan to eventually remove the need for the ‘name’ attribute, but other then that, it should stay pretty consistent.

So that brings me to the XSLT. It is split up into two main files: xhtml.xsl and blog.xsl. xhtml.xsl is the file that has the formatting for my site while blog.xsl has the templates, etc., for any blog. So, in general, when I make infrastructure improvements I only need to modify blog.xsl and when I make presentation improvements I only have to modify xhtml.xsl.

Main Problem Encountered

The main problem that I had was the fact that XSLT 1.0 does not allow for variables to recognized as a node-set if its contents were created on-the-fly. What this meant for me is that each time I needed the root of my blog, I had to recursively find the root and then perform the needed operation. This made me cringe, but I was able to get over the fact. Because I didn't like this fact, I sought a way to avoid it.

The way I am now using to avoiding the problem is saving the path to the root instead of the root node itself. This works because setting a variable to be a string is easy. This is still not perfect, but produces less duplicate “code.”

The best part about this problem, however, is that it is temporary — XSLT 2.0 allows a variable to contain a node-set easily.

Nice Features

There are just a few nice features that I have implemented. The nicest in my opinion is how acronyms and abbrs work. There are two files, acronyms.xml and abbrs.xml, that contain a list of acronyms and abbreviations that I use on my blog. In my XML I only place a the correct acronym or abbr tag around text that needs it, but I do not declare the title or any other attribute. This is done by the XSLT when it comes across either tag and it fills in the attributes given from the two XML files.

Such a feature is commonly done in PHP, however, this has to compare every string against every possible acronym or abbr. In this case however, the XSLT only has to compare every denoted acronym or abbreviation against the known values. Of course, the benefit is a the loss of writer, however, it would require XSLT 2.0 do do differently easily.

The only other nice feature is really being allowed to modify the header and sidebar without having to do PHP in each document when needed. This came in handy on the Recent Posts page.

Eric Anderson

Archive