Shipyard
logo
THIS SITE IS NO LONGER MAINTAINED. MOST CONTENT HAS BEEN MIGRATED TO ANCHOR HOSTING WEBSITE.
     
     
Advertising
.au domain names
free transfers, registrations and renewals from $69

Australian web hosting PHP, MySQL, Java
from $198/year

Dedicated servers
Australian, Linux and Windows, $175/month
 

The Anchor Website: .py in the Sky: PART 3

Written by: James Gregory on 12 February 2004

Putting snakes in angle-brackets

So now you understand why we're using Python, and how we avoid talking to the database. But this is a website, right? Where's the HTML? Well, I can't stand HTML. I've got chronic tendenitis and I attribute that to all the angle brackets I've written in my 23 years. Once again, we've implemented a library to do all the work relating to html. Before I explain what's going on, I thought I'd just leap straight in with an example of what happens:

>>> import html
>>> table_elements = [ [ 'a', 'b' ],
...                    [ 'c', 'd' ] ]
>>> table = html.Table(table_elements)
>>> print table.htmlRender()

<table  summary="Autogenerated table">
        <tr>
                <td>a</td>
                <td>b</td>
        </tr>
        <tr>
                <td>c</td>
                <td>d</td>
        </tr>
</table>

Before we move on, I'd like to mention that you just saw a demonstration of another of Python's charming features -- the interactive interpreter. To do that, I just typed 'python' at my command prompt and went and bashed that code straight in, and it ran exactly as it would had I been feeding it input from a file. Very convenient for testing that crazy hypothesis before you waste a day coding it up.

Now, what did the example show? Well, I fed this library a set of cells arranged in a 2 dimensional array, and the library went off and gave me back some nice, well formatted html to draw that table in your web browser. The more observant of you will notice that there's a little more at play, but we'll come back to that.

The 2D array representation of the table is extremely convenient. For starters it's easy to go back and change cells (try doing that if you're just working with strings). It's also a good analogue to the data we're actually presenting. This has its share of benefits too. For example, our database engine is capable of spitting out data in a similar format, so we could glue this module to the database module and spit out beautiful html representations of our database. And it's really easy, look:

>>> import db
>>> c = db.handle.cursor()
>>> c.execute('select * from country_codes')
>>> table = html.Table(c.fetchall())
>>> print table.htmlRender()

<table  summary="Autogenerated table">
        <tr>
                <td>AD</td>
                <td>Andorra</td>
                <td>1</td>
        </tr>
        <tr>
                <td>AE</td>
                <td>United Arab Emirates</td>
                <td>1</td>
        </tr>
        <tr>
                <td>AF</td>
                <td>Afghanistan</td>
                <td>1</td>
        </tr>

        ...

</table>

But why stop at tables? We didn't. There are classes for numbered lists, unordered lists, div blocks, images, the list goes on. It's not quite complete enough to build a web page out of (at least not comfortably), but its worked out very well so far. Perhaps most important is the Page class. The way we managed to get all our pages looking the same was to dump a big chunk of HTML that our designer gave us into a class that just replaced the center "block" with whatever content we put in. This "Page class" then asks the content for an appropriate HTML representation of itself, which might in turn ask its siblings for the same. Ultimately all this content is glued together in the Page class and sent out to your web browser. I never have to write HTML, and I don't have to worry about other people writing broken HTML since there's code to take care of that for them.

How does it work?

I don't think I can express how disappointing it is that every time I have a good idea, I open up Gang of Four and discover that I wasn't the first to come up with it. In this case, the idea that they MERCILESSLY STOLE FROM ME has been ascribed the name "Composite". The key idea is that you create classes to represent all the "things" in your document, and all the "things" that store "things". The classes only need to know how to draw themselves.

To get a better handle on this, think about how a computer might store a vector drawing. You'd have a bunch of lines and circles and boxes and stuff, and there'd be some kind of container to store them all. So, when you're coding this up, you'll probably have a "Drawing" class (that top-level container I mentioned) to store all the lines and stuff, and it'll probably have a rasterize method, which will return a bitmap version of the drawing. Now, the Drawing class won't know anything about drawing lines -- that's not its job. It's just a container. It also won't know about drawing boxes or spheres or anything else. Putting all that code into the Drawing class would just make it a massive mess. So, we'd set about putting the knowledge about drawng lines into the Line class; we put the knowledge about drawing boxes into the Box class (which may use several Line objects as helpers, but that's an implementation detail) and so on. Let's say that all these pieces of knowledge were encoded into the rasterize methods of each of these classes? That makes it easy -- The Drawing object just needs to visit each of the objects it contains and call the rasterize method on them, et voilet! a beautiful bitmap image of a flower comes out.

That's skipping some of the finer details, like where these Line objects would draw to and so on. That's not so interesting here because those details are very specific to the problem of Drawings. HTML fits this mould really nicely by virtue of it being a plain-text format. That is, everything in HTML can be represented by a string. That makes writing these classes really easy. To make the point, here's the complete implementation of the code we use to render unordered lists:

class UnorderedList (HtmlElement) :
    """Class representing an html <ul> ... </ul> list of elements
    (bulleted list)."""
    def __init__ (self, content) :
        HtmlElement.__init__(self)
        self.content = content

    def htmlRender (self) :
        output = '<ul>\n'
        for i in self.content :
            output += '\t<li>' + util.toHtml(i) + '</li>\n'
        output += '</ul>\n'

        return output

Fourteen lines of code, including a comment explaining what it does! The util.toHtml function is very straightforward: if the object passed in has a htmlRender method, then run it and return the result, otherwise cast the object to a string and return it. It's there merely for convenience. The great thing is that all of the classes are this simple, but with just a little bit more magic, we can build forms and have them display any way we want without doing any work at all. Check this out:

Name*
Age*
* Mandatory field
Name Age

The code for both those forms differed by only one line. I added this line:

f.packer = form.HorizontalPacker()

to the second variant to make it display accross the page like that. The way it works is actually very simple. Unlike the UnorderedList example above, the code to actually render the HTML for forms is split out into a second class called a Packer. Packers really do only have as much functionality as the htmlRender method would. The only reason they've been put into their own class is to allow this easy substitution of them.

Of course, this is only the beginning. Both the HTML rendering code and the form code do a lot more work than this simple introduction. Indeed, they're rendering all these pages.

Why would you use this?

I've used similar systems to build whole websites before. It's worked pretty well -- there was one system I worked on (and I've just noticed that there are now references to it in Japanese and Spanish in google) that also had the nifty ability to render different HTML depending on the browser, so as to solve all the cross-browser issues. What I've found though is that while the approach is extremely programmer-friendly, it's not very considerate to designers. They just can't work with this stuff. Ultimately these are components that you should use in building your own systems, but imposing them on others is just unproductive. In writing this code I've made sure there's always a way for raw HTML to be added to the output stream. It's actually proven to be extremely valuable even for me. This is nifty stuff, but make sure it's a hammer that fits your nail before going ahead with such a solution.

x