2008-10-14

Serializing deeply linked data structures

Upon expanding the test cases for my tree classes in ZUZU.LIB.CONTAINERS, I discovered that in one degenerate case involving a pessimally balanced splay tree, attempting to serialize the tree using the default compiler-generated serialization routines resulted in a stack overflow. The problem was that I had a test case that accesses each element in the tree in order before attempting to clone the tree using serialization. For most self-balancing trees, this is not a problem, but for splay trees, this results in a tree that is as unbalanced as possible -- essentially just a long linked list. Because the compiler-generated object-serialize method recursively serializes the classes fields, serializing the tree nodes blows up the stack. This is a potential problem when serializing any linked data structure that may have arbitrarily large depth.

The way around this problem is to implement an explicit non-recursive object-serialize method and object-deserialize constructor for affected classes. The general algorithm is fairly simple:

  1. Iterate non-recursively over the nodes in the datastructure. For each node, temporarily null out its pointers and serialize the node normally. The SerializeOutputStream will remember the objects and will not dump them out again if the same object is serialized later.
  2. If the number of nodes was not known in advance, serialize out a sentinel value to delimit the end of the nodes.
  3. Iterate over the nodes again in the same order and serialize the fields in order.
When deserializing, just reverse this process.

The following example demonstrates this problem for a simple linked list data structure. Note that in the linked list case the algorithm only requires a single
iteration because the next pointer is always just the next element to be serialized. To see the stack overflow, comment out the object-serialize and object-deserialize members.


To see this, you need to install the Curl RTE.


Note how I used the class version as an optimization to avoid serializing an extra null for each instance.

Fixing this for my tree classes was a little bit more complicated but the principle is the same. You can see my changes here.

2008-10-01

An 'unimplemented' syntax

Frequently I find that I want to quickly sketch out the interface of a function or class method and compile it without actually implementing its body. If the function does not return any arguments, I can simply leave the body empty, but if it does return something, I might need to write a fake return statement to make the compiler happy. In either case, I usually want to leave myself a reminder that the code still needs to be implemented. In Curl, this can easily be done using an exception:

{define-proc {foo}:String
{error "not implemented yet"}
}

The compiler knows that the 'error' function will always throw an exception and will therefore not complain that the function lacks a return statement. To create your own function like 'error', you only need to make a procedure that always throws an exception and that has a declared return type of 'never-returns':

{define-proc {unimplemented}:never-returns
{error "not implemented yet"}
}

I have done one better than this by creating an 'unimplemented' syntax in the ZUZU.LIB.SYNTAX package that uses Curl's 'this-function' syntax to add the name of the unimplemented function to the error message. For example:


To see this, you need to install the Curl RTE.


You can find the source of this macro here.

The ability to extend the syntax like this makes Curl a much more expressive language than most widely used languages today.

2008-09-17

Running Applets directly from Google Code

One thing I have always liked about Curl is the lack of an independent compile/link step. You can run Curl applets directly from source code just using the Curl RTE, which will compile and link the code dynamically as needed. This gives Curl the immediacy and flexibility of scripting languages like JavaScript while retaining the performance of a compiled language. It also means that you can run Curl applets directly from a source code repository with a web interface that can be configured to return the appropriate Curl applet MIME type (text/vnd.curl). Luckily for me, Google Code provides such a repository, so I am able to configure applets in my ZUZU libraries to be run directly from the repository.

Here is an example:

To see this, you need to install the Curl RTE.

The above applet is located at the URL:

http://zuzu-curl.googlecode.com/svn/trunk/zuzu-curl/LIB/applets/example.curl

This example applet takes arguments in the "query" portion of the URL to set the title of the example and to load the initial contents of the example either from another file or from the query itself (as in this case). This allows me to use the same example applet to show different editable examples in my blog. The embedded example applet used in the training section of the Curl Developer's Site uses the same trick; for example, see here.

Look here for instructions on how to configure your Google Code repository to serve Curl applets. This trick may work on other Subversion-based code hosting services such as SourceForge, but I have not tried it.

UPDATE: Unfortunately, there does not seem to be any comparable support for Mercurial-based repositories. See Google Project Hosting issues 2815 and 2920.

2008-09-08

Conditional compilation for the "bleeding edge"

If you are a typical Curl application programmer, then you only need to use one version of the API at a time. When you are ready to upgrade to a new version and make use of its features, you only need to update the curl heralds of your code to refer to the new version and don't look back. Curl library developers, on the other hand, often have to support multiple versions of Curl. One way to do this is simply to create separate branches in their source code repository for the different API versions. This technique works well if the multiple versions do not have to be supported for very long because it can be difficult to update code on all branches. The ZUZU libraries take a different approach by listing multiple API versions in their heralds, as in
{curl 6.0, 7.0 package}
This allows the same source code to be compiled and run under either the 6.0 or 7.0 version of the Curl API. However, this only works by itself for code that does not use any APIs that have changed between 6.0 and 7.0. New Curl APIs almost never introduce changes that are incompatible with previous versions, but this still means that you cannot make use of new features without making some provision to compile the code conditionally on the API version. Such an ability is provided through Curl's built-in api-version-switch macro:
{api-version-switch
case "6.0" do
{do-it-the-old-way}
case "7+" do
{do-it-the-new-way}
}

This works great provided that you and your users only use official releases of the Curl RTE. But if you are like me and want to use a feature as soon as it becomes available in a beta or experimental release, this is not quite good enough. The problem is that beta releases use the same version number as the official release they precede so api-version-switch cannot distinguish between a beta version and an official version. Even this is only a problem if you are afraid that some of your users may be using an earlier beta version than the one you are using. To address this problem, I created an extended version switch macro to the ZUZU.LIB.SYNTAX package that switches using the content of the curl-version string, which for beta releases will include a beta tag and a build number. For instance, in order to make use of a feature that was first introduced in build 35 of the 7.0 (Nitro) beta, you could write:
{curl-version-switch
case "7 beta 35+" do
{use-cool-new-feature}
else
{do-it-the-old-way}
}
This will compile the block using the new feature whenever it is compiled on a 7.0 beta with a build number of 35 or higher, and also when compiled under an official 7.0 release or any release later than 7.0.

2008-09-01

Deploying multi-library documentation

My decision to make ZUZU.TEST depend on ZUZU.LIB had the unintended consequence of breaking my library documentation. The pcurl-docs deployment targets for both libraries still worked just fine, and I was also able to install documentation from the generated libraries without error (using the "Help/Install documentation" menu action of the Curl Documentation Viewer), but when I tried to view documentation for an API in ZUZU.TEST I got an error complaining about a bad 'delegate-to' statement for the ZUZU.LIB manifest. Sure enough, the delegate path was "../LIB/manifest.mcurl", just as in the original manifest in the source code, but this does not work because documentation is always installed in a directory whose name includes the full manifest name and version, which in this case is ZUZU.LIB.0.1.

One way to fix this would be to rename the directories containing the original source files, but this approach is heavy handed, especially given the fact that the manifest version is in the name. Instead, I fixed this by altering the pcurl-docs targets for all the ZUZU libraries to deploy all files to directories using the manifest-name.version naming scheme. Unfortunately, this will require me to update the target directories whenever I change the manifest version, but I don't expect that to happen very often. Changing the name of the target directories does break the relative paths used to locate sibling libraries. In order to fix, that I needed to specify alternate delegate paths for ZUZU.TEST to locate ZUZU.LIB (and for ZUZU.ALL to locate both other libraries). This just required me to go the Manifest Explorer view in the IDE, right-click on each delegate library and modify the component target setting to use the alternate path. Hopefully, this process will get easier when the official 7.0 release comes out.

For more information on deployment, please refer to the Deploying Projects chapter of the Curl IDE User's Guide.

2008-08-24

Introducing ZUZU.LIB

I had originally intended to get ZUZU.TEST to its first release before spending much time on ZUZU.LIB, but for various reasons I have found myself adding code to it lately. In particular, I decided that my ExtendedDequeue-of class (a subclass of Dequeue-of with some extra useful methods) really should live in the ZUZU.LIB library instead of ZUZU.TEST. Of course, this means that ZUZU.TEST will need to depend on ZUZU.LIB while ZUZU.LIB will need to use ZUZU.TEST for its tests.

This presents a minor code organization issue. In order for packages in ZUZU.TEST to be able to import packages from ZUZU.LIB, the manifest for the former will have to delegate to the latter. Likewise, in order for the test packages in ZUZU.LIB to import the ZUZU.TEST.STANDARD package, the test manifest must delegate to the ZUZU.TEST library manifest. The problem is that Cyclical manifest delegation is not allowed in Curl so the ZUZU.LIB manifest may not delegate to ZUZU.TEST.

The solution is fairly simple, since the ZUZU.LIB library itself cannot depend on the ZUZU.TEST library, the tests for ZUZU.LIB can simply use a third manifest that contains entries for the test packages delegates to both libraries. In fact, what I did was slightly different than that. Instead of having my test manifest delegate to the ZUZU.LIB manifest, I had it include it so that the test packages and ZUZU.LIB packages would be loaded using the same manifest. This will allow me to use certain "white-box" testing techniques should they be necessary. For instance, I can create macros that add extra testing instrumentation when compiled using the test manifest (the macro can access the contents of the manifest in which it is expanded through its 'macro-env' variable). One aspect of this arrangement that may not be immediately obvious is that despite the fact that the test manifest includes ZUZU.LIB and its package entries, it is still considered a distinct manifest and its instances of the ZUZU.LIB packages are distinct from those loaded by the standard ZUZU.LIB manifest. In particular, this means that even though both are loaded from the same URL, the copy of the ZUZU.LIB.CONTAINERS package that is tested by the unit tests is different from the copy that is used in the implementation of the ZUZU.TEST infrastructure. For more information on manifest inclusion and delegation in Curl see the Manifests chapter of the Curl Developers Guide.

In any case, now that ZUZU.LIB is needed for ZUZU.TEST, I expect that I will do at least a skeleton release at the time I release ZUZU.TEST, but don't expect to do much in the way of testing or documentation, especially for the components not needed for ZUZU.TEST. However, that is not to say there is not useful code there, so don't be afraid to make use of it if it will help your project. I expect to write some blog entries on some of the things I have dropped in there recently, so stay tuned....

2008-08-18

What I did on vacation

Despite the fact that it rained almost every day on our vacation, causing me to spend much time indoors, I did not end up getting all that much done on the Zuzu project. I mostly did some refactoring on classes in the REMOTE package in order to support accommodating remote test servers with different process configurations, i.e. different Curl API, debuggability, or even platform. My goal is for the test driver application to be able to dispatch to test servers with a variety of configurations so that developers can test multiple configurations from a common front-end.

I definitely missed being able to connect to the source repository, since it forced me to devise a manual system for doing check ins. Subversion does maintain local state, so you can at least track what files you have changed from the last time you committed your sandbox, but there is no facility to track individual changes while offline. I believe that I could have installed SVK and used it to mirror my repository, but it seemed like it would have been too much effort to set up, so I didn't try. Instead, I just created numbered folders for each discrete change, and copied any changed files when otherwise I would have submitted to the repository. When I got home, I reverted by sandbox, and then copied the contents of the numbered folders back into the sandbox and submitted the changes one at a time. All that file copying did slow me down a bit, and I will revisit the issue of doing offline checkins before I go on another long vacation.

2008-07-29

Small steps

Working on an open source project is difficult, even for a professional programmer. My natural inclination is to immerse myself in a project and not to check in my code until large pieces of functionality are in place. But unlike my real job, I rarely find that I can spend more than an hour or two at a time to working on my open source projects. Consequently, I used to leave partially changed files around waiting for some free time when I could get around to finishing them. Sometimes when I am especially busy, I cannot resume my work for several weeks and then may have some difficulty remembering where I was.

Lately I have begun to work in smaller increments. I have been making smaller changes that can be made quickly and checked in without breaking any of my existing functionality or unit tests. Sometimes these steps are in the right direction, sometimes not, but at least I am no longer leaving my project in an inconsistent state. It also allows me to make progress I can measure even if I know I can only spend time in small increments.