Zuzu Curl

2009-06-27

Configuring debuggability in Curl

Debuggability is an attribute of Curl processes that controls how the code is compiled and whether debugger features are enabled for the applet. Debuggability controls the following features/attributes:

The ability to set breakpoints, and step through code in the IDE debugger.
The ability to measure code coverage of running code.
Syntax errors are reported in the Curl IDE in debuggable applets, where the developer may easily click on the error and navigate to the offending code. In non-debuggable applets they are only reported in the applet's main window.
Lazy compilation is disabled in debuggable applets. In non-debuggable applets, functions are not compiled (and errors not detected) until they are first used, which may not happen until well after the application has been started.
Compiler optimizations like function inlining and register allocation are disabled in debuggable applets.
The Curl performance profiler can report information about source lines in debuggable applets; otherwise only function-level information is available. However, because optimizations are disabled, profiling debuggable applets may produce significantly different results than non-debuggable ones.

The principle disadvantage to making your applets debuggable is that the combination of disabled optimizations and insertion of extra noop instructions to support debugging will result in slower code, and in some cases debuggable applets may be dramatically slower. In these cases, the developer may want to be applet to run the same applet debuggable when using debugger or coverage features, but otherwise use a non-debuggable version.

At least up through version 7.0 of the RTE, debuggability of Curl applets in is controlled solely by the list of directories listed in the Debugger pane of the Curl RTE Control Panel. When the RTE starts a new applet, it consults this list to see if the applet should be made debuggable. If the applet's starting URL is in the specified directory, it will be made debuggable.

So for developers to run the same applet with different debuggability settings, they must either add and remove entries to the debuggability settings in the Control Panel every time they want to run the applet differently or they find a way to run the same applet from different paths. The latter is obviously preferrable. On linux (and Mac OSX) this can easily be accomplished by creating symbolic links using the 'ln' command. For instance, on the linux machines I use at work, I have started putting all of my Curl projects in a subdirectory named 'ws' (short for workspace), and have made a symbolic link named 'non-debug-ws' to the 'ws' directory, so that I can configure my debuggability settings to use paths beginning with "file:///u/cbarber/ws/" to load applets from the source with debuggability enabled, and those beginning with "file:///u/cbarber/non-debug-ws/" to run non-debuggable versions.

On Windows, however, this is not so easy, since there is no equivalent command in Windows nor is there any way to accomplish the same thing in the Explorer UI. It turns out that it is indeed possible to create the equivalent of a unix symbol link on Windows NTFS file systems -- the default file system used by Windows NT and later -- a NTFS "junction point", but this ability is only available in low-level system programming APIs. Fortunately, there are a number of open source solutions that give you the ability to create them. The one that I favor is an open-source shell extension called NTFS Link, which adds entries for creating NTFS hardlinks and junction points to the Windows Explorer's "New" submenu. The one gotcha is that you must be careful not to delete junction points in the explorer until you have unlinked it from its target directory, or else you will end up deleting the target directory contents as well!

P.S. In case it is not already obvious, that the debuggable path is the same as the path you use in your development environment. The non-debug path does not have to be reflected in your development environment since it is not expected to cause breakpoints to be triggered, and so on.

2009-03-28

Poor man's union types

The original design of the Curl language anticipated support of user defined "union types", that is, the ability to define an abstract data type representing values that may belong to two or more unrelated types. (For an example from another dynamic language, see Dylan's type-union syntax.)

The proposed syntax was:

{one-of T1, T2 [,...]}

In early versions of the language, one-of was defined as a macro that simply resolved to 'any', which is the supertype of all scalar types in Curl, with the intention of implementing a correct version at a later time. Of course, such an implementation does not do the desired compile-time checking and served only as an indication of the programmers intent. However, when we started looking into implementing this concept for real, it quickly became evident that the amount of effort required to implement it was not justified by the relative infrequency of its anticipated use, so we dropped it from our development schedules and removed the syntactic placeholder.

Nevertheless, there are times when such a type would come in handy. One example can be found in Curl's built-in 'evaluate' function, whose first argument is declared as 'any', but which actually accepts one of 'CurlSource', 'StringInterface', or 'Url'. Because the argument is declared as taking 'any', you can write code that passes an unsupported type and the compiler will not generate an error; an error will not be thrown until the function is actually executed with a bad value. If the function could have been declared as taking a union type, the compiler would be able to detect such errors at compile time.

Fortunately, it turns out that it is possible to define a class type that while not implementing the full semantics of a union type, still provides us with the most important feature you want from a union type: i.e., the type checking of assignments. The trick is to define a class with a 'any' field to hold the value, and an implicit constructor for each type in the union:


{define-class public IntOrString
field public constant value:any
{constructor public implicit {from-int i:int}
  set self.value = i
}
{constructor public implicit {from-String s:String}
  set self.value = s
}
}

Each implicit constructor supports implicit conversion from the argument type when assigning to a variable or argument of the class type, and since the field is constant and can only be initialized by one of the constructors, you can safely assume that the value is a member of one of the specified types (or null 'uninitialized-value-for-type' would return null for one of the types).

It would be tedious to have to define such a class every time you needed to use this pattern, but it is straightforward to define a parameterized versions of this class for different numbers of arguments, and a macro that picks the correct parameterized class based on the number of arguments, and this is exactly what I did last week. I added a new ZUZU.LIB.TYPES package to the ZUZU.LIB project, which contain the macros 'One-of' and 'AnyOne-of' along with associated parameterized value classes. The 'One-of' type represents the value using a field of type '#Object' and can only be used when all of the types in the unions are subtypes of 'Object'; 'AnyOne-of' uses an 'any' field and can be used with any types. Here is a small example:

2008-10-14

Serializing deeply linked data structures

Upon expanding the test cases for my tree classes in ZUZU.LIB.CONTAINERS, I discovered that in one degenerate case involving a pessimally balanced splay tree, attempting to serialize the tree using the default compiler-generated serialization routines resulted in a stack overflow. The problem was that I had a test case that accesses each element in the tree in order before attempting to clone the tree using serialization. For most self-balancing trees, this is not a problem, but for splay trees, this results in a tree that is as unbalanced as possible -- essentially just a long linked list. Because the compiler-generated object-serialize method recursively serializes the classes fields, serializing the tree nodes blows up the stack. This is a potential problem when serializing any linked data structure that may have arbitrarily large depth.

The way around this problem is to implement an explicit non-recursive object-serialize method and object-deserialize constructor for affected classes. The general algorithm is fairly simple:

Iterate non-recursively over the nodes in the datastructure. For each node, temporarily null out its pointers and serialize the node normally. The SerializeOutputStream will remember the objects and will not dump them out again if the same object is serialized later.
If the number of nodes was not known in advance, serialize out a sentinel value to delimit the end of the nodes.
Iterate over the nodes again in the same order and serialize the fields in order.

When deserializing, just reverse this process.

The following example demonstrates this problem for a simple linked list data structure. Note that in the linked list case the algorithm only requires a single
iteration because the next pointer is always just the next element to be serialized. To see the stack overflow, comment out the object-serialize and object-deserialize members.

Note how I used the class version as an optimization to avoid serializing an extra null for each instance.

Fixing this for my tree classes was a little bit more complicated but the principle is the same. You can see my changes here.

2008-10-01

An 'unimplemented' syntax

Frequently I find that I want to quickly sketch out the interface of a function or class method and compile it without actually implementing its body. If the function does not return any arguments, I can simply leave the body empty, but if it does return something, I might need to write a fake return statement to make the compiler happy. In either case, I usually want to leave myself a reminder that the code still needs to be implemented. In Curl, this can easily be done using an exception:


{define-proc {foo}:String
  {error "not implemented yet"}
}

The compiler knows that the 'error' function will always throw an exception and will therefore not complain that the function lacks a return statement. To create your own function like 'error', you only need to make a procedure that always throws an exception and that has a declared return type of 'never-returns':


{define-proc {unimplemented}:never-returns
  {error "not implemented yet"}
}

I have done one better than this by creating an 'unimplemented' syntax in the ZUZU.LIB.SYNTAX package that uses Curl's 'this-function' syntax to add the name of the unimplemented function to the error message. For example:

You can find the source of this macro here.

The ability to extend the syntax like this makes Curl a much more expressive language than most widely used languages today.

2008-09-17

Running Applets directly from Google Code

One thing I have always liked about Curl is the lack of an independent compile/link step. You can run Curl applets directly from source code just using the Curl RTE, which will compile and link the code dynamically as needed. This gives Curl the immediacy and flexibility of scripting languages like JavaScript while retaining the performance of a compiled language. It also means that you can run Curl applets directly from a source code repository with a web interface that can be configured to return the appropriate Curl applet MIME type (text/vnd.curl). Luckily for me, Google Code provides such a repository, so I am able to configure applets in my ZUZU libraries to be run directly from the repository.

Here is an example:

The above applet is located at the URL:

http://zuzu-curl.googlecode.com/svn/trunk/zuzu-curl/LIB/applets/example.curl

This example applet takes arguments in the "query" portion of the URL to set the title of the example and to load the initial contents of the example either from another file or from the query itself (as in this case). This allows me to use the same example applet to show different editable examples in my blog. The embedded example applet used in the training section of the Curl Developer's Site uses the same trick; for example, see here.

Look here for instructions on how to configure your Google Code repository to serve Curl applets. This trick may work on other Subversion-based code hosting services such as SourceForge, but I have not tried it.

UPDATE: Unfortunately, there does not seem to be any comparable support for Mercurial-based repositories. See Google Project Hosting issues 2815 and 2920.

2008-09-08

Conditional compilation for the "bleeding edge"

If you are a typical Curl application programmer, then you only need to use one version of the API at a time. When you are ready to upgrade to a new version and make use of its features, you only need to update the curl heralds of your code to refer to the new version and don't look back. Curl library developers, on the other hand, often have to support multiple versions of Curl. One way to do this is simply to create separate branches in their source code repository for the different API versions. This technique works well if the multiple versions do not have to be supported for very long because it can be difficult to update code on all branches. The ZUZU libraries take a different approach by listing multiple API versions in their heralds, as in

{curl 6.0, 7.0 package}

This allows the same source code to be compiled and run under either the 6.0 or 7.0 version of the Curl API. However, this only works by itself for code that does not use any APIs that have changed between 6.0 and 7.0. New Curl APIs almost never introduce changes that are incompatible with previous versions, but this still means that you cannot make use of new features without making some provision to compile the code conditionally on the API version. Such an ability is provided through Curl's built-in api-version-switch macro:

{api-version-switch
case "6.0" do
 {do-it-the-old-way}
case "7+" do
 {do-it-the-new-way}
}

This works great provided that you and your users only use official releases of the Curl RTE. But if you are like me and want to use a feature as soon as it becomes available in a beta or experimental release, this is not quite good enough. The problem is that beta releases use the same version number as the official release they precede so api-version-switch cannot distinguish between a beta version and an official version. Even this is only a problem if you are afraid that some of your users may be using an earlier beta version than the one you are using. To address this problem, I created an extended version switch macro to the ZUZU.LIB.SYNTAX package that switches using the content of the curl-version string, which for beta releases will include a beta tag and a build number. For instance, in order to make use of a feature that was first introduced in build 35 of the 7.0 (Nitro) beta, you could write:

{curl-version-switch
case "7 beta 35+" do
 {use-cool-new-feature}
else
 {do-it-the-old-way}
}

This will compile the block using the new feature whenever it is compiled on a 7.0 beta with a build number of 35 or higher, and also when compiled under an official 7.0 release or any release later than 7.0.

2008-09-01

Deploying multi-library documentation

My decision to make ZUZU.TEST depend on ZUZU.LIB had the unintended consequence of breaking my library documentation. The pcurl-docs deployment targets for both libraries still worked just fine, and I was also able to install documentation from the generated libraries without error (using the "Help/Install documentation" menu action of the Curl Documentation Viewer), but when I tried to view documentation for an API in ZUZU.TEST I got an error complaining about a bad 'delegate-to' statement for the ZUZU.LIB manifest. Sure enough, the delegate path was "../LIB/manifest.mcurl", just as in the original manifest in the source code, but this does not work because documentation is always installed in a directory whose name includes the full manifest name and version, which in this case is ZUZU.LIB.0.1.

One way to fix this would be to rename the directories containing the original source files, but this approach is heavy handed, especially given the fact that the manifest version is in the name. Instead, I fixed this by altering the pcurl-docs targets for all the ZUZU libraries to deploy all files to directories using the manifest-name.version naming scheme. Unfortunately, this will require me to update the target directories whenever I change the manifest version, but I don't expect that to happen very often. Changing the name of the target directories does break the relative paths used to locate sibling libraries. In order to fix, that I needed to specify alternate delegate paths for ZUZU.TEST to locate ZUZU.LIB (and for ZUZU.ALL to locate both other libraries). This just required me to go the Manifest Explorer view in the IDE, right-click on each delegate library and modify the component target setting to use the alternate path. Hopefully, this process will get easier when the official 7.0 release comes out.

For more information on deployment, please refer to the Deploying Projects chapter of the Curl IDE User's Guide.