Skip to main content.

Monday, December 10, 2012

The parallel scheduler as designed in the Falcon Organic Virutal Machine specification is complete, at least as a proof of concept.

I will merge the work in the new_engine branch in hours, and this starts the last development cycle leading directly to the release of Falcon 1.0 alpha.

The work from now on starts going in the release code. As we proceed into finalizing the code, it's time to start working on the test suite and documentation.

Things that must be done prior issuing the alpha are mainly:

  • Review symbol/variable relationship to simplify it.

  • Add asynchronous item R/W.

  • Review the module loading API to simplify it and make it more organic with the new VM.

  • Finish the garbage collector (mostly already in place; mainly, it's a matter of finalizing the API).

  • Finish the grammar for the parser (a couple of advanced constructs are still to be ported).

  • Bring in the relational programming model (actually, it's a matter of a couple of extra grammar rules).

  • Port the old modules.

The engine as it stands now it's way faster than the old 0.9 series, and it's currently at par or superior to LUA and optimized python 2.7 in many aspects. There are still things we can work on to make it even a tad faster, but the introduction of asynchronous item R/W might reduce the performance of about 10-20% on some symbol access (I think I can have local variables free of this burden), but we gain transparent, simple and powerful parallel programming constructs which can counterbalance this loss hands down.

One thing that must be fixed in this final review is the way modules are loaded in the virtual machine. Up to date, the new engine module loading scheme was loosely related with the old 0.9 engine, where the modules could be compiled or saved through a separate set of functions. In the original Falcon, module compilation and serialization could have been performed even through a separate library, and when I coded the new engine I tried to retain this separation.

However, the introduction of the fully reflective code, which is the DNA of our organic machine, definitely requires a different approach. Even if non of the code usually compiled in a module currently requires that, any element that can be serialized might require the help of the virtual machine to complete some parts of its serialization. Prior to the introduction of the parallel scheduler, this help was part of an interactive process of the upper application using the Falcon engine, a VM and a single context. Now that we have multiple processes, each with its own multiple context, things must be handled a bit differently (and more transparently for the host application).

The following scheme shows the main entities involved in module serialization:

Module Space Scheme

A "module space" is a set of modules that, once deserialized or injected in the virtual machine, form a common space for global variables and exported symbols. They can be organized in a hierarcy, so that a plugin module compiled or loaded by a Falcon program can live in its own module space, and once the plugin is unloaded, all the sub-modules it references can be disposed with it, without polluting the global symbol space of the application.

However, loading modules might require to run some code. Their main function is usually (but not necessarily) to be run prior their loader can receive them. Also, while currently it's never the case, some of the code in the module might require the virtual machine to perform some operation to help in the storage/restoring of the element.

For this purpose, we can now use a specific Process entity which can be executed by the module space to complete the deserialization.

The module space uses a private module loader, which can use a compiler to load source Falcon files, a FAM loader to load pre-serialized modules, or a dynamic library loader to load modules stored in system dynamic libraries.

Also, it will use a storer to serialize the module (via a ClassModule handler) to a stream, in case compiled modules are to be saved transparently as FAM modules.

Specularly, the FAM loader will use a restorer to read the serialized FAM modules.

Both the storer and the restorer might (atm theoretically) require the virtual machine to execute some code; this was previously performed by configuring a VMContext with the operations to be performed, and then notifying the storer/restorer owner about the need to run that code. With an asynchronous, automated Process entity around, this might actually change into something more transparent, and the invocation of code might be done directly by the storer/restorer classes, while the owner could just wait on the process for the storage/restoring (and eventually, for the call of the main() function of a module) to be complete.

Doing this will simplify the API and, leveraging the existing multithreaded Falcon engine, it should even result in being a tad faster thanks to the exploiting of I/O dead times in the parallel environment.


No comments yet

Add Comment

This item is closed, it's not possible to add new comments to it or to vote on it