Archive for the ‘Software Engineering’ Category

New CEP engines for .Net

Today Microsoft published one interesting part of the upcoming SQL Server 2008 R2: StreamInsight. This is Microsoft’s first step into the area of CEP. After a quick look it seems that is maybe the smallest set on features which you need for a useful stream processing engine. In its core it is a typed continues query engine which is utilizing LINQ for basic operations like projection, joins, ranking and some more. The query will be directly bound to an input and output adapter and then assigned to a named application which is managed by an embeddable execution engine. Event types are simple attributes to POCO object. All events are per definition assigned to a fixed model, like interval, edge (open interval) or point in time. All is assembled programmatically; there is (currently) no support in Visual Studio. Although there is a Offline Debugger which allows it to replay event log traces.
As mentioned, this is a simple engine but interesting because it is easily embeddable and easy to handle so it is ideal for integration into existing applications. Will be interesting to see how well the current runtime actually scales on multi-processor systems.
The second interesting development is that NEsper, the always one-step-behind .Net version of the Java CEP engine Esper, is also available in a new version and now not so far behind of the mature Java version. It provides a lot more on functionality as the small Microsoft engine, although because it is aligned to the Java version, the query language is based on a SQL-related language.
Both engines are interesting because now the .Net world has finally some nice ways to utilize CEP which is more and more becoming a commodity and is drifting out of its niche of expansive but not necessarily more useful commercial CEP application servers.
Would be interesting to test the actual performance of these two new(late)comers because a lot of home grown code can be removed in some applications by using CEP concepts …

First impression of VS 2010

Visual Studio 2010 Beta 1 is publically available and besides CLR/.Net 4.0 the IDE itself changes a bit:

  • The Visual Studio Shell 2010 is based on WPF, the first time Microsoft actually is using this in one of its own product.
  • Some features available in R# are now also directly supported by the IDE like symbol navigation and automatic implementation generation.
  • Silverlight and F# are integrated out of the box.
  • Historical debugging is an interesting new concept, the debugger now tracks certain events until you actually hit your break point.
  • Some basic support for UML was added.

All in all the switch to WPF gives the Visual Studio Shell new important graphical possibilities. WPF has also a big downside: it needs a lot more resources as the old forms, so you need a bit more graphical and computational power as with the current version. At least you will know why 2 or 4 cores are useful … The Team Foundation Server functionality will be also extended but as it looks the knew version is no improvement to the current one, which is not worth the money and effort, so it will be easier to invest in a working issue tracker, a build server, a test case management tool and a usable wiki. Integration is overrated if it is not usable in real projects effectively …

Microsoft CHESS for managed code available!

Presented on the PDC 2008, Microsoft has finally released the first version of the concurrency unit-testing tool CHESS. It was available already for Win32 applications but is now also available for managed Win32 code and integrated in Visual Studio 2008. Because CHESS is controlling all threads and their schedule while executing your code it will find “Heisenbugs”, means it is possible for the first time to build unit tests which are capable to test concurrency reliable.  This is a unique capability which I’ve not seen  in any other  test framework,  so it will be interesting to run it against some code of my own as well as from 3rd party libraries…
The only downside so far is that the Visual Studio version needed is the Team System one, because of the Microsoft unit test framework, which is not available to everyone. But Microsoft provides a trial version, valid until December 2009, as Virtual PC image so there is a way to use CHESS.

Microsoft CCR and DDS, Part 2

Using the CCR Toolkit Microsoft also presented a new framework for distributed service environment, Decentralized Software Services (DSS). The framework provides the hosting environment for services, a new REST-oriented, SOAP based communication protocol and contract based programming model not dissimilar to the WCF framework. Services basically have a unique service- and contract identifier, multiple instances of a specific service will be identified by an additional UUID. Both descriptions are directly accessible via HTTP. Every service defines exactly one main CCR port which is responsible to handle the DDS protocol commands as well as the standard HTTP commands. The state holder of the service is explicitly defined and can be also accessed via HTTP which allows it to provide a web based interface to any service instance. DSS also provides a publish/subscribe service which can be used to communicate state changes between services. DSS also provides distributed queries based on LINQ for accessing the state as well as the actual message information of an particular service or contract. DSS also can use UPnP for discovering services.

I have only played a little with the tutorial examples and what these are my impressions:

  • Using CCR ports as accessors of the service the handling of the protocol is very simple also in the case of concurrent access. And debugging is actually a lot easier as in traditional concurrency structures …
  • Because services can provide their contract and description as plain HTML page as well as their state, it is a very convenient way to have a human readable monitoring interface available
  • Similar to WCF the contract definition for data and the service conversation is very simple as is the rest of the service implementation
  • Hosting of services is very simple and can be easily embedded
  • Working with the CCR is a new experience, without events or callback delegates

Because DSS is used for nearly all functionality provided by the Robotics Developer Studio, there a lot of examples how to interact with unmanaged code, GUI application as well as complex service choreography and the Visual Programming Language (VPL), which allows it to build really complex applications based on the DSS infrastructure.

So the only downside is that the CCR and DSS libraries can only be distributed if you by the separate license, which is not really expensive, and the feature to build actual distributable assemblies out of a VPL application is only available in the Standard Edition of the Robotics Studio. Either way, this is a very elegant and impressive new toolkit which is maybe the better the best solution for SOA architectures on the .Net platform today, at least better suited as Remoting or WCF.

Microsoft CCR and DDS, Part 1

After presenting the Concurrency and Coordination Runtime (CCR) and Decentralized Software Services (DSS) Toolkit at the PDC08, Microsoft released it as commercially available product. Originally, the toolkit is part of the Robotics tool selection and frameworks. This week now the current version was released, Microsoft Robotics Developer Studio 2008, as free, downloadable, Express and normal Standard edition. Part of it is the CCR as well as the DSS library, also Microsoft Visual Programming Language is part of the package. The runtime libraries can not be redistributed, for this you have to by the seperate product bundle, but for playing around with the new toolkit this is more than sufficient.
The first interesting part is the CCR. There are three main components:
Ports which provide a type safe, atomic data interface. Arbiters on the other side are responsible for coordination and the execution of the user code. The last interesting piece are the Dispatcher and the DispatcherQueue, responsible to actually react on data posted at a port. Data can be posted to a port and either a Arbiter schedules the execution of a specific user code section provided by a delegate with the help of the Dispatcher. The important thing is, that all calls are atomic and is thread safe, so there is no need for locks. Basically it works conceptually like message passing. The CCR is very well documented and provides the basic structure for the DSS and most of the Robotics services. One interesting concept is how iterators are used to handle the asynchron program workflow. For this a small code example which uses them for reading a file can demonstrate this:

IEnumerator FileReadEnumerator(string filename)
{
    // this is the port which is responsible for handling the 
    // IAsyncResult structure
    var resultPort = new Port();

    // open the synchron file read stream, and this is important: inside
    // a dispose block
    using (FileStream fs = new FileStream(filename,
        FileMode.Open, FileAccess.Read, FileShare.Read, 8192,
            FileOptions.Asynchronous))
    {
        Byte[] data = new Byte[fs.Length];

        // start reading from the file, the port post function is used
        // as callback
        fs.BeginRead(data, 0, data.Length, resultPort.Post, null);

        // the second important thing: the iterator suspends until the port
        // gets data, meaning that asynchron processing has finished. A
        // empty delegate will be executed at this event
        yield return Arbiter.Receive(false, resultPort, delegate { });

        // get the actual asynchron result structure for closing the port
        var ar = (IAsyncResult) resultPort;
        try
        {
            Int32 bytesRead = fs.EndRead(ar);
        }
        catch
        {
            // handle I/O exception
        }
    }
    ProcessFileData(data);
}

This implicit iterator is now called by the following code snippet:

Dispatcher dispatcher = new Dispatcher();
DispatcherQueue taskQueue = new DispatcherQueue("Default", dispatcher);
Arbiter.Activate(taskQueue,
    Arbiter.FromIteratorHandler(() => FileReadEnumerator(@"C:\test.txt")));

The arbiter is using the handles the delegation to the actual file reading logic and works in non-blocking mode, the iterator is responsible for the actual program flow. The beauty of this code is that the program flow in in one, optical, sequential flow and has some important advantages
instead of using anonymous delegates. For one thing the using() ensures that the file stream actually gets correctly closed while not using delegates it is possible to use parameters and local variables which can lead to problems otherwise.

Google Collections Framework

The Google Collections Framework is now available for some time, but I never took the time to look into it. Now there is a nice presentation about it, part 1 and 2 including slides, and as someone in the audience already said “It sounds that it is cooler as I thought“. Currently it seems that the download-able snapshot is not the most current one but nevertheless it is worth to have a look on the code …

Why it is hard to find good Software Developers

I am currently reading Gunter Dueck‘s newest book and I found a reasonable explanation why it is currently so hard to find good Software Developers: he describes a interesting article by George A. Akerlof, The Market for “Lemons” Quality Uncertainty and the Market Mechanism, which simplified tells us why asymmetrically information can destroy a market. So my thoughts here are that this is maybe a reason why it is currently very hard to find good people: most companies have no knowledge about how to distinguish between different qualities of work and why it can make sometimes more impact as “process re-engineering”. For this reason, they do not know how they can evaluate the market and they expected on the over side that a lot of “lemons” are there, mostly they compare by price. The article tells us that in this situation the market loose the really valuable people, because they either have to sell them self on same low level or they have to leave the market at all (and become managers, consultants or freelancers).
May be this thought is a bit far fetched but it is definitely the best explanation I’ve found so far because the currently the market for above sub-standard Software Developers is in a imbalance.

Singularity

These days a lot happens in the field of the Microsoft Research Project Singularity: the source is finally available from Codeplex! Why is this project exciting? It is a research playground to test ideas such as using virtual machines like the CLR on the level normally occupied by C or assembler (hey, device drivers in C# are definetly more readable). Also a lot of concepts such as contracts are inherited from Spec# and used for guarding most system services. Because until now only interviews take place (the last one on Software Engineering Radio with Markus Völter and Galen Hunt), so look at actual working code is amazing.
I’m sure anyone who is interested in novel operating system ideas and want not explore something like Minix should download the source. And I’m sure, also any other developer will get some new ideas from the source.

A great build server

For efficient software development you need a reliable and very flexible Continuous Integration server. Most people know CruiseControl(.Net) but everyone who used it knows it is very flexible but has it’s limitations and well, it has a bit “uncool” fronted. After searching for a new one I tried out Jetbrains TeamCity once again. The first version had not provided enough new features to make a switch, but the current version 3.0 (and upcoming 3.1) has features not easily found elsewhere:

  • The concept of build agents: you can install small Java based build agents on various platform (in my case a x64 Windows, a x64 Linux and a IA64 Linux based server), all managed by a single build server. Checkout can take place on the server or on any agent host if SVN is installed there.
  • Remote builds: any user can trigger a remote build from his workstation without commit his code to SVN.
  • Targets ANT, Maven2, NAnt, MSBuild, JUnit, TestNG, NUnit, Visual Studio 2003-2008, IntelliJ projects and simple shell scripts.
  • The Web GUI is cool and workable (as well as the integration in Eclipse, Visual Studio or the Tray).
  • Can integrate third party reports and integrate with any build script.
  • And the professional version is free 😉

So give TeamCity a try if the professional version is enough for your purpose. It solves a lot of problems very elegant and you definitely need less time to manage it.

Exploit easily your cores

Microsoft has released it’s Parallel Extensions to the .NET Framework 3.5 as CTP (you can found more about it on Joe Duffy’s Blog). The library provides some interesting features: one is a easy way to execute LINQ queries in parallel and simple loops. The other thing is that a simple to use Task execution library which, and this is the most interesting part, is not build up on the existing ThreadPool class instead a new Scheduler based on “Work-Stealing” is realized. Why is that important? Normally you use threads which have, although not so much, some overhead in creation and management. This plays no role for long running tasks but it is certainly not very effective for very small tasks as they exist if you want, as example, extend a sorting algorithm. On the other side you have often to sync the work of some parallel task so you use resettable events, in C#, and wait for them. The problem is that they are not cheap because they are using actively system resources, e.g. handles, which are somehow limited. The new library does not need event for this job, so the only resources used are the threads you assign to the TaskManager class. This is certainly not a production ready library but a very interesting one because it makes development of effective multi-core aware data structures and algorithms very easy. The Java world has already for some years a library like this: Doug Lea’s Concurrency Library has a Fork/Join framework which is also based on a “Work-Stealing” scheduler but I doubt a lot of people knew about it. The framework will be part of Java 7 which, sorry to say, will not be available until 2009.