Friday, 30 August 2013

Comparing Test Spec Frameworks

Up until now, I've been using a custom test spec framework geared towards helping write unit tests with a BDD style description: Given; When; Then, described in a previous post. The framework was built on top of NUnit, Moq, and Structuremap.AutoMocking nuget packages to make assertions, create mock objects and resolve the test target dependencies with mock objects respectively. There is no problem with the framework as it is, but I was seeking a way to reduce the language 'noise' that was incurred - a new method for each Given, When and Then step is cumbersome and gets tedious after a while.

This lead me to discover the Machine.Specifications (MSpec) framework. It allows you to write tests in the same BDD style, but does not require a full method for each step. Instead, you can just write a lambda function. This can reduce any one test down to a single line, by making sure that you stick to one assertion per test. It introduces some new types which enter the terminology: Establish,  Because and It. These are direct replacements for Given, When and Then, as the semantics are identical.

But what about the automocking feature of the old framework? This can be fulfilled by another package designed to work with MSpec: Machine.Fakes. This, too, automatically resolves any dependencies of the target with mocked objects, provided by your mocking framework of choice. I used the Machine.Fakes.Moq nuget package to do this, but since you also use the Fakes framework to get access to the mocked dependencies it has created for you, and also generate additional mocks you may need the fact that I'm using Moq becomes irrelevant - it is completely abstracted away. To get access to an mocked dependency you use The<T>, and to create a mock you use An<T>. I also like the nice feature that gives you a list of mocked objects: Some<T>.

That's the language hooks and mocks taken care of, but in the effort to reduce the syntax bloat as much as possible I brought in the Fluent.Assertions nuget package. This library is leaner than NUnit because it lets me make assertions using a set of extension methods, e.g. .ShouldEqual() or .ShouldBeTrue(), so I don't have to begin an assertion statement with NUnit, I can start with the object I want to make the assertion on. It also puts the final nail in the coffin for NUnit, as I have already done away with the [SetUp], [TearDown] or [Test] attributes.

Let's compare two test classes written for the same target class. The Target is shown below, followed by the before and after test classes.

As you will notice, the 'After' test class contains over 50% fewer lines. The tests are now more readable because there is less language noise to cut through when reviewing the code. The important statements jump right out, and they also take less time to write. The method bodies and calls to base methods in the Given and When steps have gone, replaced with single line statements. The MFakes framework has helped make each line that refers to a mock more concise be removing the need to call .Object. The Fluent.Assertions library has made the assertions simpler too.

There are a couple of downsides that I have discovered not shown in the example that I think are worth mentioning. Firstly, Tests (It field variables) cannot be run from a base class. They can be declared, but go undetected by the build runner. This means that you will have to duplicate any tests that are based on statements that come before a branch at an if statement, assuming you want to test both branches of the method. Whereas previously you could declare a test in an abstract class and have it executed in each text fixture class that inherits from it.

The other negative is that you cannot instantiate a mock of your own using the An<T> from MFakes at field level. You have to be inside the body of the Establish lambda/anonymous delegate before you can assign a value to the variable defined at field level. You aren't told about this error until run time which is annoying to begin with, but being forced to declare your mock variables on one line and assigning them on another doubles the amount of code needed to achieve what that the previous framework could do. So it goes against the purpose of this exercise somewhat.

All in all, I think the positives outweigh the negatives. The biggest factor being that there is less manual syntax to be typed to generate a test. Which is one less reason not to do TDD, which can only be a good thing!

Wednesday, 21 August 2013

There's still a need for SQL databases

Document Databases are an important part of the NoSql movement taking place. They are appealing to developers because of their intuitive Apis and also to stakeholders as they lower the boundaries to feature development that would have once been hugely difficult a Sql-only environment. Some key benefits of document databases include:

- They permit fast development of complex objects graphs, making it easy to employ DDD. Though hand-scripting stored procedures are now largely a thing of the past thanks to modern ORMs, they are difficult to retro-fit to a legacy system and can introduce new bugs of their own. I would argue that there is more confidence in storing and accessing aggregates roots in a document database as their is 'truer' representation of the model being stored on disk, while with an ORM there will always be a layer of translation.

- Are schema-less and therefore 'backwards compatible', meaning no application downtime while the database is administered. Again, ORMs have advanced enough to support deployment tasks, see Microsoft Web Deploy and RoundhousE of the Chuck Norris Dev Ops suite, but the risk still exists of losing or corrupting data with a tool or not.

- When you need to scale, document databases have been designed from the ground up to be able to distribute the data across large numbers of machines with aim of spreading system load, increasing availability and preventing failures. There is a physical limit that Sql will reach when scaling simply by throwing more resources at it (vertical scaling). Document databases deal with sharding more effectively as they don't have relationships to maintain as in a relational SQL database. There are ways in which you can scale a sql database that are not so simple.

- When you need to perform calculations over a large dataset very quickly. Document databases lend themselves to map-reduce type queries especially well because of their distributed nature. Map reduce queries are designed to be performed in parallel, which is a great fit for a shared document database. Queries are also optimized for 'eventual consistency', meaning that result accuracy is traded for low response times.

- When you need access from JavaScript. Many document databases store their data as JSON objects, so using a JavaScript client is a great way of consuming a document database service. The increase of Single Page Applications written in JavaScript and also Node Js in the back-end has encouraged the growth of JS clients. Take the Mongoose npm package for example.

In modern applications, its hard to argue that any of these requirements can be overlooked. But if you don't have a bursty traffic model and do not need a globally accessible service then you are not yet excluded from using a Sql database...

Sql databases still have a very good use. Leaving aside the fact that they remain largely a more stable platform due to there maturity compared with the relative youthful NoSql technologies, Sql databases make a great tool for data-mining. That is, to make use of the Sql JOIN operator across multiple tables to 'slice' the data. If you are working with a normalized schema, Sql queries provide the ability to draw conclusions from a body of data with more freedom than a document database. Document databases demand that you specify the kind of report you will want up front at design-time. With a Sql database, or at least a tool that provides a Sql type language interface, you can build up reports based on the relationships built into the tables.

By reporting in this context, I primarily mean the types of back end office reports that might be needed on a daily, weekly or monthly basis. These type of reports are largely back office type reports, that tend to grow as the business rules change. The type of reports I mean reveal insight and are built up further on top of existing tables and views. This type of gradual evolution is brought about by in-house report writers, who know the schema well. These reports take time to write, as well as time to run. But that time is acceptable because of the advantages that they lend the business.

In contrast, when lots of users need to see top level summaries of data quickly, a document database would be a much better tool for the job. Dashboards headline figures can be useful debugging as well as marketing tools, but they are not the type of reports that are suited to a Sql database. When website response time is important, do not put a report generated from a Sql database on the home page (hopefully we all learned this by now, right?!)

It is easy to get caught up in the momentum and success of the NoSql movement. But it can be difficult to tell what benefits it will bring to your application, and at what cost. Hopefully this article offers you a balanced view. Its fun to be an early adopter, but I would exercise caution before binning the old Sql reporting database in favour of the latest and greatest.