NoSQL Zone is brought to you in partnership with:

I am the founder and CEO of Data Geekery GmbH, located in Zurich, Switzerland. With our company, we have been selling database products and services around Java and SQL since 2013. Ever since my Master's studies at EPFL in 2006, I have been fascinated by the interaction of Java and SQL. Most of this experience I have obtained in the Swiss E-Banking field through various variants (JDBC, Hibernate, mostly with Oracle). I am happy to share this knowledge at various conferences, JUGs, in-house presentations and on our blog. Lukas is a DZone MVB and is not an employee of DZone and has posted 234 posts at DZone. You can read more from them at their website. View Full User Profile

Easy Mocking of Your Database

02.21.2013
| 15010 views |
  • submit to reddit
Test-driven development is something wonderful! Once you’ve established it in your organisation, you will start to:
  • Greatly improve your quality (things break less often)
  • Greatly improve your processes (things can be changed more easily)
  • Greatly improve your developer atmosphere (things are more fun to do)

The importance of doing the right test-driven development is to find a good ratio of what kind of code is to be covered…

  • by automated unit tests
  • by automated integration tests
  • by manual “smoke tests”
  • by manual “acceptance tests”
  • not at all

Finding that ratio can be grounds for heated, religious discussions. I will soon blog about my own opinion on that subject. In this post, however, we will focus on the first kind of test: unit tests.

Unit testing your data access

When databases are involved, people will probably quickly jump to writing integration tests, because all they have to do is create a little Derby, H2 or HSQLDB (or other) test database, and run a couple of data-setup queries prior to the actual test. Their code module will then hopefully not notice the difference to a productive environment, and the whole system can be tested as a blackbox. The advantage of this is that your tests can be written in a way to verify your business requirements, your user stories, or whatever you call them. So far, the theory.

When these database integration tests pile up, it starts to become increasingly difficult to shield them off one another. Avoiding inter-dependencies and at the same time, avoiding costly database setups is hard. You won’t be able to run the whole test-suite immediately after building / committing. You need nightly builds, weekly builds. But unit testing the data access layer isn’t that much easier! Because JDBC is an awful API to mock. There are so many different ways of configuring and executing queries through this highly stateful API, your unit tests quickly become unmanageable.

There are a few libraries that help you with database testing. Just to name a few:

  • MockRunner: This one has some JDBC-specific extensions that allow for simulating JDBC ResultSets, as well as for checking whether actual queries are executed
  • jMock: An “ordinary” Java mocking library
  • mockito: An “ordinary” Java mocking library
  • DBUnit: This one doesn’t mock your database, it’s good for testing your database. Another use-case, but still worth mentioning here

Some of the above libraries will not get you around the fact that JDBC is an awkward API to mock, specifically if you need to support several (incompatible!) versions of JDBC at the same time. Some examples can be seen here:

Mocking the database with jOOQ

When you’re using jOOQ in your application, mocking your database just became really easy in jOOQ 3.0. jOOQ now also ships with a Mock JDBC Connection. Unlike with other frameworks, however, you only have to implement a single functional interface with jOOQ, and provide that implementation to your MockConnection: The MockDataProvider. Here’s a simple implementation example:

MockDataProvider provider = new MockDataProvider() {

    // Your contract is to return execution results, given a context
    // object, which contains SQL statement(s), bind values, and some
    // other context values
    @Override
    public MockResult[] execute(MockExecuteContext context) 
    throws SQLException {

        // Use ordinary jOOQ API to create an org.jooq.Result object.
        // You can also use ordinary jOOQ API to load CSV files or
        // other formats, here!
        DSLContext create = DSL.using(...)
        Result<MyTableRecord> result = create.newResult(MY_TABLE);
        result.add(create.newRecord(MY_TABLE));
// Now, return 1-many results, depending on whether this is // a batch/multi-result context return new MockResult[] { new MockResult(1, result) }; } }; // Put your provider into a MockConnection and use that connection // in your application. In this case, with a jOOQ Executor: Connection connection = new MockConnection(provider); DSLContext create = DSL.using(connection, dialect); // Done! just use regular jOOQ API. It will return the values // that you've specified in your MockDataProvider assertEquals(1, create.selectOne().fetch().size());

The above implementation acts as a callback for JDBC’s various executeXXX() methods. Through a very simple MockExecuteContext API, you can thus:

  • Get access to the executed SQL and bind values (Use general jOOQ API to inline bind values into the SQL statement)
  • Distinguish between regular SQL statements and both single-statement/multi-bind-value and multi-statement/no-bind-value batch executions
  • Return one or several results using jOOQ’s org.jooq.Result objects (which you can easily import from CSV, XML, JSON, TEXT formats)
  • Return “generated keys” results through the same API
  • Let jOOQ’s MockStatement take care of the serialisation of your mock data through the JDBC API

There is also an experimental implementation of a MockFileDatabase, a text-based mock database that uses the following format:

# This is a sample test database for MockFileDatabase
# Its syntax is inspired from H2's test script files

# When this query is executed...
select 'A' from dual;
# ... then, return the following result
> A
> -
> A
@ rows: 1

# Just list all possible query / result combinations
select 'A', 'B' from dual;
> A B
> - -
> A B
@ rows: 1

select "TABLE1"."ID1", "TABLE1"."NAME1" from "TABLE1";
> ID1 NAME1
> --- -----
> 1   X
> 2   Y
@ rows: 2

MockFileDatabase implements MockDataProvider, so it’s dead-simple to provide your unit tests with sample data. Future versions of jOOQ will allow for:

  • Regex pattern-matching SQL statements to provide mock results
  • Load these results from other formats, such as jOOQ’s supported export formats
  • Specify the behaviour of batch statements, multi-result statements, etc.

Using jOOQ’s MockConnection in other contexts

Things don’t stop here. As jOOQ’s MockConnection is the entry point for this mocking sub-API of jOOQ, you can also use it in other environments, such as when running JPA queries, Hibernate queries, iBatis or just your plain old legacy JDBC queries.

jOOQ has just become your preferred JDBC mock framework! ;-)




 

Published at DZone with permission of Lukas Eder, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Comments

Paul Campbell replied on Mon, 2013/02/25 - 12:27pm

Mocking an SQL database is full of pitfalls IMO.

Firstly the establishing the syntactic correctness of the SQL query requires a full SQL parsing engine.

Secondly the semantic correctness cannot be established without an actual schema and data to operate on.

Thirdly, the mapping of queries to results is not 1:1 i.e. there are ,generally, a great many possible queries that yield the same result for example with a join and unless you identify all correct variants then your test is prone to false negatives.



Lukas Eder replied on Tue, 2013/02/26 - 4:03am in response to: Paul Campbell

It is. In most cases that you've listed, integration tests against an actual database are better suited. But sometimes, you really want a unit test. Then, you really should mock a database, to increase execution speed and decrease test dependencies.


Of course, nothing keeps you from delegating query execution from jOOQ's MockDataProvider to the actual database, intercepting only some query configurations...

Paul Campbell replied on Tue, 2013/02/26 - 8:39am

Well usually one has a data/DAO layer that broadly speaking takes an abstracted query and returns some domain objects and this decouples the code dealing with the database from everything else.

view sourceprint?1.List <Addresses> recentAddresses = myDataLayer.findAddressesForUserSince(userid, date);

Inside this data layer I need to test against a real DB for all the reasons previously discussed.

Outside it I mock it the data layers interface and return programmatically created domain objects.

I'm just not seeing where this idea of mocking at the SQL level is useful.



Aaron Digulla replied on Wed, 2013/02/27 - 7:28am

Paul,

Your approach works well for simple/dumb DAOs (the "select * from user where name like"-kind).

As soon as you have more complex code in the DAOs, this quickly becomes an obstacle. Some examples:

  •  I have tables which have non-null FK relations to 15 other tables. For my test, I don't need those 15 other tables but I have to either patch my DDLs for this test (= set up an individual database for this test) or I have to load test data into tables that I don't care about but which I do have to maintain.
  • When you have complex computations in your DAO, then you will have to have many different test cases in your database (at least one for each path in my code, sometimes more to test all corner cases).

So my usual approach works like this:

  1. I have SQL that I want to test, so I run that against the real DB. This checks the syntax. For all other tests, I can assume that the SQL is OK -> I don't need to check the syntax again
  2. Some complex queries need to be run so I can be sure they return the correct set of results. Again, I need to run them against a real DB but I don't have to execute it again for any test that works on the results.
  3. There are many cases where I need to check complex code in my DAOs. Loading all the necessary test data would be huge waste of time. It would also mean that each test depends on many external files or huge Java strings with SQL.
To understand this better, most of my data structures are revisioned trees. SQL really isn't very suited for this.

So I need to make sure that the SQL generators produce correct queries but I just need to check the syntax once for this. Afterwards, I can just toString() the query and compare the strings.

See also: Testing With Databases 

Oleksandr Alesinskyy replied on Wed, 2013/02/27 - 8:14am in response to: Lukas Eder

One of the worst ideas I ever seen - your mocks represent your understanding of the database, which may be pretty much incorrect. So such test would test nothing (especially if queries/DML statements are simple CRUD statement operating on one object, identified by its key, per time).


Paul Campbell replied on Wed, 2013/02/27 - 8:21am in response to: Aaron Digulla

 My example was purposefully trivial but the general approach works perfectly well for complex systems. In such systems I would always avoid complex logic inside the data layer, if I found myself needing to have complex code to build queries I would look at whether I needed to represent the data differently, or use an ORM (or even a use compeletly different method of storage). This has never yet been an issue for me though despite having used this approach on dozens of significant commercial projects.

"I have tables which have non-null FK relations to 15 other tables. For my test, I don't need those 15 other tables"

"To understand this better, most of my data structures are revisioned trees. SQL really isn't very suited for this."

OK given this I can see why you found your approach useful and thats cool but your article is pitched as a general strategy for SQL database backed applications and that is what I am taking issue with.

"So I need to make sure that the SQL generators produce correct queries but I just need to check the syntax once for this. Afterwards, I can just toString() the query and compare the strings."

I can see why your doing this given the above circumstances but be aware that you are writing tests that tell you that something has changed, which isnt the same as telling you that something is broken. Over time this can lead to a fragile test suite that hampers certain types of refactoring.

In closing, I thing your article should have given some context of the specifics of the problem you were trying to address.





Lukas Eder replied on Wed, 2013/02/27 - 8:43am in response to: Oleksandr Alesinskyy

The actual mock test API lets you introspect bind values and act accordingly. You don't have to hard-code IDs or any other values in your tests...

Oleksandr Alesinskyy replied on Wed, 2013/02/27 - 9:09am in response to: Lukas Eder

And what? 

Lukas Eder replied on Wed, 2013/02/27 - 9:41am in response to: Oleksandr Alesinskyy

Hmm, I'm not sure where you're trying to get this discussion to. How I understood you, you're afraid that mock setups would be closely coupled to single records in CRUD operations, whereas this API will allow you to intercept any type of SQL statement (including CRUD operations) by implementing a MockDataProvider and introspecting bind values that may indicate what IDs / keys / etc were being manipulated by the CRUD operation.

Note, that a MockDataProvider is just a tool to inject results for well-defined "to-be-expected" queries. The test still has to be written, no matter whether you want to test the SQL-layer, data-access layer, service-layer, etc...

It's not an everyday use-case, but it can be useful in some cases where integration tests aren't the right tool...

Oleksandr Alesinskyy replied on Wed, 2013/02/27 - 10:55am in response to: Lukas Eder

No, you completely misinterpret my point - it was that there is no reasonable way to mock anything save simplest DP operation (select by id, update by id, insert a single record and so on).

And even these "simplest" operation cannot be mocked reasonably - let assume a case when child and its parent are inserted in the wrong order (so FK would be violated).

Yes, MockDataProvider may be useful in some albeit very rare cases. In most cases it would be easy enough to test against an real DB (and such tests may be fast enough, so it is not a concern in most cases).

One more thing - it is not a good (as for me) to put a lot of logic to the data access layer (it should be kept either at business layer or in a SQL itself). If such logic is necessary it would be better to isolate it into the method and/or classes that do not need DB to be tested.

Florin Jurcovici replied on Wed, 2013/02/27 - 11:26am in response to: Lukas Eder

I wouldn't say so. I'd either mock the DAO directly, and not mess with the SQL which gets sent to the underlying database in the classses I unit-test, or use a real, for-test database, with a setup and destroy script to properly initialize the database and clean up afterwards (for example when unit-testing DAOs).

A relational database is a complex beast, if used for what it's good at. Your tests are either irrelevant, if this complexity is avoided/short-circuited in tests, or your app's use of a relational database makes no sense, and you should use a lighter storage mechanism (CSV files, serialized Java objects, some NoSQL database, XML files, whatever fits your usage pattern). I can't think of any other case in which mocking the database would actually be easy enough to justify not using a real database. At least that's what I've learned from experience.


Lukas Eder replied on Wed, 2013/02/27 - 11:48am in response to: Florin Jurcovici

Here's a simple use-case:Intercept 1-2 queries that provide data you'd like to patch, delegate all other queries to the actual database.Agreed, this is no longer unit testing, but a special case in integration testing. But it can still be good to be able to patch some data on a SQL level, instead of having to manipulate the actual database...

Florin Jurcovici replied on Wed, 2013/02/27 - 3:05pm in response to: Lukas Eder

That opens up a whole different can of worms. I generally think it's bad to use the same language for writing both the production code and the integration tests. A language like Java or C# is IMO/IME a bad fit for writing integration tests, or at least a sub-optimal fit. Mocks provided by a Java lib would be a bad fit for integration tests written in JavaScript running on nodejs (for testing web services), Perl (for testing a CLI), some proprietary testing tool (for testing a native GUI app) or Selenium (for testing a browser app).


Oleksandr Alesinskyy replied on Wed, 2013/02/27 - 4:43pm in response to: Lukas Eder

this use case is not simple but rather very exotic. Honestly, I am hardly able to imagine a situation in which this use case would be of any use.

Aaron Digulla replied on Fri, 2013/03/01 - 8:10am

A lot of the discussion here can be summarized as "I don't understand what you try to achieve because I don't work that way."

That's OK. There are people who can write code in Lisp and there are those who can't. Some people work better with vi and others with emacs. That's the way the world works.

BUT

Something isn't a bad idea just because you don't see an application. In our code base, we spend a huge amount of time creating a dozen test databases - no, we can't do all the tests in one huge database. I've seen projects which copy the production database into the test system every week as if they wanted to make sure nobody knows what gets tested and what won't.

In many of my tests, I don't want to test JDBC, the database driver, the test database setup code, that Java can read SQL from 50 files and convert that into SQL statements, that the database can be started and that SQL works.

Most tests need to run a query once against the database (something that an IT build system can during the night). In the tests that I run as a developer during the day, I only need to test all the things above when I make a change in this code.

In all the other cases, testing the database is something that the database vendor should have been done and not me, twenty times each day.

Or to put it another way: A database is a fancy global variable. And global variables should be avoided (and I say "should", not "must be at all cost").

Paul Campbell replied on Fri, 2013/03/01 - 9:45am in response to: Aaron Digulla

"Something isn't a bad idea just because you don't see an application."

  • Yes but the thing is you kicked off the article outlining a development process with the implication what you were about to present was the "proper" way to do automated testing in SQL DB based systems in general in a agile setting. So yes I'm afraid that, pitched as a such a generally desirable approach (rather than to address your specific problems) it is IMO a bad idea.

"That's OK. There are people who can write code in Lisp and there are those who can't. Some people work better with vi and others with emacs. That's the way the world works."

  • This isn't just some "you like tea, I like coffee" matter of taste: your approach has some serious shortcommings as a general approach which myself and other responders have already elaborated.

"In all the other cases, testing the database is something that the database vendor should have been done and not me, twenty times each day."

  • Neither is it your job to test the complier, the various libraries you use, the operating system, the machine hardware etc etc but all of these things, like the database, define the effective set of environmental constraints within which your code should be continually proven to work.

"Most tests need to run a query once against the database (something that an IT build system can during the night). In the tests that I run as a developer during the day, I only need to test all the things above when I make a change in this code."

  • So if someone decides to change the database structure or change the way it behaves in some way (say changing some contraints) a whole day will elapse before they find out they broke all your code ... I just ran the junit DAO test suite on our project which has nearly 500 tests and it took all of 9 seconds running against an in-memory SQL engine i.e. no dependencies on anything outside my IDE and we're using hibernate so it would probably by quicker if we were using raw SQL.

"Or to put it another way: A database is a fancy global variable. And global variables should be avoided"

  • But If you have such global state then it is all the more important to test anything that manipulates and/or depends on that state especially when the statefulness is very complex, as it is with a database.


Oleksandr Alesinskyy replied on Fri, 2013/03/01 - 10:51am in response to: Aaron Digulla

A lot of the discussion here can be summarized as "I don't understand what you try to achieve because I don't work that way."

Rather because "it does not work that way" - fell the difference.


Something isn't a bad idea just because you don't see an application.
There are no such thing as "a bad idea" or "a good idea" unless its context is provided. You failed to provide an realistic context in which your idea is good (at least you fallen short to persuade me and some other readers in it.

In our code base, we spend a huge amount of time creating a dozen test databases - no, we can't do all the tests in one huge database.
I do not know your project so cant comment on it but in most cases it is not so big effort to create a test database and bring it to know state at the test start. And this database does not need to be huge. As a test execution may be done fast enough (at least in most cases). In our current project we have several hundred tests executing against an real (no in-memory engines) test database - but a build cycle is around 1 minute, which looks perfectly acceptable.

In many of my tests, I don't want to test JDBC, the database driver, the test database setup code, that Java can read SQL from 50 files and convert that into SQL statements, that the database can be started and that SQL works.

In your build you anyway are  testing that JVM may be started, your build tool (Maven, Ant, whatever) may  be started and is able to read your build files, compile your source files, ... Why you believe that a database is different?

Most tests need to run a query once against the database (something that an IT build system can during the night). In the tests that I run as a developer during the day, I only need to test all the things above when I make a change in this code.

Or e.g. somebody made a change in the DB structure.

In all the other cases, testing the database is something that the database vendor should have been done and not me, twenty times each day.
We are speaking not about t4esting of an database engine - we are speaking about testing your code expressed as SQL (believe it or not SQL statements are part of your source code and your application logic). And your code that generates those statements (if they are generated dynamically).

Or to put it another way: A database is a fancy global variable. And global variables should be avoided (and I say "should", not "must be at all cost").
If you consider a database just as  a fancy global variable (that means just as a dumb storage) then you suffer from a severe form of DYI  (Do It Yourself) and or FUD (Fear, Uncertainty, Doubts - in relation to databases) syndromes.


Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.