Not Rocket Science

Thoughts on OR/M

There is an updated post, written in August 2016: https://stum.de/2016/08/24/thoughts-on-orms-2016-edition/

It's hard to imagine a world without OR/Ms anymore. The old days of making a query, getting a reader, looping through it, getting every column into an object...

I'm a Linq2Sql user. I like it's relatively easy setup and good flexibility, and it has a really good query generator. However, I'm running into some issues with Unit Testing it, because you can't really emulate all the features that it has (EntityRefs with a composite Key or fields that are DbGenerated are a nightmare). Since it is a one-off technology for SQL Server, it doesn't work with any other database (not even SQL Compact 4, which sucks as it would be the perfect Unit Test drop in).

Some people are saying that OR/Ms are an anti-pattern, to which Ayende had an interesting reply. I don't fully agree with any of them. I think that OR/M is solving two different problems, one that it shouldn't solve.

The first problem is getting data from and to the database - the second problem is the mapping of the result to objects. The second problem is a good one, the first one is not a problem and thus doesn't require a solution.

Somehow, people seem to frown SQL, as if it were some evil or outdated concept. You know what? SQL is the language of your database, so you better learn it. Oh sure, it's nice to chain a gazillion Where/Select/ToDictionary/GroupBy calls together and don't have to worry about coming up with the SQL yourself. It's not so nice that the chance to produce a SELECT N+1 is pretty high. When was the last time you looked at the SQL that your OR/M generates and optimized it?

Speaking of optimizations: When was the last time your DBA improved your OR/M Code? Most DBAs I know speak one language very well, and that's the SQL Dialect of their database. They know these hidden tricks and tweaks, they can tweak the query plan and come up with ways to speed up your queries, thus lowering the load and therefore the cost of the system. I don't know many DBAs who would look at a piece of C# code and suggest moving the select outside of the foreach loop in order to turn a SELECT N+1 into a single select. They know the difference between the different JOIN Types (I use LEFT JOIN 99.99% of the time - who knows if there are better ways?) and are able to suggest putting in a READ UNCOMMITTED into a query that is okay to retreive slightly stale data. When was the last time you tweaked the TABLE LOCK setting on your OR/M query?

Granted, most developers don't seem to have access to DBA people and thus have to wear a DBA hat quite often. But you know what? If you have to know about DB stuff anyway, you can as well learn SQL. Even though you may not get everything highly optimized, in the moment a DBA looks at it and optimizes stuff you just have to change your query and call it a day. Maybe you two decide to create a Stored Procedure instead, in which case your SELECT becomes an EXEC - big deal, not.

Yes, I am aware that some queries can get pretty harsh. I have a query with 14 subselects (mainly because it has to do paging on the database) and I am glad that I can write it in Linq2Sql, creating an IQueryable and passing it through a dozen methods that apply more Where clauses based on some ridiculously huge filter class a user passes in. Not having to generate that through StringBuilders or other stuff is a great thing.

But you know what? That query is the one anomaly in a sea of standard CRUD queries. I can write those CRUD queries in my sleep, blindfolded. And I can look at them and immediately understand what this thing is doing. I can read up the T-SQL manual, run it in SQL Management Studio while the SQL profiler is running, look at the Execution plan, tweak the query, put it on Stack Overflow to get some input - I am talking to the DB in a language it understands and have no limitations in how I tweak it.

Really, all I want is the Mapping. I want to write a (parameterized) SQL query, but I don't want to do the ExecuteReader/foreach loop crap, because that's stupid work. I want to be able to insert a Parent Row and it's child rows in one go, without having to use client-side generated GUIDs for the primary key so that I can add it to the child rows. I love the EntitySet/EntityRef stuff in Linq2Sql, and that's what I want. Create an Object and it's children, put it in the database and not worry about capturing the @@IDENTITY and populating it on the children. I don't want an OR/M, I want an Object Mapper.

Luckily, there seems to be a small (but truly existing) movement into it, Dapper and Massive being two of the better known ones. Massive uses the new dynamic functionality in .net 4, a technology that could revolutionize OR/Ms once they start embracing it (rather than trying to stick to their Java roots or stay compatible with .net 2.0). I don't fully like how I have to create "empty" classes and it seems that it doesn't do "real" SQL but rather a DSL that looks like SQL. That would kinda suck.

Dapper is currently used as the OR/M for Stack Overflow, It's very bare bones (doesn't manage your connection, no UPDATE helpers), has some not-so-nice syntax quirks (Having to pass IsAnsi = true when querying a non-unicode - but that's an edge case I'd argue) and although I have not yet had a chance to use it, if Jeff Atwood is willing to use it on the thing that puts food on his table and rock band songs onto his Xbox, I'd assume it's stable.

Granted, one can argue that you're not paying your developers to write code that can be written faster in an OR/M, a claim I'd like to dispute when looking at the total lifetime of an application including maintenance, but if you want a catchy, quotable sentence: Traditional OR/M is like WebForms. It's rapid at first, does stuff just well, may even have a fancy GUI to drag/drop your tables. But just like WebForms limits you tweaking the genrated markup or core functionality ("I need my Data Grid to render the headers in a <thead> so that the jQuery TableSorter plugin works") and sometimes requires some hacks ("I know, I just put some javascript that wraps the generated table header rows into a <thead>"), an OR/M will put up some walls that you can't climb as well.

I think it's time for an OR/M that is to other OR/Ms what ASP.net MVC is to ASP.net WebForms: A Framework that allows and requires us to get deep into the guts, but does not limit the tweaking we can do, while still handling a lot of the stupid work, and I think it's great to see projects move into that direction. Just as MVC and WebForms both sit on top of an ASP.net core stack, ADO.net serves as a rock solid foundation for multiple philosophies of OR/M systems that can happily co-exist and serve both the people able and willing to write SQL and the people who'd rather spend money on an OR/M, profiling tools, and maintenance costs.