rc3.org

Strong opinions, weakly held

Is ORM an anti-pattern?

This weekend, Tim Bray posted a link to an article at Seldo.com that argues that ORM is an anti-pattern. If you don’t know what ORM is, you probably won’t find this post very interesting, but I’ll provide a brief definition. ORM (object-relational mapping) is a layer of abstraction that stands between your database and code and allows you to treat entities in the database as native objects in whatever language you’re using. The advantage and disadvantage is that it substitutes native calls for SQL.

The linked article does a good job of going over the downsides of using ORM, but I think it ignores the upsides. His most compelling argument is that you should be using SQL appropriately in your application rather than trying to avoid it using ORM instead:

But in my experience, the best way to represent relational data in object-oriented code is still through a model layer: encapsulation of your data representation into a single area of your code is fundamentally a good idea. However, remember that the job of your model layer is not to represent objects but to answer questions. Provide an API that answers the questions your application has, as simply and efficiently as possible. Sometimes these answers will be painfully specific, in a way that seems “wrong” to even a seasoned OO developer, but with experience you will get better at finding points of commonality that allow you to refactor multiple query methods into one.

I think that he undersells the utility of ORM when it comes to simple queries and inserting and updating data. I work on a Web service that processes transactions. Each transaction involves a user logging in and the retrieval of a number of settings from different tables. When I’m done processing a transaction, I have to insert dozens of rows of data into at least half a dozen tables. All of the data that is retrieved and stored is already represented in the object model of my application. Hibernate, the ORM library we use, is remarkably good at pulling up the data when users log in using simple queries, and more importantly, saving all of that data using inserts and in some cases, updates.

Using ORM in the transaction processing context has saved me a massive amount of time over the years, and helped avoid stupid countless bugs involved with manually updating the SQL statements in the data access layer every time I update the object model. Coding that SQL by hand would offer nothing in the way of performance nor would it make the application any more understandable.

The main downside is that on the few occasions when I do have to write Hibernate code to ask questions of the database, I pretty much have to look up the proper approach every time. It would be much easier in SQL since I already know SQL very well.

The secret is, I think, to use the right tool for the job. The main manner in which we avoid the downsides of ORM is in not using it at all for the reporting part of our application. For reports, we use a data access layer that just uses SQL to access the database in a purely relational fashion. Just as writing SQL for all of the basic create/read/update/delete operations in our application would be painful, trying to write well-optimized reporting queries through the ORM layer would be as well.

Categorically rejecting a class of tools that are used productively on thousands of projects is just as silly as picking the wrong tool for the job. Either way, you give yourself more work than is necessary.

4 Comments

  1. I’ve been struggling a lot lately to support a highly modular open-source application which relies heavily on Hibernate along with some other custom ORM-like solutions, and does so in a terribly inefficient manner, so I’ve been thinking about this a lot. I don’t consider myself a developer and I don’t follow arguments and trends about software engineering theory, but I can read the code apparently better than the original developers themselves, and see all the places where they end up pulling in entire tables to have the software pick out the one or two records they’re looking for, or when they call the same uncached query trigger every time through a loop instead of pulling the values ahead of time.

    As for picking the right tool, relational databases may not be the optimal data store for every problem, but they are widely supported by IT departments and hosting services and can accommodate nearly any other kind of data model you want to construct within their notionally relational framework. So it’s no surprise that most projects end up using a relational database, even in those cases when some NoSQL solution would be theoretically superior. Choose the right tool, yes, but often you only have a limited selection.

    So, I think the danger of ORM comes when developers use it as a way to avoid planning their data model appropriately. This risk is amplified with the tools exist to generate all the code your model requires without you ever needing to look at it. In the application I have to deal with every day, it’s clear that the developers more often than not used Hibernate to avoid planning a data model appropriate to the relational database they were required to utilize. And instead the code is written for an idealized perfect object store in which all data retrieval has zero cost and calling functions repeatedly is preferred to storing results in variables. Then over the years since the ideal design was written, hack after hack has been inserted to work around the particularly inefficient areas of the code.

    So yes, ORM can be very useful, but developers who don’t understand (or don’t want to understand) the reality underlying the abstraction are what often make it an anti-pattern in practice.

  2. I have seen ORM horribly misused as well in the manner in which you describe. ORM can save work when it comes to writing SQL, but it is no substitute for understanding SQL or relational databases.

  3. Actually, the guy says “the best way to represent relational data in object-oriented code is still through a model layer”, which to me translates to “write your own ORM”.

  4. ORM is the right kind of solution to the wrong problem. We should be looking at it the other way. SQL is the descrption language of the relational algebra and the relational algebra is persistence adaptation: neither less nor more. Objects first, adaptation afterwards. The notion that “The Database” is somehow definitive is absurd. It is persistent storage. Safe and efficient persistent storage turns out to be disproportionately difficult, hence the relational algebra, whose intriguing complexity leads us to overestimate it: it is really only a splendid kludge and the transformation that it enables is a lossy transformation. Do we talk about OSM (Object-Streaming Mapping) or OMM (Object-Memory Mapping) the same way we talk about ORM? No, and the only reason is the overestimation of the definitiveness of “The Database”.

Leave a Reply

Your email address will not be published.

*

© 2024 rc3.org

Theme by Anders NorenUp ↑