Comparing ORM's

by jmorris 26. December 2009 09:33

I am currently using Linq, EF, ADO.NET and NHibernate (trying at least) in various projects. Here is my current feeling about each:

  • Linq2Sql - easy, works, obsolete, lacks many features...
  • EF- easy, fast, low barrier to entry...Linq tooling is better than v3.5, which is pretty much useless...v4 seems better
  • ADO.NET - simple, easy, no hidden "gotchas", repetitious, flexible...in most all cases requires framework built around it to scale developer-wise
  • NHibernate - flexible, complex, high barrier to entry, lots of hidden "gotchas" regarding performance, mappings, etc., poor (IMO) query API (syntax), outdated documentation, dedicated community albeit scattered

When one say's NHibernate is missing VS integration, I don't think it's the GUI Modeler that they are missing. It's the fact that they can't point at a database and hit "GO"...I think model first is important, but not everyone works that way.

Low friction prototypes are often the start to more complex finished products. We don't need complexity in prototypes; we tailor our finished works to our problem domain - this is where we tweak the API (away from the GUI) to fit the specific requirements of our problem. The ability to quickly create prototypes from the IDE is incredibly important. Without IDE integration, creating these prototypes is difficult...friction.

The thing that a MS solution offers is IDE integration, which you do not find in the NHibernate suite. If I could go to the NHForge.com site and download a VS plug-in, point at a db, and then start playing...I would be much more inclined to use NHIbernate in a project.

This post was motivated by this post...

Tags: , , ,

Overly Complex Solutions to Simple Problems: Take 1 - Local Time to GMT Conversions

by jmorris 11. December 2009 22:41
[No text]

Tags:

Generating Data Transfer Objects with Seperate Files from a Schema Using T4

by jmorris 3. December 2009 04:21

T4, or Text Template Transformation Toolkit, is a template based solution for generating code that is built into VS2008 (it's also available as an add in VS2005). Alas, it has minimum support in VS2008 in that there are are no Visual Studio Template for adding a T4 template - you cannot just right click Add > New Item and add a T4 file to you project. However, you can create a new file and change the extension to ".tt" and VS will know that you are adding a T4 file to the project (after a security prompt asking you if you want to really add a T4 file) and create the appropriate T4 template and .cs code behind file that accompanies each .tt file. For a detailed explaination of how to do this, please see Hanselman's post here.

T4 templates are pretty cool once you get the hang of some of nuances of the editor, which is somewhat lacking in features (reminds me of using Notepad to write Assembly code in college). There are in fact at least two VS add ons that add some degree of intellisense and code/syntax highlighting: Clarius Visual T4 and Tangible T4 Editor for VS. They both offer free developer editions with limited functionality if you just want to get a feel for what they can do without forking out the cash.

Out of the box, T4 templates have one glareing weakness: only one template (.tt) file can be associated with one output file (.cs, et al). This is not really ideal in that we typically associate one source code artifact (class, sproc, etc) with it's own file. This makes it easier to grok, manage, read a projects source and also makes versioning easier using version control software such as Git or Subversion. It's much easier to track changes to a single class in a file than multiple classes in a file. With a little work and a little help it is possible to generate multiple source files from template files, however.

So, T4 aside, what are Data Transfer Objects (DTOs) and why do need them? DTOs are exactly what they propose to be: objects containg data, typically corresponding to a single record in a database table. In my opinion, they are anemic in that they contain NO behavior whatsoever. They are the data...constrast this with domain objects, which are data and behavior. A very typical scenario involving DTOs are situations where data must be moved from one part of the system, to another where they are consumed and used, likely by a domain object(s). For instance, we may use an ORM such as NHibernate or Entity Framework to access the database, bring back a set of data and then map it a DTO.

In many cases your DTOs are mapped directly to your database schema in that there is a one to one mapping between table and object field or property. In this situation, manually creating an object per entity becomes tedious and scales poorly in terms of developer productivity. This is where using a code generation solution, such as one created with T4 really shines.

For example, given the following schema, generate a DTO for each entity:

The first step is getting enough information about the tables from the database metatables so that you can generate objects for each table. Assuming you are mapping to pure DTOs, you can ignore any of the relationships in the form of 'Has A' in your objects. It's not these relationships do not exist; they do, just not explictly. Instead you maintain the relationships through properties that represent the foriegn keys between the objects. An obvious benefit to this sort of convention is that you immediatly resolve the potential n+1 problems inherit with ORMs and lazy loading, at the expense some other features of ORMs, such as object tracking. IMO ignoring these relationships is a personal preference as well as a architectural concern; for this example I am ignoring these relations.

The following sproc is an example of how to get this data from a database, in this case I am using MS SQL Server 2008:



This stored procedure returns a record describing the table and each column of the table or at least the relevent parts: name, data type, and whether or not the column is a primary key. This sproc is called from a T4 template to load a description of each table and it's columns into memory. Here is the relevent code:



In order to generate seperate files for each artifact generated, we will be using three seperate T4 templates: GenerateEntities.tt, Entity.tt, and MultiOuput.tt. GenerateEntities.tt contains the code above as well as another code block in which it loops through the results returned from GetTypes() and uses the Entity.tt template to generate the artifact. The MultiOuput.tt takes the code generated by GenerateEntities.tt and Entity.tt to write the output file to disk and add the file Visual Studio. Note that MultiOutput.tt comes from one of many excellant posts by Oleg Sych and that an updated version of the file is purported to be available with the T4 Toolkit up on CodePlex.com.




The code above loops through the table definitions and uses remoting to set a property on the Entity.tt template with each value. Finally, the MultiOutput.tt.ProcessTemplate(...) method is called which writes the output of Entity.tt to disk and adds the file to Visual Studio. Entity.tt is pretty straight forward:



When the solution is saved, the T4 templating engine will process the templates a file will be added to Visual Studio under the GenerateEntities.tt file that represents a DTO for each table in the database.



References:

  1. http://www.hanselman.com/blog/T4TextTemplateTransformationToolkitCodeGenerationBestKeptVisualStudioSecret.aspx
  2. http://www.codeplex.com/t4toolbox
  3. http://www.olegsych.com/2008/03/how-to-generate-multiple-outputs-from-single-t4-template/

Tags: , , , , ,

Fluent NHibernate Mapping Identity Columns to Properties

by jmorris 1. December 2009 06:37

I decided to jump into the NHibernate lovefest and use it in an upcoming project that I am planning right now. I have been following the NHibernate project for some years, but never actually comitted to using it in a project, because frankly, it was a bit intimidating in size and complexity. Now, of course this was my biased assumption and boy was I wrong! The new Linq 2 NHibernate and Fluent NHibernate API's are awesome and relatively simple to get up and running.

Although I still have some reservations about the completeness and performance of Linq 2 Hibernate, Fluent NHibernate seems to be pretty mature. Additionally, the Fluent NHibernate community is robust, friendly and very quick to lend a hand when I ran into some trouble with AutoMapping.

AutoMapping is a convention based feature of Fluent NHibernate in which with a very little configuration, you can map your entire schema to your domain model. This feature is a tremendous time saver, and gives the illusion of "it just works"! As awesome AutoMapping is, there are certain situations where it will choke. In these cases you must add a little "help" to make the mapping work correctly. Take the following table:

 


Just your basic Role table, but notice how the primary key column is named [entity name]+Id: RoleId? This is a convention that I use for naming the primary keys of all tables I create. It is simple, easy to understand, and works! Now here is the domain model object that it maps to:

 


Notice that the domain object does not have a field called RoleId? Instead we have another field in our base called Id. Now seeing how AutoMapping requires convention (namely naming conventions) to map entities, how does Fluent NHibernate map this with AutoMapping? Well, unfortuntaly it can't:

 



However, with a little help from the Constraints API, we can easily resolve this mapping with a minimum amount of code. What are Conventions, you might ask? According to the Fluent Nhibernate documentation:

"Conventions are small self-contained chunks of behavior that are applied to the mappings Fluent NHibernate generates. These conventions are of varying degrees of granularity, and can be as simple or complex as you require. You should use conventions to avoid repetition in your mappings and to enforce a domain-wide standard consistency."

Connventions are a set of base classes and interfaces that when implemented, allow you to override the default AutoMapping behaviour. Pretty sweet. 

Ok, so how did I resolve the mapping exception above? First the Covention implementation:


And finally the configuration with Fluent NHibernate AutoMapping API:




Tags: , , ,

Decisions, Decisions, Decisions...NHibernate, Linq 2 Sql, or Entity Data Model 4.0?

by jmorris 13. November 2009 08:29

I have a couple of projects coming up and I am trying to decide on a persistence framework to use. I pretty much narrowed the scope down to NHibernate and Fluent NHibernate, Linq2Sql, or Entity Framework 4.0 (there really on 4.0? really?). So far here is my 'superficial' thoughts, feelings about each (note that until I actually commit to one I won't truly understand it's benefits and/or limitations):

  1. NHibernate/Fluent Nhibernate
    1.  Yeas
      1. Cleanly separates persistence layer from domain layer - you can pretty much ignore the database
      2. Strong adoption amongst the 'in-crowd' - many, many Alt.Net and .NET bloggers/developers profess to it's prowess (Ayende, etc)
      3. Supposed strong community
      4. Very configurable - lots of OSS extensions, etc.
    2. Nays
      1. Lack of centralized ownership (IMO) - no single contributer/owner, lots of work outside the main trunk 
      2. Documentation is all over the place; some of it does not seem to be written clearly or lacks steps that require some amount of presupposed knowledge of NHibernate or one of it's dependencies
      3. No support from any of the larger software dev companies
      4. From the main source forge site I don't see much activity; i see more spam that has _not_ been removed than new members.
      5. First implementor syndrome?
  2. Linq2Sql
    1. Yeas
      1. Super simple, limited functionality ORM
      2. Used it, know it, no surprises
      3. Tons of documentation, blog entries, SEO fodder content...everyone has had a turn 
      4. Write code while others configure...
    2. Nays
      1. Obsolete - MS is moving to Entity Framework
      2. Minimalist ORM - you end up adding additional features (such as caching) - this could be a strong point as well
      3. Poor performance with many concurrent users until you optimize (which is relatively easy: compiled queries, etc)
  3. Entity Framework 4.0 
    1. Yeas
      1. Strong MS commitment (or so it seems, unless they have another MS Research project that cannibalizes on this one)
      2. Lots of current documentation
      3. Centralized community (well MS...)
      4. Seems like the have addressed the shortcomings of Entity Framework 3.0 - I won't really know until I commit to using it, the truth always comes out in the implementation
    2. Nays
      1. Previous versions of Entity Framework were horrible
      2. YAMSOSFAS - Yet another Microsoft one size fits all solution
      3. Trust - MS seems to jump from one thing to the next - outside the grokking of us mere mortals; what they support monday might be obsolete on tuesday.
      4. Personal - I always either build it myself or fall back to MS; I need some new experiances ;)
      5. I have worries about the how easy it is to break away from the MS model of horrible code generation experiences. I have seen enough MS brochureware to make me sick for life. I need real world flexibility with MY model, domain, etc. not what MS marketing deems significant.

So that is my initial, superficial perception of each persistence framework. Next up is looking into more specific and less subjective reasons for liking or disliking each ORM solution.


Tags: , , ,

Cleaning up XmlWriter and IXmlSerializable with Extension Methods

by jmorris 4. November 2009 19:46

If you do any work with xml you probably have come across scenarios where you are using an XmlWriter to produce an output stream of xml. Eventually this output stream is either persisted to disk via an XDocument, sent over the wire using a distributed technology such as WCF, Remoting etc., or possibly transformed with XSL/XSLT. A strong example is custom serialization classes that implement IXmlSerializable.  For example:

The class above is a simple data transfer class (DTO) that implements IXmlSerializable so that it can be serialized and/or deserialized from an objet to an xml stream and vice versa. Note: in most cases you would simple mark the class as [Serializable] and/or provide attributes from the System.Xml namespace to provide the same behavior, however in many cases the default implemention will not fit your particular scenario, hence you would implement IXmlSeriable and provide your own custom serialization.

Here is the 'custom' serialization implementation:


While the XmlWriter/XmlReader API's are pretty simple to use, they are also a bit verbose. If you happen to have a fairly large class with many fields, things start to get ugly pretty fast. Typically when I see large classes, I began to think about refactoring into smaller classes when applicable, but that not always the case. Since, most of them time when want serialization/deserialization you simple want to quickly (i.e. less keystrokes) turn the contents and structure of the class into its xml equivalent you are looking at reducing the amount of work needed. This is where extension methods really come in handy:



The result compared to above is a much cleaner, easier to read class:


While extension methods are not new, they do offer unique way of handling situations where you would like to simplify a set of operations without reaching for the traditional static xxxUtil class or creating a customized implementation or wrapper class. In this case,  XmlWriter is a class open for extension via basic inheritance, unlike a sealed class such as System.String, which is the intended purpose of extension methods: extended classes closed to inheritance (sealed).

Tags: , , , , , , ,

Recipe for Disaster - Naming a Release after a Feature

by jmorris 8. October 2009 22:22

Note: This post was motivated by a 3 hour discussion with a release manager regarding why the name of the release was the same as a feature that we had pushed to a future release – “If feature x has moved to release y, why are we still releasing a build called feature x."

Why? Simple, it holds you and/or the team accountable for releasing the feature in that release on the planned date. This doesn't sound so bad one would think. After all, the feature was conceived hopefully from some end user or business need, feasibility was assessed, and the sprint was planned with dates engraved in stone. The reality is that while LOE can be estimated, there are additional factors at hand that wreck havoc on what will be completed by the cut-off date.

Typically, common cause variance can be mitigated and the estimate included in the original LOE, however the same cannot be said for special case variance.  For example, you can pretty much assume that a certain amount of effort will be attributed to research and discovery, before the developer even starts working on the task or feature. While in the exceptional case a developer may have sufficient domain experience to simply start coding a whip out the feature, the rule is that the developer will have to spend time familiarizing them with the environment, before jumping in. This is expected to cause a certain amount of containable variance with respect to a project or sprint. However, how do you reasonable plan the case where another developer on the team contacts some grave illness, such as swine flu and is not available and reduces the number of resources available?

In cases such as this, we simple do not plan for these special cases, because it is virtually impossible. If you did try to plan for them, you would never finish planning because there are an infinite number of things that can happen. It costs too much to plan for these types of things!

Back to my original point, when you name a release after a feature you commit yourselves to completing that feature and releasing it on the planned date, however you omit the possibility of special case variance affecting the outcome. In which case you still want to release something so maybe you release another, less resource feature; but the release is named after the feature that was omitted…this is awkward (especially for the build engineer).

You will communicate your intentions better (i.e. the goal is a release, not necessarily the release of a specific feature) if you name the release after some arbitrary name not tied in directly to a feature.

Tags: , ,

FragileUnit Tests and External Dependencies

by jmorris 6. October 2009 23:42

An example of why unit tests that depend upon external resources, such as databases are fragile:

And the code that is breaking:

Reason: article was versioned repeatedly in the UI after this unit test was created...noise.

Tags: , ,

Updating Records With Randomly Selected Values - SQL Server

by jmorris 4. October 2009 05:40
It turns out that selecting random values from a table column and updating or inserting them into another table is trickier than I imagined. For example, the following SQL snippet seems to work, but it will always assign the first random value to

Tags:

Simple Pipe and Filters Implementation in C# with Fluent Interface Behavior

by jmorris 16. September 2009 21:51

Background

I am working on a project that requires a series of actions to be executed against an object and I immediatly thought: pipe and filters! Pipe and Filters is architectural pattern in which an event triggers a series of processing steps on a component, transforming it uniquely on each step. Each step is a called a filter component and the entire sequence of called the pipeline. The filter components take a message as input, do some sort of transformation on it and then send it to the next filter component for further processing. The most common implementations of a pipe and filter architecture are Unix programs where the output of one program can be linked to the output of another program, processing XML via a series of XSLT transformations, compilers (lexical analysis, parsing, semantic analysis, optomization and source code generation) and many, many other uses (think ASP.NET Http stack).

 

 
 Pipe and Filters Conceptual

The diagram above depicts the actors and message flow. The pump is the event originator which pushes messages into the pipeline to be processed by the individual filter components. The messages flow through the pipeline and become inputs for each filter component. Each filter performs it's processing/transformation/whatever on the message and then pushes the message back into the pipeline for further processing on the next filter component. The sink is the destination of the message after it has been processed by each filter component.

Pipe and Filters Implemented

For me, the best way for me to implement a design pattern is to see how others have implemented it. A quick google search and I came up with the two good examples (actually there are many, many, many examples!):

Both examples take advantage of the newer features of C# such as the Yield keyword (the purpose behind their posts?), which did not apply exactly to my needs. A little meddling however, and I came up with the following:

 
Simple Pipe and Filters Implementation

Here is the final code for the FilterBase<T>:

And the code for the Pipeline<T> class:


Here is rather weak unit test illustrating the usage.

Note that I added a fluent-interface type behavior to the Pipeline<T> class so that you can chain together the registration of the filters and finally execute the chain.

References:

 

Tags: , , ,

Jeff Morris

Tag cloud

Month List

Page List