30. August 2013 19:16
On September 13th 2013, leading NOSQL database provider, Couchbase, will be invading San Francisco for it’s annual community conference, Couchbase [SF] 2013! I’ll be there as well participating in a Couchbase Cluster – smaller interactive and discussion driven sessions – representing the .NET Client SDK.
The event will host three different tracks: developer, operations and administration with speakers from Couchbase and from Customers of Couchbase who have a wealth of experience and various use-cases to share. There will also be smaller interactive sessions that go over advanced topics, new offerings such as Mobile Couchbase and the various Couchbase Labs projects that are available for free on Github or via Nuget (if you are a .NET developer).
So if your in San Francisco on September 13th or are willing to travel a bit, come by and join in on the fun!
3. November 2012 21:08
Annotations or Attributes (Java vs. .NET/C#) are a means of decorating classes, methods and properties with additional metadata or declarative information. The annotations/attributes can then be queried at runtime via reflection and methods associated with them can be invoked.
Incredibly powerful and useful, they are quite common in various frameworks for tasks associate with say validating data associated with a class or property or for mapping properties on an entity to column names in a table in an RDBMS. There are many, many other uses as well.
Here is an example in C#: https://github.com/jeffrymorris/attributes-example
Note that in both Java and C#, annotations/attributes are a first class language construct. This is useful for many reasons, including improved readability and comprehension, they are type-safe, you can attach a debugger and step into them, etc.
Today I learned that PHP also has a form of annotations…well, sort of! It seems that a couple of PHP frameworks (Symphony 2 and Doctrine 2) have “implemented” them not as a language construct but as a hack via comments:
Folks, those aren’t comments…that is code that will get executed! Yuck, this is wrong in so many ways…especially since there is an RFC for adding annotations to PHP in the works: https://wiki.php.net/rfc/annotations.
Just because you can, doesn’t mean you should!
14. June 2012 10:50
Not much more to say about Eclipse…
25. March 2012 12:13
A couple of weeks back, while at SxSW, I attended an excellent presentation about NOSQL databases by Gary Dusabek of Rackspace: NoSQL Databases: Breaking the Relational Headlock. The following post summarizes some of the key points and provides a comparison of the various technologies. He didn’t go over CouchDb, Couchbase or Membase, so I’ll add my own notes about those offerings as well, since I personally have used each.
The Problems with RDBMS
The major problems with traditional Relational Database Management Systems is the inability to scale linearly, Single Point of Failure (SPoF), lack of sharding features, and the requirement of de-normalization to ease the use of data. Typically, to deal with scale, you would add processors, memory, disk space etc. to build a bigger box capable of handling increased volume or throughput. This is normally referred to as “vertical scaling”. Unfortunately, vertical scaling is not cost efficient; the cost of CPU and memory increases disproportionately to performance – it’s cheaper and more efficient to cluster cheaper hardware – horizontal scaling. Additionally, RDBMS performance tends to suffer when transactions are introduced to ensure data corrected-ness, consistency and isolation.
Considerations for Choosing
When choosing a NOSQL solution the following considerations must be evaluated:
- Fault tolerance – what is an acceptable level?
- Recoverability – volatility (in-memory/fast) or persisted (slower, but less volatile)
- Replication – fully distributed or master/slave?
- Access – polyglot drivers? Do they all offer consistent functionality?
- Hooks – before/after command execution (sprocs and triggers)?
- Distribution mode – sharding strategy?
- Data model
- Key/Value pairs?
- Data structures?
- Transactional semantics? BASE vs ACID?
- Read vs Write throughput – where are your scaling issues? What are the usage patterns of your data?
- Deployment, Management, Administration – how to add or remove nodes without affecting clients?
What NOSQL Offers
All being said, NOSQL solutions are not necessarily a replacement for RDBMS, but a complement to handle issues of scalability and complexity. An example usage would be as the Q in CQS…store a master copy in a fully normalized form in RDBMS and then push a de-normalized form into a NOSQL solution for scaling reads. Additionally, by virtue of being schema-less, development is typically easier and faster.
Some NOSQL Databases
The following is an non-exhaustive overview of NOSQL databases:
- Master/slave replication – master is a SPoF
- Gives failover and reliability, but not consistency
- Only master receives writes
- Document orientated, thus naturally denormalized – stored natively as BSON
- Flexible schema
- Programmer friendly
- Many language drivers – C#, Java, PHP, Ruby, Python et al
- Atomic on a single document for writes
- Allows for complex queries – by ranges and multiple criteria for instance
- Not good for DW/data analytics
- Blocking offline compaction
- SPoF – the master dies, everything dies
- Master/Slave replication
- Good for real-time stat tracking
- Very fast – in memory database
- Volatility – in memory database – potential for data loss
- Like Memcached, but with data structures: lists, sets, hashtables
- RAM limitations – whole set fits in memory, but also allows for offline storage
- Good when the entire dataset can fit in memory
- Fully distributed – shared nothing – no SPoF
- Relationships via links
- Map/Reduce framework
- Completely schema-less – keys and buckets
- Scales linearly
- Tunable consistency - can adjust for read vs write optimization etc
- Pre and Post commit hooks
- Pluggable backend storage
- Bit cast – everything in memory
- InnoDb –everything won’t fit in memory
- Memcached-like in memory
- REST API
- Dynamic clustering via “vnodes” similar to Membase/Couchbase vbuckets – when a node is added or removed the data is automatically re-indexed
- Data is stored unsorted
- Written in Erlang
- Has a query language called CQL – SQL like syntax
- Dynamo based distribution system – BigTable like
- Allows for range queries, but prone to “hotspots” – uneven distribution of key/value pairs
- Data center “rack aware”
- Hadoop integration provided by datastax.com
- Configurable caching – like a super-fast Memcached
- Some schema schematics – hybrid columnar and row based storage system
- Keeps sort order of data, but can be changed on the fly
- When growing the cluster “hotspots” may occur – uneven distribution of keys and values
- Part of the Hadoop suite of tools: HBase, HDFS, Sqoop, Hive, etc
- Versioned cells – you can query data as it existed at a particular point of time
- Easy Hadoop integration by default
- Hadoop NameNode is a SPoF – Secondary NameNode provides some redundancy
- Schema maintenance requires downtime
- Complicated balancing – HBase region servers then HDFS
Couchbase – not covered in session
- Fast, in-memory database due to Memcached interface integration
- Provides Map/Reduce framework for creating different views of the data you wish to display
- Stores data as JSON documents via Key/Value pairs
- Combines the best attributes of Memcached (caching), Membase (administration and scaling) and CouchDb (mapreduce)
- No SPoF – fully replicated data
- When a node is added data is automatically rebalanced and replicated across the cluster!
- Depending upon bucket type, data can be persisted to disk or stored in-memory
- Can easily support multi-tenancy via buckets – just create a bucket for each client
- Written in Erlang – newer 2.0 version has more C/C++ for performance reasons
- Product keeps changing…first it was Membase, then they added Memcached, and now CouchDb functionality – moving target for long-term NOSQL deployment
Next Up: Details…
This is a just a cursory overview of several NOSQL databases, I’ll be evaluating each one in detail in the coming weeks to get a better feel for where each solution fits given a particular scenario. From what I can see, some are more specific in the scope of problem sets that they satisfy, while others are more general purpose tools that satisfy a range of scenarios.
10. January 2012 22:37
I came across the following press release (a bit old) and liked what I read. Specifically, that Couchbase was working on UnQL support with MS:
“Couchbase unveiled and released to the public domain the UnQL query
language, (UNstructured Query Language). Jointly developed with
Microsoft and SQLite, UnQL is designed to provide a common query
language for NoSQL developers and help drive widespread adoption of
NoSQL technology. Each company has committed to delivering product
support for UnQL in 2012.”
By going to UnQL and partnering with MS, this puts Couchbase in an awesome position to develop a Linq (IQueryable) implementation of UnQL. If this happens, then querying a NOSQL or a RDBMS (or anything else) will be unified from the CLR perspective.
For instance, the following Linq query in the CLR (C# syntax):
var query = (from f in Context.Foo
select new f).
Could emit UnQL if Context is NOSQL or SQL if RDBMS…genius. If only Java had something like IQueryable <sigh>.
It also looks like Couchbase is dumping the CouchDb HTTP REST API for the binary Memcached protocol, which should be a big win from a performance perspective (sorry CouchDb users). Membase already uses the protocol, so it’s just matter of switching the HTTP REST API for UnQL.
Another develpoment in Couchbase is that CouchDb has been forked. The good news it’s still going to be open-source:
“As J. Chris Anderson notes in the comments, Couchbase is completely open source and Apache licensed:
Everything Couchbase does is open source, we have 2 github pages that are very active:
Probably the most fun place to jump into development is the code review: http://review.couchbase.org/
Let me clarify, if you like Apache CouchDB, stick with it. I'm working on something I think you'll like a lot better. If not, well, there's still Apache CouchDB.”
While possibly a bit traumatic for CouchDb afficiandos, this should be a huge win for Couchbase fans and for companies investing in Couch as stable, NOSQL solution.
15. December 2011 13:32
This is what inspires me to blog again after months of inactivity:
Is it just me or don’t they, kind of look alike?
That observation aside, the Obama website is kind of creepy. It has an several forms for soliciting contact information from the sheeple:
And this one which collects more data, allows you to make a donation and enter a Republican you knows contact info and they will send them a message:
“This holiday season, we're giving you a chance to have a little fun at the expense of a Republican in your life by letting them know they inspired you to make a donation to the Obama campaign.
Simply enter their name and email address below. Then, we'll send them a message letting them know they inspired you to donate. (Don't worry—we won't hold on to any of their information.)”
Immature and trite, if not weird action from the POTUS…meh politics.
Another interesting “feature” of the website is this splash page that comes up the first time you hit the site (try clearing cookies):
Note that it attempts to immediately get your email and zipcode? More point-of-contact and demographic information for the the big “B’s” big data machine. Notice how small the “continue to the website…” part is? It’s even in a muted color (in comparison to the “SHOP NOW” button).
Inspiration, wherever it may find me…
12. September 2011 05:50
I find it hard to believe that twitter is still experiencing problems of scale given it’s popularity and available resources. Mind boggling, really.
30. August 2011 20:05
Looks like the cloud evaporated…
20. July 2011 22:11
There was a lot of noise and confusion on the Monodroid mail group when it was reported that Novell had been sold to Attachmate and subsequently laid off the entire Mono team. The two major products that the team had been working on, Monodroid and Monotouch (cross platform .NET platforms for Android and iOS development) apparently were dead.
Well, it looks like Miguel and his team have worked a deal with Attachmate:
Through an agreement with SUSE*, a business unit of The Attachmate Group (the company that acquired Novell in April 2011), Xamarin has a broad, perpetual license to all intellectual property covering Mono*, MonoTouch, Mono for Android and Mono Tools for Visual Studio. Xamarin will also provide technical support to SUSE customers using Mono-based products and assume stewardship of the Mono open source project
This is good news indeed for open source .NET development and all involved.
9. July 2011 01:23
While unit testing a a VirtualPathProvider today, I came upon an interesting exception:
System.Runtime.Serialization.SerializationException : Type 'Foo.Web.Core.UnitTests.Plugins.Modules.PluginRegistrarTests+<>c__DisplayClass1' in assembly 'Foo.Web.Core.UnitTests, Version=184.108.40.206, Culture=neutral, PublicKeyToken=null' is not marked as serializable.
I was stumped by this…what, where and who is +<>c__DisplayClass1'? Granted my scenario was somewhat complex in that I am testing in a by using a fake AppDomain that mimics the ASP.NET HostingEnvironment ala this post.
When I saw that error, I immediatly thought the problem was that I was missing the Serializable attribute, since the exception explicitly states: “[type] is not marked as serializable.” I added that attribute to class that I was loading into the faux AppDomain for the same result:
WTF? I was stumped! I googled around a bit and got side tracked by some discussions of MarshalByRefObject and finally stumbled upon something on Ode2Code. I wasn’t quite sure of what the problem was until I read some of comments. Most notable this one. A quick check of the IL with ILSpy confirmed my suspicions:
Fix was easy: simply move the declaration of the FakeHttpApplication class to within the scope of the delegate itself. Here is how I had it defined:
And after I moved into the scope of the delegate:
So, what was the problem? Basically it comes down to the scoping of Anonymous methods and how the compiler generates code to support them. Anonymous methods are simply compiler generated types. The generated type c__DisplayClass1 is not marked as serializable, so it fails when passed into the scope of the Anonymous delegate call…which is executing in another AppDomain (that does not contain the Anonymous type definition).