Tuesday, May 19, 2009

Best Of Mix ‘09 (MSDN event)

I went to this free MSDN event yesterday, a 3-hour recap of the most interesting technologies unveiled at Mix ‘09. The event focused on Silverlight 3, Windows Azure and ASP.NET MVC (we skipped that introductory lecture since we’ve been using the framework for quite a while). Here’s a quick recap.

Silverlight 3 / .Net RIA services

The MS demo guy really just focused on the RIA services extensions (a separate framework you can add on SliverLight 3 apps). The technology is built on top of SQL Server Data Services (Astoria), which provides quick REST scaffolding around your entities / db schema.

No WCF here, just an .axd resource handler with REST-like Uris that lets you access your CRUD services with no need to write code.

What’s cool about the Silverlight RIA dat aservice framework is it provides you with client-side proxy model and service classes that you don’t have to worite. Not only that, but it deals with entity states on the client, difgrams, and lets you save changes automatically or manually. It makes for a coll demo – build your entities, add a datagrid & form views to your sliverlight apps, a save button and you’re done. It uses optimistic concurrency by default, so you can even detect conflicts (mutliple users changing the same data concurrently).

You can also decorate your model classes with CLR attributes from System.ComponentModel.DataAnnotations to define validation rules (on the server), and bam – all validation is taken care of on the client as well.

I absolutely get the benefit of all this for quick and dirty intranet apps. You already have the data model, now you can build an entire RIA on top of it in a matter of hours. A few clicks in Visual Studio, some XAML, and you’re done. No need for an advanced C# programmer.

The demo I saw used ADO.NET Data services via an .ax resource handler. I’m not super familiar with that technology, but I read a blog post from Brad Abrams mentioning various data stores could be used, including REST service, nHibernate, WCF-server etc…

Personally I think MS can go a bit overboard with declarative markup with its declarative data sources etc… I think a little bit of C# never killed anybody, which is why I like the ASP.NET MVC model so much. But I digress.

Windows Azure

Azure is Microsoft’s cloud infrastructure response to Google App Engine, Mosso, Amazon Web Services etc… For those of you who’ve heard about cloudiness but don’t really know what it means, these vendors are trying to make it easy for you to deploy scalable clustered apps using a “pay for what you use” pricing model, without having to worry about load balancing, failover, redundancy etc... A typical cloud hosting offering usually consists of a combination of the following:

- Data store (SQL Server Data Services, Azure table storage, Amazon Simple DB, MYSQL cluster etc…)
- Cloud file storage (Amazon S3, virtual NAS devices, Azure simple data storage)
- Queuing service (Amazon SQS)
- Computing nodes for background processing
- Load-balanced web cluster to host your web app and/or web services

Their offerings range from pure infrastructure where you get a bunch of computing instances at the OS level (GoGrid, EC2) to a full framework (Google App Engine), with Azure and Mosso as in betweens – you don’t have access to the actual VMs, but your code is not completely dependant on your host.

What I like about Azure:

- Microsoft provides you with a full development stack / SDK you can install on your development machine. It’ll run your Azure app just like it would in the cloud, and deployment is a breeze. No other vendor out there offers anything like it. We use AWS, and we had to develop an abstraction layer for each service with implementations that allowed us to develop while not being connected to AWS. I still think it was the right thing to do since our app would now be easily portable to a different host, but it makes me nervous whenever we deploy. Unit-tests only take you so far, there are performance and scalability issues that can be impossible to detect when you’re simulating the app in a completely different environment (such as local disk access vs web service based storage).

- I registered for the beta a while a go and played with a bit. You get a staging and production environments out of the box, and deploying from staging to production is just one click.

- At first I was very excited about SQL Server in the cloud. I thought MS had found a way to scale a relational DB indefinitely. Yes, I also believe in magic. However, I was mistaken. It’s just like using any SQL server instance – sharding will be a pain, and the platform won’t work if you’re building a super-large app. For this you’ll need to use Azure table storage (closer to SimpleDB) - I’ll get to that in a minute. I’ll still put that feature as a pro since you don’t have to worry about DB maintenance and licensing – it’ll be price as you go. I have t say though, Mosso found a way to have a similar offering. Create as many DBs as you want, priced by amount of storage needed.

- Windows Server 2008 / IIS 7 support. I know this should be obvious, and Mosso & GoGrid do offer this as well, but not EC2. That’s the one thing preventing us from hosting our web app on the Amazon cloud.

What I don’t like:

- SQL Server Data Services don’t come with full-text indexing or CLR integration. So if your cloud app features a search box (I know, crazy concept, right?), you’ll need to use a 3rd party indexing app (Google?) or write your own indexer (who’s going to do that??).

- Table storage is so limited I don’t understand what I would do with it. You can basically store your entities in name/value property bags by specifying a partition key & row key. I’m fine with the inability to join, but you can’t even query data across partitions. Since each partition will always be stored on the same node, you’ll want to break your extra-large tables into multiple partitions by using multiple partition keys. So let’s say you build the next YouTuve and host – I don’t know – 500 million videos. You’ll segment your video data store into maybe 100 partitions. There’s no built-in way for you to query that huge data store at once. You’ll need to create your reports separately or use something like Hadoop (which Amazon offers) to query each partition and aggregate the results. In short: a lot of work.

- No distributed caching layer. This one baffles me. Any cloud app should be built with distributed caching in mind (like memcached). If you’re going to get 100 simultaneous requests to your load-balanced clustered, with each one potentially fetching 50 items from your data store, you could be in trouble. I’m sure SQL Server data services use replicated DBs, so that’ll help, but if MS charges for DB traffic, this could be very, very painful. Not only that, but it’ll be slower than fetching a binary serialized object from memory. IIS caching won’t do, you’ll get into big-time cache invalidation nightmares if your data changes often. The demo guy showed blob storage as a potential solution, but since operation writes the data 3 times, it’ll be slow – and potentially costly. It sounds like MS is aware of the problem, they’re working on a Velocity offering. No ETA though. BTW – Mosso has the same exact problem. Right now if you need distributed caching you’ll have to use an infrastructure offering like EC2 or GoGrid.

Conclusion

Azure looks promising. I really want pricing info though. And it sill looks about 12 to 18 months away from being exactly what I need.