About the author

J Sawyer is a developer based in Houston, TX and loves to write code, especially ASP.NET and other web-related stuff. He is currently working on implementing Team Foundation Server at a large energy company in Houston and is loving that too.

He also loves to ride his Yamaha FZ1. And sometimes his Ninja 650.

But he doesn't code and ride at the same time. That would be bad.

Launch in Second Life ...

April 23, 2008 11:13 AM

This Saturday at 9:00 AM SLT (that's Second Life Time ... or Pacific Time), we will be having the Visual Studio 2008/Windows Server 2008/Sql Server 2008 Launch event in Second Life. This is the first event of its kind ... yeah, there have been user group meetings, but nothing quite this big, quite this dramatic or ground-breaking. This is a full-on launch event held in a Virtual World with folks from around the globe participating. It's pretty exciting stuff. 

I won't repeat all the details ... Zain has done that already, so just hop on over to his blog and check it out. 



Tags: ,

Events | Second Life

Thoughts on Linq vs ADO.NET - Simple Query

April 22, 2008 1:09 PM

I had a little discussion today with an old buddy of mine this morning. I won't mention his name (didn't ask him for permission to) but those of you in Houston probably remember him ... he used to be a Microsoft guy and is probably one of the best developers in town. I have a world of respect for him and his opinion.

So ... it started with this ... he was surprised by the "do you think a user will notice 300 ms".  Of course, that's a loaded question. They won't. But his point was this: 300 ms isn't a lot of time for a user, but under a heavy load, it an be a lot of time for the server. Yes, it can be ... if you have a heavy load. I won't give a blow-by-blow account of the conversation (I can't remember it line for line anyway), but it was certainly interesting.

One thing that we both agreed on that is important for web developers to understand is this: performance is not equal to scalability. They are related. But they are not the same. It is possible (and I've seen it) to create a web app that is really fast for a single user, but dies when you get a few users. Not only have I seen it, but (to be honest here), I've done it ... though, in my defense, it was my first ASP "Classic" application some 10 or 11 years ago; I was enamored with sessions at the time. This was also the days when ADO "Classic" was new and RDO was the more commonly used API. And ... if you are a developer and haven't done something like that ... well, you're either really lucky or you're just not being honest.

With that out of the way ... I'd like to give my viewpoint on this:

Data Readers are still the fastest way to get data for a single pass. If it's one-time-use data that is just thrown away, it's still the way to go. No question. (At least, IMHO). But there's a lot of data out there that isn't a single pass and then toss ... it may be something that you keep around for a while as the user is working on it (which you often see in a Smart Client application) or is shared among multiple users (such as a lookup field that is consistent ... or pretty much consistent ... across all users). In both of these cases, you will need to have an object that can be held in memory and accessed multiple times. If you are doing a Smart Client application, it also needs to be scrollable. Data Readers don't provide this. So ... if you are doing these types of things, the extra 300 ms is actually well worth it, In a web application, you'll scale a lot better (memory is a lot faster than a database query and it keeps load off the database server for little stuff) by caching common lookup lists in the global ASP.NET Cache. One thing that I find interesting ... the LinqDataSource in ASP.NET doesn't have an EnableCaching property like the SqlDataSource. It does, however, have a property StoreOriginalValuesInViewState.  Hmmm ... curious. Storing this in ViewState can have its benefits ... it's a per-page, per-user quasi-cache ... but at the cost of additional data going over the wire (which might be somewhat painful over a 28.8 modem ... yes, some folks still use those). That said, ViewState is compressed to minimize the wire hit and can be signed to prevent tampering. But ... the EnableCaching puts the resulting DataSet (it won't work in DataReader mode) into the global ASP.NET cache ... which, again, is good for things like lookups that really don't change very often, if at all.  For the Smart Client application ... well, DataReaders have limited use there anyway due to the respective natures of DataReaders and Smart Client apps.  Granted, you can use a DataReader and then manually add the results to the control that you want it to display in ... but that can be a lot of code (yeah, ComboBoxes are pretty simple, but a DataGrid ... or a grid of any sort?). One thing that struck me is the coding involved with master/child displays in Smart Client applications. There's two ways that you can do this in ADO.NET: You can get all the parents and children in one shot and load 'em into a DataSet (or object structure) -or- you can retrieve the children "on demand" (as the user requests the child). Each method has it benefits, but I'd typically lean to the on-demand access, especially if we are looking at a lot of data. This involves writing code to deal with the switching of the focus in the parent record and then filling the child. Not something that's all that difficult, but it is still more stuff to write and maintain. With Linq to Sql, this can be configured with the DeferredLoadingAvailable property of the DataConnection and it will do it for you - depending on the value of this property (settable at runtime - you won't see it in the property sheet in the DataContext designer).

There was also some discussion about using Linq vs. rich data objects. This ... hmmm ... well, I'll just give my perspective. This is certainly possible with Linq, though certainly not with anonymous types (see http://blog.microsoft-j.net/2008/04/15/LinqAndAnonymousTypes.aspx for a discussion of them). But ... the Linq to Sql classes are generated as partial classes, so you can add to them to your heart's delight. As well as add methods that hit stored procs that aren't directly tied to a data class.  Additionally, you can certainly use Linq to Sql to have existing (or new) rich data classes that you create independently of your data access and then filled from the results of your query. As for the performance of these ... well, at the current moment, I don't have any numbers but I'd venture to guess that the performance would be comparable to anonymous types.

Performance aside, one thing that you also need to consider when looking to use Linq in your projects is not just the performance, but the other benefits that Linq brings to the table. Things like the ease of sorting and filtering the objects returned by Linq to Sql (or Linq to XML for that matter) using Linq to Objects. There is also the (way cool, IMHO) feature that lets you merge data from two different data sources (i.e. Linq to Sql and Linq to XML) into a single collection of objects or a single object hierarchy. Additional capabilities and functionality of one methodology over another are often overlooked when writing ASP.NET applications ... it's simply easier to look at the raw, single user, single page performance without thinking about the data in the holistic context of the overall application. This is, however, somewhat myopic; you need to keep the overall application context in mind when making technology and architecture decisions. This in mind ... hmmm ... off to do a bit more testing. Not sure if I'll do updates first or Linq sorting and filtering vs. DataViews.



Tags: , ,

.NET Stuff | Linq | Performance

Linq vs. ADO.NET - Simple Query

April 16, 2008 12:32 AM

In my last blog post, I took a look at how Linq handles anonymous types. I also promised to do some performance comparisons between Linq and traditional ADO.NET code. Believe it or not, creating a "fair" test is not as easy as one would think, especially when data access is involved. Due to the nature of connection pooling, whichever method is first to be tested gets hit with the cost of creating the connection ... which skews the test. Yeah, I'm sure this is out there in the blogosphere, but I do like to do these things myself. Call it the Not-Invented-Here syndrome.

This particular test set is for a very simple query. I created a set of 4 methods to test for performance within a standard Windows Console Application, which should give an overall comparison of data access. All tests used the AdventureWorks sample database, with the statement (or its Linq equivalent) Select FirstName, LastName From Person.Contact. This is about as simple a query as you can get. From there, each method concatenated the two field results into a single string value ... The Linq test used an anonymous type going against a data class created with the Data Class designer. Data Reader Test 1 (DataReaderIndex) used the strongly-typed DataReader.GetString(index) ... and I did cheat a little with this one by hardcoding the index rather than looking it up before entering the loop (though this is how I'd do it in the "real world"). In previous tests that I've done, I've found that this gives about 10-20% better performance than DataReader[columnName].ToString() ... though that does include the "lookup" that I mentioned previously. Data Reader Test 2 represents the more common pattern that I've seen out there ... using DataReader[columnName].ToString(). Now, I'm not sure which of these methods Data Binding uses and, honestly, that's not in the test ... though, now that I think of it, it may be a good thing to test as well. Finally, I included a test for DataSets (TestDataSet) ... using an untyped DataSet. I've found (again, from previous tests) that this performs far better than a typed DataSet ... the typed DataSet gets hit (hard) by the creation/initialization costs. Before running any tests, I included a method called InitializeConnectionPool, which creates and opens a connection, creates a command with the Sql statement (to cache the access plan), calls ExecuteNonQuery and then exits. This is not included in the results, but is a key part of making sure that the test is as fair as possible. Additionally, all of the tests access the connection string in the same way ... using the application properties. In looking at the code generated by the LinqToSql class, this is how they get the connection string. This ensures that the connection string for all methods is the same, which means that the connection pools will be the same.

To actually do the test, I called each method a total of 30 times from the applications Main, each function in the same loop. This would help to eliminate any variances. After running each test, I also called GC.Collect() to eliminate, as much as possible, the cost of garbage collection from the results.  I also closed all unnecessary processes and refrained from doing anything else to ensure that all possible CPU and memory resources were allocated to the test. One thing that I've noticed from time to time is that it seems to matter the order in which functions are called, so I made a total of 4 runs, each with a different function first. For each run, I tossed out the min and max values and then averaged the rest -- (total - min - max)/(numCalls -2). This gave me a "normalized" value that, I hoped, would provide a fair, apples-to-apples comparison. Each method had a set of 4 values, each with 30 calls, 28 of which were actually included in the normalized value. I then took the average of the 4 values. I know that sounds like an overly complex methodology ... and I agree ... but I've seen some weird things go on and some pretty inconsistent results. That said, in looking at the results, there was not a lot of difference between each of the 4 runs, which makes me feel pretty good about the whole thing.

So ... without further ado ... the results (values are in milliseconds):

Method Normalized Average
TestDataReaderIndex 56.64767857
TestLinq 75.57098214
TestDataSet 117.2503571
TestDataReaderNoIndex 358.751875


Now, I have to say, I was somewhat surprised by the TestDataReaderNoIndex results ... previous tests that I had done didn't show such a big difference between this and TestDataReaderIndex ... though I wonder if that has something to do with the way I did this test - hardcoding the indexes into TestDataReaderIndex. I'm not surprised that TestDataReaderIndex turned out the be the fastest. DataReaders have been, and still are, the absolute fastest way to get data from the database ... that is, if you do it using integer indexes. However, TestLinq didn't come that far behind and was certainly more performant than the untyped DataSet. So ... let's think about this for a second. The Linq collection that is returned is more like a DataSet than it is a DataReader. DataReaders are forward-only, read-only server-side cursors. Use them once and kiss them goodbye. Both the Linq collection and the DataSet allow random access and are re-startable ... and they are both updatable as well. I've had a lot of folks ask about the performance of Linq and now I can, without question and with all confidence, tell them that the performance is quite good.

Still, let's be honest ... the difference between the fastest and the slowest is a mere 300ms. Do you really think users will notice this?

UPDATE: You can download the code and the tests that I used for this at https://code.msdn.microsoft.com/Release/ProjectReleases.aspx?ProjectName=jdotnet&ReleaseId=948. If you get different results, I'd be interested to hear about it. Even more, I'd be interested in the methodology that you used to create the report.



Tags: , ,

.NET Stuff | Performance | Linq

Linq and Anonymous Types

April 14, 2008 9:20 PM

I've been playing with Linq quite a bit recently. I have to say ... it's some cool stuff and revolutionizes data access on the .Net platform. One of the things in Linq that I'm really fascinated with is anonymous types. These classes are created based on a Linq statement and only have the properties that you specified. They're nicely type-safe and work with IntelliSense. Beauty and goodness.

Now, for a time, I just played with them and used them without much thought about what's going on behind the scenes. But ... my curiosity got the better of me and I decided to dig a bit and see what's going on. And the best way to do this? Lutz Roeder's Reflector of course!

So first ... the code. Not much, pretty simple.

            using (DataClasses1DataContext dc = new DataClasses1DataContext())
            {
                var contactNames = from c in dc.Contacts
                                   select new { c.FirstName, c.LastName };
                foreach(var contactName in contactNames)
                {
                    Console.WriteLine(contactName.FirstName + contactName.LastName);
                }
            }

I could have made it even simpler ... remove the foreach loop. But that let's me know that all's well.

So ... what happens with the anonymous type? It's actually compiled in the assembly. Yup, that's right ... it's a compiled class, just like a case that you create. But there is some black voodoo majik going on and, I'm certain, some significant compiler changes to make this happen.

Here's the raw IL generated for the class (with attributes):

.class private auto ansi sealed beforefieldinit lt;<FirstName>j__TPar, <LastName>j__TPar>
    extends [mscorlib]System.Object
{
    .custom instance void [mscorlib]System.Diagnostics.DebuggerDisplayAttribute::.ctor(string) = 
         { string('\\{ FirstName = {FirstName}, LastName = {LastName} }') Type=string('<Anonymous Type>') } .custom instance void [mscorlib]System.Runtime.CompilerServices.CompilerGeneratedAttribute::.ctor()

And here's the C# version if the IL:

[DebuggerDisplay(@"\{ FirstName = {FirstName}, LastName = {LastName} }", Type="<Anonymous Type>"), CompilerGenerated]
internal sealed class <>f__AnonymousType0<<FirstName>j__TPar, <LastName>j__TPar>

 

If you had any doubt at all, the "CompilerGenerated" attribute pretty much says it all. All of the references to the anonymous type in the code are replaced by this class in the compiled IL. And the return value from the query? It's a generic class:
[mscorlib]System.Collections.Generic.IEnumerable`1<class <>f__AnonymousType0`2<string, string>>.

Pretty cool, eh?

Now I'm off to dig into the performance of these beasties when compared to a DataReader and a DataSet. Early results look promising, but I've got some work to do to make sure it's a valid and fair comparison.



Tags: , ,

.NET Stuff | Linq | Performance

<RANT>About Drivers and Fuel Economy</RANT>

April 7, 2008 3:28 PM

Driving in Houston, for those that have had the joy, is certainly an interesting experience. It's not uncommon to see a dually zipping down the Sam Houston Tollway at 90 MPH, tailgating (and they brake at the last minute when they do this) and weaving in and out of traffic, sometimes with only a foot to spare. Now, while I could just rant a bit on just how, ummm, un-smart this is, I won't. And this is, for me, quite terrifying when I see this as I'm out and about on my motorcycle. Needless to say, any fight between my motorcycle and said dually would result in a somewhat less-than-pretty outcome for me. That said, this type of driver will be something of an example in what follows.

So, here's the deal ... that crazy dually driver is gulping gas at a tremendous rate. Probably getting 5 MPG or so. The vast majority of the vehicles on US highways today are tuned for best highway fuel economy at around 60-70 MPH. Above that and your fuel economy drops exponentially. The exception may be some of the imported German vehicles ... they are designed with the Germany's famous Autobahn in mind. And then, every time he slams on his brakes (to tailgate someone and give himself a bad case of road rage), he's wasting the fuel that got him to that speed.

I'm not going to go into that whole green-environmental-global warming argument. No worries.

Let's do a little math here. Let's say he's going 20 miles on the tollway (that's 1/4 the loop). And he averages - not instant speed, but average, including all that braking - 80 MPH. At 80, it will take him 15 minutes. At 65 (the posted speed), it'll take 18 1/2 minutes. At 70 (the more reasonable speed), it's 17.15 minutes. Not that much time saved. But he's not going to average 90 -- because of constant acceleration an deceleration. He probably won't even average 80. And his fuel economy will be much worse. Let's pretend he drops from 15 MPG at 65 to 10 MPG (and that lower end is generous ... it's likely lower, with all that weight to keep going). At 15 MPG, he'll use 1.3 gallons. At 10, it's 2.0 gallons. At today's average gas price in Houston (3.23/gallon), that an extra $2.26. It doesn't seem like much, but that's just one trip. Driving the same way back, it's $4.50. If this is a regular trip to and from work, that's $22.50/week. At 50 weeks/year, we're at $1125. Now, for me, that's enough to pay for my insurance. His may be higher, depending on how many wrecks he's had and how many tickets. Oh, and getting busted at that speed will not only get you a ticket (for $205), it may well get you locked up for reckless driving (20 MPH over the speed limit). Finally, it's also an additional 1.4 gallons of gas per day. That's an extra 350 gallons/year. And this is a conservative estimate ... I'd bet that his average fuel economy for this trip is under 10 MPG (all that braking and heavy acceleration). Keep in mind that this is just this guy's daily commute.

How much do you want to bet that this'll be the same person that complains the loudest about gas prices?

Even with a typical driver, if you increase your fuel economy by 20%, you're looking at some savings. Assuming you drive 12000 miles per year (about average) and currently get 20 MPG, you would save 100 gallons by increasing your fuel economy by 20%. And this is still less than the US Standard for fuel economy. And, judging from what I've seen drivers doing, I'd bet that we could realistically get better savings.

Looking at the gallons saved, it looks like a drop in the bucket, right? But once you start adding it up for all of the drivers on the highway, you are getting into some serious gallons. Now, before you start going off that I'm getting into an enviro-rant, I'm not. It's a question of national security and the country's overall economy. Oil, the cost of oil and issues related to securing oil resources in an ever-competitive oil market, the US's need to import massive quantities of oil to quench our thirst for driving like maniacs has put is in something of a bad situation. We are completely dependant on other countries for our very economic engine. One of these is Venezuela, which is interesting considering how much Chavez rails against us. And, of course, the Middle East. So much of our foreign policy today relies on ensuring our stream of incoming oil that it often hampers what we can realistically accomplish ... diplomats and policy makers have to keep this in mind. I was young, but I do remember the OPEC oil embargo of the 70's and have read about what it did to our overall economy (it was a Very Bad Thing™). Getting everyone in the country on board is impossible, of course. But a few individuals here and there can add up pretty quickly. There's an estimated 143 million cars in the US ... if just 2% of them improved their fuel economy by 20% using the MPG assumption above, we're at 286 million gallons/year. And there are additional efficiencies and savings that can also be brought into play to multiply this effect.

How to improve mileage? Drive a little slower. You'll be able to get a good idea of where your best efficiency is ... you're tach will tell you. Lower RPM's is better. If you need to stay higher in the RPM's to keep your speed constant, you're wasting gas. Accelerate smoothly and don't floor it. Don't wait until the last minute to break ... just lay off the gas and coast a bit to bleed speed off. This is actually where you get the best mileage (especially at higher speeds) because your engine will be just above an idle ... and that helps your overall average quite a bit. For example, when using a ramp between highways (say the Tollway to I-10), I see many folks keep their speed up while going up the ramp and braking at the last minute. Rather than doing this, let your speed bleed off gradually ... use gravity and the rolling friction of your tires rather than your brakes.  This actually holds for any offr-amp. To do this, make sure you are looking ahead and planning what you're going to do. This is actually a good safety tip and one they drill into you in the motorcycle training course ... and it's made me a better driver in my truck as well. If you're sitting and waiting for someone, just cut off the engine. Sitting still is 0 MPG, so it's a waste. (Don't do this when you are stuck on I-10 during rush hour, though). And try using the cruise control ... it uses just enough gas to keep a constant speed. That task is a little tough for we humans. Finally, track your fuel economy. Set your trip when you fill up and just do a little math at every fill up. Set a target and think about how you drive to get there.

Some things you can't help that will hurt ... rush hour traffic, for example. That's pretty obvious. The terrain you are driving over also has impact ... hilly vs. flat.  Again, this is obvious. Wind plays a huge factor as well, both crosswinds and head winds. I've driven to San Antonio on 30-40 MPH crosswinds and it used much more gas just to stay at 65 MPH. Tailwinds, of course, have a positive effect.

Before I get going, and before you get the urge to try to accuse me of not practicing what I rant, let me get that out of the way right now. I do drive this way. And I also drive a hybrid vehicle. So, I do practice good fuel economy when I'm in my "cage" (that's a biker term for anything with four wheels and is, or can be, enclosed ... convertibles do count). No, it's not a Prius or anything like that ... it's an '07 Ford Escape. It's an SUV. With that vehicle, I typically average 30-32 MPG -- it has readout on fuel economy average, as well as instant economy, so I am pretty aware of how my gas consumption is currently going. It also has a continuously variable transmission, which help keep my engine RPMs at an ideal rate. With a traditional automatic transmission, you'll have to find this. It also cuts out when it's going slow (like rush hour traffic) and that helps a lot. In the worst rush hour traffic, I'll be on the electric motor 80% of the time ... the engine just jumps in to recharge the battery. This technology is continuing to improve and becoming more and more common

My other vehicle is an '07 Kawasaki Ninja 650R. With this one, I typically average just over 40 MPG. I could get it up to around 50 (and I have) depending on how I choose my gear, especially on highway riding, but there's a reason for it. When on the highway, I try to keep my RPM's in the lower edge of the power band. So ... at 70-80 MPH, I'm in 4th or 5th gear, with the RPM's around 6-7000. If I pop it into 6th, I'd be at around 4-5000 RPMs, which is better for fuel economy. But then, on the highway, I want to have the power and acceleration to escape aforementioned dually. And at the lower RPM's, the acceleration is much lower. In 4th or 5th gear, I can get from 75 to 100 MPH is just about 2 seconds. In 6th gear, it takes about 5 seconds. I like having those couple of extra seconds to escape the crazy drivers ... it's a safety thing. Staying at that RPM also allows me to slow down quicker ... letting off the throttle at that RPM range increases the effect of engine braking. That safety thing, however, is not nearly really a concern in a cage. And most of the cars out there can't accelerate fast enough to make much of a difference anyway.

Now ... if you don't drive in a way that helps increase your fuel efficiency, I don't want to hear you complaining about the cost of gas.



Tags:

Idle Babbling