While I Compile

… I compile my thoughts about programming

The UI programmers (not so) secret weapon

An Example:

Suppose you had software which matches buyers and sellers, and new users are created via a ‘new user’ wizard[1].  Let’s say the wizard has 4 pages for Basic User Info, Review, Processing, and Completion Status Report.  And there are 5 buttons; Cancel, Previous, Next, Run, and Finish.

Cancel is displayed from the Basic User Info, Review, and Processing pages.
Previous is displayed from the Review page.
Next is displayed from the Basic User Info page.
Run is displayed from the Review page.
Finish is displayed from the Completion Status Report page.

This isn’t difficult to manage the display from the events[2] right?

Well … in reality, it doesn’t take much of a change for your simple display functionality to become …

The Problem

Manipulating application display during events quickly turns into a complex, bug riddled, difficult to maintain, mess.

It may not seem like that big of a deal when you have only a few controls and a very simple (or no) workflow.  Actually, you may argue, managing visual display in the events may even seem like the most efficient strategy.  After all, you will never display, disable, or otherwise needlessly manipulate a control, but as your application grows more complex, this approach can leave you tearing your hair out[3].  This is especially true when you have multiple execution paths leading to the same event handlers with different state.

What if you received a change request for the above requirements, where additional info is required for buyers or sellers, resulting in a new wizard page for each, and in addition to that, business sellers require a page to gather Tax Information?

Your complexity has started to grow, resulting in buyers have a workflow of
Basic User Info -> Buyer Info -> Review -> Processing -> Completion Status Report

Individual sellers have a workflow of
Basic User Info -> Individual Seller Info -> Review -> Processing -> Completion Status Report

And commercial sellers have a workflow of
Basic User Info -> Commercial Seller Info -> Tax Info -> Review -> Processing -> Completion Status Report

Notice how these changes lead to different execution paths all arriving at the Review page.  The question then, is; where does the back button on the Review page send the user?  The back button requires logic to know which wizard page to display.

It’s not difficult to imagine the need for logic to be added in more than one place[4], which can result in your display being processed differently from every event.  Not only does this lead to repeated code, but can also create bugs where buttons do/don’t display where they should.

Eventually, as this logic grows increasingly complex, simple, 5 minute, change requests will eventually take you hours to determine there won’t be any side effects.

At this point, developers usually find …

Inadequate Solutions

So how do people manage this?  I mean, beyond having the logic at the control level.

Well, you can put controls for different states on their own panel for their states.  So in the example above, you could add the buttons directly onto the wizard page control[5].  This would ensure the correct buttons, for the panel, are displayed, but it doesn’t solve the ‘Back’ button logic of which page to display.

Or you could add some kind of control grouping mechanism, like a collection of the controls, to display or hide all controls at once.  The problem with this, is what happens when you have functionality which will enable/disable a specific control, or do something else?  Do you add another grouping or another function to manipulate your control group, or do you have that one function do enabling/disabling as well?  What happens if the enabling/disabling logic is different from the display logic?  You also have the problem of one control being in multiple groups; will your groupings need to be called in the order of hide, then show, just to get the control that’s in both groups to display?  And if you do, will you let the control flicker? Or will you add extra functionality to prevent specific controls from hiding in the first place, so it won’t flicker?

Or maybe you can make a centralized function to manage the controls where the display is based on the passed in parameters.  Well this is very close, but no cigar.  Not only are the parameters needless, as I’ll show later, but this won’t completely remove logic from the events and can needlessly increase logic in that function.

The solution most have preferred is to make screens as minimal as possible, with a new screen for each piece of functionality.  This is often effective, at the cost of usability, but sometimes has its own problems like managing state while gathering data from multiple screens to persist it all in one shot.  Not even to mention that this strategy would be ineffective in our example.  Besides, in my experience, you can never really get away from at least some display manipulation.

This is a small example; imagine having a screen with 50 controls, where many of them change state.  Imagine how difficult it would be to manage the display this way.

Hopefully you can see how this can turn into a major pain in the butt.

Well, the good news is this problem has been solved …. Solved a long time ago actually[6].

The solution is ….

The UpdateUI() Method

It’s pretty simple really; just create a single function for the entire page, view, form, or screen (whatever it’s called in your technology).  All this function does, is manage the control display.  Then call this function at the end of every event, after all data manipulation.

It’s pretty simple; there are just …

5 Rules for the UpdateUI() method

1.     Fast – Don’t do anything which isn’t fast.  The state data driving the display decisions should already be calculated.

2.     No parameters – You should be able to make all decisions about display using the current screens state.

3.     No side effects – Don’t manipulate data within this method.  Not only can it slow it down, but it can also lead to inconsistent display evaluations.

4.     Manage each control individually – Don’t hide all controls, then show the controls you want displayed.  This causes flicker and adds unnecessary complexity.

5.     Call it last – call this function at the end of every event.

Code Sample

Here’s a little sample of what the wizard UpdateUI() code might look like.

public void UpdateUI()
{
 basicUserPg.Visible = (CurrentPage == NewUserWizardPage.BasicUserInfo);
 buyerPg.Visible = (CurrentPage == NewUserWizardPage.BuyerInfo);
 individualSellerPg.Visible = (CurrentPage == NewUserWizardPage.IndividualSellerInfo);
 commercialSellerPg.Visible = (CurrentPage == NewUserWizardPage.CommercialSellerInfo);
 commercialSellerTaxPg.Visible = (CurrentPage == NewUserWizardPage.CommercialSellerTaxInfo);
 reviewPg.Visible = (CurrentPage == NewUserWizardPage.Review);
 processingPg.Visible = (CurrentPage == NewUserWizardPage.Processing);
 compeletedPg.Visible = (CurrentPage == NewUserWizardPage.CompletionStatusReport);

 cancelButton.Visible = (CurrentPage == NewUserWizardPage.BasicUserInfo
                     || CurrentPage == NewUserWizardPage.BuyerInfo
                     || CurrentPage == NewUserWizardPage.IndividualSellerInfo
                     || CurrentPage == NewUserWizardPage.CommercialSellerInfo
                     || CurrentPage == NewUserWizardPage.CommercialSellerTaxInfo
                     || CurrentPage == NewUserWizardPage.Review);
 previousButton.Visible = (CurrentPage == NewUserWizardPage.BuyerInfo
                     || CurrentPage == NewUserWizardPage.IndividualSellerInfo
                     || CurrentPage == NewUserWizardPage.CommercialSellerInfo
                     || CurrentPage == NewUserWizardPage.CommercialSellerTaxInfo
                     || CurrentPage == NewUserWizardPage.Review);
 nextButton.Visible = (CurrentPage == NewUserWizardPage.BasicUserInfo
                     || CurrentPage == NewUserWizardPage.BuyerInfo
                     || CurrentPage == NewUserWizardPage.IndividualSellerInfo
                     || CurrentPage == NewUserWizardPage.CommercialSellerInfo
                     || CurrentPage == NewUserWizardPage.CommercialSellerTaxInfo);
 runButton.Visible = (CurrentPage == NewUserWizardPage.Review);
 finishButton.Visible = (CurrentPage == NewUserWizardPage.CompletionStatusReport);
}

This is quite a bit simpler than having the code sprinkled throughout the screen’s logic.

You may also want to move many expressions into their own protected properties. For example, you may want to change …

 cancelButton.Visible = (CurrentPage == NewUserWizardPage.BasicUserInfo
                     || CurrentPage == NewUserWizardPage.BuyerInfo
                     || CurrentPage == NewUserWizardPage.IndividualSellerInfo
                     || CurrentPage == NewUserWizardPage.CommercialSellerInfo
                     || CurrentPage == NewUserWizardPage.CommercialSellerTaxInfo
                     || CurrentPage == NewUserWizardPage.Review);

… to …

cancelButton.Visible = DisplayCancelButton;

Unfortunately, if your screen has a lot of controls, you may end up with a lot of these properties.  Because the state used to make the display decisions is usually in that screens class, it’s difficult to move these expressions into a separate class.  Or at least it is difficult to make a generalized statement about how to do it.

Where to use

This technique can be used in every GUI language or technology I’ve ever worked with.  I’ve used this strategy in Classic VB, C++ (MFC[7], Win32), Classic ASP, PHP, JavaScript, ASP.NET webforms[8] , etc…

This technique can significantly reduce Ajax initiated display adjustments.

Costs

The cost to managing your display this way is a slight increase in computing power required to calculate every display decision on every event.  However, in reality, if your UpdateUI() function causes a performance issue that’s noticeable, you’ve probably missed one of the rules stated above.

In my opinion, the increased simplicity, and cleanliness of you code, more than makes up for the slight increase in processing.

Conclusion

The UpdateUI() method is an efficient and effective way to keep your display manipulation code clean & bug free.  Even if your screen is relatively simple, you still benefit from having all your code in one place, and if it does grow in complexity, it will be manageable.  This technique has saved me countless hours of frustration.

Copyright © John MacIntyre 2010, All rights reserved


[1] These examples are really difficult to come up with, so let’s just go with this.  In reality many screens will be much more complex as well.

[2] I don’t necessarily mean ‘code behind your buttons’, but any logic executed specifically for this button, as a result of clicking it.

[3] For proof, see my avatar.  😉

[4] As if one place isn’t enough

[5] Wizard page controls are on their own panel of course, I mean we’ve all definitely learned that haven’t we?

[6] MFC had UpdateUI() as a virtual method on its main window class in the mid-1990s.  Fortunately, this was one of the only new paradigms Microsoft has ever added which did not undo the solution I came up with in the last paradigm

[7] The name actually comes from the CView::UpdateUI() method in MFC which was a core method for all UI related activity, before MFC I was calling it EnableCtrls().

[8] In ASP.NET WebForms, this function is called Page.PreRender.  However, this solution isn’t the greatest, since any control with an event needs to be populated in Page.OnInit just to wire up events properly.

Thanks to Cory Fowler for reviewing my post.

Advertisements

October 3, 2010 Posted by | C#, Code, Programming, Uncategorized | , | 2 Comments

What is too simple and small to refactor? (Clean Code Experience No. 2)

Introduction

Shortly after reading Robert C Martin‘s Clean Code, I refactored the data access layer from a project I was working on, and was amazed by how much the code improved. It really was night and day. My first clean code refactoring experience was an obvious improvement.

I was still on that clean code high, when a little function entered my life that I was compelled to refactor. This one left me questioning the limits of what I should refactor and if my refactor even qualified as clean.

I’d like to share that second experience with you in this post.

Original Code

Here is the original code. The code was written by and owned by Jim Holmes. Jim was kind enough to give me permission to use it in this post, as I was unable to find an equivalent example. Thanks Jim.

public static float WageCalculator( float hours,
                                    float rate,
                                    bool isHourlyWorker)
{
    if (hours < 0 || hours > 80)
    {
        throw new ArgumentException();
    }
    float wages = 0;

    if (hours > 40)
    {
        var overTimeHours = hours - 40;
        if (isHourlyWorker)
        {
            wages += (overTimeHours * 1.5f) * rate;
        }
        else
        {
            wages += overTimeHours * rate;
        }
        hours -= overTimeHours;
    }
    wages += hours * rate;

    return wages;
}

So as you can see, this simple method calculates a workers weekly pay based on hours worked, their hourly rate, and if they receive overtime or straight pay. There really isn’t much to it, is there?

So why was I compelled to refactor such a simple piece of code?

As soon as I opened this function, I felt it was doing too much. Mostly it was the isHourlyWorker parameter.

Since reading Clean Code, I’ve come to realize boolean and enum parameter types are a huge tell that they should be refactored into separate classes.*

My refactored code

So what did my refactored code look like after spending 30 minutes or so playing with it?

Well, here’s the new class diagram first, so you get some idea what you’re looking at.
Class diagram of refactor results

And here’s the code

public abstract class WageCalculatorBase
{
    public float HoursWorked { get; protected set; }
    public float HourlyRate { get; protected set; }

    public WageCalculatorBase(float hours, float hourlyRate)
    {
        if (hours < 0 || hours > 80)
            throw new ArgumentOutOfRangeException("Hours must be between 0 and 80.");

        HoursWorked = hours;
        HourlyRate = hourlyRate;
    }

    public abstract float Calculate();
}
public class WageCalculatorForEmployee : WageCalculatorBase
{
    public WageCalculatorForEmployee(float hours, float hourlyRate)
        : base(hours, hourlyRate)
    {
    }

    public override float Calculate()
    {
        if (IsOvertimeRequired)
            return CalculateWithOvertime();
        return CalculateWithoutOvertime();
    }

    protected bool IsOvertimeRequired
    {
        get
        {
            return HoursWorked > 40;
        }
    }

    protected float CalculateWithoutOvertime()
    {
        return HoursWorked * HourlyRate;
    }

    protected float CalculateWithOvertime()
    {
        float overTimeHours = HoursWorked - 40;
        return (overTimeHours * 1.5f + 40) * HourlyRate;
    }

    public static float Calculate(float hours, float hourlyRate)
    {
        WageCalculatorForEmployee payCalc = new WageCalculatorForEmployee(hours, hourlyRate);
        return payCalc.Calculate();
    }
}
public class WageCalculatorForContractor : WageCalculatorBase
{
    public WageCalculatorForContractor(float hours, float hourlyRate)
        : base(hours, hourlyRate)
    {
    }

    public override float Calculate()
    {
        return HoursWorked * HourlyRate;
    }

    public static float Calculate(float hours, float hourlyRate)
    {
        WageCalculatorForContractor payCalc = new WageCalculatorForContractor(hours, hourlyRate);
        return payCalc.Calculate();
    }
}

And the code to execute this would be as simple as

weeklyPay = WageCalculatorForEmployee.Calculate(hoursWorked, hourlyRate);

or

weeklyPay = WageCalculatorForContractor.Calculate(hoursWorked, hourlyRate);

Or as flexible as

WageCalculatorBase wageCalculator = WageCalculatorFactory.Get(employeeInfo, hoursWorked, hourlyRate);
weeklyPay = wageCalculator.Calculate();

So as I said earlier, it was the isHourlyWorker parameter which compelled me to refactor. Notice how this parameter no longer exists, and has been replaced by a class for each of the potential values. isHourlyWorker has become the WageCalculatorForEmployee class, and !isHourlyWorker has become the WageCalculatorForContractor class.

Now one of the first questions you may have about the refactor is why didn’t I implement the Calculate method in the WageCalculatorBase class, instead of declaring it as abstract? Especially since both derived classes have identical methods, namely the CalculateWithoutOvertime method in the WageCalculatorForEmployee class, and the Calculate method in the WageCalculatorForContractor class.

What the hell happened to DRY?

I considered doing that, but decided that the Calculate method is such an important and critical piece of functionality, that implementation should not be left to chance. I felt an explicit implementation should be required.

And while we’re on the topic of the CalculateWithoutOvertime() method, you may be asking yourself, why this method even exists? I mean couldn’t CalculateWithoutOvertime(), CalculateWithOvertime(), and the IsOvertimeRequired property have been easily implemented in the Calculate() method?

Yes, as a matter of fact, I could have done it in a single expression, but felt it might be complex enough to warrant a comment, so I added the commenting into the structure, and kept the complexity way down.

Observations

Something else you may notice is the lack of control statements and flow branching. Notice, there are barely any ‘if’s. You may also notice the methods are very small, with the longest method being only 4 LOC, and most are 2.

You may also notice the increase in structural complexity, as noticed in Clean Code Experience #1.

But the real kicker to this refactor is that the original 26 lines of code became 74 when the refactoring was completed (including whitespace and everything). That’s a near 300% LOC increase!

This r-e-a-l-l-y bothered me, and left me perplexed as to if my refactor was foolish or wise.

Good idea?

So was it a good idea to refactor it? Is this clean code? Or was I just over stimulating myself with the refactoring? Was I partaking in refactoring masturbation?

I mean really … this wasn’t a complex function. There was nothing wrong with the function and the amount of code actually increased.

It took me a long time to wrap my head around this, but I finally decided this was good code. This was clean code. This was appropriate if creating code from scratch. This might be appropriate if working on client code.

.. Wait! .. what?

What do I mean, might be appropriate?

Would I do it in real life?

There is an ROI on business software. Unfortunately, software is not merely our art and our craft. Software is an investment. Software is a financial investment. More specifically, it’s not even our financial investment, it’s our employers.

So should you spend 30 minutes refactoring the above code? Is there an ROI worth it?
As much as I want to do that refactor, and believe later maintenance will benefit from it, I seriously question if the ROI would warrant it. I don’t think I would refactor this on its own.

…. Unless I was working on that code as part of my current task. If I’m already working on that code, then yes, by all means, it’s appropriate. It’s appropriate as part of my current task and as an act of craftsmanship to leave the code cleaner than I found it.

If it was my project, would I do it?

Yes, without a doubt I would. But that’s because I see software that I write for myself as more of a vehicle of artistic expression than a financial investment.**

What do you think?

I’d be interested in hearing your thoughts.

Is this clean code?

Is this good design?

Should you refactor code that is this small?

When should you refactor something like this?

What are the guidelines as to when it’s worth it to refactor or not?

* I don’t remember if he said this in the book or if this was my own insight. I don’t’ even know if Uncle Bob would agree with that statement
** My inability to view my own software as a financial investment is also the reason I’ve never released anything awesome and made a ton of money.

Copyright © John MacIntyre 2010, All rights reserved

September 14, 2010 Posted by | C#, Code, Programming | , | 14 Comments

Procedure Like Object Oriented Programming

In a previous post What’s wrong with the Nouns/Adjective/Verb object oriented design strategy, I talked about how verbs should be implemented in their own separate class instead of as a method strapped onto an entity class. In my opinion, it’s an appropriate way to work with processes and pass those processes around, while keeping code flexible, testable, and highly maintainable.

But it has led to comments on Twitter and a link to one of Steve Yegge’s post Execution in the Kingdom of Nouns. Basically, Steve said that turning verbs into nouns was a bad idea (at least that’s what I think he was getting at, there were a lot of metaphors in there :-).

It’s easy to see Yegge’s point of view, if you just leave it at that. After all turning your single line of code accessing those actions

commentData.Insert(cn);

into multiple lines of calling code, when you move the logic into its own class,

using (CommentInsertCommand insCmd = new CommentInsertCommand(cn))
{
    insCmd.Execute(commentData);
}

definitely sucks.

So why not add a static method to the process class so you can access it with a single, procedural like, call?

public class CommentInsertCommand : IDisposable
{
    ....

    public static void Execute(CommentData commentData, int userId, SqlConnection cn)
    {
        ValidateCommentParameter(commentData);
        using (CommentInsertCommand insCmd = new CommentInsertCommand(cn))
        {
            commentData.CommentId = insCmd.Execute(commentData, userId);
        }
    }

    protected static void ValidateCommentParameter(CommentData commentData) {...}
}

This way your call is reduced to

CommentInsertCommand.Execute(data, cn);

I think this has merit and is a clean way to manage your classes.

It brings your object oriented code back to a more procedural level.

One problem I haven’t quite figured out yet is the naming. To be honest, I’m a little uneasy about it. I should probably name it ‘Insert’, but that’s redundant with the class name and I’m not crazy about naming it ‘Execute’ or ‘Run’ either. I chose ‘Execute’ in this example so all XXXXCommand classes would be consistent across the application, and the name is consistent with the SqlCommand naming which is important since this class kind of emulates SqlCommand. However, I’d still love to find a better name.

So, the bottom line is this; why not give all your process classes a procedure like entry point?

Why not give more of our object oriented code a procedural language feel?

Copyright © John MacIntyre 2010, All rights reserved

September 10, 2010 Posted by | C#, Code, Programming | 3 Comments

My Clean Code Experience No. 1 (with before and after code examples)

Public Code Review
Robert C. Martin was kind enough to review the code in this post at on his new blog Clean Coder. Be sure to read his review when you finish reading this post.

Introduction


Clean Code

After expressing an interest in reading Robert C Martin‘s books, one of my Twitter followers was kind enough to give me a copy of Uncle Bob’s book Clean Code as a gift*. This post is about my first refactoring experience after reading it and the code resulting from my first Clean Code refactor.

Sample code

The code used in this post is based on the data access layer (DAL) used in a side project I’m currently working on. Specifically, my sample project is based on a refactor on the DAL classes for comment data. The CommentData class and surrounding code was simplified for the example, in order to focus on the DAL’s refactoring, rather than the comment functionality. Of course; the comment class could be anything.

Download the my clean code refactor sample project (VS2008)

Please notice:
1. The database can be generated from the script in the SQL folder
2. This code will probably make the most sense if you step through it
3. This blog post is about 1,700 words, so if you aren’t into reading, you will still get the jist of what I’m saying just from examining the source code.

What Clean Code isn’t about

Before starting, I want to point out that Clean Code is not about formatting style. While we all have our curly brace positioning preferences, it really is irrelevant. Clean Code strikes at a much deeper level, and although your ‘style’ will be affected tremendously, you won’t find much about formatting style.

My original code

Original "dirty" comment DAL class

Original comment DAL class

My original comment DAL class is in the folder called Dirty.Dal, and contains one file called CommentDal.cs containing the CommentDal class. This class is very typical of how I wrote code before reading this book**.

The original CommentDal class is 295 lines of code all together and has a handful of well named methods. Now, 295 lines of code is hardly awful, it doesn’t seem very complex relatively speaking, and really, we’ve all seen (and coded) worse. Although the class interface does seem pretty simple, the simplicity of its class diagram hides its code complexity.

public static void Create(IEnumerable<CommentData> Comments, SqlConnection cn)
{
    // validate params
    if (null == cn) throw new ArgumentNullException("cn");
    if (cn.State != ConnectionState.Open) throw new ArgumentException("Invalid parameter: connection is not open.", "cn");
    if (null == Comments) throw new ArgumentNullException("Comments");
    foreach (CommentData data in Comments)
    {
        if (data.CommentId.HasValue)
            throw new ArgumentNullException("Create is only for saving new data.  Call save for existing data.", "data");
    }

    // prepare command
    using (SqlCommand cmd = cn.CreateCommand())
    {
        cmd.CommandText = "ins_comment";
        cmd.CommandType = CommandType.StoredProcedure;
        // add parameters
        SqlParameter param = cmd.Parameters.Add("@comment_id", SqlDbType.Int);
        param.Direction = ParameterDirection.Output;
        cmd.Parameters.Add("@comment", SqlDbType.NVarChar, 50);
        cmd.Parameters.Add("@commentor_id", SqlDbType.Int);

        // prepare and execute
        cmd.Prepare();

        // update each item
        foreach (CommentData data in Comments)
        {
            try
            {
                // set parameter
                cmd.Parameters["@comment"].SetFromNullOrEmptyString(data.Comment);
                cmd.Parameters["@commentor_id"].SetFromNullableValue(data.CommentorId);

                // save it
                cmd.ExecuteNonQuery();

                // update the new comment id
                data.CommentId = Convert.ToInt32( cmd.Parameters["@comment_id"].Value);
            }
            catch (Exception ex)
            {
                string msg = string.Format("Error creating Comment '{0}'", data);
                throw new Exception(msg, ex);
            }
        }
    }
}

This method can be simplified dramatically into a more readable style with fewer control statements.

But first, notice how the methods are segmented into line groupings which are similar, with each grouping isolated with a single line of white space both before & after it, plus a comment to prefix most groupings. Each of these groupings is a smell, indicating each should be its own method.

Before reading Clean Code, this was clean to me … this was beautiful code to me.

My new ‘clean’ code

The comment DAL classes after refactoring

The comment DAL classes after refactoring

I’ve got a feeling I missed a lot in this book and will probably end up rereading it several times, but the biggest takeaways from reading it in my first pass were:

Smaller well named classes & methods are easier to maintain and read. You may notice in the Clean.Dal directory, the classes are smaller, with file sizes hovering around the 50 LOC mark. 50 LOC for an entire class, when in the past, only my smallest methods would be less than 50 LOC. I’ve now realized; no code grouping is too small to separate into its own property, method, or even class***. Sometimes it’s wise to refactor a single expression into a property just to label it****.

Here is the equivalent of my new Create method:

public static void Execute(IEnumerable<CommentData> comments, int userId, SqlConnection cn)
{
    ThrowExceptionIfExecuteMethodCommentsParameterIsInvalid(comments);
    using (CommentInsertCommand insCmd = new CommentInsertCommand(cn))
    {
        foreach (CommentData data in comments)
            data.CommentId = insCmd.Execute(data, userId);
    }
}

From the above code, you may notice not only how much smaller and simpler the ‘Create’ method has become, but also that its functionality has been moved from a method to its own smaller class. The smaller class is focused on its single task of creating a comment in the database and is therefore not only easier to maintain, but will only require maintaining when a very specific change in functionality is requested, which reduces the risk of introducing bugs.

The small class / property / method idea extends to moving multi-line code blocks following control statements into their own methods.

For example:

while(SomethingIsTrue())
{
    blah1();
    blah2();
    blah3();
}

Is better written as

while (SomethingIsTrue())
    BlahBlahBlah():

With the ‘while’s block moved into its own BlahBlahBlah() method. It almost makes you wonder if having braces follow a control statement is a code smell, doesn’t it? *****

Also, as part of the small well named methods idea, detailed function names make comments redundant & obsolete. I’ve come to recognize most comments are a code smell. Check this out; my colleague Simon Taylor reviewed my code while I was writing this, pointed out that although my dynamic SQL was safe, colleagues following me may not see the distinction of what was safe, and may add user entered input into the dynamic SQL. He suggested a comment for clarification.

He was absolutely right, but instead of adding a comment, I separated it into its own method, which I believe makes things very clear. See below:

protected override string SqlStatement
{
    get
    {
        return GenerateSqlStatementFromHardCodedValuesAndSafeDataTypes();
    }
}

protected string GenerateSqlStatementFromHardCodedValuesAndSafeDataTypes()
{
    StringBuilder sb = new StringBuilder(1024);
    sb.AppendFormat(@"select	comment_id, 
                                    comment, 
                                    commentor_id
                        from		{0} ",
                    TableName);
    sb.AppendFormat("where      Comment_id={0} ", Filter.Value);
    return sb.ToString();
}

Not only is this less likely to go stale, it will also clearly identify exactly what is going on both at the method declaration and everywhere it is called.

Moving control flow to the polymorphic structure is another technique to achieve clean code. Notice the ‘if’s in the ‘Clean.Dal’ version are pretty much reserved for parameter validation. I’ve come to recognize ‘if’s, especially when they deal with a passed in Boolean or Enum typed method parameters as a very distinct code smell, which suggests a derived class may be more appropriate.

Reusable base classes are also a valuable by-product of writing clean code. A reusable class library is not the goal in itself, in an anti-YAGNI kind of way, but is instead a natural side effect of organizing your code properly.

Clean code is also very DRY. There is very little if any duplicated code.

The structure is 100% based on the working, existing, code, and not on some perceived structure based on real world domain is-a relationships which don’t always match up perfectly.

One more thing, unfortunately not everything is going to fit nicely into the object oriented structure, and sometimes a helper class or extension method will be the best alternative. See the CommentDataReaderHelper class as an example.

A closer look

Here’s a quick overview of class hierarchy, we’ll use CommentUpdateCommand as an example:

New CommentUpdateCommand class

New CommentUpdateCommand class and its inheritance hierarchy.

One of the first things you may notice about this class, is there only one entry point; the static method Execute(). This provides a very simple and obvious way to use the class. You can place a breakpoint in this method and step through the class and its hierarchy to fully understand it.

The class hierarchy is based completely on existing working code and was designed to share functionality effectively without duplicating code. Each class is abstracted to the point it needs to be, and no more, yet, the duties of each class in the hierarchy is crystal clear as if it was designed from the beginning to look like this.

Here are the classes,
The AppCommandBase class manages an SqlCommand object.
The ProcCommandBase class executes a given stored procedure.
The CommentCommandBase class shares functionality specific to the Comment table
The CommentUpdateCommand class implements functionality specific to the comment update stored procedure and hosts the static entry point method ‘Execute’.

The cost of writing Clean Code

When I started writing this, I didn’t know there were any downsides to writing clean code, but once I began this post, and started gathering evidence to prove how awesome the clean code strategy was, some evidence ruled against my initial euphoria. Here is a list of observations which could be looked upon unfavorably.

Increased LOC – This was really pronounced in a later experience (Clean Code Experience No. 2 coming soon), but I actually thought the LOC was decreased in this sample. At least once you take into account the reusable base classes … but it wasn’t. While I haven’t done an in depth analysis of the LOC, it appears that the code specific to the comments, before taking into account any abstract base classes, has a similar LOC count as the Dirty.Dal class. Now much of this is from structure code, like method declarations, but it was still disappointing.

Increased structural complexity – I suppose this should have been obvious, but it didn’t occur to me until writing this blog post, the complexity didn’t really disappear; it’s just been moved from the code to the polymorphic structure. However, having said that, I suspect, maintaining Clean Code with a vigorous polymorphic structure would be lower cost than a traditional code base.

Method call overhead
– With all the classes, properties, and methods, it has occurred to me, that method call overhead may have a performance trade off. I asked Uncle Bob about this on Twitter, and his reply was “Method call overhead is significant in deep inner loops. Elsewhere it is meaningless. Always measure before you optimize.”. Pretty good advice I think.

But even after realizing the trade offs, Clean Code is still, clearly, the way to go.

In Summary

The Clean Code book has changed my outlook on how I write, and I think everyone should be exposed to this material. I’ve already re-gifted my Clean Code book, and the drive to write this blog post comes from a burning desire to share this information with my new colleagues.

I would love to hear your input on this blog post, my sample code, and the whole Clean Code movement. Please comment below.

Reminder
Don’t forget to read Uncle Bob’s review at on his new blog Clean Coder.

* Why did somebody I’ve never met send me a book as a give? Because he wanted me in the C.L.U.B. (Code Like Uncle Bob). And to answer your next question, yes I have since re-gifted the book.
** Very typical of how I wrote code before reading Clean Code … with the exception of some pretty nasty exception handling code. I removed it not due to its ugliness, but because its ugliness may have taken away from the code logic refactoring that I am trying to emphasize.
*** And when I say no code is too small, stay tuned for Clean Code Experience No 2, which was so unconventional, it left me questioning if I’d taken this concept way too far.
**** Although to be fair, I my first exposure to this idea was from Scott Hanselman after reading his first MVC book (this link is to his second MVC book).
***** Wrap your head around that, those of you who feel a single line following a control statement should have braces. LOL

Copyright © John MacIntyre 2010, All rights reserved

PS-Big thanks to Ben Alabaster to pushing me to write this post.

August 24, 2010 Posted by | Code, Programming | , , , , , | 8 Comments

Visual Studio Bug – ‘if’ followed by a try / catch causes debugger stepping error

Yesterday I was debugging and stepped into a method. I wanted to get past my parameter validation checks and into the meat of the method, so I quickly, F10’d my way down the method, but I noticed a line of code was stepped on which should not have been touched.

The code was a simple parameter validation like:

if (enumerableObj == null)
    throw new ArgumentNullException("enumerableObj");

with several similar parameter validation lines above it and a try/catch block containing the meat of the method below it.

The odd thing was, I thought I saw the debugger step on the throw statement even though the enumerableObj should have had a value.

I assumed I had somehow passed in a null value to the enumerableObj parameter and had nearly missed the problem in my haste. I had been moving quickly, so quickly in fact that I had stepped about 3 more lines into the method before I even stopped. To be honest at this point, I wasn’t even sure if I saw it step into the ‘if’ block, so I repositioned my debug cursor back to the ‘if’ condition, and stepped again. Sure enough, it stepped into the ‘if’ block.

I assumed I passed in a null parameter, but when I evaluated enumerableObj, it was set, what’s more, evaluating the entire enumerableObj == null expression resulted in false, as expected. But why the heck was I being stepped into the ‘if’ block when the ‘if’ condition was false?

I retried it again, just in case the enumerableObj had somehow been set as a result of a side effect somewhere, but even then, it still stepped into the ‘if’ block.

So, I did the standard stuff; cleaned my solution, deleted my bin and obj directories, reopening the solution, restarted Visual Studio, & rebooted, all the while rebuilding and retesting the project with each change. Nothing seemed to work. I even cut & pasted my code into notepad, then cut & pasted from notepad back into Visual Studio to ensure there was no hidden characters in my files.*

None of this worked, so I started commenting out code in the method, and eventually was able to isolate it to the above code failing if, and only if, it was followed by a try / catch block. Seriously! If the try / catch block was there, it would step onto the throw statement even though it should not have, but when you removed the try / catch block, everything worked just fine.

In order to isolate the problem for debugging purposes and to have something I could ask for help with, I isolated the problem to a simple command line app.

using System;
using System.Collections.Generic;

namespace IEnumerableBug
{
    class Program
    {
        static void Main(string[] args)
        {
            try
            {
                List lst = new List();
                lst.Add(999);
                IsIEnumerableNull(lst);
                Console.WriteLine("Post IsIEnumerableNull() call");
            }
            catch (Exception ex)
            {
                Console.WriteLine("-- Error --");
                Console.Write(ex.ToString());
            }
            finally
            {
                Console.WriteLine("\n\nPress any key to exit...");
                Console.ReadKey();
            }
        }

        public static void IsIEnumerableNull(IEnumerable enumerableObj)
        {
            if (enumerableObj == null)
            {
                // this should never be stepped over, but is (at least for me)
                // however execution moves to the next line once this is thrown instead of interupting execution
                // Also, a breakpoint on this line is never hit, as you would expect.
                throw new ArgumentNullException("enumerableObj");
            }

            // commenting out this line of code will cause the debugger to step through
            // the above control statement properly
            try {} catch { }
        }
    }
}

Once I got it isolated, I also realized that even though it was stepping onto the throw statement, it didn’t actually jump to the appropriate catch block, it just continued on it’s way to the next line of code. This is why I was able to F10 3 lines past the throw the first time I stepped on it.

Anyway, I threw the code up on Pastebin.com and asked my Twitter followers to see if the problem was repeatable for them. After the standard Twitter, 140 character limit, miscommunication confusion, everybody came back with a negative result. “It works fine on my machine.” … typical 😉

Stepping through the debugger with the throw statement, line of code, highlighted

Close up, click for a screen shot of the entire IDE

At this point I was wondering if it’s some wonky bug only on my machine or if I hadn’t gotten enough sleep and I was doing something so incredibly moronic that I was about to publicly brand myself as an idiot.

Fortunately 2 things happened at that point; 1) I remembered one of my old laptops had VS installed on it, and I was able to repeat the problem on that computer, and posted a screenshot, so people wouldn’t think I completely lost it.

And 2) @james_a_hart came back telling me that he could repeat it and that it was a problem with the way Visual Studio is stepping through the debugger symbols. He also reduced the problem a lot further than I had. I had been so caught up in the problem, I hadn’t even realized how much more the test command line app could be simplified.

using System;

namespace IEnumerableBug2
{
    class Program
    {
        static void Main(string[] args)
        {
            if (new object() == null) 
                throw new Exception(); 
            try { }  catch { }
        }
    }
}

I don’t know who James works for, but I think he has a pretty sophisticated VS virtual machine setup, because he was able to confirm the problem only happened on 64bit Visual Studio 2008 and 2010.

Oh and for the record, in my testing, this only seemed to happen only with a throw statement. Switching the throw statement with a Console.WriteLine(), worked as expected.

using System;

namespace IEnumerableBug2
{
    class Program
    {
        static void Main(string[] args)
        {
            if (new object() == null) 
                Console.WriteLine("Equals NULL");
            try { }  catch { }
        }
    }
}

It was a heck of a day, and a huge waste of time, but I’m just glad I now know what the problem is.

I want to thank everybody who helped me by brainstorming with me on this, or running my code to check if they could repeat the problem; @james_a_hart, @JMBucknall, @dullroar, @SteveSyfuhs, @InfinitiesLoop, @klmr, @SyntaxC4, @CarolFil, and @brianrcline.

* As I understand it, cs files are plain text, so this was unlikely to be the problem, but I felt it was an assumption that I should test.

July 2, 2010 Posted by | C#, Code, Programming | , , , | 1 Comment

How To Get The Most Frequently Used Column Values

Whenever I import external data, integrate to another database, or am new to a project, I need to get familiar with the database. The table schemas, relational integrity, and constraints are the first thing I look at and take me a long way, but soon I need to know what the data looks like.

In an ideal world, relational integrity and database constraints would define control this, and all I’d really need to do is look at those. But the reality is, in 15 years of working in this industry, most of the databases I’ve worked on, that I didn’t design, have barely used constraints and some haven’t even used relation integrity fully!

The need to get a good feel of the data is even more prevalent when working with dirty data, or when refactoring poorly written applications to ensure any refactoring doesn’t introduce other issues. I will usually wind up writing the following query repeatedly:

Select	column_name, count(*)
From	table_name
Group by column_name
Order by count(*) desc, column_name

This little query often reveals; inconsistencies between data and the application, where an application sets column X to possible values of ‘A’, ’B’, ‘C’, ‘D’, or ‘E’, but in reality, there may be zero ‘C’ and ‘E’ values in that column, but there is 6 ‘X’s, 1 ‘Q’, and an ‘?’. Or I may find that there are only 6 rows with data in that column, out of almost 3 million rows, indicating the column / application feature is unused.

Anyway, yesterday I finally wrote a little stored procedure which will print out the most frequent N values in each column for a specified table.

/*
Purpose : Retrieves the top N most frequent values in 
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;each column of the specified data table.
Copyright © Jaten Systems Inc. 2009, All rights reserved
http://www.jatensystems.com/
https://whileicompile.wordpress.com/
http://www.johnmacintyre.ca/
*/
create proc jaten_diag_GetTableContents
(@tblName nvarchar(200),
@rowCount int)
as
begin
	declare @colId int;
	declare @colName nvarchar(100);
	declare @sql nvarchar(2048);
	declare @tblRowCount int;

	-- cursor to get columns
	DECLARE curColumns CURSOR FAST_FORWARD 
			FOR	SELECT	colid, [name] 
					FROM	syscolumns
					where	id=object_id(@tblName)
					order by colid
	open curColumns;

	-- get table row count
	set	@sql = 'select @retCount=count(*) from ' + @tblName;
	exec sp_executeSQL @sql, N'@retCount int OUTPUT', @retCount = @tblRowCount OUTPUT;

	-- print table header
	print '';
	print '---------------------------------';
	print '--- ' + @tblName;
	print '--- Row count : ' + cast(@tblRowCount as nvarchar)
	print '---------------------------------';

	-- get info for each column
	fetch next from curColumns into @colId, @colName;
	while 0 = @@fetch_status
	begin
			-- print column header
			print '';
			print '---------------------------------';
			print '--- Column [' + cast(@colId as nvarchar) + '] - ' + @colName + ' ---';

			-- compile &amp; execute grouping sql
			select	@sql = 'select	top ' + cast(@rowCount as nvarchar) 
							+ '		count(*) as [count], '
							+ '		cast(((count(*) * 100)/' + cast( @tblRowCount as nvarchar) + ') as nvarchar) + ''%'' as [Percentage], ' 
							+ '		[' + @colName + '] as [col_value] ' 
							+ 'from	' + @tblName + ' ' 
							+ 'group by [' + @colName + '] ' 
							+ 'order by count(*) desc, [' + @colName + ']';
			exec sp_executeSQL @sql;
			--print @sql;

			-- next
			fetch next from curColumns into @colId, @colName;
	end

	-- clean up
	close curColumns;
	deallocate curColumns;
end

Please note 2 things :
1. You need to run it with ‘Results to Text’ or ‘Results to File’ setting.
2. The table parameter will need square brackets if the table name uses unconventional characters.

If you create it and run it in AdventureWorks on the ‘Production.Product’ table

exec jaten_diag_GetTableContents 'Production.Product', 5

… you will get these results

———————————
— Production.Product
— Row count : 504
———————————

———————————
— Column [1] – ProductID —
count Percentage col_value
———– ——————————- ———–
1 0% 1
1 0% 2
1 0% 3
1 0% 4
1 0% 316

(5 row(s) affected)

….

———————————
— Column [6] – Color —
count Percentage col_value
———– ——————————- —————
248 49% NULL
93 18% Black
43 8% Silver
38 7% Red
36 7% Yellow

(5 row(s) affected)

….

Notice how the Color column reveals that almost half of the products do not have a color setting? This could imply relevancy or this data possibly has a problem being maintained. But also, notice how unique columns will obviously provide meaningless data.

The AdventureWorks database is a very clean database, so this example is a bit contrived, but in the real world, there are plenty of databases where this little procedure will allow you to get some insight into the data.

How do you get familiar with new data?

Copyright © John MacIntyre 2009, All rights reserved

WARNING – All source code is written to demonstrate the current concept. It may be unsafe and not exactly optimal.

February 8, 2009 Posted by | Code, SQL Server | , , | Leave a comment

How To Guarantee Dependent JavaScript Files are Included

I’m very much a statically typed kind of programmer and knowing that missing code will not be found, until executed, and then only when it hits that missing code … well, lets just say it makes me … uneasy … actually, I’m nervous as heck every time I run it! I feel like my app is held together with duct tape! …. a house of cards waiting to collapse with the next gentle breeze.

When I started programming in JavaScript, this really bothered me. In addition to my ‘uneasy’ feeling, there was the constant aggravation of missing dependencies. This was more than uneasiness; this was an irritating thorn in my side. It was again only compounded by the fact, that if there were 5 missing dependencies, I would only find them one at a time, and only if I happened to be so luck as to covered it’s reference in my GUI testing!

Wouldn’t it be nice if my code would tell me on the first run ever dependent file that was missing?

I thought so; so I came up with this little trick to test if a JavaScript file is included already. Basically, you create a variable at the top of the JavaScript files which others depend on. I try to keep the name obvious, and weird enough to avoid collisions.

var include_ajax_utility = true;

Then in all the files which are dependent on this script (the ajax utility in this case), I add the following code to check.

//----------------------------------
// Check for dependencies
//----------------------------------
try
{
	var fileName = "mypage.events.js";
	if( "undefined" == typeof(include_ajax_utility))
		alert( "JavaScript file '" + fileName 
			+ "' is missing dependent file"
			+ " : ajax.utility.js");
}
catch(e)
{
	alert( "Programmer Alert : At least one dependent”
		+ “ file has not been included.");
}

Notice the code is checked to confirm the variable is set (in other words it exists), and if it doesn’t, an alert is displayed to the user indicating exactly which dependent file is missing. [As a side note, the try catch block doesn’t appear to be necessary, but still … I like to cover all my bases]

There are a couple drawbacks to doing this however; a) you may get multiple messages saying practically the same thing, and b) the JavaScript script tags need to be in order! Yes, the ordering of tags can be frustrating, if you insert a new dependency into existing code, only to have this break, and this is enough to make most people quickly abandon this idea. But for me, it’s a small price to pay for a giant step forward in having a sense of stability.

What do you think? A little too anal? 😉

Copyright © John MacIntyre 2009, All rights reserved

WARNING – All source code is written to demonstrate the current concept. It may be unsafe and not exactly optimal.

February 6, 2009 Posted by | Code, JavaScript | , , , | 2 Comments

How To Write Dynamic SQL AND Prevent SQL Injection Attacks

One of my pet peeves is when general rules are taken as gospel, and declared as the only acceptable practice regardless of the circumstance.

One of the big ones is Dynamic SQL. There’s a heck of a good reason for this, and it’s called an SQL Injection Attack, and if you are not familiar with it, I would strongly urge you to leave this post right now, and read up on it.

Anyway, Dynamic SQL is not inherently evil, it’s the appending of user entered text that is evil. Appending user entered text is just lazy and can be easily avoided with parameterization.

The trick is to create dynamic SQL with parameters.

Interestingly, I’ve never seen anybody else do this. I am constantly hearing people recommending stored procedures … even when are clearly not flexible enough to meet the required functionality. Don’t get me wrong, stored procedures have a lot of benefits, but flexibility isn’t one it’s popular for.

And now for some code …

I created a console app which queries the SQL Server sample database AdventureWorks. The following static method was added to the Program class.

public static int GetOrderCount(string productPrefix, SqlConnection cn)
{
	// initialize SQL
	string starterSql = "SELECT count(*) FROM Production.Product";
	StringBuilder sbSql = new StringBuilder(starterSql);

	// add parameters
	if( !String.IsNullOrEmpty(productPrefix))
		sbSql.Append( " where [name] like @namePrefix");

	// initialize the command
	SqlCommand cmd = new SqlCommand(sbSql.ToString(), cn);
	if (cmd.CommandText.Contains("@namePrefix"))
		cmd.Parameters.AddWithValue("@namePrefix", productPrefix + "%");

	// get count
	return Convert.ToInt32( cmd.ExecuteScalar());
}

Basically, the function queries the number of orders where the product name starts with a certain prefix.

The strength of doing this via dynamic SQL is we only need to filter on the product name when a valid prefix parameter is passed in. So, if the optional parameter (productPrefix) exists and is valid, the filter condition is added to the SQL and the parameter is added to the SqlCommand object.

In this overly simplified example, we could manage the same thing by just setting the productPrefix variable to the ‘%’ wild card, but then we’d be doing a filter for nothing. Not to mention things might be a little more difficult if the operator were ‘equals’ instead of ‘like’, or if there were multiple optional parameters. Creating SQL dynamically means we don’t need to write some funky kludge and our SQL is always nice, simple, and doing minimal work.

To execute my function, I added the following code to the Main(…) method.

// get total count
Console.WriteLine( "There are {0} products in total.", 
			Program.GetOrderCount( null, cn));

// get totals for various prefixes
string[] prefixes = new string[6] { "a", "b", "c", 
						"'; drop table Production.Product;--", 
						"d", "e" };
foreach(string prefix in prefixes)
{
	Console.WriteLine("There are {0} products"
			+ " prefixed with '{1}'.",
			Program.GetOrderCount(prefix, cn), prefix);
}

First we call GetOrderCount(…) without a name prefix to test it without the parameter, then we traverse the array of possible prefixes (this would be the user entered data in a real app). Notice the fourth item? Pretty menacing eh? Don’t worry, it’s safe.

Here are the results

There are 504 products in total.


There are 3 products prefixed with ‘a’.
There are 4 products prefixed with ‘b’.
There are 12 products prefixed with ‘c’.
There are 0 products prefixed with ”; drop table Production.Product;–‘.
There are 3 products prefixed with ‘d’.
There are 9 products prefixed with ‘e’.


Notice the ‘d’ and ‘e’ prefixes were searched, and items found, proving the ‘drop table’ statement was not injected into the command.

You’d be surprised how much I use this. Many of my objects have a static GetList(…) method, and this method usually has multiple overloads. Keeping with the DRY principle, I prefer to keep all my logic in one place, so this method will usually have one overload with every possible filter parameter, and all the other overloads will just call this one. Surprisingly, the overload with the code, is not overly complex, and is actually pretty simple.

What do you think? Will you use parameterized dynamic sql in the future?

Copyright © John MacIntyre 2009, All rights reserved

WARNING – All source code is written to demonstrate the current concept. It may be unsafe and not exactly optimal.

February 5, 2009 Posted by | C#, Code, Security, SQL | , , , , , | 1 Comment

   

%d bloggers like this: