UI design for a better user experience. I found a goodie…

I’m currently doing UI analysis on several apps. Hopefully some of them will be migrated.
Being a core developer I must confess I get a little bored if I don’t see code or try a new framework, or implement a pattern; but I guess this is a good opportunity to put myself in the end user’s shoes and easy their lives, if I can…
Coming from a Windows environment, my first apps were always MDIs. Now most development efforts are moving away from MDIs as they are considered very complex and overwhelming, requiring a high learning curve. See most disadvantages here.

I found a free goodie and wanted to share it: Designing Interfaces: Patterns for Effective Interaction Design

I hope you find it as interesting as I did.

There is also this very good website that documents most of the GUI design patterns, with emphasis on Web: Patterns in Interaction Design

Enjoy!

And wish me luck so I can go back to try new development frameworks soon and they bring the usability expert and designer on board :-p

Have fun while learning, are you certifiable?

I got this ad from a marketing person. I wasn’t going to post it as I’m not very fond of sales and marketing people.
I tried the game and found it cute, it’s similar to the contest that MSDN Canada launched a few years ago, the Last Developer Standing. I was eliminated quite fast on that one :-p

If you like games, you might enjoy this one.

Check out the game at www.areyoucertifiable.com. Enjoy!

Second shot certification exam and discount on e-Learning…

I just got this from a Microsoft marketing representative. I know I’ll make use of the second shot quite soon, hope it helps you too:

Register for Second Shot today and get an up to 90% price discount on one Microsoft Official E-Learning Collection
Second Shot provides a free retake if you fail your Microsoft Developer Certification exam. And from now, until June 30, 2009, if you register for Second Shot, you get any Developer E-Learning Collection for just USD $35 (usually priced up to USD $349). That’s up to 90% off.

But hurry! Once you have activated a discounted E-learning collection it will only be available for 90 days.

Sign up for the Second Shot offer today

Excuse me, Emir Maktoum, could I please debug this small hello world app on your Surface?

The race for the multi-touch input devices has begun, unfortunately as of today, the prices are unaffordable for the regular consumers and developers that are tied to project budgets.
We all remember the GUI from Minority Report and the attempts CSI Miami* is making nowadays to show these input devices as regular, day to day available hardware.
Most of us, developers, know it will take a while till the hardware becomes available/affordable in enterprise or end consumer applications.
But the time might be around the corner.
Hopefully it will be like the IBM PC and the PC-clones stories, third party companies were able to clone the IBM PC’s hardware and lower PC prices. That along with MS-DOS being distributed as a separate product, made the PCs widely affordable. No more architectures based on Zilog Z80 or Motorola 6508 (my highschool classmates played with Z80/8085 assembler back in the 80s), Intel 8085 quickly became the processor of choice, then the 16 bits 8088, 32 bits 80386 and 486 and all AMD clones.
Anyway, my point is, let the hardware be cloned, lower the prices and sell the WPF, XNA, Surface.Core APIs as a separate products. **


I just came from a Metro Toronto User group presentation about Microsoft Surface and played tetris with a few other group members. The presenter gave a great introduction to multi-touch devices and how to start programming for the MS Surface using WPF.
Unfortunately for me, I would have to email my hello world app to Dubai for debugging purposes.
It seems the price range of the Surface is about 15 to 20K. It is now used in a Casino in Vegas, some banks in Switzerland and widely spread in Malls in Dubai.

Another multi-touch/multi-user device is the incredible Perceptive Pixel Wall with range prices in the order of 100K. I wish I could get a hold of the development platform for this 100K device :-p What the heck are they using?

So far all of these technologies are under a pretty lip tight policy…It’s proprietary, can’t touch this. As a difference b/w the MS Surface and PerceptivePixel is the pressure perception. MS Surface being based on infrared rays and an arrays of cameras cannot perceive pressure.

Video

IPhones from Apple, even though they are not multi-user, do recognize a few types of touches and are more affordable, if you don’t sign up with Rogers. I believe they use the Objective-C Cocoa framework as development API which would be pretty hard to kick off for a Java or .NET developer coming from a managed environment. I remember the alloc and mallocs from C and get chicken skin…what’s wrong with being spoiled with Garbage Collection?

Enough about random thoughts, I wish I could have played Tetris for longer…


*
I’m not a regular of the show, but my mom asked me once if I developed apps like that on my workplace…Not mom, not quite yet.
**Let’s see if they listen :-p

Identifying performance bottlenecks on a .NET windows app. Part II Using Native Images with CAB, reviewing Fusion Logs

We left off on the previous post with a newer version of NHibernate and a different mapping that avoided the byte per byte comparison of our byte arrays, however our application start up was slower, about 20 seconds and showing some screens for the first time was taking 10 seconds, not acceptable.

The performance decrease was gone but the start up was not good enough.

We got our hands on ANTS profiler again to see what was going on whenever we invoked a screen for the first time:

CPU usage:

Jitted Bytes per second:

and IO Bytes Read:

From these images we deducted there was quiet some Just-In-Time compilation going on when the screen was loaded. How to solve that? Using Native Images for our assemblies in order to avoid JIT compilation, see this MSDN article for this.

All in all that was quite easy to narrow down, we used NGen, installed the native images and voila!, let’s profile again…

I wish it were that quick, we kept seeing JIT peaks :-O

Alright, let’s use some heavier artillery and see why it’s still JITting.

This is where we got our hands on Fusion logs. Fusion is the engine (DLL) in charge of loading and binding assemblies. The Fusion Log Viewer is the tool to see the logs for this DLL and troubleshoot loading problems. This tool is part of the SDK and can be downloaded from here. We aware that it’s a heavy download. In order to use this tool once the SDK is installed:

1. Open in Fuslogvw.exe in folder C:\Program Files\Microsoft Visual Studio 8\SDK\v2.0\Bin
2. If it shows up any entry click on the list box click on Delete All.
3. Click on Settings and choose Log all binds to disk and check Enable custom log path
4. And in the Custom log path edit box type C:\FusionLog
5. In C: drive create a new folder and name it FusionLog
6. Now run the application and execute scenarios where we are seeing JIT-ing
7. Now when you browse to C:\FusionLog you would see couple of folders.

We were unable to install the SDK in our production clients, so we ended up doing a registry edit in order to collect the logs. If you don’t want to install the SDK, do the following:

1) Go to regedit
2) HKEY_LOCAL_MACHINE\Software\Microsoft\Fusion
3) Click on the right pane and new -> string value
4) Name it LogPath,click it in the value write C:\MyLog
5) Again right click the right pane
6) go for new DWord value,name it ForceLog
7) click it and give Value “1”
8) Then create a folder in C drive with the name MyLogs
9) Run the app and logs will be created

The logs are created as HTM files in the folder you decide. reviewing our logs we found out one of our main modules wasn’t loading from its native image although the native image was on the native image cache. Why?

Let’s give some more background information, we use CAB.

The Composite UI Block from Patterns and Practices had a main release on December 2005, there’s been other releases for WPF and the most recent Prism project, but apart from the Smart Client Factory addition, the CAB framework has stayed pretty much the same for Windows Forms.

CAB is known for its Module Loader Service and was highly welcomed by windows developers as a framework that allows loose coupling with it’s Event Publishing/Subscription mechanism, it’s Services module and its MVP implementation.

All that is very good for the developer and for maintainability but the performance is not the greatest if you have quite a few publications and subscriptions going on and if you have a few modules loaded at start up. There are quite a few posts regarding this on CodePlex’s CAB forum.

I could go on and on about the beauty of CAB and despite its performance issues, I do believe it offers more advantages than disadvantages to the windows developer. IMHO, being able to give modules to develop to different teams and being able to plug them into the application without any major compilations, only a configuration change is a big big plus, see these posts on CAB Module Loader Service (CAB Modules on Demand) and Dynamically Loading Modules in CAB)

The main reason for this module not loading from its native image is due to the Reflection mechanism currently used in CAB’s Module Loader Service:
(namespace Microsoft.Practices.CompositeUI.Services)
assembly = Assembly.LoadFrom(file.FullName);

More information on Cook’s archives

Codeplex community member Mariano Converti was prompt on offering a solution on his blog.
How To: Use the Ngen tool to improve the performance in CAB / SCSF applications

As to the date of this post, this code change hasn’t been incorporated into any CAB release, they should do it soon though.

Happy performance troubleshooting!

Identifying performance bottlenecks on a .NET windows app using Windows Debugging Tools and ANTS Profiler. Part I: NHibernate byte[] types

This is a curious case that led me to discover and use a very valuable tool ANTS Profiler and read a few good blogs about .NET debugging and CLR internals. Read on to bookmark with me.

Near to Christmas we received a complain one of the windows applications was performing too slow after a few hours of usage. Performance monitor counters indicated the performance problem lied on high CPU peaks sustained for a long period of time.

.Net memory counters were somewhat fine, no increase on allocated bytes or overall memory consumption, no high IO reads, no high network usage…apparently the application was just doing its stuff, but for a long time, and each time longer…

First thing that came to our mind was an infinite loop, however the curious part on this case is that the CPU peaks took longer the longer the end user worked on the application and began to be noticeable after a couple of hours, not quite the definition of an infinite loop.

If only had we had a better CPU, had the performance downgrade been noticeable after more hours. This is something we had to be thankful for, bad CPU, less time to reproduce the problem. This was one of the typical production only problems too :-p

Long sustained CPU peaks, how we dug down on the cause:

First we grabbed the free debugging tools (insert the obvious reasons here, budget, management approval, etc): CLRProfiler, winDGB, SOS and ADPlus.

Two great blog posts about how to start with these tools can be found here (Speaking Of Which) and here (Maoni’s blog).

MSDN Magazine also has two good articles (Bugslayer column and this CLR Inside Out column) on the subject of windows debugging tools and how to use them in VS 2005.

Back to our own experience on the matter, CLRProfiler hung the machine beyond response and despite being able to sketch the object graph in memory, it was hard to correlate the time of the high CPU peaks with the information obtained from CLRProfiler.

This was not due to a problem with the tool itself, hanging was due to poor iron power and our over-consuming application and the inability to detect the main CPU usage cause was due to the fact that CLR profiler is only meant to identify and isolate problems related with garbage collection, excessive long lived objects or huge collections.

At first we thought the high CPU could be related with garbage collection due to long lived objects, see this post on Tess’s blog If broken it is, fix it you should.

We collected memory dumps with ADPlus during the high CPU peaks as per this lab blog post and analyzed the memory dumps using WinDBG.

At the end we decided to have more control on when the dumps were taken and use WinDBG while attaching it to the process. I should also mentioned ADPlus ended up generating dumps with errors when the system was really stressed.

Instructions to take dump via WinDBG :
1. Run the application.
2. Open up WinDBG. Click on File ?> Attach To Process ?> Select the process ?> Click on OK.
3. WinDBG attaches to the process and waits on the command line. Press ‘g’ and hit enter. ‘g’ is for letting the
application run.
4. Now whenever you want to take a dump, Hit Ctrl+Break in WinDBG. Now, type : .dump /ma C:\Dump1.dmp
This will take a dump.
5. Press ‘g’ and hit enter for the process to resume.

WinDBG can give valuable information about the CLR stack at the time the dump was collected (clrstack command), the types being scheduled for GC (!finalizequeue) and how many types marked for finalization belong to Gen 0, Gen 1 and Gen2.

Seeing your managed stack at a single point in time or having exact information about the memory allocation does not give information on the amount/% of CPU time each method takes though.

We tried taking dumps with WinDbg at the beginning of the CPU peak, in the middle and at the end but the results only offered a hint, too many Collections were allocated and lived to Gen 2. Some of this collections were byte arrays. It wasn’t apparent from analyzing the three managed stacks (from the three memory dumps) which method was consuming the longest time.

So far we had lots of collections surviving to Generation 2 and some of them were of type byte array. Garbage collection counters, however were within the “normal”.

If the application was just “busy” doing its stuff, where was this time spent? Data Binding? Event Brokerage? Database access latency and query performance had been already discarded with SQL Profiler btw.

The main sustained CPU peak cause was discovered using ANTS profiler. Memory leaks, long GC cycles were discarded using the mentioned free tools.

ANTS profiler will let you set .NET performance counters and it will attach itself to the application being debugged. You cannot set breakpoints, afaik, but can go back in the profiler results and drag your cursor over a region to get a full called stack walk. It also goes beyond that and will indicate the % of time each method is taking on CPU and the % of time its children take on CPU usage.

Finally! A tool that will correlate performance counters the called stack for you and will indicate % of CPU time per method. This information you cannot gather by taking memory snapshots or called stack snapshots, unfortunately the free tools were only useful to discard memory leaks and GC related problems on this particular case. They narrowed down the places to look into.

As you can see from the ANTS Profiler screen shot the application was indeed doing stuff, in this case comparing collections of bytes, byte per byte…Ouch!

We were able to identify the Collection comparison problem (byte[] arrays were being compared when the
NHibernate session was flushed and persisted even when they didn’t changed). We correlated this with a
fixed NHibernate bug:
http://jira.nhibernate.org/browse/NH-1246
and changed our mapping attributes to indicate there was no need to update the BinaryBlob fields. Our application either inserts the binary data or deletes its.

Note: you should be logged into http://jira.nhibernate.org/ before navigating to this bug report, registration is free.

Our NHibernate version and mapping strategy contained the buggy bits…

I hope this post hasn’t turned out too long, by upgrading NHibernate we solved the mystery of performance downgrade over time, the more the user worked with persistent binary data in the application the longer this loop comparing byte per byte on each collection took.

Upgrading NHibernate added to a performance challenge in another area, the application start up was taking longer. This will go on Part II as I should get some sleep.

Sweet dreams!

PS. VSTS 2008 has very promising capabilities for debugging high CPU usages, almost as good as ANTS Profiler, see this post

Hold down the right CTRL key, press the SCROLL LOCK key twice and scare the hell out of your Sysadmin :-p

I just watched Mark Russinovich’s Technet webcasts and he demoed how to create a Blue Screen of Death on a windows PC by using the right ctrl key + scroll lock + scroll lock

This is very helpful when debugging OS issues but I thought it might cause a faint on one sysadmin or two if done without warning. EYE: Do not attempt on Production servers or face an immediate let go…

Here’s the KB article describing how to gather the memory dump and the types of memory dumps you can get with this technique. Cheers!

Triple hop issue with ASP.NET delegation Part II: Fixing our remote users.

Where did I left the previous part?… ah yes, our Terminal Server users. Well the XP users were okay and happy but the remote users were not… what could possibly be different for both?

First, the OS, our remote users connect to windows servers (Windows 2000 Standard servers ) so we checked the kerberos.dll version and nope, everything was up to date…

We also checked the kerberos tickets for the logged user and there they were…We again checked our SPN on the domain (setspn), both the SQL Server service and our web service URL were registered in AD with the proper ports.

With the help of a tech support call we landed on this KB article:
Unable to negotiate Kerberos authentication after upgrading to Internet Explorer 6

Windows Server 2000 ships with Internet Explorer 5, it turns out that when IE5 is upgraded to IE6 the advanced option Enabled Integrated Security Option (requires restart) is checked off by default. This option is normally checked on on Windows Server 2003, Windows XP, Vista and 2008.

It turns out that this setting indicates to IE to use NTML as the authentication protocol when the option is unchecked.

This option is equivalent to the following registry key:

HKEY_CURRENT_USER\Software\Microsoft\Windows
\CurrentVersion\Internet Settings

and administrators can enable Integrated Windows Authentication by setting the EnableNegotiate DWORD value to 1

The name of the option is very misleading. You can find more information on this setting here, unfortunately I didn’t find much on the official Microsoft curriculum.

On our web server, we do not allow protocol transition which means the authenticated user should use Kerberos in order to enable the credentials to be delegated to a second hop:

The PROBLEM:

With the client set up to disable Kerberos authentication and with the IIS box set up to disable protocol transition, the credentials passed to the back end were not the end user credentials. Challenge protocols such as NTML do not allow delegation of credentials to a second hop.
A great article regarding the difference between Kerberos and the NTLM, WWW-Authentication such as Basic, Digest etc can be found on Larry Osterman “the Ping of Death”‘s amazing blog post.

The SOLUTION:

Enabling Protocol Transition on our web box at the domain controller would have done the trick:

Or setting up a Group Policy for our end users to have the IE setting checked up. See how to create a Group Policy here.

Cheers!

Triple hop issue with ASP.NET delegation Part I: Our Windows XP Pro desktops

Last Friday we had an issue in production: we have a very simple web application with one single page on our intranet that consumes an array of web services. These web services talk to a back end SQL Server.

All in all this is a very typical scenario and like most companies with .NET technology we have web applications using ASP.Delegation in the intranet, the only particular point regarding this web page is that it is called inside an old legacy windows application (not a .NET app). For remote users, this old legacy application is used via Terminal Services.

For our remote users also, the application didn’t work and our DBA was registering a bunch of anonymous requests coming from the web server box…

On the other hand we set up our web services tracing to debug and were able to see the end user credentials on each HTTP request, so the end user had managed to authenticate using Integrated Windows Security on our web box and the web service trying to open a SQL connection to the back end.

We used impersonation and Integrated Windows Authentication on our web application and web services (this is an intranet after all). ASP.NET impersonation gave us the chance to restrict the access on the back end based on AD groups and at the same time gave us the ability to audit the user’s actions to a very fine grained degree (user name).

The PROBLEM with our Windows XP Pro desktop users

The application worked for our desktop users if and only if they had logged off and on their desktops in the past 48 hours. If the desktops users hadn’t logged on for a while, like me, that I lock my computer instead of logging myself off, the application didn’t work either and the sql box passed an anonymous login attempt back to our web tier. The web services then passed a SOAP Exception with the NT Service/Anounymous user error message to our web app…

System.Web.Services.Protocols.SoapException: Server was unable to process request. —> System.Data.SqlClient.SqlException: Login failed for user ‘NT AUTHORITY\ANONYMOUS LOGON’.

At first we thought it was the same problem, but it turns out the TS users couldn’t use the application even when they logged of and back on, not even when the TS server was restarted, hrm….

By dividing and conquering we applied the kerbtray.exe tool on our web server and one of the desktops and enabled Kerberos logging on both boxes. We noticed that when the application worked the user logged in the web server box used Kerberos, but after a few days the logging defaulted to NTML.

SOLUTION for the Windows XP Pro Desktops

It turns out this was a bug in the kerberos.dll running on Windows XP SP2, SP3 has this problem solved. More information can be found on this MSDN thread. Also the hotfix for Windows XP Professional SP2 can be found on this Microsoft Knowledge Base article. Although this article describes a different problem the hotfix provided here contains the fixed kerberos dll.

There are quite a bit of articles regarding ASP.NET delegation

And quite a few MSDN forum threads, like this one I initiated and has a heated discussion with the moderator, my fault most of it.

The best resources I have found so far, and I hope this digested summary will help you if you have the same double/triple hop issue, are:

Ken Schaefer’s blog post regarding IIS and Kerberos Part 5 – Protocol Transition, Constrained Delegation, S4U2S and S4U2P.

Keith Brown’s article on MSDN: Credentials and Delegation
and
nunos’s Blog: Concerning the credentials double hop issue

and the best of all is a webcast by Yung Chou *all kudos to his explanation of Protocol Transition*

MSDN Webcast: Getting Delegation to Work with IIS and ASP.NET:
The Ins and Outs of Protocol Transition (Part 1 of 2) (Level 300)

This webcast specifically helped us troubleshooting and fixing the second part of our problem, our failed connection when the end users connected remotely via terminal servers.

I’ll post more of the problem and the resolution on Part II…

…stay tuned.