Wednesday, January 06, 2010

For those following the OpenNETCF IoC Framework, I've checked in another set of fixes.  So yes, it's still an active project.

1/6/2010 1:42:12 PM (Eastern Standard Time, UTC-05:00)  #    Comments [0]  | 
 Tuesday, January 05, 2010

Ages ago I did a science project where I was reading and writing registers from managed code.  It worked well and there is absolutely no reason that you shouldn't be able to do this kind of thing.  Windows CE is an embedded platform, and affecting hardware is what we do in the embedded world.

Well, as with any code posted on the web (and in fact this got rolled into the Smart Device Framework), it got used in some actual shipping products.  Great!  i'm not the only one who thinks we should be able to do this stuff.

Well, over the past 6 months or so I had a few different people contact me saying that the code either didn't work, or did work on their earlier hardware, but was now giving strange behavior.

Most telling was that if you hooked up a scope to the memory, you would see 4 write pulses when writing to a register.  Furthermore, all 4 bytes in the register ended up getting set to the last byte in the array you wanted to write.  For example, if you wrote 0x12345678 to a register, it would actually get set to 0x56565656.

Now I know that this code originally worked.  I wrote it using actual hardware (PXA255) so the behavior was new.  The fact that there were 4 strobes on the memory strongly suggested that the code was actually doing 4 individual writes for the 4 bytes, not a single, atomic 4-byte write.

I looked at the code and it couldn't be more simple.  A write boiled down to this:

public void WriteInt32(int data)
{
   Marshal.WriteInt32(m_addressPointer, data);
}

A bit more investigation found that the behavior was fine under CF 1.0, but started failing in CF 2.0.  What that means is that Microsoft changed the underlying implementation of the Marshal class, and in a bad way.  Why would they do such a stupid, stupid thing?  The new mechanism is going to not only cause a break for the writes we're looking at, but it's also a lot slower.

Since I don't know who did this, I can only guess as to what happened.  If you write a 4-byte value to an address that is not DWORD aligned (i.e. not evenly divisible by 4) then an ARM processor with throw a fault and puke on you (x86 throws, but handles it internally).  My bet is that some edge case got reported that the CF was throwing an unaligned exception, and some idiot developer decided that the heavy-handed solution of changing the write to happen byte-by-byte would be the solution.  Yes, it prevents the error, but it causes bugs and is bad, bad form.  Personally I'd like to slap the persone who made the change and the person who reviewed the change and thought it was ok.

So how do you get around this?  Well by doing what the CF itself should have done.  Instead of using Marshal, you use unsafe code and pointers, and check address alignnment before writing.  Something like this:

public unsafe void WriteInt32(int data, int offset)
{
  int baseAddr = (m_addressPointer.ToInt32() + offset);
  if (baseAddr % 4 == 0)
  {
    // dword aligned
    uint* pDest = (uint*)(baseAddr);
    *pDest = (uint)data;
  }
  else if (baseAddr % 2 == 0)
  {
    // word aligned
    ushort* pDest = (ushort*)(baseAddr);
    *pDest = (ushort)(data >> 0x10);
    pDest += 2;
    *pDest = (ushort)(data & 0xFFFF);
  }
  else
  {
    // byte aligned
    byte* pDest = (byte*)(baseAddr);
    foreach (byte b in BitConverter.GetBytes(data))
    {
      *pDest = b;
      pDest++;
    }
  }
}

This is a classic case of someone not understanding the problem they are solving.  This logic should have been done well below us - that's the whole point of using managed code, right?  To simplify things.  Unfortunately it also allows many developers to write code without understanding what it's really doing, and in my mind that's crazy risky. 

Testing also shows that the Read and Copy methods are similarly broken and all of these bugs still exist in CF 3.5, and I strongly suspect it will remain broken in future versions.  So beware, if you're using Marshal for moving data, you could probably get 4x performance improvement by using a pointer instead.

1/5/2010 12:53:28 PM (Eastern Standard Time, UTC-05:00)  #    Comments [0]  | 
 Wednesday, November 25, 2009

Long, long ago I wrote an article for MSDN on creating a multi-Form CF application that used a Form Stack.  Since we all tend to grow and learn as developers, we find different ways to do things (and generally we scoff at code we wrote years before as inferior crap). 

Well Peter Nowak is giving a presentation of the OpenNETCF.IoC framework next week and I decided that a sample of using the UI elements of the library (Workspaces, SmartParts, etc) might be handy and so I decided to rewrite the Form Stack following my latest thinking.  Basically the idea is to allow a user to navigate through Forms (actually Views - you are separating your Views from the Model, right?) like you would on a browser.  You can move forward and back as well as "adding" to the end or top of the stack.  We also don't want to be constantly creating new instances of the View classes because we like applications to perform well.

Here's a look at the new application (yes, I know it's "developer ugly" but this is about architecture, not aesthetics):

You can see the stack in the list, and our current position is noted by the asterisk.  "Fwd" will move down to View B, "Back" will move up to Form A, or you can push a new A or B onto the stack at the current location, which will truncate everything currently above (after) the current position.  Again, think of how your browser works.  It's important to know, also, that there are only 3 total View instances created at this point (one of each specific type).

The code for this application is in source control over at the OpenNETCF.IoC Framework Codeplex site.  You'll notice it's called FormStackCS, hinting that there may be a FormStackVB coming.  If you'd like to volunteer to do that port, by all means let me know (meaning don't hold your breath waiting for me to do it).

11/25/2009 12:21:11 PM (Eastern Standard Time, UTC-05:00)  #    Comments [4]  | 
 Friday, November 20, 2009

If you've ever done mstest unit testing with a Smart Device project, then you're painfully aware of how badly Microsoft dropped the ball on this one.  Debugging a unit test requires making device registry modifications, adding a call to Debugger.Break in your code, then telling Studio to Attach to Remote Process once the breakpoint has been hit.  Seriously, that's their officially published answer to how you debug a Smart Device unit test!

If you know anything about testing, you know that keeping the cycle time for a test to a minimum.  The longer it takes a developer to go from "start testing" to a break point where they can step, then the less productive they're going to be.  Even worse, if the process is painful, slow and convoluted (check, check and check for Microsoft's recommendation), they're likely to just skip writing tests altogether.

Internally we get around this by using our own test runner which uses Reflection to load up and run tests.  I've decided to once again give back to the community and publish this gem as part of Project Resistance (it will get checked in to the IoC Framework as well).

It does not support everything that mstest does, but it's got enough to get you going, and I think it's at least reasonably easy to modify if it doesn't meet your needs.  The currently supported attributes are:

It also might now be obvious how to set it up for your own app.  You need to add a reference to your test assemblies (so VS will deploy them - for some stupid reason you can't tell it to do so via the Configuration Manager) and make sure all projects are set to deploy to the same place.

As usual, if you have feedback or updates, please let me know.  Submitting a patch right on one of the project portals is probably the easiest way (hint, hint).

It's probably worth noting here that the code for this is the CFTestRunner project, and you have to pull it from the source tab on the project site (it's not in the release download yet).

11/20/2009 1:36:06 PM (Eastern Standard Time, UTC-05:00)  #    Comments [0]  | 
 Monday, November 02, 2009

For those following Project Resistance, this is probably going to be a week with little progress.  I've got a major milestone on another project that we're trying to prepare for followed by an on-site installation so my activity in the code base will be very, very limited.  I believe Alex is also on-site and swamped this week as well.  We will be back at it next week though, hopefully with some new graphics so we can hammer out the final details of the UI.

11/2/2009 8:57:31 PM (Eastern Standard Time, UTC-05:00)  #    Comments [0]  |