Friday, November 11, 2005
You Passed 8th Grade Science
Congratulations, you got 8/8 correct!
11/11/2005 10:53:31 PM (Eastern Standard Time, UTC-05:00)  #    Comments [0]  | 
Your IQ Is 125
Your Logical Intelligence is Below Average

Your Verbal Intelligence is Genius

Your Mathematical Intelligence is Exceptional

Your General Knowledge is Exceptional
11/11/2005 10:53:10 PM (Eastern Standard Time, UTC-05:00)  #    Comments [0]  | 
 Friday, November 04, 2005

Once again the OpenNETCF SDF has won Pocket PC Magazine's Best Software Award in the “.NET Developer Package” category for both the SmartPhone and Pocket PC categories in Pocket PC Magazine's Best Software.  Kudos to the team and thanks to the judges and community for making it a success.

SDF 2.0 will bring even more excitement - more on that later.... 

11/4/2005 6:58:39 PM (Eastern Daylight Time, UTC-04:00)  #    Comments [1]  | 
 Saturday, October 29, 2005

If you've not heard here are a few bits:

  1. Studio 2005 RTM is available for download for those with MSDN subscriptions.  If you've got a subscription, don't wait until the 7th, go get it now!
  2. CF 2.0 distributables are available here
  3. A Platform Builder 5.0 QFE for CF 2.0 is available here.
10/29/2005 10:11:41 PM (Eastern Daylight Time, UTC-04:00)  #    Comments [0]  | 
 Monday, October 24, 2005

Yesterday I was working on a project using C++ for a device under Visual Studio '05 and for some reason it stopped linking with the following error:

error LNK2019: unresolved external symbol __security_check_cookie referenced in function "int __cdecl RegisterAndActivate(void)" (?RegisterAndActivate@@YAHXZ)

No idea even what the error meant, so I messed with some settings, rolled back code - all the usual things to try to find it.  Nothing.  I decided to bag it for the night - maybe fresh eyes today would help.

I loaded the project again today.  Same error.  I removed the project from the solution, created a new one, and re-added my code. Same error.  So I turned to my fellow coders to see if anyone else had seen this.

Jeff Abraham of Microsoft replied as follows:

My psychic powers tell me that you are building against PPC2003, and you aren't linking against secchk.lib. Try adding that to the linker inputs line, and seeing if that fixes the issue. I'm not sure how you would have gotten into this state without changing anything, as any project targeting PPC/SP 03 should have these libraries by default.

Sure, enough, I added secchk.lib to the Additional Dependencies line under the Project's Linker | Input section and it's now building again.  Why the error occured in the first place I'm not too concerned with - the release of Studio is only weeks away and I'll be uninstalling soon, but Googling on this error turned up nothing helpful at all.  Hopefully this blog entry will remedy that if someone else is looking.

10/24/2005 5:14:42 PM (Eastern Daylight Time, UTC-04:00)  #    Comments [2]  | 
 Thursday, September 15, 2005

So I just posted a blog entry about debugging and lo and behold, not 30 minutes later I his an exception in my platform.  I've done this tracing before, but this time I decided I'd document my discovery process and post it here as a real-world example.  It turned out to be an interesting example that I'd not hit before, so all the more fun to post.

So here's the background: I have a CE 4.2 PXA255-based system.  When I go to sleep and then wake back up with a USB device (keyboard, mouse or mass torage device) inserted I get the following exception:

RaiseException: Thread=81a8db58 Proc=816e7490 'device.exe'
AKY=00000005 PC=03fd3f80 RA=800bd398 BVA=00000001 FSR=00000001

This exception only happens at wake - if the device is in during boot there's no problem.

Nicely, when PB makes an image, it dumps the output into a PLG file in the same folder as your WCE and PBW files.  This output provides the map of where everything gets assembled. 

So let's take a look at what my exception is telling me.  I like to start with the return address (RA in the exception output) to see who the caller is causing the problem.  From Sue's entry, we can tell immediately it's in the kernel because the address is after 80000000. 

We need to find the offset (plus verify this assumption) so we'll look at the build output.Here's an excerpt from my myproject.plg file:

MODULES Section
Module                 Section  Start     Length  psize   vsize   Filler
---------------------- -------- --------- ------- ------- ------- ------
nk.exe                 .text    800b9000  258048  255488  255440 o32_rva=00001000
nk.exe                 .pdata   800f8000    8192    8192    7872 o32_rva=0006a000
cplmain.cpl            .text    800fa000  110592  109056  108976 o32_rva=00001000
cplmain.cpl            .rsrc    80115000  126976  126464  126276 o32_rva=0001f000

We can see that our return address of 800bd398 is in nk.exe at offset 0x4398 (0x800db398-0x800b9000).

Now we look in our _FLATRELEASEDIR and pull out the MAP file for NK.EXE.  This tells us the offsets for the beginnings of every function in the module (keep an eye on statics listed at the bottom of the map file).  Here's an excerpt from NK.MAP:

 0001:000041f8       UndefException             8c5851f8     nk:armtrap.obj
 0001:00004218       SWIHandler                 8c585218     nk:armtrap.obj
 0001:0000421c       FIQHandler                 8c58521c     nk:armtrap.obj
 0001:00004238       PrefetchAbortEH            8c585238     nk:armtrap.obj
 0001:00004240       PrefetchAbort              8c585240     nk:armtrap.obj
 0001:000044bc       MD_CBRtn                   8c5854bc     nk:armtrap.obj
 0001:00004578       CommonHandler              8c585578     nk:armtrap.obj
 0001:00004584       SaveAndReschedule          8c585584     nk:armtrap.obj

My calculated offset of 0x4398 is after the start of PrefetchAbort and before MD_CBRtn, so my exception is being throw by PrefetchAbort.  That surprises me - why the hell would PrefetchAbort be throwing an exception?  Even better, why is my code in PrefetchAbort in the first place?

Let's see where the exception is coming from exactly by chasing the program counter (PC=03fd3f80) from the exception.  Again we trun to our PLG file.  Here's an excerpt:

Module coredll.dll   at offset 01fff000 data, 03f71000 code
Module regenum.dll   at offset 01ffd000 data, 03f61000 code
Module pm.dll        at offset 01ffb000 data, 03f51000 code

Based on this we can see that our program counter address of 0x03fd3f80 is in coredll.dll at offset 0x62f80.  This is odd as I'd expect it to be in phci.dll or device.exe.  It appears that maybe we're passing some invalid info to a Win32 API, which then causes it to throw an exception?  Let's do more digging.

We go back to our _FLATRELEASEDIR and open up COREDLL.MAP to find what function in there is the culprit.  Here's an excerpt:

 0001:00062e40       __fp_mult_uncommon         10063e40     coredll_ALL:mul.obj
 0001:00062f5c       __rt_div0                  10063f5c f   coredll_ALL:__div0.obj
 0001:00062f90       __ld12mul                  10063f90 f   coredll_ALL:tenpow.obj

Ahhh...now things come together.  An offset of 0x62f80 puts us in a function called “__rt_div0”, which I'm going to assume is a runtime divide by zero exception.  That explains why we see an RA leading to PrefetchAbort.  Something in the platform code is causing a divide by zero error, which then causes PrefetchAbort to be called, which in turn enters __rt_div0.

So now what?  I still don't know what library or function caused the exception, and herein is a flaw in trying to track problems with addresses.  In this case (and this is the first I've seen this happen) I've got almost nothing.  I know it's in the USB host driver simply because of the  physical effects (happens only with USB devices, and when it happens the device stops working).  I know it's a divide by zero error.  That's all I get from this entire exercise - I wish it was a prettier case study that led me right to the line of code causing it, but hey, if it were simple everyone would do it.

9/15/2005 2:40:42 PM (Eastern Daylight Time, UTC-04:00)  #    Comments [0]  | 

Everyone that's done much development in CE knows about Doug Boling's definitive article on CE memory management.  If you've not read it, you should - if you have read it it doesn't hurt to read it again.  Go ahead and click the link - I'll wait here for you....

Done reading?  Good.  Now let's take a deeper look through the magic of blog voyeurism (yes, I'm not giving any info here other than links - I'm lazy).  Back in February, John Eldridge talked about how to tell if a debug symbol is even correctly being reported, and it good info.  Again, I'll wait while you read....

Now this got Sue Loh to thinking about what he said and a couple days later she put together a fantastic blog entry on making sense virtual addresses.  This one is short, but the info in it is invaluable if you do any amount of debugging at all.  No clue how I missed it until today, but it's one that every CE developer should read.

She then followed that up with an entry that shows how to go from an address to a symbol, allowing you to make use of that cryptic crap that Data Aborts give.

So there you go, links to some of the most valuable debugging information I've probably ever read, all in one place.  If you have other good debugging links, post them in the comments.

9/15/2005 11:50:27 AM (Eastern Daylight Time, UTC-04:00)  #    Comments [0]  |