Monday, August 20, 2007

My love

The first computer I worked on was a VT100 terminal attached to a central server in the sacred server room which we could only look at from outside. We had a slot of not more than an hour to work on it. This was at my campus 7 years back. Computers really have changed a lot, and so have the interfaces and the softwares that we interact with. But a few things will never change at least for me. Top of the list I guess is Vi. I am a Vi fan. Three years of working in the industry and four years at the campus, I found Vi incredibly powerful and easy.

Some time back NB was talking about this cool dood friend of his who used to do everything imaginable in Emacs; right from writing, debugging source code to checking his mails. He said he had done a plugin which could rip out the attachment off a multi-part mail and save it to a directory which he could go look into. Emacs I believe has a lot of popularity and fan following so I thought I would give it a shot. For a month I "tried" religiously to use emacs for editing it is a good editor, but it does not beat Vi. It might be my love for Vi which hides the goodness of Emacs ;) but the fact remains that to do Emacs you really have to press a bit too many keys. One of my favourite "shortcuts" was "CTRL-x a i g" inverse-add-global-abrev [Ok Ok.... It is a pretty fancy feature... If you want to use it you have sacrifice a little bit... On the plus side you will get more flexible fingers >:)] Anyways, after a month I gave up... I am back to Vi now...

Wednesday, August 1, 2007

Stone Age Debugging

How do you debug a reasonably complex system without tools like a debugger, core dumps or even a console access!!! I am talking about a system with no printf's, no gdb, no persistent filesystem to dump a core on crash.

I work on an embedded system, TI C62x DSPs on an Motorola MPC board. Few weeks back I was debugging a problem that caused the DSPs to crash every 18-20 hours. We dont have JTag to peek into the DSPs, all we have is a shared memory based debugging mechanism in which DSPs keep updating a location with the code trace and in case of a problem with the DSP, it jumps to HALT and the Controller detects it with a heartbeat mechanism after which it tries to access this shared memory via the DSP's HPI, gets the data, and dumps it to a file.

However, the problem we have is, if the DSP crashes, i.e. if it DOES NOT do a halt instead badly screws up and reboots, the Controller goes for a toss eventually causing the whole system to reset; which is where the problem starts.

First 3 days I tried to fix the MPC reset. On a little digging, the MPC reset was found to be due to the following sequence of events:
- DSP has a problem
- DSP resets
- MPC misses a heartbeat from the ailing DSP
- MPC tries to read the shared memory for the logs [Thinking that DSP has halted]
- The DSP handles are no more valid as it is reset
- MPC crashes, bringing down the whole system

Day1:
Aim: DSP resets by jumping to c_int00, override the ISR to loop infinitely [or HALT]

Approach:
  • Find address of c_int00 from map file
  • When DSP is loaded, overwrite the address with JMP HLT code [RTFM for asm or write while (1), compile, check asm]
Result: Day wasted, DSP still resets, looks like the code I wrote was wrong [no way for me to figure out] or the screwup is worse than I thought

Day2:
Aim: MPC crashes because it is trying to read using an invalid handle, get a new handle before reading

Approach: On DSP failure
  • Close the DSPs
  • Get new handle
  • DONT download the code to the DSP
  • Open HPI
  • Read from the shared location
Result: I get something, but it looks like it is corrupted, but MPC does not crash at least. But still nowhere.

Day3:

Aim: Look for the rootcause, DSP is resetting probably because there is a stack overflow, arrest that

Approach: In the main task that runs every 20 msecs, check if the stack usage is going beyond 80 %, if so, HALT

Result: None, half the day wasted

Aim: There is a buffer overflow, arrest that

Approach: Put a "gaurd band" near all major buffers, every 20 msecs cycle, check if there is something being written to it. i.e. lets say there is buffer char caImportantBuffer [100];
Modifiy that to: char caImportantBuffer [100 + GAURD_BAND];
memset (caImportantBuffer + 100, 0x1234, GAURD_BAND);

Every 20 msecs, check if the last GAURD_BAND bytes have changed, if they have, HALT.

Result: ONE bug found, but the problem still remains. Big achievement, but miles to go before I sleep


To be continued......

On getting drunk

Ok... To set the expectations right.... I dont drink... A teetotaler.... NEVER have I touched daaru (That is what I call everything.... From Desi tharra to Breezers (Fondly referred to as Juice) to Tequila Shots) in my life and never do I intend to.... that said..... I do get drunk..... by induction.... Initially everyone (including me) thought that I was trying to act drunk in a drunk company and later I realized that I totally enjoy the drunk parties... as much as anyone drunk dead would.. Comes in pretty handy.... You can get all the high that a drunk would get and you would not do anything silly that you may regret later...

So today in our office party today we were all drunk and we talked about all sort of things... from Marriages... to College Blues...

One of the more interesting topics that we talked about was about arranged vs. love marriages and that every guy tries at least once to find love but not every time does one get love... or people take "Mature decisions" not to go with what the heart tells them... the question was... will you be a loser if you take such a decision.... maybe yes or maybe not.... Before we could find out we realized the high came down... nerves were being touched.... so instead of finding answer to that we instead discussed the more interesting topic about not looking for love.... just flings instead... And back to being happy.....

But I being not drunk drunk... was forced to think of the more important question.... how many times do we let go off the more important things... the more painful things... how many times do we put aside a more meaningful question.... turning ourselves away from what we know matters to something that makes no difference... just avoiding... running away.... from life..... Living.....