Wednesday, July 6, 2011

Anatomy of a HEX File

Before we can write a bootloader that reprograms our PIC for us we need to understand what a C18 compiled .hex file looks like. To do this I'm just using the MPLABX environment I set up on my Ubuntu dev machine. If we're going to look at a hex file then we need to get a hex file first. Let's keep it very simple. I created a new standalone project in MPLABX called HEXplorer. I added a single file named main.c with the following contents:
void main(void)
return 0;

That's as simple as I know how to make it. It compiled fine and created a hex file that looks like this:

Twenty-three lines of instructions for a four line program? This highlights a good point. C compilers are very nice utilities to allow you to write firmware code in a high level language but you do pay a price for it. Another interesting thing to noticed when you build this simple program is the RAM usage. After a clean build it showed 268 bytes or 7% of total RAM being used. Where is all that RAM going? Good question. If you look at section 3.3 (START-UP CODE) of the C18 User Guide you'll read that the default startup behavior that C18 creates involves initializing stack pointers. Those stack pointers point to a stack that is, by default, 256 bytes in size. You can control that size with a custom linker file. More on that in a later post. In section 3.4 of the User Guide you see that the compiler-managed resources account for a minimum of 12 bytes. 256 + 12 accounts for our 268 bytes.

HEX Demystified

"I don't even see the code. All I see is blonde, brunette, redhead. Hey uh, you want a drink?"

So if you look at the Intel HEX ( page on Wikipedia ( I recommend you read it before going any further) you'll see that there are multiple formats for HEX files so the first thing we need to do is figure out what format we're looking at. That's easy, we just go to the Project Properties in MPLABX and select the MPLINK item on the left and see on the right that the HEX file format is INHX32.

Alright, so what do we know about that format? Well the MPLINK documentation can tell us. This is what it says. INTEL HEX 32 FORMAT
The extended 32-bit address hex format is similar to the hex 8 format, except that the extended linear address record is also output to establish the upper 16 bits of the data address. This is mainly used for 16-bit core devices since their addressable program memory exceeds 64 kbytes. Each data record begins with a 9-character prefix and ends with a 2-character
checksum. Each record has the following format:
BB A two digit hexadecimal byte count representing the number of data bytes that will appear on the line.
AAAA A four digit hexadecimal address representing the starting address of the data record.
TT A two digit record type:
00 – Data record
01 – End of File record
02 – Segment Address record
04 – Linear Address record
HH A two digit hexadecimal data byte, presented in low byte/high byte combinations.
CC A two digit hexadecimal checksum that is the two's complement of the sum of all preceding bytes in the record.

So let's look at the first line in our compiled file:

Broken into it's parts we have
: - Indicates the start of the line
02 - Tells us that there are 0x02 or 2 bytes in the data segment
0000 - Will always be 0000 for record type 04.
04 - The record type is the Linear Address record. The data bytes represent the upper 16 bits of a 32 bit address.
0000 - The two data bytes. This makes sense because are program is small and will be under the 64KB (0xFFFF) range so the upper 16 bits of the address should be all zeroes.
FA - The checksum for the line. We're going to ignore this for now.

Well, that's not horribly interesting since that is how all of our PIC18 programs are likely to start. So let's look at one more.

This one is more interesting.

: - Start of the line
06 - Data segment will contain 6 bytes of data.
0000 - The starting address of the data in the data segment
00 - The record type is a Data Record
63EF00F01200 - The actual data. We'll disect that in a second.
A6 - The checksum.

Ok, so this time we're actually getting into the data that should be programmed onto our device at the reset vector (address 0x000). An important thing to remember here is how the data is organized. We read it from left to right but the bytes are swapped in each single-word instruction. Back in the MPLINK documentation it describes the data as "A two digit hexadecimal data byte, presented in low byte/high byte combinations." That means that "63EF"is really "EF63" if we want to break it out into binary and figure out the opcode which we obviously want to do right? This is confusing at first and seems completely insane and backward but it will help us later because when we want to program the PIC we program it LSB first then MSB so we'll program the device exactly how the data is in the hex file. It just needs to be swapped when we humans are looking at it.

Dissecting the Data

So lets look at the first two bytes of data.
EF63. Broken into binary we have

1110 1111 0110 0011

We take that information to the PIC18F27J53 datasheet, section 29.1, Table 29-2 which is the standard instruction set table. We just go down the instruction column until we find a match. Remember, some of the instructions have variable bits that will represent data for the instruction to use. That happens to be the case for our instruction which is the GOTO instruction. It has the format:

1110 1111 kkkk kkkk

The k's represent the address to go to. That address is the rest of the bits in our instruction word, “0110 0011” or 0xC6 when translated to hex format. If you want some practice try to figure out what the rest of the instructions are from our data segment. So, in review, the first instruction at the reset vector is going to be GOTO 0xC6.

I don't expect you to take my word for it. Let's have MPLABX confirm this. After building the simple HEXplorer program we can go to Window->PIC Memory Views->Memory View 2. This will display the PIC Program Memory in a tab in the bottom view area and you can see what the program memory should contain after programming the PIC with the HEXplorer hex file.

Well, you're welcome to go through the rest of the hex file and try to decode all of the instructions but I think we've got a solid enough understanding of the hex file format to continue working on our bootloader. The next piece of our problem is to "Understand how to write to the PIC's program memory space." That will be up next.

Bootloader Beginnings

Before I even begin defining the sensor firmware I want to create a solid, easy to use bootloader. I referenced it in an earlier post and still like the idea of using an SD card. It's platform independent and requires no host machine or software. The drawback will be power consumption. I'm designing the platform to be battery powered and while SD cards aren't huge power hogs they're not exactly power sippers either in terms of embedded systems. Before we even get to the SD part though we need to make it so our PIC can read hex data and program itself. Many of the PICs including the one I'm using for the project are self programmable, meaning you don't need an external programmer to reprogram it. I've done this before using the Microchip bootloader as a base and customized it a bit but I never really loved how it was written and layed out. I'm going for simplicity. So how do we solve this problem if we've never done it before? We do what any good programmer would do with a complicated software problem; break it into small, manageable pieces. So here are the pieces I've come up with to create a bootloader.
  1. Understand the anatomy of a hex file. If we're going to read a hex file and reprogram a PIC then we'd better understand what a hex file looks like.
  2. Understand how to write to the PIC's program memory space.
  3. Implement flash memory routines for the bootloader in C
  4. Implement programming routines to program the PIC from hex data.

The Spec

So what is the specification we are tackling here. I think it's good to define what "done" is so here's what I've got:

As a user I want to be able to update the firmware on the platform with an SD card containing the updated firmware.

It's not too specific but that's ok. In my next post we'll start breaking down the pieces to our problem and see where we get.