Linking a binary blob with GCC

Posted on 2012/02/19

13


There’s a simple way to include binary data inside an executable, when using a GCC toolchain. The trick relies on using objcopy to transform the binary blob of data into an object file that can be linked.

In this example I am creating a binary blob of 16 bytes of random data in file “blob.bin“:

$ dd if=/dev/urandom of=blob.bin bs=1 count=16
16+0 records in
16+0 records out
16 bytes (16 B) copied, 8.7424e-05 s, 183 kB/s
$ hexdump -C blob.bin
00000000  2a 3b cb 0f 43 66 56 77  fd cc 5a e9 b9 73 a7 b2  |*;..CfVw..Z..s..|
00000010

Then I need to use objcopy to transform it; the command is of the form:

$ objcopy -I binary -O <target_format> -B <target_architecture> <binary_file> <object_file>

If you are not sure about the target architecture and target format, they can be found with something like:

$ > arch.c
$ gcc -c arch.c -o arch.o
$ objdump -f arch.o

arch.o:     file format elf32-i386
architecture: i386, flags 0x00000010:
HAS_SYMS
start address 0x00000000

$ rm -f arch.c arch.o

So, in my case the command is:

$ objcopy -I binary -O elf32-i386 -B i386 blob.bin blob.o

This command created a “blob.o” file that contains the symbols to access the data within:

$ objdump -t blob.o

blob.o:     file format elf32-i386

SYMBOL TABLE:
00000000 l    d  .data    00000000 .data
00000000 g       .data    00000000 _binary_blob_bin_start
00000010 g       .data    00000000 _binary_blob_bin_end
00000010 g       *ABS*    00000000 _binary_blob_bin_size

These symbols can be accessed by C code (and assembly too). Here’s a simple program that uses them:

#include <stdio.h>

extern unsigned char _binary_blob_bin_start;
extern unsigned char _binary_blob_bin_end;
extern unsigned char _binary_blob_bin_size;

int main()
{
  unsigned char *pblob = &_binary_blob_bin_start;
  while(pblob < &_binary_blob_bin_end)
  {
    printf("%d: %02X\n", pblob - &_binary_blob_bin_start, *pblob);
    pblob++;
  }
  printf("size: %d\n", &_binary_blob_bin_size);

  return 0;
}

Now we can compile the program and run it to see that the binary data can be accessed correctly.

$ gcc    -c -o test_blob.o test_blob.c
$ gcc   test_blob.o blob.o   -o test_blob
$ ./test_blob
0: 2A
1: 3B
2: CB
3: 0F
4: 43
5: 66
6: 56
7: 77
8: FD
9: CC
10: 5A
11: E9
12: B9
13: 73
14: A7
15: B2
size: 16

It is also possible to rename the symbols that are created by objcopy using the “--redefine-sym” option, and also put the data in a section with a different name and different flags, using “--rename-section“.

I’ve also seen a method that involves translating the blob into a C source file containing an array of data. The C file can then be compiled and linked into the program. I think both methods have their advantages, for example the objcopy method has fewer steps and needs less space on the disk, whereas the “C array” method can be useful if you want to commit the array into a versioning control system that works better with text file than with binary files.

Everything in this page can be achieved also with cross-compilers, by adding the prefix of the toolchain (for example “arm-linux-gnueabi-” or “avr-“) to the gcc, objcopy and objdump commands.

About these ads
Posted in: Embedded