There’s a simple way to include binary data inside an executable, when using a GCC toolchain. The trick relies on using objcopy to transform the binary blob of data into an object file that can be linked.
In this example I am creating a binary blob of 16 bytes of random data in file “blob.bin“:
$ dd if=/dev/urandom of=blob.bin bs=1 count=16 16+0 records in 16+0 records out 16 bytes (16 B) copied, 8.7424e-05 s, 183 kB/s $ hexdump -C blob.bin 00000000 2a 3b cb 0f 43 66 56 77 fd cc 5a e9 b9 73 a7 b2 |*;..CfVw..Z..s..| 00000010
Then I need to use objcopy to transform it; the command is of the form:
$ objcopy -I binary -O <target_format> -B <target_architecture> <binary_file> <object_file>
If you are not sure about the target architecture and target format, they can be found with something like:
$ > arch.c $ gcc -c arch.c -o arch.o $ objdump -f arch.o arch.o: file format elf32-i386 architecture: i386, flags 0x00000010: HAS_SYMS start address 0x00000000 $ rm -f arch.c arch.o
So, in my case the command is:
$ objcopy -I binary -O elf32-i386 -B i386 blob.bin blob.o
This command created a “blob.o” file that contains the symbols to access the data within:
$ objdump -t blob.o blob.o: file format elf32-i386 SYMBOL TABLE: 00000000 l d .data 00000000 .data 00000000 g .data 00000000 _binary_blob_bin_start 00000010 g .data 00000000 _binary_blob_bin_end 00000010 g *ABS* 00000000 _binary_blob_bin_size
These symbols can be accessed by C code (and assembly too). Here’s a simple program that uses them:
#include <stdio.h>
extern unsigned char _binary_blob_bin_start;
extern unsigned char _binary_blob_bin_end;
extern unsigned char _binary_blob_bin_size;
int main()
{
unsigned char *pblob = &_binary_blob_bin_start;
while(pblob < &_binary_blob_bin_end)
{
printf("%d: %02X\n", pblob - &_binary_blob_bin_start, *pblob);
pblob++;
}
printf("size: %d\n", &_binary_blob_bin_size);
return 0;
}
Now we can compile the program and run it to see that the binary data can be accessed correctly.
$ gcc -c -o test_blob.o test_blob.c $ gcc test_blob.o blob.o -o test_blob $ ./test_blob 0: 2A 1: 3B 2: CB 3: 0F 4: 43 5: 66 6: 56 7: 77 8: FD 9: CC 10: 5A 11: E9 12: B9 13: 73 14: A7 15: B2 size: 16
It is also possible to rename the symbols that are created by objcopy using the “--redefine-sym” option, and also put the data in a section with a different name and different flags, using “--rename-section“.
I’ve also seen a method that involves translating the blob into a C source file containing an array of data. The C file can then be compiled and linked into the program. I think both methods have their advantages, for example the objcopy method has fewer steps and needs less space on the disk, whereas the “C array” method can be useful if you want to commit the array into a versioning control system that works better with text file than with binary files.
Everything in this page can be achieved also with cross-compilers, by adding the prefix of the toolchain (for example “arm-linux-gnueabi-” or “avr-“) to the gcc, objcopy and objdump commands.
Entries
himanshu arora
2012/02/19
today i learnt :
1) /dev/urandom
2) dd command and it’s available options
3)objdump command
4)hexdump command
1 question though,
what is the difference between .o and .bin format ?
Balau
2012/02/19
The “
blob.bin” file has not any format, it’s simply composed of random bytes of data.The “
.o” files have an ELF format, and contain more information that is needed to be linked in an executable.himanshu arora
2012/02/19
some time back , i compiled u-boot for mini2440 and i got u-boot.bin as the resultant thing. So , do boot loaders also don’t have any format ?
Balau
2012/02/19
The “
u-boot.bin” file has probably been created by anobjcopycommand from the ELF file. The resulting “u-boot.bin” is just machine code and the data used by the machine code.demarchi
2012/05/23
There’s a 3rd way that I find is easier: “ld -s -r -o blob.o -b binary blob.bin”