There’s a simple way to include binary data inside an executable, when using a GCC toolchain. The trick relies on using objcopy to transform the binary blob of data into an object file that can be linked.
In this example I am creating a binary blob of 16 bytes of random data in file “blob.bin“:
$ dd if=/dev/urandom of=blob.bin bs=1 count=16 16+0 records in 16+0 records out 16 bytes (16 B) copied, 8.7424e-05 s, 183 kB/s $ hexdump -C blob.bin 00000000 2a 3b cb 0f 43 66 56 77 fd cc 5a e9 b9 73 a7 b2 |*;..CfVw..Z..s..| 00000010
Then I need to use objcopy to transform it; the command is of the form:
$ objcopy -I binary -O <target_format> -B <target_architecture> <binary_file> <object_file>
If you are not sure about the target architecture and target format, they can be found with something like:
$ > arch.c $ gcc -c arch.c -o arch.o $ objdump -f arch.o arch.o: file format elf32-i386 architecture: i386, flags 0x00000010: HAS_SYMS start address 0x00000000 $ rm -f arch.c arch.o
So, in my case the command is:
$ objcopy -I binary -O elf32-i386 -B i386 blob.bin blob.o
This command created a “blob.o” file that contains the symbols to access the data within:
$ objdump -t blob.o blob.o: file format elf32-i386 SYMBOL TABLE: 00000000 l d .data 00000000 .data 00000000 g .data 00000000 _binary_blob_bin_start 00000010 g .data 00000000 _binary_blob_bin_end 00000010 g *ABS* 00000000 _binary_blob_bin_size
These symbols can be accessed by C code (and assembly too). Here’s a simple program that uses them:
#include <stdio.h>
extern unsigned char _binary_blob_bin_start;
extern unsigned char _binary_blob_bin_end;
extern unsigned char _binary_blob_bin_size;
int main()
{
unsigned char *pblob = &_binary_blob_bin_start;
while(pblob < &_binary_blob_bin_end)
{
printf("%d: %02X\n", pblob - &_binary_blob_bin_start, *pblob);
pblob++;
}
printf("size: %d\n", &_binary_blob_bin_size);
return 0;
}
Now we can compile the program and run it to see that the binary data can be accessed correctly.
$ gcc -c -o test_blob.o test_blob.c $ gcc test_blob.o blob.o -o test_blob $ ./test_blob 0: 2A 1: 3B 2: CB 3: 0F 4: 43 5: 66 6: 56 7: 77 8: FD 9: CC 10: 5A 11: E9 12: B9 13: 73 14: A7 15: B2 size: 16
It is also possible to rename the symbols that are created by objcopy using the “--redefine-sym” option, and also put the data in a section with a different name and different flags, using “--rename-section“.
I’ve also seen a method that involves translating the blob into a C source file containing an array of data. The C file can then be compiled and linked into the program. I think both methods have their advantages, for example the objcopy method has fewer steps and needs less space on the disk, whereas the “C array” method can be useful if you want to commit the array into a versioning control system that works better with text file than with binary files.
Everything in this page can be achieved also with cross-compilers, by adding the prefix of the toolchain (for example “arm-linux-gnueabi-” or “avr-“) to the gcc, objcopy and objdump commands.
Posts
himanshu arora
2012/02/19
today i learnt :
1) /dev/urandom
2) dd command and it’s available options
3)objdump command
4)hexdump command
1 question though,
what is the difference between .o and .bin format ?
Balau
2012/02/19
The “
blob.bin” file has not any format, it’s simply composed of random bytes of data.The “
.o” files have an ELF format, and contain more information that is needed to be linked in an executable.himanshu arora
2012/02/19
some time back , i compiled u-boot for mini2440 and i got u-boot.bin as the resultant thing. So , do boot loaders also don’t have any format ?
Balau
2012/02/19
The “
u-boot.bin” file has probably been created by anobjcopycommand from the ELF file. The resulting “u-boot.bin” is just machine code and the data used by the machine code.demarchi
2012/05/23
There’s a 3rd way that I find is easier: “ld -s -r -o blob.o -b binary blob.bin”
Shashaa
2013/05/29
Balau, my requirement is closely matching with this blog! can you comment on my issue?
http://www.linuxquestions.org/questions/programming-9/creating-a-binary-blob-in-linux-4175463856/
Balau
2013/05/29
No it really isn’t. Just because your requirement contains the word “blob” doesn’t mean that it’s related to my post.
Yours seems to be a problem of serialization and deserialization (as someone already said in that thread). In C, for simple cases, it could be solved by pointing to the structure with a “
char *” pointer, send it, and then from the other side pointing to the received bytes with a “struct ... *” pointer.You need to create and manage run-time binary data, my post is for linking binary data into a C program with GCC that are statically present in the final program like a constant.
Shashaa
2013/05/29
Thanks for the reply, Balau!
Marco
2013/06/28
Hi I’ve read objcopy is usually used for some people to port application from one platform (say x86) to another platform (say ARM) when no source code is accessible. Is that true? If so, is there any documentation about this interesting process?
Thanks a lot! Great post!
Balau
2013/06/29
I’m sorry but that’s not true. I don’t know where you read it, but ask that source to give you a simple hello world example and they will not be able to do it.
The real capabilities of objcopy can be found in the documentation, and it involves copying the content, which is usually the binary code, from a format into another format, but the content remains quite the same.
Evan Zelkowitz
2014/09/09
Hopefully your still reading this blog. Ive been trying to use this method to create a self running application that usually takes a parameter of a binary blob, but to make it easier Im packing that in to the application. The issue Im running in to is I have to support this on multiple cross compilers. On a more recent one this works fine except it throws warnings about abi vs non-abi calls but still runs. On the older one it wont even link complaining about pic vs. non-pic linking.
My original application has relocatable sections which seems I cant turn off. Looking at the generated .o files even though they were created with ‘-r’ (Im using the ld method but have also tried objcopy). So I was seeing if you had run into this where it seems to generate non-relocatable code even though -r was used
Balau
2014/09/09
Since you have those many problems, it’s probably best to transform it into an ANSI C array with a command such as
xxd -i blob.bin >blob_bin.cEvan Zelkowitz
2014/09/09
Yea I had seen that method too, was just hoping to avoid it since it does greatly increase the size. My blob is already 64mb in most cases, so I was hoping to avoid that if possible, but I may give it a try and see
Andreas
2014/10/07
Hi Balau,
nice post, thanks 🙂
I have tried it in eclipse and gcc but i get the error undefined reference to `_binary_mybin_bin_size’.
When i use objdump i can see the variables, as well as in the resulting exe, but I am not able to access it.
When I comment out the usage of the variable, everything links without error.
Do you have any idea what goes wrong?
I have used objcopy with the following target for a windows hello world exe
objcopy -I binary -O pe-i386 -B i386 mybin.bin mybin.o
Could the format be an issue? I have a 64 bit Windows?
Another question – what is that xxd you mentioned in your last post – a Linux tool?
Is there a Windows pendant?
Thanks in advance,
Andreas
Balau
2014/10/07
My guess is that Eclipse is not linking
mybin.o. What does Eclipse print on the console in terms of executed commands? Is the undefined reference only on_binary_mybin_bin_sizeor on any of the three variables that you use in your program?I don’t know if the format could be an issue because I have never tried this on Windows. In my opinion it should not matter or anyway it should give a different error.
xxdis a versatile Linux tool, but if you just need to translate a binary file into a C array, any developer with basic C knowledge should be able to write a small (~20 lines of code?) program which is able to do the same.Andreas
2014/10/07
Eclipse prints that mybin.o is linked with my object file from the c file, seems okay.
In addition the variable name appears in the objdump of the linked exe.
I even tried to compile and link with gcc on the console, with the same result and
the error – weird. Maybe I try it again in a virtual machine.
Yeah, was already about to write a few lines of code to do so, but it is interesting
to learn new ways. Found that xxd tool in the meantime with Google 🙂
Andreas
2014/10/08
Hi Balau,
found the problem: gcc linker inserts the variables without the leading
underscore, now it works 🙂
Thanks for the blog again and best regards
Andreas
Andreas
2014/10/08
Addon: Its not the gnu linker I am using, its the MinGW C Linker of the MinGW Toolchain
none
2016/09/21
Thanks!
TheMortiestMorty
2023/08/12
Um… So, what is the size of ‘size’? You used
%dtoprintfthe value ofsize, which at least hints at 32 bits. (BTW, a signed format specifier to print object size? Really?) Is it always 32 bits? Or does it depend on the target platform format? How do we specify this (and can we)?Yblehs Samoht
2023/11/11
Hi. Thanks for sharing.
I have a question. If I have many .bin files and want to access each of them in the code, how can I do that?