Software debugging is a complex task. There is always the need to collect all available information, in order to detect and understand the problem fast and to think of a proper solution. Sometimes it’s more convenient to debug step-by-step, sometimes it’s better to make the program run completely, and then trace the execution flow “offline”.
Another important step in software development is profiling. GNU offers “gprof
” as a tool to analyze the execution time of functions. The working principle of gprof
is that it polls the program state with a small sampling interval and notes the function that is executing. In this case small functions could also not appear in the profiling data because their execution time is smaller than an interval.
I recently tried to use a feature of GNU GCC that can be of some help both for tracing and for profiling. It’s the following option (from its GNU GCC Manual section):
-finstrument-functions
- Generate instrumentation calls for entry and exit to functions. Just after function entry and just before function exit, the following profiling functions will be called with the address of the current function and its call site. […]
void __cyg_profile_func_enter (void *this_fn, void *call_site); void __cyg_profile_func_exit (void *this_fn, void *call_site);[…]
The execution flow can be traced implementing these monitoring points, for example writing on file some useful information.
Suppose you have to analyze the following code:
#include <stdio.h> void foo() { printf("foo\n"); } int main() { foo(); return 0; }
Create a file called “trace.c
” with the following content:
#include <stdio.h> #include <time.h> static FILE *fp_trace; void __attribute__ ((constructor)) trace_begin (void) { fp_trace = fopen("trace.out", "w"); } void __attribute__ ((destructor)) trace_end (void) { if(fp_trace != NULL) { fclose(fp_trace); } } void __cyg_profile_func_enter (void *func, void *caller) { if(fp_trace != NULL) { fprintf(fp_trace, "e %p %p %lu\n", func, caller, time(NULL) ); } } void __cyg_profile_func_exit (void *func, void *caller) { if(fp_trace != NULL) { fprintf(fp_trace, "x %p %p %lu\n", func, caller, time(NULL)); } }
The idea is to write into a log (in our case “trace.out
“) the function addresses, the address of the call and the execution time. To do so, a file needs to be open at the beginning of execution. The GCC-specific attribute “constructor
” helps in defining a function that is executed before “main”. In the same way the attribute “destructor
” specifies that a function must be executed when the program is going to exit.
To compile and execute the program, the command-line is:
$ gcc -finstrument-functions -g -c -o main.o main.c $ gcc -c -o trace.o trace.c $ gcc main.o trace.o -o main $ ./main foo $ cat trace.out e 0x400679 0x394281c40b 1286372153 e 0x400648 0x40069a 1286372153 x 0x400648 0x40069a 1286372153 x 0x400679 0x394281c40b 1286372153
To understand the addresses, the “addr2line
” tool can be used: it’s a tool included in “binutils
” package that, given an executable with debug information, maps an execution address to a source code file and line. I put together an executable shell script (“readtracelog.sh
“) that uses addr2line
to print the trace log into a readable format:
#!/bin/sh if test ! -f "$1" then echo "Error: executable $1 does not exist." exit 1 fi if test ! -f "$2" then echo "Error: trace log $2 does not exist." exit 1 fi EXECUTABLE="$1" TRACELOG="$2" while read LINETYPE FADDR CADDR CTIME; do FNAME="$(addr2line -f -e ${EXECUTABLE} ${FADDR}|head -1)" CDATE="$(date -Iseconds -d @${CTIME})" if test "${LINETYPE}" = "e" then CNAME="$(addr2line -f -e ${EXECUTABLE} ${CADDR}|head -1)" CLINE="$(addr2line -s -e ${EXECUTABLE} ${CADDR})" echo "Enter ${FNAME} at ${CDATE}, called from ${CNAME} (${CLINE})" fi if test "${LINETYPE}" = "x" then echo "Exit ${FNAME} at ${CDATE}" fi done < "${TRACELOG}"
Testing the script with the previous output, the result is:
$ ./readtracelog.sh main trace.out Enter main at 2010-10-06T15:35:53+0200, called from ?? (??:0) Enter foo at 2010-10-06T15:35:53+0200, called from main (main.c:9) Exit foo at 2010-10-06T15:35:53+0200 Exit main at 2010-10-06T15:35:53+0200
The “??
” symbol indicates that addr2line
has no debug information on that address: in fact it should belong to C runtime libraries that initialize the program and call the main function. In this case the execution time was very small (less than a second) but in more complex scenarios the execution time can be useful to detect where the application spends the most time. It is also a good idea to use the most precise timer on the system, such as gettimeofday in Linux that returns also fractions of a second.
Some thoughts for embedded platforms:
- It is possible to have fine-grain timing information if the platform contains an internal hardware timer, counting even single clock cycles. It will become then important to reduce the overhead of the entry and exit functions to measure the real function execution time.
- The trace information can be sent to the serial port (for example in binary form), and then interpreted by a program running on PC.
- The entry and exit functions can be used to monitor also the state of other hardware peripherals, such as a temperature sensor.
See also:
Sylvain
2010/10/19
This is a great article !
It shows a way to trace and also profile in a manner that fits perfectly the embedded world constraints. I imagine coupling this to OCD. I was initially looking after a way to use the profiling info (-pg gcc option) and redirect it to OCD or USART stream, do you know if it is feasible ?
Balau
2010/10/19
If you have an operating system running on your target, then -pg is OK. Otherwise -pg can’t work, for example because it needs a filesystem to write “gmon.out” during execution. The toolchain must also support it.
With the method I explained, it is quite trivial to send data to your USART instead of writing to file, it just needs to remove the fprintf calls and add functions to write data on the correct serial port registers. I don’t know about OCD but I suppose it is more complex because it involves having a JTAG adapter and a way to send your custom debug info with it.
Kaiwan
2011/03/22
Hi,
If I may, I have a kind of unrelated question to this post:
how exactly did you get the ‘syntax highlighter’ plugin (i assume that’s the one you’re using above) to work with this wordpress blog?
It’s really cool!
Am sure others would like to know as well!
TIA!
Balau
2011/03/22
I didn’t have to do anything, the syntax highlighter is already enabled in wordpress.com blogs: see Posting Source Code.
For wordpress.org blog installations, there’s a plugin available.
Kaiwan
2011/03/23
Hey, thanks again! it works. indeed
Silly of me…
Mahmoud Eltahawy
2012/04/26
Hi ,
Can I use the addr2line with shared library as input instead of executable??
Balau
2012/04/26
Yes you can, but you need to give addr2line the address as an offset from the load address of the shared library. The absolute address that you get run-time won’t work. You can find the load address of the shared library run-time with the dladdr function.
Sumi Dodger
2012/08/28
i added this trace.c in my project and compiled in Gxlinux and got this why this so some one help me………..
my project also consists of main.c
GxLinux- A complete embedded Linux development platform
[root@GxLinux src]# gcc -finstrument -functions -g -c -o main.o main.c
cc1: error: unrecognized command line option “-finstrument”
cc1: error: unrecognized command line option “-functions”
Balau
2012/08/28
You should write “
-finstrument-functions
” without the space, you currently have “-finstrument -functions
“Sumi Dodger
2012/09/05
I want that all the files of my project get traced using this function
I have added a file(trace.c) in my code containing these functions
but problem is that i have to run these commands for a particular file like main.c here
gcc -finstrument-functions -g -c -o main.o main.c
gcc -finstrument-functions -g -c -o main.o main.c
$ gcc -c -o trace.o trace.c
$ gcc main.o trace.o -o main
$ ./main
foo
$ cat trace.out
Is there any possible way to trace all the files (near about 70) at a time so that i come to know, what was the particular path or flow of functions was.
or better way will be that if a function can be created for running these commands.
Sumi Dodger
2012/09/06
someone plz give a solution………….
Balau
2012/09/06
I don’t know if I understand your problem.
It seems that what you need is a build system.
I suggest using GNU make.
I am giving you a quick example:
Create a file called “Makefile” where you have your C source files. Suppose your C source files are “main.c”, “foo.c” and “trace.c” The makefile contains:
main: main.o foo.o trace.o
CFLAGS = -g -finstrument-functions
trace.o: CFLAGS = -g
Then in the same directory run “make”. You should see:
$ make
cc -g -finstrument-functions -c -o main.o main.c
cc -g -finstrument-functions -c -o foo.o foo.c
cc -g -c -o trace.o trace.c
cc main.o foo.o trace.o -o main
You can then fill the list of object files needed by “main” and let “make” compile them.
For everything you need to know on makefiles, see the GNU Make manual.
Joseph Hillery
2012/09/14
Very useful! One thing omitted from code sample was to exclude the trace routines from the instrumentation. Otherwise they will recursively fire until stack is exhausted and you get a SIGSEQV.
void __cyg_profile_func_enter(void *this_fn, void *call_site)
__attribute__((no_instrument_function));
void __cyg_profile_func_exit(void *this_fn, void *call_site)
__attribute__((no_instrument_function));
Balau
2012/09/14
Yeah, in the compilation commands that I indicated the “
trace.c
” source file is not compiled with “-finstrument-functions
” option, so the trace routines were not instrumented anyway, but it’s a good idea to enforce it also with an attribute.alfred
2013/01/26
To resolve the addresses at run time you could also use the dladdr() function:
ex:
#include <stdio.h>
#define __USE_GNU
#include <dlfcn.h>
void __cyg_profile_func_enter (void *this_fn,
void *call_site)
{
Dl_info info;
if (dladdr(this_fn, &info) != 0) {
printf("%s", info.dli_sname);
}
}
Michael R. Hines
2013/01/28
I’ve improved on this tutorial here: http://michael.hinespot.com/tutorials/gcc_trace_functions
Now you can:
1. trace your program with fuzzy, human-readable function names
2. easily trace external libraries linked to by your program
3. blacklist/whitelist functions that you don’t care about
4. output trace results in human-readable format
Balau
2013/01/28
Thanks for improving my work! Glad to have inspired you.
Pierre Rasmussen
2013/04/22
Nice post!
Do you have any trick for handling static functions?
Balau
2013/04/22
What about them? My example works
in the same wayalmost the same even if function foo is static or inline.Correction: I tried some more, and it seems to have some problems detecting the caller when the callee is inlined.
Kaiwan
2013/07/04
Hi Balau,
Superb write-up, thanks.
When I attempt this on a stand-alone C program on Linux, it works just fine.
However, I wanted to try this approach on the Linux kernel itself during boot; so, in the kernel src tree (am using vanilla 3.2.21 configured for ARM Vexpress), I setup a patch to provide a menu option: once enabled, the Makefile sets up the build so as to include the CFLAGS:
ifdef CONFIG_KERNEL_INSTRUMENT_FUNCTIONS
KBUILD_CFLAGS += -finstrument-functions -fno-inline
endif
The profiling hooks are similar to what you used, here’s the func entry hook:
void
__attribute__((no_instrument_function))
__cyg_profile_func_enter(void *func, void *caller)
{
printk(KERN_ALERT “E %p %p\n”, func, caller);
}
After some fiddling, I got it to build successfully. But when i boot this kernel (am using QEMU; even tried this on a Raspberry Pi board with same -ve result), it just appears to hang:
# qemu-system-arm -m 256 -M vexpress-a9 -kernel ~/3.2.21/arch/arm/boot/zImage -initrd images/rootfs.img.gz -append “console=ttyAMA0 rdinit=/sbin/init” -nographic
pulseaudio: set_sink_input_volume() failed
pulseaudio: Reason: Invalid argument
pulseaudio: set_sink_input_mute() failed
pulseaudio: Reason: Invalid argument
Uncompressing Linux… done, booting the kernel.
…
Any idea why it hangs?
FYI, when I rebuild the kernel _without_ my new “instrument-functions” option, it boots just fine.
Any help appreciated!
TIA, Kaiwan.
Balau
2013/07/04
First thing that comes to mind, printk itself should have __attribute__((no_instrument_function)), and all functions that are called inside printk. Otherwise you have an infinite recursion.
Other function that should not be instrumented are those that are executed before the resources used by printk are initialized (memory, peripherals, stack, …)
Other than that, it seems a very invasive modification. I would add that option on a small part of the kernel, and then expand on that little by little.
Kaiwan
2013/07/08
Thanks Balau.. will try out stuff..
roger
2013/07/09
Thanks Balau for the great exposé. Is there any possibility to do the reported issue using eclipse?
Balau
2013/07/09
It doesn’t depend on Eclipse. If Eclipse uses GCC as the toolchain, then what I wrote can be applied.
roger
2013/07/10
Hello Balau, can you please explain me how to build this code using eclipse CDT? When i try to build code in Eclipse using Option -finstrument-functions i do not haave success!
Balau
2013/07/10
I don’t want to install Eclipse CDT because I’m afraid it would screw up my Eclipse Android development environment.
You gave me very little details on what you did.
What does it mean that you did not have success? It doesn’t compile? It doesn’t link? It doesn’t create a file on execution?
What did you do to add the -finstrument-functions options? I’m not familiar with Eclipse CDT but I assume it has options pane where you can put options for compilation of C source, and on your main.c you have to add this option.
You have to be aware that if you add the -finstrument-functions also on trace.c it will not work because it creates a recursion problem.
Usually during build the commands are printed somewhere on a build log. Are the command the same that I have in my post?
Roger
2013/07/11
Thanks Balau,
the codes above also work with Eclipse CDT. Can you please tell me how to use the readtracelog.sh within Eclipse?
Thank you so much.
Roger
Balau
2013/07/11
I don’t know much about Eclipse options, but you could try adding a Launch Configuration that runs that script.
roger
2013/07/12
Thanks so much Balau, but unfortunately. I copied all the needed files under ~/Debug then set the post-build command “readtracelog.sh ThelloW trace.out < t1.txt".
ing the
i am having this error message:
Cannot run program "readtracelog.sh ThelloW trace.out < t1.txt": Launching failed.
I will continue tomorrow.
Roger
roger
2013/09/23
Hello Balau
how can i improve the calling trace information in order to have a matrix base on callers and callees methods? Thanks so much.
Balau
2013/09/23
I’m not sure if I understand correctly. If you need a matrix with “callers” on the rows and “callees” on the columns, and each cell is true/false whether the caller calls the callee, then you have all the information you need, you just need to post-process it.
Roger
2013/09/23
Hi Balau,
that is exactly what i am looking for. I tried it every night since many months. I can read the functions address of both callers and callees in c/cpp but i don’t no how to build the corresponding Matrix. Do you have any example?
Thanks so much Balau.
Balau
2013/09/24
It depends very much on what you really need.
Do you need to create a matrix for a particular execution (like a dynamic call graph) or a matrix for all possible executions (like a static call graph)?
If you need static information, then this blog post is not what you are searching for. Go search for a static analysis tool that outputs a call graph.
Do you need to create it while the program is running or you can run the program first, then post-process the output?
If you need to create it while the program is running, then you will most probably need a list of lists, for example a list of callees, where each callee points to a list of its callers. Then at the end of the program you can transform it into a matrix because you can count the total number of items.
If you can create it afterwards, you can use the same C code in my post, then you can create a simple script (like in python) and take the name of the callers and callees like I did in the shell script, count the unique callers and unique callees to find the matrix dimensions, create the matrix with all false in the cells, then re-scan the file to put the true values into the cells.
I think it’s more difficult to do it run-time and easier to do it in post-processing.
I hope this gives you some direction so that you can do it by yourself.
Roger
2013/09/25
Hello Balau, thanks for your answer. I am trying to create the matrice afterwards. I have already use your code to generate the list of all callers. I have now to generate the list of the callees. I will try to create the python script for the Goal. Do you have any papers about creating such script?
Thanks so much
Roger
Balau
2013/09/25
No, sorry.
Roger
2013/09/25
Hi Balau,
Any way thanks so much Balau.
Roger
Howdy Texas
2014/04/23
Awesome post!
I would like to indent each function enter with a tab, and unindent exit with one less tab. Could you please give a hint on how to implement that.
Balau
2014/04/23
I think it’s quite easy: just add in
trace.c
a globalint
variable (preferablystatic
), increment it in__cyg_profile_func_enter
, decrement it in__cyg_profile_func_exit
and use that value as you wish.Philippe Proulx
2014/09/11
For your information, if your target is running Linux, LTTng-UST (the user space tracing component of the LTTng suite) has a preloadable helper library for tracing entries to and exits from functions using
-finstrument-functions
. It adds LTTng-UST tracepoints to__cyg_profile_func_enter
and__cyg_profile_func_exit
. Since LTTng also has the ability to trace all kernel tracepoints, you can get a correlated trace of both user space function entries/exits and kernel events using an aggregating viewer like Babeltrace.Once your application (or parts of it) is (are) built with
-finstrument-functions
, here’s how you can trace it using LTTng:LTTng-UST has the advantage of producing a trace with very little run time overhead. Like in your case, addresses are kept as fields, so you would need another tool (like the aforementioned
addr2line
) to create a call graph or such.Farhan Shariff
2015/07/08
I instrumenting the code with -finstument-fucntion will significantly slow down execution the program.
Is it true that it slows down execution by 100x or more
Balau
2015/07/08
I am not surprised that the execution slows down very much. One way to mitigate this is to use the flag only when compiling C files that you want to profile, and not everything.
Uwe Geuder
2015/11/05
Nice post. I have done it myself several years ago, but now my favorite search engine found your post much faster than I would have been able to locate my old code.
I modified the readtracelog.sh script a bit to show the function nesting graphically like this
(and to show relative timestamps, but that might be a matter of taste / depend on the use case. The resolution could be improved, but it was not important for me now, so we leave it as an exercise for the next reader…)
You did not specify a license for your code. Once you do I might be able to publish my changed version.
Balau
2015/11/05
Thanks for improving the code! Everything in this blog that doesn’t specify a license is under CC-BY-SA (see column on the right). I just switched to version 4.0 of the CC-BY-SA that allows to share the content also under compatible licenses such as GPLv3.
Sandeep Mukherjee
2016/01/17
I’m writing a call tracing utility using -finstrument-functions. Here’s a sample output of sqlite3 in action:
18:34:22 7f7ff7a10700 sqlite3_step->sqlite3Step->sqlite3VdbeExec->sqlite3BtreeInsert->insertCell->sqlite3PagerWrite->pager_write->pager_open_journal->sqlite3OsOpen->unixOpen->findCreateFileMode->sqlite3Strlen30
18:34:22 7f7ff7a10700 sqlite3_step->sqlite3Step->sqlite3VdbeExec->sqlite3BtreeInsert->insertCell->sqlite3PagerWrite->pager_write->pager_open_journal->sqlite3OsOpen->unixOpen->robust_open->posixOpen
18:34:22 7f7ff7a10700 sqlite3_step->sqlite3Step->sqlite3VdbeExec->sqlite3BtreeInsert->insertCell->sqlite3PagerWrite->pager_write->pager_open_journal->sqlite3OsOpen->unixOpen->robustFchown
Check it out at https://github.com/sandeepmukherjee/calltrace
Desmond
2016/04/28
Hi Balau, awesome write. Would this solution also work with C++ shared libraries?
Balau
2016/04/28
I never tried, but I guess it will probably work. Obviously the library should not be already compiled, you have to compile it with that option.
Desmond
2016/04/30
Hi Balau… Just some feedback.. I tried your solution for a big C++ software project. The tricky thing there is at want point you link the
trace.cc
file. Because the implementation for __cyg_profile_func_** is given in libc.so.6, and once the linker prepares the first shared library it seems to include the implementation from libc for those functions. Do you have any solution as how to linktrace.cc
as early as possible?Balau
2016/05/01
It seems the problem has to do with C++ function name mangling. For example if I compile trace.cc the function name becomes
_Z24__cyg_profile_func_enterPvS_
. If I instead stick with trace.c instead of trace.cc, the function name will remain the same and the linker will use them instead of the libc ones.Ricardo Mota
2018/07/17
I tried to add this to a source file on my code I mean to trace (somewhat large project), but the there seems to be too many calls to ‘__cyg_profile_func_enter()’… Does the instrumentation propagates to functions called inside the functions I meaan to trace?
Balau
2018/07/24
usually when I have an error like this, it’s because __cyg_profile_func_enter calls a function that itself is compiled with
-finstrument-functions
. You could use the attributeno_instrument_function
or gcc options-finstrument-functions-exclude-file-list
and-finstrument-functions-exclude-function-list
, see here: https://gcc.gnu.org/onlinedocs/gcc-4.4.2/gcc/Code-Gen-Options.htmlDONGLIANG MU
2018/10/29
There is one problem in this option. When you would like to record the function call trace for one crash, it will crash as expected without any call trace printed.
alxhoff
2019/10/29
Hello,
I am having the problem mentioned above that the addr2line addresses are without the memory offset and as such all are appearing as ??. How does your example void this problem?
Cheers
Balau
2019/10/29
Usually this problem is because the debug info is not present in the final ELF program. Be aware to specify -g both during compilation and during linking.