February 2019

I want a compiled program to be able to tell at runtime in a single read operation which function is on the stack.

So that the program can find metadata about the function, like the function's caller or which expression in the source code the program is about to execute, and to find this metadata in O(1). [1]

Today's processor does something similar. When a function is called the processor pushes automatically the return address on the stack. This address can be used to find which function is currently running. But this isn't quite the same, because it takes more than a single read operation to find the function. The return address points to a range between the start and end of the function, and searching for this range takes longer than O(1).

I want something different than just pushing a memory address on the stack. I want the processor to also push an integer function id into an array. Ids are much smaller than memory addresses. Most programs have only a few hundred functions, so an id can be a 2-byte value. [2] An id can be used as an index in an array of all functions loaded in memory. And since the processor already does the extra work of pushing the return address on the stack during a function call (and incrementing the stack pointer), it can also push the function id in a call array at the same time (and increment a call index pointer). When the function returns (when a RET instruction is executed) the call index pointer can be decremented.

This is slow if done in software. It has overhead of ~6%-30%.

Where will the processor get the function id from in order to save it in the call index? The CALL instruction isn't wide enough to contain both the absolute memory address of the callee and also the function id. A variant of CALL that uses relative addressing may not be wide enough either. So where can CALL get the function id?

I know this may sound inappropriate, but how about saving the function id right after the CALL instruction? Then the processor will know exactly where to look. [3] Or if the processor uses a variable length instruction set, make a longer version of the CALL instruction and save the function id in it. The wisdom of keeping a clean separation between data and code served well for many problems but the nature of this problem suggests a different solution.









Notes:

[1]  A static local variable in C doesn't do this. It doesn't track the function's caller.

[2]  The Linux 2.6 kernel contains ~7,000 functions. So it seems reasonable that the average program that saves a function id array of 10,000 2-byte entries would consume at most 20Kb of memory to have this feature. A 32Kb array may be enough.

If you're adding this call index feature in a processor, please support 4-byte function ids too. Who knows what may happen in a 64-bit single address space OS.

[3]  This isn't the first time code has been written that inlines data in the text section. The Linux BUG macro works this way, to show the file and line number when the kernel hits a bug.