OnSwipe redirect code

Sunday, August 17, 2008

How libffi actually works?

I came across this libffi when I thought of working on js-ctypes. Though my contribution remained very close to nil, I got to know about this fantastic new thing libffi. But I did not understand how it actually worked when I first read. I just got a high overview and assumed that the library abstracts out several things that I need not worry about when making calls across different programming languages. I just followed the usage instructions and started looking at the js-ctypes code. That was sufficient to understand the js-ctypes code. But today when I was once again going through the README, I actually came across one line that made things a little more clearer. The first paragraph in "What is libffi?" tells it all. Its the calling conventions that this library exploits. I had written a post about calling conventions some time back. As mentioned in the readme file, its the calling convention whcih is the guiding light for the compiler. So when generating the binary code, thecompiler will assume will that the arguments that are being passed to that function will be available in some place and also it knows about a place where the return value will be kept, so that the code from the calling function will know where to look for it.

Now we know that the ultimate binary code that we get after compilation is the machine code. So it is basically a set of machine supported instructions. These instructions are certainly not as sophisticated as those available in the C - like the functions. So what do these functions transalte to? Nothing but a jump of IP(instruction pointer) to an address where the binary code of that function is. Inside this code, the arguments passed are accessed by referencing some memory location. During compilation the compiler will not know the precise memory locations (obviously). But the compiler has to put in some address there when generating the code. How does it decides on the memory location? This is where the calling conventions come into picture. The compiler follows these standard steps. Since its the compiler who is generating the code for both calling a function, where arguments are sent, and the code of the called function, where those args are received, the compiler will be knowing where it has put the arguments and hence generate code to use the data in those locations. From this point of view, the main(or is it the only one) condition for the binary code of a function to work properly is to have its arguments in the memory locations it thinks they are in and and the end place back the return value in the right location. And from what I have understood till now, it is this point that libffi exploits.

When we are forwarding calls from an interpreted language, like JS, to some binary code generated from a compiled language, like C, there are two things to be done:

0. As discussed earlier, whatever arguments are passed from JS are to be placed in the right locations and later take the return value and give it back to JS.
1. Type conversions -- The types used by JS are not understood by C. So someone should rise up to the occasion and map these JS types to their C counterparts and vice-versa.

The first of these is taken care by libffi. But the second one is very context specific and totally depends on the interpreted language. It does not make sense to just have a catalog of type convertors for every interpreted language on this earth. So these two steps were separated and spread out as two interacting layers between the binary code and the interpreted language code. libffi now takes care of the calling convention part and making sure the binary code runs and gives back the results. And the job of type conversion is the job of another layer which is specific to the interpreted language which wants to call into the binary code. And hence we have these different type converting layers for different interpreted languages: ctypes for python, js-ctypes for JS (and probably some more exist). Well with this renewed and clearer understanding I hope to actually contribute to js-ctypes.

Happy calling (with conventions) ;-)

3 comments:

  1. Interesting. Currently, I'm creating an interlanguage solution to my personal project with shared libraries. I have a Pascal class that I use and I simply createe a Pascal shared library that exposes the methods of the class as functions and then use those functions from a C program (using the correct calling conventions, creating a header file, etc.). Would libffi be a better solution?

    ReplyDelete
  2. If you intend to create a component/library for providing interoperability between Pascal and C, then libffi will be really useful. All you have to do is handle the type conversions.

    However, if it is just your project, where some pascal code needs to talk to some C code, then I guess libffi will be an overkill. It is the type conversion that will be the killer.

    OTOH, if the types that move between Pascal and C are just the most trivial ones, then libffi can still be useful, as type conversion for these will be relatively easy.

    ReplyDelete
  3. It's a nice reading. It gives me a much better understanding of libffi. Thank you!

    ReplyDelete