Multilink is a command line tool that can create multiple code segment applications and shared library files for PalmOS. Multilink takes as input object files generated using gcc for PalmOS, as output multilink generates resource files corresponding to the application or shared library code, data, and relocation segments. As such multilink replaces the normal gcc-as-linker and obj-res steps in the build process.
The latest version is 0.3, if you have an earlier version, please update to the latest. You can check the version, with the -version option. Note: in version 0.3, the name of the command has changed, so you can keep your old multilink command and also install the newer m68k-palmos-multilink or m68k-palmos-coff-multilink.
0.3 is a major update to earlier versions of multilink. A lot of the main executable has been rewritten to work with the newer prc-tools 2.X version of gcc, a number of bugs have been fixed, a number of new features have been added, and (hopefully) multilink should be easier to use now.
Here is a HTML version of the multilink(1) man page, converted using rman.
Before explaining how the multilink tool works, it might be useful to start with what actually happens when a function from one segment calls a function in another segment. To explain the process, let's say that object foo.o is in segment #1 and contains a call to bar(). bar() is (naturally) in bar.o, and bar.o is in segment #2. How does foo.o get at bar()? It's all about surrogates and export tables.
Each segment has an export table which is basically a table of all the functions in the segment which are called from another segment. The table is actually a simple vector of 16 bit offsets from the start of the code segment to the entry point of the exported functions.
Ok, so we have an export table in segment #2 that somewhere in it has an entry for bar(). How does foo.o get at that? This is where surrogates come in. Multilink generates a function called bar() and adds this to code segment #1. This means that segment #1 is fully resolved and can be successfully linked. What the segment #1 bar() (the surrogate for the real bar()) does is to find the exports table in segment #2, and jump to the real bar() function referenced in that exports table. When the real bar() returns, processing will continue with the code in segment #1 that called the surrogate bar(). To make it easy for the surrogate bar() to "find the exports table in segment #2", a global table of all segment export tables is allocated and computed at startup time. In the usually case, the surrogate bar() executes only a handful of instructions to index into the global export tables table, index into the export table for segment #2, compute the absolute address of the real bar() and jump there. This fast path relies on storing a pointer to the global export tables table in the global data (a4 relative reference). If your executable is running without globals (say called from find), then this doesn't work. So, in addition to the reference stored in global data, a reference is stored using the PalmOS FtrSet() facility. When no globals are available, an inter-segment call can still be accomplished by pulling the global export tables table using FtrGet(). Note that the creator ID that is used as the first argument to FtrSet() is specified on the command line with -fid. You don't have to make this the same as the final .prc creator ID, but it should be reasonably unique. This is especially true if you are running multilink created apps with multilink created shared libraries. If the latter accidentally uses the global export tables table of the former (or visa versa) bad things will happen. Here is a (poor) diagram of the whole process:
If this all seems familiar it should. It's very similar to the way GLib shared libraries call each other. Multilink inter-segment calls are a little simpler, as they don't try to switch global data between segments. This is a feature(!) and makes the inter-segment calling easier, but makes the linking part much harder.
There are two important components to multilink: the multilink command itself, which runs on the host (development) platform, and the runtime code, which is linked into the target code and runs on the PalmOS device.
When executed, first multilink analyses all the input object files, recording information about all the global symbols (global functions, data, etc), unknown symbols (object foo.o calls memcpy() which is unknown to foo.o), and objects. When this is complete, multilink knows in which object file every symbol is (or no object if the symbol is still unresolved), and has created a graph of all the object to symbol dependencies.
Next multilink starts allocating objects to code segments. Multilink has two different strategies for doing this. The default is the 'simple' allocation scheme, optionally the '-segmentmap' approach can be taken. In the simple scheme, objects from the command line are added to a segment one by one until the -segmentsize is reached. Then a new segment is created, and objects are then added to it, etc.. The simple allocation scheme works fine, and will usually 'just work', but as no regard to inter-object dependencies is taken, this approach can sometimes result in a resulting executable which makes lots of calls across segment boundaries (somewhat more expensive than a local call). Ideally, you want your executable to have code segments which have a high degree of locality. That is, most of the code in a code segment calls other code in the same code segment. So, if you use the simple scheme, the order of objects on the command line matters. For the finest control over this, use the '-segmentmap' approach to explicitly control things. Personally, I think both are less than ideal, and it's unlikely you'll be able to make much difference one way or another. I really wish that I had the time to write a better allocation algorithm that took into account inter-object dependencies (either static, or runtime determined) and generated the ideal segment map. Maybe someone else can work on this...
Ok, so now all the segments have been allocated, and the inter-segment wiring can commence! This is the interesting part if you are into bits and bytes. The wiring consists of the caller side surrogate functions and callee side export tables described above. Both surrogates and exports are generated as assembler files. One of each kind for every code segment. If you use the '-leavetmp' option, you can examine the surrogate functions and export table for segment NNNN, they will be named 'sNNNN.s' and 'eNNNN.s' respectively, where NNNN is the segment number.
The other thing that multilink must do is to merge data references from all code segments into one common addressing range. You want to be able to reference global data Foo from code segment #1 and Foo from code segment #2 and have them be the same thing, right? This turns out to be pretty tricky, as multilink doesn't actually do the linking of code segments itself, it calls gcc to do this, and when gcc is linking code segment #5 it doesn't know about the global data used in code segment #2, or visa versa, but for everything to work, all data must be allocated out of a common offset range for all code segments. Multilink finesses gcc (or ld I guess) to make everything line up. Multilink does this by first being aware of all data references in all segments, and by adding padding data to the front and end of each code segment link to ensure that each segment refers to the same data at the same global offset. If you use the '-leavetmp' option, you can examine the padding data for all segments as 'data.s' (actually this is not just the padding data, this is all data), and the pre- and post- padding data for segment NNNN as 'dbNNNN.s' and 'deNNNN.s'.
Next multilink determines what kind of runtime support files each segment needs. For most segments this is just a little bit of code that can handle the FtrGet() way of finding the global export tables table. In addition, the first code segment (code0001 for apps, libr0000 for shared libraries) gets all the startup code. The operating system will load the first code segment and global data (at least for apps), and the first code segment startup code must load the other code segments, and relocate data references to them, as appropriate.
When the segments have been allocated, when the inter-segment wiring has been generated, when the pre- and post-segment padding data has been generated, and the appropriate startup files have been chosen, when all is ready the segments get linked. Each segment is linked individually, and gcc is called to 1) assemble the .s files, and 2) to link the segment together. If you use the -verbose option, you can see how gcc is called. The result of each segment link is fully linked code segment. If you use the '-leavetmp' option, you can examine these code segment files, they are named 'oNNNN', where NNNN is the segment number.
After gcc has been run to link each segment, obj-res is called to extract the code and data relocation information from each gcc linked file. Multilink could do this work itself, but it seemed a waste of time to duplicate what obj-res does very well. Multilink does some renaming of the files that obj-res generates to match the naming scheme described in the usage section.
Finally, the data resource needs to be generated. Multilink does this work itself, as the resulting data resource is the common data for all segments. The data resource file is generated as 'data0000.basename.grc' and we're all done.
Multilink has been successful built and used on FreeBSD, Linux and Solaris (though in the case of the Solaris, not in a long time). It should easily port to any other Un*x variety that has the prc tools package and the bfd library. Multilink has also be ported to Windows using the Cygwin package. Porting it to Win32/Visual C++ shouldn't be that difficult, but I've not yet had the inclination to try.
A number of the items previously described in the section have been addressed in version 0.3: library support, debugger support, etc..
Basically global const data doesn't work. If global const data is referred to in more than one segment, you're out of luck, it won't link. If all the references to a particular const items, and the item are in the same code segment, you're OK. There are some system limitations and limitations of the multilink approach that cause this. Gcc usually generates data defined as const as text (code), and all references to that data using PC relative addressing. This works fine when you only have one code segment, as all text is in that code segment, so a PC relative address will work. The scheme breaks down in multi segment world, as at compile time you have no idea of the relative (or absolute) address of an item in another code segment. For function calls, this is handled by the surrogate and export table scheme described here, but for data references, there is no hook that multilink can use to cross the inter-segment chasm. As a point of reference, the MetroWerks CodeWarrior compiler gets around this by using globals relative addressing for all inter segment references (code or data). This has the advantage of always working, but has the (I think) distinct disadvantage that it generates huge global data segments, as every exported function or data item must have an entry in the data segment, with the associated relocation information. This adds up pretty quick, and eats away at your most precious resource: application memory. The three workarounds that I use, and suggest to others are:
This is not a multilink wish list item, but rather a wish for multilink's obsolescence. Multilink is a hack to solve a problem. The app that I work on, had grown too big for one code segment (many times over), and a fix was needed. The right fix is teach the gcc compiler and linker about multiple code segments, but that sounded like taking a really long time to do. I have some familiarity with linkers and hacking them from previous endeavors (see gtscc), and thought the multilink non-invasive approach to using gcc could work. It did, but it's not as good as the real thing could be.
What about multi-segment support in prc-tools 2.X you ask? Unfortunately, the multi-segment support in the prc-tools 2.X tool chain requires you to make significant changes to your source code. I found this unacceptable and so multilink has been updated to work with the newer compiler. The combination provides the better compilation and convenience of the new compiler, plus the ease of use and low runtime memory use of multilink.
I still think the right way to go in solving this problem is a version of ld specifically made for PalmOS. Multilink may one day grow into this or serve as a store of knowledge about non-invasive PalmOS multi-segment linking.
How does multilink compare to prc-tools 2.0's multi-segment support?
Creating multi-segment applications with using only the multi-segment support now available in prc-tools 2.0 requires changes to your source code. Multilink does not require changes to your source code, only to your Makefile. That's the main advantage of using multilink, but in addition: the multilink runtime model uses less memory, apps linked with multilink can always call across segment boundaries (even when there are no globals), multilink now has a viable debugging technique, other things.
Problems building multilink: libbfd
90% of problems related to building multilink are to do with the bfd library. Multilink uses the bfd library (as do the gnu binutils tools) to read, write and modify object files. It is critical that you are building multilink with the correct version of the bfd library. The correct version of the bfd library is NOT the version for your host machine, but the version for PalmOS object files. /usr/lib/libbfd.a is probably not what you want. /usr/local/pilot/lib/libbfd.a probably is. Your system may not have a version of the bfd library built for Palm object files, in that case you must build one. Sorry. See also: Libbfd.
When I link, I am getting unresolved reference error messages for strange things like: _udivsi3, _muldi3, etc.. What's up?
These symbols are part of the gcc runtime library, supporting things like multiply and division of 32 bit numbers. You have probably neglected to include -lgcc or -stdlib on the command line.
Are cross segment calls possible when there are no globals?
Yes. The multilink runtime library loads and tracks the segment tables whether there are globals or not. With the older prc-tools 0.5 tool chain, there is some performance penality in this mode.
What is the multilink license?
MPL. Multilink was originally developed by me at AvantGo (makers of very cool software for your mobile device). AvantGo has allowed me to release the source code under the MPL license.
Multilink source is available here:
These may be useful:
Archives:
If you download and use multilink, please support me in my efforts to raise money to fight AIDS while running in the Honolulu Marathon. Click here.
Having trouble getting multilink to build correctly? It's probably the version of libbfd.a or bfd.h that you are using. Try this.
Many thanks to the authors of build-prc (Dionne & Associates, The Silver Hammer Group Ltd.), obj-res (Dionne & Associates) and the gcc startup and GLib library code (Kresten Krab Thorup) for sharing their discoveries. Many thanks to the management at AvantGo for allowing multilink to be released under the MPL. The -gdb-script option (and load-segments command) was inspired by a clever gdb hack Adam Dingle developed. Thanks to Ashok Nadkarni for the first Cygwin builds. Thanks to John Marshall at Palm for prc-tools 2.0. Thanks to David Sidrane for some good testing and suggestions. Thanks to the entire Palm hacking team at AvantGo (especially Eric House and Max Sprauer) for their bugs, good ideas, and patience. This team rocks! Thanks to Max Sprauer for the Cygwin builds.
© 1998-2002 David Williams <djw@djw.org>