Unused methods

rokicki · 2006-12-21 19:26

I heard from someone on this forum that unused methods in spin are not included; that is, the compiler chucks
them out.

In my tests just now, that doesn't appear to be happening. Is this supposed to? That is, if I have an object
foo with methods m1, m2, and m3, and I use it in an object bar but I only use m1, should the compiler not
include methods m2 and m3?

If I'm not messing things up (that is, the compiler includes all the methods whether or not they are used),
that's going to put a kink in my plans. Is this something perhaps that we can add to the compiler? If not then
I need to figure out how to get that effect for a FAT16 library I'm about to release; I would hate for people
who don't need directory traversal, for instance, to have to waste memory on directory traversal routines,
and I hate manual editing of code. Similarly, people who use putc() but not write() should not need to
include write().

Thanks for any help!

Mike Green · 2006-12-21 20:23

The SPIN compiler does indeed leave out methods that are not referenced. It starts with the main method in the root object, adds the interpretive code for all methods referenced by the main method, then goes through each method added, and includes the code for the methods referenced by those, etc. adding the code only once for each method. Any method not "reachable" from the main method is not included.

If you have an example of this that doesn't work, strip it down to its essentials and submit it as a bug report.

rokicki · 2006-12-21 20:31

I'm attaching such a minimal example. I am using a string that you can clearly see in the compiled
output of main. It does not require a string, however; a long sequence of simple assignment
statements also will be included.

You can trivially create an example yourself. Simply add any unused object to any
main you may have; the output file will grow by the full size of that object (and not just its
data portion as I would expect).

Or maybe I'm misinterpreting something.

Mike Green · 2006-12-21 20:58

Just guessing ... I suspect that the compiler is putting the strings into a global table for the object rather than one associated with the method as if they were going into a DAT section (which is global to the object). Since (as far as I know) the compiler doesn't optimize duplicate strings, it shouldn't matter where they're put ... globally or locally at the end of the method's byte codes. If they're local, they should be able to be optimized along with the byte codes if the method isn't referenced.

I don't think you're misinterpreting anything. The compiler is supposed to optimize out unreferenced objects. I'll PM Jeff Martin about this.

rokicki · 2006-12-21 21:07

Right, but as I said, it is not just strings. Replace the string with, say, 1000 assignment statements. The same
thing will happen.

I've tried all sorts of things, including with real objects, and I cannot get any unused function elided at all.

Simplest test: build a main that has one empty pub and nothing else. Compile it, record the size.

Now stick at the top

obj
tv : "TV_Text"

Compile it and record the size. Nothing in TV_Text is used. Yet, you'll get every byte of code from
TV_Text in your main.

I thought, maybe if you don't use it at all it defaults to include everything. So use a single method from
TV_Text. You'll get the same (indeed a slightly larger) size.

Essentially, I cannot locate any evidence whatsoever that unused methods are eliminated. Every single
time I add a method to a subobject, even if it is completely unreferenced anywhere, the overall size
grows.

cgracey · 2006-12-21 21:51

Actually, the compiler includes the whole object (including all methods). The good news is that no matter how many instances you have of an object, across all levels of hierarchy, the code for that object is included only once.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔

Chip Gracey
Parallax, Inc.

rokicki · 2006-12-21 22:02

Great, thanks for the definitive reply, Chip!

Of course this leaves library builders in a quandary; we want to provide a large set of methods
for people to use, but we don't want people who only use a small subset to pay the memory
price of all the methods. And there's no automatic way for us to figure out what methods are
actually used.

Is there any chance a new version of the tool could do what (apparently) many people expected
it to do, and simply not include code that's unused and unreachable? If nothing else, this will
quickly and effortlessly make a lot of current programs smaller.

Jeff Martin · 2006-12-21 22:06

Hi,

Thank you, Mike, for alerting me to this thread.· The answer is that·both of you are partially right.

The current compiler (and every previous version of it so far) does optimize some things, but the granularity is at the Object level, not the Method level.· This means, if your application includes two or more instances of a particular object, the compiler only uses one copy of the object's code, and multiple copies of the object's global variables, in the final application.· So, for example, using 5 instances of an object will include 1 instance of the object's code and 5 instances of the object's global variables.· I believe this is what Mike is thinking of when he mentioned that it optimized methods.

Optimizing out unreferenced methods is something we'd really like to do, but it horribly complicates the compilation process and we have not reached a point yet where we wanted to take on that task.· The compiler recursively compiles the objects into the application and every step of the way it does a binary compare of the object units currently included and throws out duplicate code, patches addresses and move on.· That's the part that would have to change significantly if we had the added necessity to create an "intersection" image of an object based on every use of each instance of that same object.

Mike, I'm sure something we said, or didn't say, early on in the Propeller release led you to misunderstand what level of optimizing occurs in the compiler; I'm sorry about the confusion and apologize for that.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
--Jeff Martin

· Sr. Software Engineer
· Parallax, Inc.

Jeff Martin · 2006-12-21 22:09

rokicki,

Your reasons are exactly why this idea has been floating around even before the release to make it optimize that way. Hopefully we can provide a solution to this someday.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
--Jeff Martin

· Sr. Software Engineer
· Parallax, Inc.

rokicki · 2006-12-21 23:14

Right, implementing this would definitely require a bit of work.

One way to do it is to make your compiler take as an argument a hash table including the
methods that are needed (indexed by file name and method name). Every time you look
up a method name, make sure it's in this hash table. If it's not, mark the current
compilation as invalid, add it to the hash table, and continue. Every time you compile an
object, only include those methods marked in the hash table.

At the end if the compilation is marked as invalid, just repeat it with the updated hash table.

This way you'll throw away precisely the same methods every time, so your binary comparison
and address patching will still work. And the changes to your existing compiler are probably
pretty minimal (just throwing a loop around it and adding a hashtable of used methods).

(I used to write assemblers this way; just run the assembly process multiple times, if you
see use of an unreferenced symbol, treat it as zero on that pass and mark that pass as
failing, and repeat until you stop failing. It may not be the fastest way to go, but it
eliminates all the separate code for multiple passes.)

I'm sure there are a lot of complexities that I am unaware of in this case, though.

Phil Pilgrim (PhiPi) · 2006-12-21 23:39

I don't think this will throw a wrench into the optimizaiton works, but something I'm rather hoping to see someday are method references that can be passed as parameters. This would help meet a need for universal I/O routines that can format data for serial output, video output, etc., without having to know where the data is going. I believe rockiki's approach will still work, since any method that gets used will have to be mentioned somewhere in the program. But it'd still be good to plan for such an eventuality if the optimization task is tackled first.

-Phil

Mike Green · 2006-12-22 02:43

One way to help somewhat is to have several objects with well defined subsets of functions and have them reference the underlying data structures via pointers. The Propeller OS does this and, for example, the display functionality could be split with commonly used methods in one object and less commonly used methods in a second one. Both objects would access the underlying display information via pointers and you could leave out the one(s) you didn't use.

Unused methods

Comments