Shop OBEX P1 Docs P2 Docs Learn Events
No bounds checking for object instance arrays: an exploitable compiler feature? — Parallax Forums

No bounds checking for object instance arrays: an exploitable compiler feature?

Phil Pilgrim (PhiPi)Phil Pilgrim (PhiPi) Posts: 23,514
edited 2008-07-11 17:31 in Propeller 1
While coming up with pathological cases to test CLEAN on, I made a discovery regarding subscripted object instances: neither the compiler nor the interpreter do any bounds checking on the subscripts. In fact, you can subscript an object instance that's not even part of an array. Here's some code to illustrate the point:

''Top object:

CON

  _clkmode      = xtal1 + pll16x
  _xinfreq      = 5_000_000

OBJ

  test0 : "test_obj0"
  test1 : "test_obj1"

PUB Start

  test0[noparse][[/noparse] 1].start

''__________

''test_obj0:

PUB start

  dira[noparse][[/noparse]0]~~
  repeat
    outa[noparse][[/noparse]0]~~
    outa[noparse][[/noparse]0]~

''__________

''test_obj1:

PUB stop

  dira[noparse][[/noparse] 1]~~
  repeat
    outa[noparse][[/noparse] 1]~~
    outa[noparse][[/noparse] 1]~




'Any idea which method gets called by test0[noparse][[/noparse] 1].start? If you guessed test1.stop, you'd be right. This is (apparently) because test1.stop is the first public routine in test_obj1, and test1 is the next object after test0 in the OBJ list.

This is a compiler "feature" which could raise all sorts of havoc. But is it also expoitable? Initially, I thought it might be, as a way to provide a universal character output routine — something like this:

CON

  _clkmode      = xtal1 + pll16x
  _xinfreq      = 5_000_000

  SER           = 0
  TV            = 1
  SYNTH         = 2

OBJ

  serout : "myserial"
  tvout  : "mytvtext"
  synout : "myvoicesynth"

PUB CharOut(device, straddr)

  serout[noparse][[/noparse]device].str(straddr)




But, really, this is no better — and a lot riskier — than:

CON

  _clkmode      = xtal1 + pll16x
  _xinfreq      = 5_000_000

  SER           = 0
  TV            = 1
  SYNTH         = 2

OBJ

  serout : "myserial"
  tvout  : "mytvtext"
  synout : "myvoicesynth"

PUB CharOut(device, straddr)

  case device
    SER :   serout.str(straddr)
    TV :    tvout.str(straddr)
    SYNTH : synout.str(straddr)




So, no useful tricks here, but certainly a possible trap.

-Phil

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
'Still some PropSTICK Kit bare PCBs left!

Comments

  • RaymanRayman Posts: 14,162
    edited 2008-07-10 00:17
    Nice way to index a bunch of objects though...

    Might be useful for switching back and forth between TV and VGA drivers...
  • Phil Pilgrim (PhiPi)Phil Pilgrim (PhiPi) Posts: 23,514
    edited 2008-07-10 00:34
    Rayman,

    As I illustrated above, the CASE construct would be much more robust for that.

    -Phil

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    'Still some PropSTICK Kit bare PCBs left!
  • jazzedjazzed Posts: 11,803
    edited 2008-07-10 02:42
    Interesting, but confusing. Can you clarify something?
    I would experiment, to find this answer, but you likely already know.

    Given 3 objs all with the same number of methods, would using an
    index as you mention select a "method number" or just some offset?

    tvtext.spin: defined as
    PUB start
    PUB stop
    PUB str
    PUB out
    PUB setcolors

    vgatext.spin: defined as
    PUB start
    PUB stop
    PUB str
    PUB out
    PUB setcolors

    OBJ
    · tv··: "tvtext"
    · vga : "vgatext"

    tv[noparse][[/noparse]0].start ' calls tvtext.start ?
    tv[noparse][[/noparse]1].start ' calls vgatext.start ?
    tv[noparse][[/noparse]0].out·· ' calls tvtext.out ?
    tv[noparse][[/noparse]1].out·· ' calls vgatext.out ?

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
  • Phil Pilgrim (PhiPi)Phil Pilgrim (PhiPi) Posts: 23,514
    edited 2008-07-10 03:01
    Jazzed,

    Yes to all the questions in your code comments. The reason tv[noparse][[/noparse] 1].start calls vgatext.start isn't because the methods have the same name, but because vgatext.start has the same relative index of occurence (among the public methods) in vgatext.spin as tvtext.start has in tvtext.spin. If you were to shuffle the order of these public methods in one of the files, you wouldn't get what you want. You would also get screwed up if the two routines required a different number of parameters. (I haven't verified the latter; but since the number of parameters is set at compile time, I'd be stunned if it weren't true.)

    -Phil

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    'Still some PropSTICK Kit bare PCBs left!
  • jazzedjazzed Posts: 11,803
    edited 2008-07-10 03:35
    Phil,

    Yup, makes sense. I kind of suspected the numbering -vs- naming as you've described having mentioning "method number" before.·I have had my nose deep in object reconstruction studies all day today. I wonder how much abuse we can give this thing without creating a problem for such "special coding" going forward.

    So, what this means is significant.
    1. It gives a way to select an object at run-time (though bulky and constrained) to suit·a task.
    2. This also looks like a way one could invoke·a·"function pointer" call [noparse]:)[/noparse]

    Concrete experiments should be done to thoughroughly explore and describe this "feature". Unfortunately, I'll not be able to contribute too much on this for a few weeks because of family needs as of tomorrow morning. I'll still check in as a "casual observer" though.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
  • Phil Pilgrim (PhiPi)Phil Pilgrim (PhiPi) Posts: 23,514
    edited 2008-07-10 03:56
    Jazzed,

    I'll probably just drop it. It would play immense havoc with my CLEAN program, which I feel is a more robust approach to things like BIOSes. Abusing object indices is an interesting diversion which may work fine if done carefully, but it's too brittle for serious application development. OTOH, it does provide some tantalizing possibilities for code obfuscation! smile.gif

    -Phil

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    'Still some PropSTICK Kit bare PCBs left!
  • jazzedjazzed Posts: 11,803
    edited 2008-07-10 04:40
    Obfuscation granted [noparse]:)[/noparse]

    I've grown to like Stevenmess2004's DOL model, and am researching alternatives there. It's more of an O/S approach, but the idea does not demand eating up precious prop memory with unused modules. As a run-time solution, it appears to be expandable in a linux-like driver way.

    Your work with CLEAN will go a long way to solving memory issues for many users and is akin to finding round pegs for round holes. DOL is kind of an elliptical peg it seems, but holds promise in the "everything to everyone" at run-time arena.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
  • hippyhippy Posts: 1,981
    edited 2008-07-10 14:48
    I think this is a re-discovery but I don't think there was much pursuit of the idea before, so don't worry about that.

    In terms of a BIOS, if every device driver object had a standard set of functions in the source in the same order with same number of parameters, it would be very easy to just switch between devices in the device routing objct with ...

    
    VAR
      byte deviceSelected
    
    PUB Tx(n)
      deviceObj[noparse][[/noparse] deviceSelected ].Tx(n)
    
    PUB Rx(n)
      return deviceObj[noparse][[/noparse] deviceSelected ].Rx
    
    
    



    That's a lot more lightweight than the Case selection. There'd be a need to map which device the user routed to to an index and probably necessary to add a Null device as well to stop code crashing if a non-existent device were selected ( and re-directing output to Null is reasonable anyway ).

    By having a "PUB Identify" in every device driver the router can dynamically determine what devices there were. As long as the devices were in sequence with Null as last it can run through device[noparse][[/noparse]n].Identify to see what there is when it first starts or dynamically ...

    OBJ
      deviceObj : "Null"
      tv        : "TV_Text Driver"
      serial    : "Serial Driver"
      usb       : "USB Driver"
      null      : "Null"
    
    PUB FindDeviceIndex( deviceCode )
      result := 1
      repeat until deviceObj[noparse][[/noparse] result ].Identify == deviceCode Or _
                   deviceObj[noparse][[/noparse] result ].Identify == 0
        result++
    
    
    
  • Phil Pilgrim (PhiPi)Phil Pilgrim (PhiPi) Posts: 23,514
    edited 2008-07-10 15:06
    Hippy,

    Your approach makes the technique more robust than I thought it could be. For me, it provides a way to deal with an OBJ list that CLEAN may have pared somewhat. If I use it, I'll have to find a way to identify which ones to keep. This may be as easy as using the CASE construct in the open method (to trigger CLEAN) and returning an index into the object list as the filehandle. One advantage the CASE construct conferred to the I/O routines was inherent bounds checking on the filehandle. This could be added to the indexed methods, too, but at the expense of a little extra overhead.

    -Phil

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    'Still some PropSTICK Kit bare PCBs left!
  • jazzedjazzed Posts: 11,803
    edited 2008-07-10 16:15
    A 'standardized' interface set is necessary for taking advantage of this obviously. I wonder tough if an extention could be descibed some way. Following the unix/linux example would be good since any student of computer science would be immediately familiar with it. At minimum, the following would be provided: init, open, close, read, write, flush, select, and ioctl. These would provide character and stream devices. Network devices could also be supported at the driver, but would require different abstraction for bind, accept, etc .... The ioctl would be used for parameter manipulation or other service control ... blocking/async whatever. This is a little more expansive tha open, close, get, set as mentioned before. If a 'count' (and list?) method was provided, one could query for other services. I would like to see an init method for being able to install kernel modules in the case of O/S use.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
  • hippyhippy Posts: 1,981
    edited 2008-07-10 17:23
    A thought for CLEAN is to keep the higher level vectoring with its overhead and then just strip the sub-objects of code except for used PUB headers. It's not as perfect and doesn't deliver the best compression but could be easier to implement.

    Another idea for the object indexing method would be to start with empty driver objects to get through compilation then post-compilation inject the required objects into the image and fixups the related pointer tables.

    Just out loud thinking, no thought given to the pro's and con's.
  • Phil Pilgrim (PhiPi)Phil Pilgrim (PhiPi) Posts: 23,514
    edited 2008-07-10 21:06
    Hippy,

    How would you handle multiple instances of the same object? I'm thinking about a serial I/O object that can be instantiated several times to accommodate multiple serial channels. Your identify method won't be instance-sensitive, but maybe it could return a null value if the channel is already being used.

    -Phil

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    'Still some PropSTICK Kit bare PCBs left!
  • hippyhippy Posts: 1,981
    edited 2008-07-11 17:31
    I suppose it depends if one thinks of it as opening COM1, COM2 ... COM32 or 32 opens of COM with a different port number; 32 drivers or just the one.

    The deviceCode could be split into parts, msb's indicate the type of device, lsb's the instance of it / pin number used. The call into deviceDriver done on the msb's index only, lsb used by the driver itself.

    In my Basic Interpreter I wend for a fileHandle array [noparse][[/noparse]0..N] which indicated deviceCode which had been opened, and that also had to start tying in which pins had been allocated to what, whether Input, Output or Both ( similar with file names for the SD Card driver ) and it all got very complicated very quickly, and I never did get that part finished.

    I wanted to have a high-level ( interpreted command ) like ...

    tv = Open "TV,PAL,INTERLACE"
    sin = Open "KEYB"
    sout = Open "COM30,9600,N,8,1"

    Then just call Print#tv,"Hello" and Print#sout,"Hello" and have everything directed as required. That should be translatable into raw Spin.
Sign In or Register to comment.