I've thought about Spin overlays a bit more. One way to do it would be to create a stub-object for each object that is in the overlay. When you call a overlay-method you actually call a stubbed version of the method that is resident in memory. The stubbed version checks if it's overlay is in memory, loads it if needed, and then calls the actual method in the overlay.
This idea could be extended even further where the stubbed-method loads only its method into a cache instead of loading the whole object. However, this would require loading the method plus all the other methods that it might call, so this becomes very complicated.
However, this would require loading the method plus all the other methods that it might call, so this becomes very complicated.
I like your stub idea, but I agree the above is an issue.
Hmm. Alright, let's take the position that using overlays is not going to be just a matter of dropping in objects from the Obex. Code is going to need to be tweaked to suit the overlay system. An example would be any object using pasm. The current spin system is you just include the pasm, load it, and up to 2k of hub space gets wasted and never used after that. A smarter way would be to load the pasm into a cog, note the PAR list location, then reload the spin part separately. And that spin part might get loaded and unloaded as an overlay and the only part that stays resident is the PAR section.
So... thinking about it this way, if things need to be optimised for overlays, then at the same time they can be optimised for stubs, including things like removing nested objects within that stub and putting everything that is needed in one package.
The stubbed version checks if it's overlay is in memory
Ah, that leads to a simple cache driver. That might make it easier for the programmer as it is not so necessary to actively think about which overlay is in memory. So you could allocate some hub memory for overlays, if it is needed, load it in and then leave it there. The cache fills up with lots of different sized overlays. Each time an overlay is called, that becomes overlay #1, and all the other overlays in memory get incremented by 1. When memory is full and a new overlay is needed, the highest number overlay is removed from memory until there is enough free space.
There are lots of other cache algorithms too.
That could work really well for something like a set of string algorithms. Each of these tends to fit into one PUB and tends to be self contained and not calling other PUBs very much, and if it is, ok, a bit of code is being duplicated here and there. But if you are processing a lot of text, then with a cache the string routines you use a lot would stay in memory, and the ones you don't use much are slower.
What would a stub overlay look like?
Could it be as simple as a PUB var1,var2,var3 etc? Do you need to define how many variables are passed? Would returning variables work the same way as normal?
Here's a program that reads Spin binary files from an SD card, and calls the first method in the top object. The example contains the simple programs test1.spin, test2.spin and test3.spin that just return the values 1, 2 and 3. The overlay program loads the binary files from SD and prints the returned values.
To test the program you must first write the binary files test1.binary, test2.binary and test3.binary to an SD card. Modify the parameters in the call to mount_explicit to match the pin numbers for your SD card. The program is currently set up for a C3 card. Build and run the overlay.spin program, and it should print the following lines.
test1.bin returned a value of 1
test2.bin returned a value of 2
test3.bin returned a value of 3
Here's a program that reads Spin binary files from an SD card, and calls the first method in the top object. The example contains the simple programs test1.spin, test2.spin and test3.spin that just return the values 1, 2 and 3. The overlay program loads the binary files from SD and prints the returned values.
To test the program you must first write the binary files test1.binary, test2.binary and test3.binary to an SD card. Modify the parameters in the call to mount_explicit to match the pin numbers for your SD card. The program is currently set up for a C3 card. Build and run the overlay.spin program, and it should print the following lines.
test1.bin returned a value of 1
test2.bin returned a value of 2
test3.bin returned a value of 3
Wow!
I think this just might be what the doctor ordered.
this is brilliant. I'm not entirely sure what some of the code does, but it certainly is working
PUB CallOverlay(fname) | addr, size, pbase, vbase, dbase, vsize, nummet1, numobj, entryptr
' Load the object to a location 200 longs after the current stack pointer
addr := @result + 800
size := LoadFile(fname, addr)
ifnot size
return 0
' Get the values of pbase, vbase and dbase
pbase := word[addr][3] + addr
vbase := word[addr][4] + addr
dbase := word[addr][5] + addr
' Determine the VAR size and zero it
vsize := dbase - vbase - 8
longfill(vbase, 0, vsize/4)
' Locate the method table entry for the last object
nummet1 := byte[@@0][2]
numobj := byte[@@0][3]
entryptr := @@0 + (nummet1+numobj-1)*4
' Adjust the pbase and vbase offsets in the method table
word[entryptr] := pbase - @@0
word[entryptr][1] := vbase - @firstvar
' Call the first method in the object
return ovl.main
Where is the overlay being loaded - ie where is the stack pointer? Is that at the end of the program?
Where is the overlay being loaded - ie where is the stack pointer? Is that at the end of the program?
In my slightly modified version, the stack pointer was set to $1C4C. This is just a couple of lines into the purple "Free/Stack" area when looking at the F8 hex.
I set the overlay area to "@result + 800" in the CallOverlay method. @result gives the address of the current result variable, and I add an offset of 800 bytes, or 200 longs to this. That should give plenty of room on the stack. If the code needs a larger stack the offset could be increased.
I got some time to test out the code. fsrw has never liked my SD cards for some reason so I ported it over to Kye's SD driver, and also dumped the display out to a VGA monitor. And just to stretch the overlay concept, I added a whole object to test1.spin so now it is 8k in size. I'm not looking to use this object (yet), just to check that adding objects to the beginning of a overlay does not crash it.
' test1.spin demo
OBJ
fat: "SD-MMC_FATEngine_No_RTC.spin"
PUB main
result := 1
Kye's SD driver code works differently but the core of the program is similar
OBJ
ovl : "dummy"
VAR
long firstvar
PUB main
printstring(string("Start Overlay Demonstration"))
crlf
PrintOverlayReturnValue(string("test1.bin"))
PrintOverlayReturnValue(string("test2.bin"))
PrintOverlayReturnValue(string("test3.bin"))
PUB PrintOverlayReturnValue(fname)
result := CallOverlay(fname)
printstring(fname)
printstring(string(" returned a value of "))
printdecimal(result)
crlf
PUB CallOverlay(fname) | addr, size, pbase, vbase, dbase, vsize, nummet1, numobj, entryptr
' Load the object to a location 200 longs after the current stack pointer
addr := @result + 800
size := LoadFile(fname, addr)
ifnot size
return 0
' Get the values of pbase, vbase and dbase
pbase := word[addr][3] + addr
vbase := word[addr][4] + addr
dbase := word[addr][5] + addr
' Determine the VAR size and zero it
vsize := dbase - vbase - 8
longfill(vbase, 0, vsize/4)
' Locate the method table entry for the last object
nummet1 := byte[@@0][2]
numobj := byte[@@0][3]
entryptr := @@0 + (nummet1+numobj-1)*4
' Adjust the pbase and vbase offsets in the method table
word[entryptr] := pbase - @@0
word[entryptr][1] := vbase - @firstvar
' Call the first method in the object
return ovl.main
PUB Loadfile(fname, addr)
fat.openfile(fname,"R")
result := fat.readdata(addr,30000)
fat.closefile
This is incredible work Dave. A simple bit of code, but amazing possibilities for huge programs!
I wrote something several years back that I think does what you want and is similar to what a lot of people have explained here. Haven't had time to look at it for a couple of years
To make this really useful you need a little bit of help from the compiler. I started modifying the one in sphinx but never got the changes to the linker made and was also having problems with the sd card getting corrupted.
Addit. Seems the answer is pretty simple. Going back to Dave's original code;
1) in dummy.spin, replace Pub Main with Pub Main(n)
2) in test1.spin, add a variable eg
PUB main(n)
result := 1 + n
and repeat for test2.spin and test3.spin
3) in the main program, pass a value eg
return ovl.main(4)
So I guess the only big decision is what the dummy format should be. One variable? 5 Variables?
What is a reasonable number to choose?
I'm kind of inclined to just stick with one, using the same logic that you only need to have one PAR passed to a cog, and if you want more, then pass a pointer to an array.
Woot! First Obex object running as an overlay. This takes the keyboard object and splits it into two parts - the resident spin part and the overlay pasm part.
The demo part of the code is this. The keyboard pasm startup is run in between other overlays and each overlay overwrites the previous one.
PUB main
printstring(string("Start Overlay Demonstration"))
crlf
PrintOverlayReturnValue(string("test1.bin"))
PrintOverlayReturnValue(string("test2.bin"))
printstring(string("load keyboard cog"))
crlf
result := key.Store_Parameters(27, 26, %100, 40) 'Start Keyboard Driver pin,pin,num,repeatrate, returns array location
CallOverlay(string("KeyOver.bin"),result) ' load cog with keyboard driver and pass location of the array
PrintOverlayReturnValue(string("test3.bin"))
printstring(string("type something"))
repeat
result := key.key
if result <>0
printchar(result)
The Resident part of the keyboard code is just the spin methods
''***************************************
''* PS/2 Keyboard Driver v1.0.1 *
''* Author: Chip Gracey *
''* Copyright (c) 2004 Parallax, Inc. *
''* See end of file for terms of use. *
''***************************************
{-----------------REVISION HISTORY-----------------
v1.0.1 - Updated 6/15/2006 to work with Propeller Tool 0.96}
' Split into resident and overlay programs
' overlay contains the pasm part and is unloaded once the cog has been started
' resident part contains the spin drivers
' save the overlay program, F8 to create binary, save to SD card and rename as KeyOver.bin
' call from main program with this code
' result := key.Store_Parameters(27, 26, %100, 40) 'Start Keyboard Driver pin,pin,num,repeatrate, returns array location
' CallOverlay(string("KeyOver.bin"),result) ' load cog with keyboard driver and pass location of the array
VAR
long keyboard_cog
long par_tail 'key buffer tail read/write (19 contiguous longs)
long par_head 'key buffer head read-only
long par_present 'keyboard present read-only
long par_states[8] 'key states (256 bits) read-only
long par_keys[8] 'key buffer (16 words) read-only (also used to pass initial parameters)
Pub Store_Parameters(dpin, cpin, locks, auto) ' stores values in array prior to passing to overlay
'returns the location of the array, as will need this to pass to the overlay
keyboard_cog := 0 ' set cog running flag to zero
par_keys[0] := dpin
par_keys[1] := cpin
par_keys[2] := locks
par_keys[3] := auto
result := @keyboard_cog
PUB stop
'' Stop keyboard driver - frees a cog
if keyboard_cog
cogstop(keyboard_cog~ - 1)
longfill(@par_tail, 0, 19)
PUB present : truefalse
'' Check if keyboard present - valid ~2s after start
'' returns t|f
truefalse := -par_present
PUB key : keycode
'' Get key (never waits)
'' returns key (0 if buffer empty)
if par_tail <> par_head
keycode := par_keys.word[par_tail]
par_tail := ++par_tail & $F
PUB getkey : keycode
'' Get next key (may wait for keypress)
'' returns key
repeat until (keycode := key)
PUB newkey : keycode
'' Clear buffer and get new key (always waits for keypress)
'' returns key
par_tail := par_head
keycode := getkey
PUB gotkey : truefalse
'' Check if any key in buffer
'' returns t|f
truefalse := par_tail <> par_head
PUB clearkeys
'' Clear key buffer
par_tail := par_head
PUB keystate(k) : state
'' Get the state of a particular key
'' returns t|f
state := -(par_states[k >> 5] >> k & 1)
and the Overlay part is the pasm code. The start method needed to be modified (and would be in other objects) as a start object usually points to a VAR list, but with overlays, only one variable is passed, so this is a pointer to the start of a PAR list, so one has to go through and selectively replace @ symbols with "ptr + n*4" symbols. Sometimes a long[base][offset] is needed instead of a VAR variable. And sometimes the original code clears all the data which you have passed so that needs to be modified. So it is not impossible but one does needs high caffeine levels!
''***************************************
''* PS/2 Keyboard Driver v1.0.1 *
''* Author: Chip Gracey *
''* Copyright (c) 2004 Parallax, Inc. *
''* See end of file for terms of use. *
''***************************************
{-----------------REVISION HISTORY-----------------
v1.0.1 - Updated 6/15/2006 to work with Propeller Tool 0.96}
PUB start(keyptr) : okay
'' keyptr is the location of keyboard_cog array
'' Like start, but allows you to specify lock settings and auto-repeat
''
'' locks = lock setup
'' bit 6 disallows shift-alphas (case set soley by CapsLock)
'' bits 5..3 disallow toggle of NumLock/CapsLock/ScrollLock state
'' bits 2..0 specify initial state of NumLock/CapsLock/ScrollLock
'' (eg. %0_001_100 = disallow ScrollLock, NumLock initially 'on')
''
'' auto = auto-repeat setup
'' bits 6..5 specify delay (0=.25s, 1=.5s, 2=.75s, 3=1s)
'' bits 4..0 specify repeat rate (0=30cps..31=2cps)
'' (eg %01_00000 = .5s delay, 30cps repeat)
if long[keyptr][0]
cogstop(long[keyptr][0]~ - 1)
longfill(keyptr+4, 0, 11) ' only fill with 11 otherwise overwrites the data being passed in par_keys
' longmove(long[keyptr][12],@dpin,4) ' replaces longmove(@par_keys, @dpin, 4)
'' not needed as Store_Parameters has already moved this data to the array
okay := long[keyptr][0] := cognew(@entry, keyptr+4) + 1 ' replaces okay := cog := cognew(@entry, @par_tail) + 1
DAT
'******************************************
'* Assembly language PS/2 keyboard driver *
'******************************************
org
'
'
' Entry
'
entry movd :par,#_dpin 'load input parameters _dpin/_cpin/_locks/_auto
mov x,par
add x,#11*4
mov y,#4
:par rdlong 0,x
add :par,dlsb
add x,#4
djnz y,#:par
mov dmask,#1 'set pin masks
shl dmask,_dpin
mov cmask,#1
shl cmask,_cpin
test _dpin,#$20 wc 'modify port registers within code
muxc _d1,dlsb
muxc _d2,dlsb
muxc _d3,#1
muxc _d4,#1
test _cpin,#$20 wc
muxc _c1,dlsb
muxc _c2,dlsb
muxc _c3,#1
mov _head,#0 'reset output parameter _head
'
'
' Reset keyboard
'
reset mov dira,#0 'reset directions
mov dirb,#0
movd :par,#_present 'reset output parameters _present/_states[8]
mov x,#1+8
:par mov 0,#0
add :par,dlsb
djnz x,#:par
mov stat,#8 'set reset flag
'
'
' Update parameters
'
update movd :par,#_head 'update output parameters _head/_present/_states[8]
mov x,par
add x,#1*4
mov y,#1+1+8
:par wrlong 0,x
add :par,dlsb
add x,#4
djnz y,#:par
test stat,#8 wc 'if reset flag, transmit reset command
if_c mov data,#$FF
if_c call #transmit
'
'
' Get scancode
'
newcode mov stat,#0 'reset state
:same call #receive 'receive byte from keyboard
cmp data,#$83+1 wc 'scancode?
if_nc cmp data,#$AA wz 'powerup/reset?
if_nc_and_z jmp #configure
if_nc cmp data,#$E0 wz 'extended?
if_nc_and_z or stat,#1
if_nc_and_z jmp #:same
if_nc cmp data,#$F0 wz 'released?
if_nc_and_z or stat,#2
if_nc_and_z jmp #:same
if_nc jmp #newcode 'unknown, ignore
'
'
' Translate scancode and enter into buffer
'
test stat,#1 wc 'lookup code with extended flag
rcl data,#1
call #look
cmp data,#0 wz 'if unknown, ignore
if_z jmp #newcode
mov t,_states+6 'remember lock keys in _states
mov x,data 'set/clear key bit in _states
shr x,#5
add x,#_states
movd :reg,x
mov y,#1
shl y,data
test stat,#2 wc
:reg muxnc 0,y
if_nc cmpsub data,#$F0 wc 'if released or shift/ctrl/alt/win, done
if_c jmp #update
mov y,_states+7 'get shift/ctrl/alt/win bit pairs
shr y,#16
cmpsub data,#$E0 wc 'translate keypad, considering numlock
if_c test _locks,#%100 wz
if_c_and_z add data,#@keypad1-@table
if_c_and_nz add data,#@keypad2-@table
if_c call #look
if_c jmp #:flags
cmpsub data,#$DD wc 'handle scrlock/capslock/numlock
if_c mov x,#%001_000
if_c shl x,data
if_c andn x,_locks
if_c shr x,#3
if_c shr t,#29 'ignore auto-repeat
if_c andn x,t wz
if_c xor _locks,x
if_c add data,#$DD
if_c_and_nz or stat,#4 'if change, set configure flag to update leds
test y,#%11 wz 'get shift into nz
if_nz cmp data,#$60+1 wc 'check shift1
if_nz_and_c cmpsub data,#$5B wc
if_nz_and_c add data,#@shift1-@table
if_nz_and_c call #look
if_nz_and_c andn y,#%11
if_nz cmp data,#$3D+1 wc 'check shift2
if_nz_and_c cmpsub data,#$27 wc
if_nz_and_c add data,#@shift2-@table
if_nz_and_c call #look
if_nz_and_c andn y,#%11
test _locks,#%010 wc 'check shift-alpha, considering capslock
muxnc :shift,#$20
test _locks,#$40 wc
if_nz_and_nc xor :shift,#$20
cmp data,#"z"+1 wc
if_c cmpsub data,#"a" wc
:shift if_c add data,#"A"
if_c andn y,#%11
:flags ror data,#8 'add shift/ctrl/alt/win flags
mov x,#4 '+$100 if shift
:loop test y,#%11 wz '+$200 if ctrl
shr y,#2 '+$400 if alt
if_nz or data,#1 '+$800 if win
ror data,#1
djnz x,#:loop
rol data,#12
rdlong x,par 'if room in buffer and key valid, enter
sub x,#1
and x,#$F
cmp x,_head wz
if_nz test data,#$FF wz
if_nz mov x,par
if_nz add x,#11*4
if_nz add x,_head
if_nz add x,_head
if_nz wrword data,x
if_nz add _head,#1
if_nz and _head,#$F
test stat,#4 wc 'if not configure flag, done
if_nc jmp #update 'else configure to update leds
'
'
' Configure keyboard
'
configure mov data,#$F3 'set keyboard auto-repeat
call #transmit
mov data,_auto
and data,#%11_11111
call #transmit
mov data,#$ED 'set keyboard lock-leds
call #transmit
mov data,_locks
rev data,#-3 & $1F
test data,#%100 wc
rcl data,#1
and data,#%111
call #transmit
mov x,_locks 'insert locks into _states
and x,#%111
shl _states+7,#3
or _states+7,x
ror _states+7,#3
mov _present,#1 'set _present
jmp #update 'done
'
'
' Lookup byte in table
'
look ror data,#2 'perform lookup
movs :reg,data
add :reg,#table
shr data,#27
mov x,data
:reg mov data,0
shr data,x
jmp #rand 'isolate byte
'
'
' Transmit byte to keyboard
'
transmit
_c1 or dira,cmask 'pull clock low
movs napshr,#13 'hold clock for ~128us (must be >100us)
call #nap
_d1 or dira,dmask 'pull data low
movs napshr,#18 'hold data for ~4us
call #nap
_c2 xor dira,cmask 'release clock
test data,#$0FF wc 'append parity and stop bits to byte
muxnc data,#$100
or data,dlsb
mov x,#10 'ready 10 bits
transmit_bit call #wait_c0 'wait until clock low
shr data,#1 wc 'output data bit
_d2 muxnc dira,dmask
mov wcond,c1 'wait until clock high
call #wait
djnz x,#transmit_bit 'another bit?
mov wcond,c0d0 'wait until clock and data low
call #wait
mov wcond,c1d1 'wait until clock and data high
call #wait
call #receive_ack 'receive ack byte with timed wait
cmp data,#$FA wz 'if ack error, reset keyboard
if_nz jmp #reset
transmit_ret ret
'
'
' Receive byte from keyboard
'
receive test _cpin,#$20 wc 'wait indefinitely for initial clock low
waitpne cmask,cmask
receive_ack
mov x,#11 'ready 11 bits
receive_bit call #wait_c0 'wait until clock low
movs napshr,#16 'pause ~16us
call #nap
_d3 test dmask,ina wc 'input data bit
rcr data,#1
mov wcond,c1 'wait until clock high
call #wait
djnz x,#receive_bit 'another bit?
shr data,#22 'align byte
test data,#$1FF wc 'if parity error, reset keyboard
if_nc jmp #reset
rand and data,#$FF 'isolate byte
look_ret
receive_ack_ret
receive_ret ret
'
'
' Wait for clock/data to be in required state(s)
'
wait_c0 mov wcond,c0 '(wait until clock low)
wait mov y,tenms 'set timeout to 10ms
wloop movs napshr,#18 'nap ~4us
call #nap
_c3 test cmask,ina wc 'check required state(s)
_d4 test dmask,ina wz 'loop until got state(s) or timeout
wcond if_never djnz y,#wloop '(replaced with c0/c1/c0d0/c1d1)
tjz y,#reset 'if timeout, reset keyboard
wait_ret
wait_c0_ret ret
c0 if_c djnz y,#wloop '(if_never replacements)
c1 if_nc djnz y,#wloop
c0d0 if_c_or_nz djnz y,#wloop
c1d1 if_nc_or_z djnz y,#wloop
'
'
' Nap
'
nap rdlong t,#0 'get clkfreq
napshr shr t,#18/16/13 'shr scales time
min t,#3 'ensure waitcnt won't snag
add t,cnt 'add cnt to time
waitcnt t,#0 'wait until time elapses (nap)
nap_ret ret
'
'
' Initialized data
'
'
dlsb long 1 << 9
tenms long 10_000 / 4
'
'
' Lookup table
' ascii scan extkey regkey ()=keypad
'
table word $0000 '00
word $00D8 '01 F9
word $0000 '02
word $00D4 '03 F5
word $00D2 '04 F3
word $00D0 '05 F1
word $00D1 '06 F2
word $00DB '07 F12
word $0000 '08
word $00D9 '09 F10
word $00D7 '0A F8
word $00D5 '0B F6
word $00D3 '0C F4
word $0009 '0D Tab
word $0060 '0E `
word $0000 '0F
word $0000 '10
word $F5F4 '11 Alt-R Alt-L
word $00F0 '12 Shift-L
word $0000 '13
word $F3F2 '14 Ctrl-R Ctrl-L
word $0071 '15 q
word $0031 '16 1
word $0000 '17
word $0000 '18
word $0000 '19
word $007A '1A z
word $0073 '1B s
word $0061 '1C a
word $0077 '1D w
word $0032 '1E 2
word $F600 '1F Win-L
word $0000 '20
word $0063 '21 c
word $0078 '22 x
word $0064 '23 d
word $0065 '24 e
word $0034 '25 4
word $0033 '26 3
word $F700 '27 Win-R
word $0000 '28
word $0020 '29 Space
word $0076 '2A v
word $0066 '2B f
word $0074 '2C t
word $0072 '2D r
word $0035 '2E 5
word $CC00 '2F Apps
word $0000 '30
word $006E '31 n
word $0062 '32 b
word $0068 '33 h
word $0067 '34 g
word $0079 '35 y
word $0036 '36 6
word $CD00 '37 Power
word $0000 '38
word $0000 '39
word $006D '3A m
word $006A '3B j
word $0075 '3C u
word $0037 '3D 7
word $0038 '3E 8
word $CE00 '3F Sleep
word $0000 '40
word $002C '41 ,
word $006B '42 k
word $0069 '43 i
word $006F '44 o
word $0030 '45 0
word $0039 '46 9
word $0000 '47
word $0000 '48
word $002E '49 .
word $EF2F '4A (/) /
word $006C '4B l
word $003B '4C ;
word $0070 '4D p
word $002D '4E -
word $0000 '4F
word $0000 '50
word $0000 '51
word $0027 '52 '
word $0000 '53
word $005B '54 [
word $003D '55 =
word $0000 '56
word $0000 '57
word $00DE '58 CapsLock
word $00F1 '59 Shift-R
word $EB0D '5A (Enter) Enter
word $005D '5B ]
word $0000 '5C
word $005C '5D \
word $CF00 '5E WakeUp
word $0000 '5F
word $0000 '60
word $0000 '61
word $0000 '62
word $0000 '63
word $0000 '64
word $0000 '65
word $00C8 '66 BackSpace
word $0000 '67
word $0000 '68
word $C5E1 '69 End (1)
word $0000 '6A
word $C0E4 '6B Left (4)
word $C4E7 '6C Home (7)
word $0000 '6D
word $0000 '6E
word $0000 '6F
word $CAE0 '70 Insert (0)
word $C9EA '71 Delete (.)
word $C3E2 '72 Down (2)
word $00E5 '73 (5)
word $C1E6 '74 Right (6)
word $C2E8 '75 Up (8)
word $00CB '76 Esc
word $00DF '77 NumLock
word $00DA '78 F11
word $00EC '79 (+)
word $C7E3 '7A PageDn (3)
word $00ED '7B (-)
word $DCEE '7C PrScr (*)
word $C6E9 '7D PageUp (9)
word $00DD '7E ScrLock
word $0000 '7F
word $0000 '80
word $0000 '81
word $0000 '82
word $00D6 '83 F7
keypad1 byte $CA, $C5, $C3, $C7, $C0, 0, $C1, $C4, $C2, $C6, $C9, $0D, "+-*/"
keypad2 byte "0123456789.", $0D, "+-*/"
shift1 byte "{|}", 0, 0, "~"
shift2 byte $22, 0, 0, 0, 0, "<_>?)!@#$%^&*(", 0, ":", 0, "+"
'
'
' Uninitialized data
'
dmask res 1
cmask res 1
stat res 1
data res 1
x res 1
y res 1
t res 1
_head res 1 'write-only
_present res 1 'write-only
_states res 8 'write-only
_dpin res 1 'read-only at start
_cpin res 1 'read-only at start
_locks res 1 'read-only at start
_auto res 1 'read-only at start
''
''
'' _________
'' Key Codes
''
'' 00..DF = keypress and keystate
'' E0..FF = keystate only
''
''
'' 09 Tab
'' 0D Enter
'' 20 Space
'' 21 !
'' 22 "
'' 23 #
'' 24 $
'' 25 %
'' 26 &
'' 27 '
'' 28 (
'' 29 )
'' 2A *
'' 2B +
'' 2C ,
'' 2D -
'' 2E .
'' 2F /
'' 30 0..9
'' 3A :
'' 3B ;
'' 3C <
'' 3D =
'' 3E >
'' 3F ?
'' 40 @
'' 41..5A A..Z
'' 5B [
'' 5C \
'' 5D ]
'' 5E ^
'' 5F _
'' 60 `
'' 61..7A a..z
'' 7B {
'' 7C |
'' 7D }
'' 7E ~
''
'' 80-BF (future international character support)
''
'' C0 Left Arrow
'' C1 Right Arrow
'' C2 Up Arrow
'' C3 Down Arrow
'' C4 Home
'' C5 End
'' C6 Page Up
'' C7 Page Down
'' C8 Backspace
'' C9 Delete
'' CA Insert
'' CB Esc
'' CC Apps
'' CD Power
'' CE Sleep
'' CF Wakeup
''
'' D0..DB F1..F12
'' DC Print Screen
'' DD Scroll Lock
'' DE Caps Lock
'' DF Num Lock
''
'' E0..E9 Keypad 0..9
'' EA Keypad .
'' EB Keypad Enter
'' EC Keypad +
'' ED Keypad -
'' EE Keypad *
'' EF Keypad /
''
'' F0 Left Shift
'' F1 Right Shift
'' F2 Left Ctrl
'' F3 Right Ctrl
'' F4 Left Alt
'' F5 Right Alt
'' F6 Left Win
'' F7 Right Win
''
'' FD Scroll Lock State
'' FE Caps Lock State
'' FF Num Lock State
''
'' +100 if Shift
'' +200 if Ctrl
'' +400 if Alt
'' +800 if Win
''
'' eg. Ctrl-Alt-Delete = $6C9
''
''
'' Note: Driver will buffer up to 15 keystrokes, then ignore overflow.
{{
┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ TERMS OF USE: MIT License │
├──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation │
│files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, │
│modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software│
│is furnished to do so, subject to the following conditions: │
│ │
│The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.│
│ │
│THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE │
│WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR │
│COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, │
│ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
}}
Attached is a modified version of Kyedos - lots of redundant code but it is a useful skeleton with keyboard, display and serial drivers all in one program.
And yes, it now compiles as a smaller program. So this process can be repeated for all the other objects - display, SD driver, serial driver etc.
So I guess the only big decision is what the dummy format should be. One variable? 5 Variables?
What is a reasonable number to choose?
I'm kind of inclined to just stick with one, using the same logic that you only need to have one PAR passed to a cog, and if you want more, then pass a pointer to an array.
I think I'd be inclined to use two. One pointing to the shared longs and the other to shared bytes. I don't use words often enough to worry about them.
One variable would still work since the address of the byte array could be passed by using an element in the long array to hold the address but the code gets pretty ugly.
I haven't gotten back to my large projects to try this yet but I'm very excited about this development.
I've been watching this thread for a while and it's very promising. I have a large project now that is at 95% ram usage and I still need to squeeze in SD card drivers. Sounds like there could be a lot of merit to this idea. About half of my program is initialization methods that only get called once, PASM, or strings that could be offloaded to SD. Please keep us up to date on how things go!
*Edit*
After some careful consideration.. I believe the 1 parameter is the way to go. I'm setting to work to try this for myself. Wish I could share my results
The VGA code is proving a bit challenging. It uses 2 cogs and it is behaving as if the pointer to the font for the second cog is not working.
The process of creating an overlay involves removing all local VAR variables in the overlay part as these will be destroyed when the overlay is unloaded, and instead the VAR values are stored in the resident part of the code. Next, if more than one variable is passed to a PUB, the code is changed so that only one variable is passed.
The VGA code splits rather nicely into Resident and Overlay, where Resident is existing drivers like Vince Briel/Jeff Ledger's VT100 code, and the Overlay is Chip's original code. So in the VT100 code, instead of passing 5 values to Chip's code, we add a little PUB to collect those values and return a pointer to the array. Something like
VAR
long Parameters[5]
PUB StoreParameters(BasePin) ' store parameters in an array so only need to pass one value to the overlay
Parameters[0] := BasePin
Parameters[1] := @screen
Parameters[2] := @colors
Parameters[3] := @cursor
Parameters[4] := @sync
return @Parameters ' return location of this array
Which would be sort of pointless in a normal program, but now we can work with one parameter.
Here is the clever bit for debugging. All the experiments like the one above can be done *before* an overlay is created. So it is possible to work through code and quickly test things. Overlays take a bit longer to debug as every change involves removing the SD card and copying it over.
So the aim is to reduce the Overlay part of the program to its bare minimum, but still debug with this code. When it appears to all be working, do an F8 and save the file as a .bin file on the SD card.
Within the main program, it is now possible to comment and uncomment just two lines to swap back and forth between the overlay version, and the previous 'one single program' version
'CallOverlay(string("VGAovl.bin"),result) ' load two cogs with VGA driver and pass location of the array
VT100.start(result)' start the vga driver
' use one or the other of the above two lines.
But the VGA object is a bit of a challenge. Below is a little screenshot, and I've worked out that the first cog is doing the bottom half of the letters and the second cog does the top half. So the text is sort of recognisable. By tinkering with values, almost every change stops the display completely, so there is lots that is working. The second cog is clearly running, and its colors are correct. The garbage at the top of the letters is consistent - look at the "a" on the screenshot. It is as if the font didn't load properly into the second cog.
The VGA object is a bit tricky, as it is using some self modifying code. According to Chip's comments, values are being "implanted" into the DAT section before the cogs are launched. Lines like this
font_base := @font
are loading one area of the cog DAT with a pointer to another area of the cog DAT. I think this may be where things are going wrong - the first one loads but not the second one.
Then there is this little bit of code which might be relevant.
CON
#1, scanbuff[128], scancode[128*2-1+3], maincode 'enumerate COG RAM usage
main_size = $1F0 - maincode 'size of main program
hv_inactive = (hn << 1 + vn) * $0101 'H,V inactive states
For an overlay, the 'size of main program' is only about 2K, and that constant won't be correct for the real program. So maybe that is truncating the font section, as the font part of the DAT is the last bit.
The pasm code is even modifying itself, and this line depends on main_size above, so this could be misreading the size of the program and chopping off the font?
'Move main program into maincode area
:move mov $1EF,main_begin+main_size-1
I did have to remove the Stop part of the overlay, but that should not matter as it is unlikely the cogs are going to be stopped this way (for an overlay program, the main program would take over tracking which cogs are running). That shouldn't affect anything, and indeed, the code all works fine when the same code is run in non-overlay mode.
The cog loader is below. There was a longmove that I changed to three separate moves, as the longs are being defined differently at the beginning of the PUB but that didn't make any difference. Probably could go back to the old code (though on the other hand, it doesn't matter if an overlay is a little too large as overlays are not part of the main program).
'implant pin settings
reg_vcfg := $200000FF + (BasePin & %111000) << 6
i := $FF << (BasePin & %011000)
j := BasePin & %100000 == 0
reg_dira := i & j
reg_dirb := i & !j
'implant CNT value to sync COGs to
sync_cnt := cnt + $10000
'implant pointers
'longmove(@screen_base, @ScreenPtr, 3)
screen_base := ScreenPtr ' replace above line with these three lines, debugging overlays, doesn't change anything
color_base := ColorPtr
cursor_base := CursorPtr
font_base := @font
'implant unique settings and launch first COG
vf_lines.byte := vf
vb_lines.byte := vb
font_third := 1
'cog[1] := cognew(@d0, SyncPtr) + 1
cognew(@d0, SyncPtr) ' replaces line above
'allow time for first COG to launch
waitcnt($2000 + cnt)
'differentiate settings and launch second COG
vf_lines.byte := vf+4
vb_lines.byte := vb-4
font_third := 0
' cog[0] := cognew(@d0, SyncPtr) + 1
cognew(@d0, SyncPtr) ' replaces line above
Attached is the complete program. Board has SD, Keyboard and VGA so with a few changes to pin numbers should work on many propeller boards. The program that is not working is VGA80x40_Overlay.spin
Any help debugging this would be most appreciated!
addit: those in the cold Northern Hemisphere might be interested to know that it has just hit 120F in Dr_Acula's shed. Time to jump in the pool, methinks.
The HiRes driver uses an 8x12 font. Each cog fetches and emits 4 lines. IOW either cog will work on the top/middle/bottom part of each row over time. What you see is simply a corruption of the top third of the font data which is expected to be a fixed 1.5K array in hub RAM. Since you load the binary on the stack any overlay being loaded at a later time will potentially destroy the font data (e.g. KeyOver.bin).
So now the keyboard and the VGA display have been split into resident and overlay portions, saving memory each time. Attached zip file.
@kuroneko, did you have a display driver that can do different colors per character? I presume that would take a bit more memory as you have to store the color data as well as the character data, but this overlay technique is freeing up lots of memory.
Attached is the latest code demo.
Also - it was 120F in the shed today, very hot even in the shade and the local wildlife has needed water. We put out a container of water for the kangaroos. Bonus points for spotting the koala in the top right of the photo
@kuroneko, did you have a display driver that can do different colors per character? I presume that would take a bit more memory as you have to store the color data as well as the character data, but this overlay technique is freeing up lots of memory.
See below for what's available (from me). IIRC there is an 80x40 full colour driver out there as well but I don't have a link right now (could be PockeTerm).
Those look great, thanks kuroneko. Are those in the Obex, or if not, on a forum thread?
Looking initially for something that is pretty simple - 80x40, happy to use the internal rom font, 2 colors per cell.
I have split the standard FullDuplexSerial into two parts. They are getting easier now I have a template. Start with a VAR list in the resident part, fill the values, in the overlay part replace all variable names with a long[pointer][n] and replace all @ references with "pointer +n*4". I copied the VAR section from the resident to the overlay as it makes it easier to convert the code. Old code is left as comments so easier to debug. This one worked first time!
PUB start(serialptr) ' pass start of VAR list
' to access the value in the VAR list, use long[serialptr][n] where n is the number in the list below
' to access the location of the variable (ie @), use serialptr +n*4
'0 long baud
'1 long rx_head '9 contiguous longs
'2 long rx_tail
'3 long tx_head
'4 long tx_tail
'5 long rx_pin
'6 long tx_pin
'7 long rxtx_mode
'8 long bit_ticks
'9 long buffer_ptr
'10 byte rx_buffer[16] 'transmit and receive buffers
'26 byte tx_buffer[16]
longfill(serialptr+4*1,0,4) ' longfill(@rx_head, 0, 4)
' longmove(@rx_pin, @rxpin, 3) - already done in resident code
long[serialptr][8] := clkfreq / long[serialptr][0] ' bit_ticks := clkfreq / baudrate
long[serialptr][9] := serialptr + 4*10 ' buffer_ptr := @rx_buffer
cognew(@entry,serialptr+4*1) ' cognew(@entry, @rx_head)
So that is now three overlays. The SD card cog might be just a bit hard as one needs to get the SD working in order to load code. Maybe a "reboot and leave cog running" technique? Or maybe a flag on the SD card DAT section, and then recycle this portion of ram for something?
I also need to think of a way to talk to multiple PUBs in a spin overlay. Probably just pass one number, and then decode that number and jump to the appropriate PUB. Need good commenting in the calling routine as a number is a bit vague.
The ultimate holy grail is a huge spin overlay with some sort of caching so the programmer doesn't need to think about loading and unloading overlays - just call the routine.
What I think ought to be possible is to load and reload those drivers on the fly. That means you could boot into text mode, then load graphics, then go back to text at a different resolution. It would be great to have a package of all of those drivers.
Dr_Acula, it looks like you are making great progress on Spin overlays. One thing I should mention is that my implementation creates, and zeroes the VAR section every time an overlay is loaded. It would be possible to create a static area where all of the overlay VARs are kept so that they don't get reinitialized.
Another approach would be to use the stub-objects I mentioned earlier. The VAR data could be associated with the stub-objects, and the stub-objects would ensure that their overlay is in memory, and jump to the corresponding methods in the overlay. I'll create an example of this when I have a chance.
Spin calling reloadable precompiled Spin objects ought to be similar.
Going back to a comment stevenmess made on 29/1 - I agree that this is the sort of thing that would be much easier to work with if it was part of an IDE. There is an open source IDE project thread at the moment that might be useful here.
Comments
This idea could be extended even further where the stubbed-method loads only its method into a cache instead of loading the whole object. However, this would require loading the method plus all the other methods that it might call, so this becomes very complicated.
I like your stub idea, but I agree the above is an issue.
Hmm. Alright, let's take the position that using overlays is not going to be just a matter of dropping in objects from the Obex. Code is going to need to be tweaked to suit the overlay system. An example would be any object using pasm. The current spin system is you just include the pasm, load it, and up to 2k of hub space gets wasted and never used after that. A smarter way would be to load the pasm into a cog, note the PAR list location, then reload the spin part separately. And that spin part might get loaded and unloaded as an overlay and the only part that stays resident is the PAR section.
So... thinking about it this way, if things need to be optimised for overlays, then at the same time they can be optimised for stubs, including things like removing nested objects within that stub and putting everything that is needed in one package.
Ah, that leads to a simple cache driver. That might make it easier for the programmer as it is not so necessary to actively think about which overlay is in memory. So you could allocate some hub memory for overlays, if it is needed, load it in and then leave it there. The cache fills up with lots of different sized overlays. Each time an overlay is called, that becomes overlay #1, and all the other overlays in memory get incremented by 1. When memory is full and a new overlay is needed, the highest number overlay is removed from memory until there is enough free space.
There are lots of other cache algorithms too.
That could work really well for something like a set of string algorithms. Each of these tends to fit into one PUB and tends to be self contained and not calling other PUBs very much, and if it is, ok, a bit of code is being duplicated here and there. But if you are processing a lot of text, then with a cache the string routines you use a lot would stay in memory, and the ones you don't use much are slower.
What would a stub overlay look like?
Could it be as simple as a PUB var1,var2,var3 etc? Do you need to define how many variables are passed? Would returning variables work the same way as normal?
To test the program you must first write the binary files test1.binary, test2.binary and test3.binary to an SD card. Modify the parameters in the call to mount_explicit to match the pin numbers for your SD card. The program is currently set up for a C3 card. Build and run the overlay.spin program, and it should print the following lines.
Wow!
I think this just might be what the doctor ordered.
I'm very excited about this. Thank you!
addit:
this is brilliant. I'm not entirely sure what some of the code does, but it certainly is working
Where is the overlay being loaded - ie where is the stack pointer? Is that at the end of the program?
In my slightly modified version, the stack pointer was set to $1C4C. This is just a couple of lines into the purple "Free/Stack" area when looking at the F8 hex.
Thinking about caching code I see there is a line
So it could be possible to create an array with the size and name and location of overlays, and then have some spin code manage a simple cache.
Kye's SD driver code works differently but the core of the program is similar
This is incredible work Dave. A simple bit of code, but amazing possibilities for huge programs!
To make this really useful you need a little bit of help from the compiler. I started modifying the one in sphinx but never got the changes to the linker made and was also having problems with the sd card getting corrupted.
The old thread is here (http://forums.parallax.com/archive/index.php/t-100406.html) but it looks like the attachments have disappeared. I can get them off my other computer tomorrow if you are interested.
The attachments are there if you "convert to a full view" (or something like that) in the upper left after the page is served.
1) in dummy.spin, replace Pub Main with Pub Main(n)
2) in test1.spin, add a variable eg and repeat for test2.spin and test3.spin
3) in the main program, pass a value eg
So I guess the only big decision is what the dummy format should be. One variable? 5 Variables?
What is a reasonable number to choose?
I'm kind of inclined to just stick with one, using the same logic that you only need to have one PAR passed to a cog, and if you want more, then pass a pointer to an array.
The demo part of the code is this. The keyboard pasm startup is run in between other overlays and each overlay overwrites the previous one.
The Resident part of the keyboard code is just the spin methods
and the Overlay part is the pasm code. The start method needed to be modified (and would be in other objects) as a start object usually points to a VAR list, but with overlays, only one variable is passed, so this is a pointer to the start of a PAR list, so one has to go through and selectively replace @ symbols with "ptr + n*4" symbols. Sometimes a long[base][offset] is needed instead of a VAR variable. And sometimes the original code clears all the data which you have passed so that needs to be modified. So it is not impossible but one does needs high caffeine levels!
Attached is a modified version of Kyedos - lots of redundant code but it is a useful skeleton with keyboard, display and serial drivers all in one program.
And yes, it now compiles as a smaller program. So this process can be repeated for all the other objects - display, SD driver, serial driver etc.
I think I'd be inclined to use two. One pointing to the shared longs and the other to shared bytes. I don't use words often enough to worry about them.
One variable would still work since the address of the byte array could be passed by using an element in the long array to hold the address but the code gets pretty ugly.
I haven't gotten back to my large projects to try this yet but I'm very excited about this development.
*Edit*
After some careful consideration.. I believe the 1 parameter is the way to go. I'm setting to work to try this for myself. Wish I could share my results
The VGA code is proving a bit challenging. It uses 2 cogs and it is behaving as if the pointer to the font for the second cog is not working.
The process of creating an overlay involves removing all local VAR variables in the overlay part as these will be destroyed when the overlay is unloaded, and instead the VAR values are stored in the resident part of the code. Next, if more than one variable is passed to a PUB, the code is changed so that only one variable is passed.
The VGA code splits rather nicely into Resident and Overlay, where Resident is existing drivers like Vince Briel/Jeff Ledger's VT100 code, and the Overlay is Chip's original code. So in the VT100 code, instead of passing 5 values to Chip's code, we add a little PUB to collect those values and return a pointer to the array. Something like
Which would be sort of pointless in a normal program, but now we can work with one parameter.
Here is the clever bit for debugging. All the experiments like the one above can be done *before* an overlay is created. So it is possible to work through code and quickly test things. Overlays take a bit longer to debug as every change involves removing the SD card and copying it over.
So the aim is to reduce the Overlay part of the program to its bare minimum, but still debug with this code. When it appears to all be working, do an F8 and save the file as a .bin file on the SD card.
Within the main program, it is now possible to comment and uncomment just two lines to swap back and forth between the overlay version, and the previous 'one single program' version
But the VGA object is a bit of a challenge. Below is a little screenshot, and I've worked out that the first cog is doing the bottom half of the letters and the second cog does the top half. So the text is sort of recognisable. By tinkering with values, almost every change stops the display completely, so there is lots that is working. The second cog is clearly running, and its colors are correct. The garbage at the top of the letters is consistent - look at the "a" on the screenshot. It is as if the font didn't load properly into the second cog.
The VGA object is a bit tricky, as it is using some self modifying code. According to Chip's comments, values are being "implanted" into the DAT section before the cogs are launched. Lines like this are loading one area of the cog DAT with a pointer to another area of the cog DAT. I think this may be where things are going wrong - the first one loads but not the second one.
Then there is this little bit of code which might be relevant.
For an overlay, the 'size of main program' is only about 2K, and that constant won't be correct for the real program. So maybe that is truncating the font section, as the font part of the DAT is the last bit.
The pasm code is even modifying itself, and this line depends on main_size above, so this could be misreading the size of the program and chopping off the font?
I did have to remove the Stop part of the overlay, but that should not matter as it is unlikely the cogs are going to be stopped this way (for an overlay program, the main program would take over tracking which cogs are running). That shouldn't affect anything, and indeed, the code all works fine when the same code is run in non-overlay mode.
The cog loader is below. There was a longmove that I changed to three separate moves, as the longs are being defined differently at the beginning of the PUB but that didn't make any difference. Probably could go back to the old code (though on the other hand, it doesn't matter if an overlay is a little too large as overlays are not part of the main program).
Attached is the complete program. Board has SD, Keyboard and VGA so with a few changes to pin numbers should work on many propeller boards. The program that is not working is VGA80x40_Overlay.spin
Any help debugging this would be most appreciated!
addit: those in the cold Northern Hemisphere might be interested to know that it has just hit 120F in Dr_Acula's shed. Time to jump in the pool, methinks.
Yes - if I don't load any more overlays then the text is perfect.
I'll modify it so the font data is in the resident section.
Thanks++
So now the keyboard and the VGA display have been split into resident and overlay portions, saving memory each time. Attached zip file.
@kuroneko, did you have a display driver that can do different colors per character? I presume that would take a bit more memory as you have to store the color data as well as the character data, but this overlay technique is freeing up lots of memory.
Attached is the latest code demo.
Also - it was 120F in the shed today, very hot even in the shade and the local wildlife has needed water. We put out a container of water for the kangaroos. Bonus points for spotting the koala in the top right of the photo
Looking initially for something that is pretty simple - 80x40, happy to use the internal rom font, 2 colors per cell.
I have split the standard FullDuplexSerial into two parts. They are getting easier now I have a template. Start with a VAR list in the resident part, fill the values, in the overlay part replace all variable names with a long[pointer][n] and replace all @ references with "pointer +n*4". I copied the VAR section from the resident to the overlay as it makes it easier to convert the code. Old code is left as comments so easier to debug. This one worked first time!
So that is now three overlays. The SD card cog might be just a bit hard as one needs to get the SD working in order to load code. Maybe a "reboot and leave cog running" technique? Or maybe a flag on the SD card DAT section, and then recycle this portion of ram for something?
I also need to think of a way to talk to multiple PUBs in a spin overlay. Probably just pass one number, and then decode that number and jump to the appropriate PUB. Need good commenting in the calling routine as a number is a bit vague.
The ultimate holy grail is a huge spin overlay with some sort of caching so the programmer doesn't need to think about loading and unloading overlays - just call the routine.
What I think ought to be possible is to load and reload those drivers on the fly. That means you could boot into text mode, then load graphics, then go back to text at a different resolution. It would be great to have a package of all of those drivers.
Another approach would be to use the stub-objects I mentioned earlier. The VAR data could be associated with the stub-objects, and the stub-objects would ensure that their overlay is in memory, and jump to the corresponding methods in the overlay. I'll create an example of this when I have a chance.
Spin calling reloadable precompiled Spin objects ought to be similar.
Going back to a comment stevenmess made on 29/1 - I agree that this is the sort of thing that would be much easier to work with if it was part of an IDE. There is an open source IDE project thread at the moment that might be useful here.