Cogjects - load cog code from SD and save hub ram
Dr_Acula
Posts: 5,484
Attached is an early version of a Cogject loader. Take a typical object from the Obex, split it into its Spin and PASM components, compile the PASM part, rename the binary file as a .cog file, copy it to an SD card and then load it through a common location in hub ram.
Why bother? Well, normally every cog that is loaded leaves its image in hub ram. In a worst case scenario, you might load 7 cogs each with 2k of code making a total of 14k. That is a significant part of the total 32k in hub.
With the advent of many boards with sd cards, as well as fast sd card code, it is now possible to load cogs from the sd card.
It ends up pretty easy to use once the code has been written. Just two lines of spin code:
The first line reads the file into a cog array. And the second line starts the cog using parameters from the standard Start method.
Attached is some demo code. I have the keyboard and the serial driver working - so you just need to copy serial.cog and keyboard.cog to the sd card.
At the moment this is using the VGA driver from Kyedos as the display. The next challenge is to put the video driver into a cog as well. And a TV driver. And a graphics driver, so you can boot up with text and then change to graphics on the fly...
The underlying SD driver is Kye's SD driver from a few months back.
We can save a lot of space on this program. Compile it on BST with the "remove unused methods" checkbox ticked and it is a lot smaller as almost all the sd driver code is not used.
Also, it may be possible to recycle the SD cog code space as the generic cog array.
Building .cog files depends on the style of the original author. Chip Gracey's original objects tend to group variables and pass a pointer to the start of the first variable. Kye uses a method of poking new values into the hex code, and we can do this too after the .cog file has been loaded off the sd card, thanks to an idea from Ariba where you create a variable length dummy array in the middle of the cog code so that the program fills from the bottom up and the variables fill from the top down.
Another advantage of cogjects is you can use the same ones for other languages, eg Basic and C. Indeed, these two cogjects were originally written and tested in Catalina before being used here in Spin.
I hope others might find this useful. I'd like to think we could add some more cogjects to the library fairly quickly.
My next one to do is the standard VGA driver and then the mouse.
Addit April 10th, including a demo program that loads and unloads the following:
1) Serial
2) Keyboard
3) Mouse
4) VGA 80x40 text
5) VGA 100x50 text
6) VGA 128x64 text
7) VGA 160x120 with 64 color graphics
8) VGA 320x240 with 4 color graphics
9) External ram driver
Addit - the April 10th zip still was running the cog from the .DAT section. Deleted this in the April 11 version and fixed a minor bug in the .cog code (my apologies to the 2 people who have downloaded the April 10th version)
Why bother? Well, normally every cog that is loaded leaves its image in hub ram. In a worst case scenario, you might load 7 cogs each with 2k of code making a total of 14k. That is a significant part of the total 32k in hub.
With the advent of many boards with sd cards, as well as fast sd card code, it is now possible to load cogs from the sd card.
It ends up pretty easy to use once the code has been written. Just two lines of spin code:
ReadCogFile(string("Serial.Cog")) cognew(@cogarray+24, serial.start(31,30,0,115200))
The first line reads the file into a cog array. And the second line starts the cog using parameters from the standard Start method.
Attached is some demo code. I have the keyboard and the serial driver working - so you just need to copy serial.cog and keyboard.cog to the sd card.
At the moment this is using the VGA driver from Kyedos as the display. The next challenge is to put the video driver into a cog as well. And a TV driver. And a graphics driver, so you can boot up with text and then change to graphics on the fly...
The underlying SD driver is Kye's SD driver from a few months back.
We can save a lot of space on this program. Compile it on BST with the "remove unused methods" checkbox ticked and it is a lot smaller as almost all the sd driver code is not used.
Also, it may be possible to recycle the SD cog code space as the generic cog array.
Building .cog files depends on the style of the original author. Chip Gracey's original objects tend to group variables and pass a pointer to the start of the first variable. Kye uses a method of poking new values into the hex code, and we can do this too after the .cog file has been loaded off the sd card, thanks to an idea from Ariba where you create a variable length dummy array in the middle of the cog code so that the program fills from the bottom up and the variables fill from the top down.
Another advantage of cogjects is you can use the same ones for other languages, eg Basic and C. Indeed, these two cogjects were originally written and tested in Catalina before being used here in Spin.
I hope others might find this useful. I'd like to think we could add some more cogjects to the library fairly quickly.
My next one to do is the standard VGA driver and then the mouse.
Addit April 10th, including a demo program that loads and unloads the following:
1) Serial
2) Keyboard
3) Mouse
4) VGA 80x40 text
5) VGA 100x50 text
6) VGA 128x64 text
7) VGA 160x120 with 64 color graphics
8) VGA 320x240 with 4 color graphics
9) External ram driver
Addit - the April 10th zip still was running the cog from the .DAT section. Deleted this in the April 11 version and fixed a minor bug in the .cog code (my apologies to the 2 people who have downloaded the April 10th version)
Comments
Though nice code.
Yes, once all the cogs are loaded you can overwrite the Hub Mem their code used. But there are two problems with this:
1. The memory is not necessary contiguous - you could be left with 2k block sprinkled throughout your memory map.
2. You cannot easily re-use the space freed up for more code. This means you are effectively limited to 16k of code space.
Ross.
The 2k blocks of code left over from loading cogs end up scattered all over hub memory. As far as I know, recycling that memory has only ever really been done with the CP/M emulation, and even then it only used some of the potential memory.
And as Ross says, you can't put any program code in that memory. Only data.
So in practice, you are coding away, compiling from time to time and getting more stressed as you can see the available memory running out. What do you do when it does run out? Move to "Big Spin"? (almost ready but not quite). Move to C or Basic running from external memory?
And I'll add a third scenario where you have a nifty program with an SD card driver, video text driver (3 cogs), mouse, keyboard, serial driver and all the cogs are now used up. And you want to boot up with that, have the user enter some command "run my cool game", and then run a graphics intensive program that uses a 5 cog video driver. And you are almost out of RAM.
BTW david, you are a legend for taking on the challenge of better TV graphics using external memory!
I second what RossH said.
I just had an idea for poking data and knowing where to poke without placing all the data at the beginning. If the first instruction is a nop with the start of the variables/constants to be poked, then it is easy to pick up.
I am not sure if this works correctly...
Also there is something Kye mentioned that is very interesting - if you take some pasm code and hit F8 or F9 and then place the cursor over a variable, down the bottom of the screen it tells you in "OBJ" where that variable ends up.
This assumes that a piece of pasm code is 'stable' and unlikely to change, but there are many objects around where that would be true.
I've been deep inside Chip's VGA driver code and he uses the term "implant" to describe changing a variable that is somewhere in the middle of code. I may end up noting where the 4 variables end up and adjusting those locations directly.
I suspect that a variety of techniques will be useful depending on the style of the original obex code.
It would be much simpler to modify an object if all that was required was a single nop instruction at the start.
However, one has to be careful that the codespace is not all used, and also that the position of the start of the code is not used. I say that because in my 1pin TV driver, the font is stored at cog $000. I have a temporary jump instruction to the code which then replaces the first font long with the correct value. I did it this way because it reduced the instructions in the font renderer to give more chars/line for each clock frequency.
Another advantage to separating the pasm code is the ability to then have the various interfaces (e.g. spin, basic, c, and of course pasm) held separately and compiled within the main program.
Yes, it is cunning things like that where maybe we might have to go back to the original author for advice.
In the process of looking at the VGA driver code there is this
and I realise that if you are going to be 'implanting' values (Chip's term, and used by Kye and I'm sure others as well), then the object is going to need access to the cog data.
As such, I have changed the start routines, so that instead of the cognew being in the Main routine it is now in the object routine. So startup is this
and an example of the cog load routine
which ends up fairly similar to the original code, just that every "start" method is now going to need one variable added = cogarray.
I think that the VGA startup routine now has enough information to modify the cog code prior to loading.
@cluso99, I took a look at BST but I could not see how to get the printout of the variable locations. Is that in a compile menu or something?
Addit: another little edit, now discarding the first 24 bytes so that the array is the same as the one used in C, ie the first cog element is the first one.
This brings up the next question as to what those variables at the bottom of the screen mean.
Take this little bit of code
If I do an F9 on that and put the cursor on myvariable, I get
DAT myvariable = OBJ[$0008] COG[$000]
from that I would assume that COG is the location of the variable, not OBJ.
Further down the program though, OBJ increments a lot faster than COG. OBJ seems to be incrementing too fast (it is $0008 even on the first long) but OBJ seems to be incrementing too slow. Which one is pointing to the location of that variable in the compiled code?
More experiments, the location of a variable seems to be OBJ + $10. This seems a bit odd as I would have expected either no offset, or an offset of $18 (24).
Any advice here would be most appreciated!
Out of curiosity, would it be easier to have 2 seperate programs. The first one with all the drivers to load the cogs and then bootstrap the second program with the actual program code... (Maybe reserving space at beginning or end of memory for cog communications.)
As for your example above, cog $000 would be the resulting address in cog.
I will leave others to comment about the object locations after compile.
Rather than requiring a spin stub to 'implant' the values, I wonder if
long variables | varlength << 9
may be better. This way you would know the length to be copied. In your example above, obviously the array would have to be preset. Your loader would then copy the required variables before performing the cognew function.
Just trying to find a simple way where the pasm object is the only portion required. This way, the pasm object would be stored identically on the SD card no matter what high level code is used. Then we could just have a whole set of standard objects on the sd card. Perhaps these could then be in a subdirectory called PASMOBJ or something similar.
You are definately on to a good thing here
A simpler approach would be to just group all the cog startup objects together in memory. The top object would call all the start routines, which would re-use this chunk of memory. A simple memory allocation scheme could be used, which uses two pointers -- one that points to the beginning of the first startup object, and a second pointer that points to the end of the last startup object. This way all of the object's memory could be re-used including the Spin code and method table. The cogs would need to wait for a start signal before they write into their hub memory so they don't overwrite code that hadn't been executed yet.
Dave
I'd need to think about that some more. It depends on the style of the programmer who wrote the object. If they used the PAR method, and all the variables are contiguous, and there are not many variables, then you could allocate fixed areas in hub at the top - x bytes for a keyboard buffer, y bytes for serial etc. If you allocated them in the same order then the second program would know where these buffers are and would be able to use them.
What I have some reservations about is saying "hub memory area x is always to be allocated for keyboard" because everyones program is going to be different. But what you could have is a starter program that loads keyboard first, then serial and if you start at the top and work down then the second program knows where they are. Your program might load keyboard first then serial, and mine might load serial then keyboard, so this keeps it flexible.
Another idea is to strip down the sd card code to the absolute minimum. There is a lot of functionality in Kye's code, but also a lot of things can be done in several ways and maybe all that code is not needed.
One thing I would like to do is to be able to present a shell of a program that has all the basics - display, keyboard, mouse, serial, sd card, but still has plenty of room free for whatever the user wants to code.
Also I want to document some of the technical aspects of translating objects because I think these techniques are likely to be able to be reused.
In Spin, all one needs is an external memory driver, and these are fairly straightforward for the Dracblade, C3 and Jazzed's external ram.
If this system becomes more popular, one could also think about storing cogjects in eeprom.
I think I have cracked the code for working out the offsets for storing bytes. Let's talk in hex for a bit!
Compile some typical cog code and put the cursor over a variable and you might get a value of 0x8C. Look at the source code using F8 and it actually is 0x10 higher than this, at 0x9C. But the first 0x18 bytes are discarded, so this is the same as adding 0x10 and then subtracting 0x18, which is the same as subtracting 0x8.
We discard the first 0x18 bytes of all cogs when loading them. So the answer to locating variables is to do an F9, place the cursor over the variable and then subtract 0x8 from that value. This gives the real location in a piece of cog code that starts at ORG 0.
And to test this is correct, take a dummy piece of code:
ORG 0
myvariable long $AAAAAAAA
hit F9, you get a value of 0008, subtract 8 and this gives zero which fits with it being the first long.
Would some kind soul be able to please explain what this line does?
The only place vf_lines is referenced is in the cog code, but it is a line of code, not a variable viz
entire program
p.s. You cannot see the forest for the trees haha.
pps I will be out your way at Easter, so will see if we can catch up. Must be time for ano Oz UPE
yes, I am so deep inside this code and I know that only one mistake and the whole thing won't work. Is this going to be some sort of "bytemove" taking into account the little endian thing?
Comparing with the original code, do you think this is on the right track?
in the main
the calling routine from main
and in the object
I am wondering if you can do something like this...
longmove(@d0 - @screen_base - $8 + @cogarray,@ScreenPtr,3)
You don't have access to any variable names in the cog file any more. All you have is a hex file with a couple of thousand bytes, and that you know they start at "cogarray".
This vga one is pretty messy as there is data being poked to all sorts of locations within the file. Not as simple as the ones that use PAR.
I would think you could insert these two lines at the beginning of the cog code after the d0 line without problems. Then collect these as the offsets into the cogarray rather than using fixed values.
Yes that is what it is. No OBJ, only one PUB with a dummy load of the cog. This is just the bare minimum to get it to compile.
You can compile it with the Proptool or with BSTC.
Yes, that is a great idea. You can store cogjects anywhere you like. External ram. SD card. Maybe even put them in the unused part of an EEPROM.
It is done in the spinc.c module included in Catalina.
It seems to be copying random data (to the correct place). vb is a constant, not a variable - maybe this is the issue.
I'm tracing this through by doing a hex dump of the cog memory after it values have been implanted (after being implanted by the second cog load), then comparing this with a screen dump from a working vga driver. There are quite a few differences that I'll need to work on.
Code is below, and near the bottom is a comment section with the cogject code above the correct code.
I wonder if I should do something like this instead
Any help would be most appreciated!
addit - that does seem to improve things- now have garbage on the screen instead of a blank screen.
I now realise that these 3 bytes will be different as they point to different locations in each demo program
longmove(cogarray + $1EC,@ScreenPtr,3) ' replaces line above, screen_base, color_base, cursor_base
Getting close here
Addit: fixed a few lines with the font implants.
Now the screen is the correct color, it is steady but the font data appears to be corrupt
Very close now!
How are you managing to isolate each spin object? I need to check out that code and see how it works.
Meanwhile, I have finally cracked the vga code.
This program now loads up Kye's SD driver, which happens to have cog code that I think takes 494 longs. I need to check if it can be padded to 496. It then recycles this cog space to load the keyboad driver, serial driver and the two cogs driving the vga. So already this is saving 8k of hub ram compared with the standard approach.
See attached.