This is all by way of an experimental hack but it does show what can be done. This works from HUB RAM and needs a bit of effort to be able to use it from EXT memory.
I get here after an hour of gluing things together
Looks like we need some kind of a "propeller.c::cognew" for external memory.
So, I put in a hack for a cogbuffer in debug_zog.spin.
Then added a syscall to get the cogbuffer start address.
Then I made cognewx and copied 520 longs from ZOG space to HUB space.
Then called cognew(buffptr, parmptr) ... no luck.
I print the data copied by longmove and it looks right.
Depends, where did the COG binary blob come from?
Did you byte reverse it and convert it to an object file linked with your C program?
How did you copy it from ext RAM to HUB RAM?
Did you have a look at the Makefile for test_libzog? Specifically the build steps for the FullDuplexSerial and VMCog PASM blobs.
Did you have any luck getting even run_zog to work?
If you look at the cognew syscall handler in zog.spin you will find that it adds zpu_memory_addr to the address of the given PASM blob. This is because the coginit instruction of course wants the actual HUB address not the address of the blob in ZPU memory space.
This works fine when running from HUB but...
In debug_zog.spin the Zog parameter par_zpu_memory_addr is not set correctly for running from external RAM. At least in the version I am looking at now it is always set as:
par_zpu_memory_addr := @zpu_memory
BUT zpu_memory is just a DAT label (non-zero)
Previously zog running from ext RAM has never used its zpu_memory_addr.
Suggest either setting par_zpu_memory_addr to zero in debug_zog.spin or removing the mapping in zog.spin
Thanks for that pointer. I've wrapped it with an ifdef USE_HUB_MEMORY. I tried USE_HUB_MEMORY with my program for debug_zog.spin and that doesn't work either. Guess there is something else. run_zog.spin works for test_libzog.bin. Maybe I'll crack it tomorrow.
I presume your PASM blob does not use anything from PAR as that pointer parameter par in cognewx is pointing to ext memory and wont work.
I would be inclined to temporarily hack a call to break into Zog's SYS_cognew handler and then print the content of the COG buffer in HUB from Spin. Just to check it is all present and correct prior to coginit.
I would be inclined to temporarily hack a call to break into Zog's SYS_cognew handler and then print the content of the COG buffer in HUB from Spin. Just to check it is all present and correct prior to coginit.
Yes, I did that. The parameter block was bad ... which brings up the next problem.
As far as I can tell, for any driver that relies on data from the "business end" of a program needs the business end to pass data to HUB ram. The only way for that to happen is via rendezvous or a device interface like read, write, ioctl, etc.... I thought perhaps a linker section could put data into shared memory space using pointers, but we need to know the address. So, here we are full circle.
At this point, a choice needs to be made on how to proceed. I'll leave that up to you. The easiest but most inelegant route seems to be adding SPIN device drivers with debug_zog or some other file and use the syscall interface. The rendezvous interface could be used, but block devices must be supported otherwise it's useless. I'll re-start my Catalina port while I await your decision.
In my comment to heater:
>> At this point, a choice needs to be made on how to proceed. I'll leave that up to you.
@Heater, any thoughts on how to proceed? I have my hands full right now, but I expect it won't be long before I'm back in ZOG's domain - maybe early December.
Not really. There is a lot going on at work just now.
I've been following your efforts with Catalina an can see how much effort RossH has put into platform support. I've always hoped Zog could get away with something less "all encompassing" whilst at the same time making it easy for interested parties to bolt in whatever they want for themselves. I will never have the time or, frankly, the motivation to provide a Catalina level of platform support.
For example:
1) All PASM running with Zog is tweaked to be usable without Spin. Just having PAR and mailbox/rendezvous interfaces. Currently I only have VMCog and my tweaked FullDuplexSerialPlus as examples.
2) PASM blobs are only started in Cogs from Spin, as in debug_zog. Or from ZPU C code executing from HUB, as is done by run_zog.
3) The mess of mailboxes/rendezvous and any buffer areas are careful and manually placed into the HUB memory Map, somewhere high. Where they can be found by any other Spin or C processes running on the Prop. And also by C code running from ext RAM.
4) Zog running C from HUB will need to be able to run further Zogs in HUB. Or run a single Zog in external RAM.
The programming model here is then rather like C on any other uC. The C code runs (HUB or EXT) and has a bunch of registers and memory areas (mailboxes/rendezvous) by which it can communicate with hardware devices (Mostly PASM in COGS.)
This model is especially so for any C code running from EXT RAM.
Note: There are really two programs here:
a) Something like run_zog/debug zog that runs from EEPROM.
b) Whatever C program is loaded to EXT RAM from SD card or wherever.
In many cases there may only be a)
PROBLEM: As you have noted the C code, as in b), has no idea where those mailboxes/rendezvous areas are. They can be accessed by pointers but what addresses to set them to?
TENTATIVE SOLUTION:
Write the memory map out as a file full of #defines that gets used from C. Create a little utility program that converts that file into a spin file containing the equivalent constant definitions.
Or the other way around, start from a neutral format and generate C and Spin from it.
OK. still thinking on this...
I look forward to trying out Catalina when you have it running in 32MB:)
I'm also thinking it's time to have two separate versions of the Zog interpreter.
Currently when running code from EXT RAM Zog implements a memory mapping which allows it to see into HUB and COG space. So for example:
*some_hub_adddr = x
should write to the correct location HUB.
However when running code from HUB there is no such mapping and VMCOG/SDCACHE is not even used.
*some_ext_addr = x
does not work.
I get the idea that doing this from a single Zog source with much #ifdef is going to be a mess.
We would then have the slight complication that the memory maps would be different for zog_hub and zog_ext versions. That is, for example, address zero is the first byte in ext RAM for zog_ext with HUB appearing up at 32MB. While for zog_hub address zero is the first byte in ZPU space in HUB with ext RAM somewhere higher.
We could make both memory maps common, say:
0000 and up, for 32MB, is HUB space.
32MB up, for another 32MB, is ext RAM
64MB and up is COG space and whatever other devices get attached.
But, that requires that programs built for ext RAM are linked and located at 32MB not 0000 as for HUB programs.
I'm back from my recent foray into Catalina land. This post meanders a little, but the meat of if offers a way for ZOG C device drivers to be used without some of the funky stuff previously discussed.
I fully expect that any C code committed to the OBEX should work with ZOG given a little effort.
I've been following your efforts with Catalina an can see how much effort RossH has put into platform support. I've always hoped Zog could get away with something less "all encompassing" whilst at the same time making it easy for interested parties to bolt in whatever they want for themselves. I will never have the time or, frankly, the motivation to provide a Catalina level of platform support.
Adding Catalina support for SDRAM is quite painful because of the way Catalina reloads itself. Hopefully Ross will be able to review what I have and help me over the hump. Catalina is almost working, but I hit a wall and I need to get something out that is functional with SDRAM, and I think I've figured out the best way to do it now.
ZOG's simplicity is it's strength. ZOG can get away with less platform support as long as there are contributors to enrich it. A GNU "native" LMM toolchain would be similar. The simpler the better.
Write the memory map out as a file full of #defines that gets used from C. Create a little utility program that converts that file into a spin file containing the equivalent constant definitions.
Or the other way around, start from a neutral format and generate C and Spin from it.
Blech! :P :sick: There's a better way.
Here's a list of options considered so far.
Supply driver hooks in debug_zog.spin
Use open, close, ioctl, read, write syscalls
Predefine HUB spaces
Let's try this instead:
HUB Allocated Shared Memory - otherwise known as ShMem
The goal here is for XMM or HUB C code to ask ZOG for blocks of HUB for communicating with PASM drivers.
ZOG decodes addresses $1000_0000 to $1000_8000 as hub access. Any 32 bit type reference in C code today in that address range interacts with HUB memory.
I've ported the OBEX C TV_Text driver which uses 16 bit access for display memory. The latest ZOG does not support word and byte references yet, but it can and I've added that to my zog.spin.
What is necessary for Shared Memory to work?
C code uses Shared Memory for communicating with PASM drivers
ZOG needs a Shared Memory HUB allocator
ZOG needs to know where is top of heap
ZOG needs 2 syscalls: SYS_huballoc and SYS_hubfree
ZOG needs BYTE and WORD Shared Memory hooks added
ZOG should have access to Propeller ROM space
I've implemented most of this in debug_zog.spin and added WORD access hook to zog.spin.
1) C code for starting the TV_Text driver has these key elements:
/*
* TV_Text start function starts TV on a cog
* See header file for more details.
*/
int tvText_start(int basepin)
{
extern int _binary_TV_dat_start;
int error = 0;
void *pasm = 0;
tvPtr = (TvText_t*) _syscall(&error,SYS_huballoc,sizeof(TvText_t));
pasm = (uint32_t*) _syscall(&error,SYS_huballoc,520*4);
longmove(pasm,&_binary_TV_dat_start,520);
tvPtr->status = 0;
tvPtr->enable = 1;
tvPtr->pins = ((basepin & 0x38) << 1) | (((basepin & 4) == 4) ? 0x5 : 0);
tvPtr->mode = 0x12;
tvPtr->colors = (uint32_t*) _syscall(&error,SYS_huballoc,TV_TEXT_COLORTABLE_SIZE*4);
tvPtr->screen = (uint16_t*) _syscall(&error,SYS_huballoc,TV_TEXT_SCREENSIZE*2);
// ... other code
// set main fg/bg color
tvText_setColorPalette(&gpalette[TV_TEXT_PAL_YELLOW_BROWN]);
// blank the screen
wordfill(tvPtr->screen, blank, TV_TEXT_SCREENSIZE);
// start new cog from external memory using pasm and tvPtr
gTvTextCog = cognew(pasm, (void*)tvPtr) + 1;
wait(1000); // wait for COG start
_syscall(&error,SYS_hubfree,pasm); // free pasm space
2) HUB allocator
I'm using a first fit chained list allocator for now. It is fairly inefficient. A "binning" or another more efficient allocator should be used.
3) Top of Heap pointer
PROBLEM: As you have noted the C code, as in b), has no idea where those mailboxes/rendezvous areas are. They can be accessed by pointers but what addresses to set them to?
The simplest rendezvous is knowing where the "top of heap" starts. Presumably the last word in HUB memory can point to the top of heap. Other things can be put between top of heap and the last word of HUB.
Now, the example I'm showing does not use rendezvous at all. Presumably, one can store the driver's main control struct reference at a reserved COG address. The rendezvous structure can look something like this:
{{
ZOGdezvous.spin
}}
CON
''minimalist interface / rendevous structure
_HeapTop_ptr = $8000 - 1 * 4 ' $xxfc stores the top of heap pointer
_StdOut_ptr = _Type_ptr - 1 * 4 ' $xxf8 pointer to stdout data structure
_StdIn_ptr = _StdOut_ptr - 1 * 4 ' $xxf4 pointer to stdin data structure
_SD_ptr = _StdIn_ptr - 1 * 4 ' $xxf0 pointer to sd or other storage interface data structure
_COG7_ptr = _SD_ptr - 1 * 4 ' $xxec pointer to cog[7] interface/other data structure
_COG6_ptr = _COG7_ptr - 1 * 4 ' $xxe8 pointer to cog[6] interface/other data structure
_COG5_ptr = _COG6_ptr - 1 * 4 ' $xxe4 pointer to cog[5] interface/other data structure
_COG4_ptr = _COG5_ptr - 1 * 4 ' $xxe0 pointer to cog[4] interface/other data structure
_COG3_ptr = _COG4_ptr - 1 * 4 ' $xxdc pointer to cog[3] interface/other data structure
_COG2_ptr = _COG3_ptr - 1 * 4 ' $xxd8 pointer to cog[2] interface/other data structure
_COG1_ptr = _COG2_ptr - 1 * 4 ' $xxd4 pointer to cog[1] interface/other data structure
_COG0_ptr = _COG1_ptr - 1 * 4 ' $xxd0 pointer to cog[0] interface/other data structure
_ProgParm_ptr = _COG0_ptr - 1 * 4 ' $xxcc pointer to program parameters
_ProgStat_ptr = _ProgParm_ptr - 1 * 4 ' $xxc8 pointer to program exit status info
_DevTop_ptr = _ProgStat_ptr - 1 * 4 ' $xxc4 pointer to top of device list
_Heap_ptr = _DevTop_ptr - 1 * 4 ' $xxc0 pointer to top of heap
4) SYS_huballoc/SYS_hubfree - could be called ShMem_alloc/ShMem_free
I picked arbitrary numbers 30 and 31 for the syscall identifiers. Here's debug_zog.spin implementation. The mem object I'm using is a simple first fit chain malloc.
' add to CASE in on_syscall method
SYS_huballoc:
p := mem.malloc(vm_readLong(framep + 12)+4) ' get user + 4 bytes
p := (p+4) & !3 ' Return long so user doesn't have to fool with it
vm_writeLong(0,p + $10000000) ' Return value via _mreg ZPU address 0
SYS_hubfree:
mem.free(vm_readLong(framep + 12))
5) Adding BYTE and WORD Shared Memory hooks
I've not added byte access yet, but I added read_word and write_word as:
'Read a WORD from ZPU memory at "address into "data"
read_word
cmp address, zpu_hub_start wc 'Check for normal memory access
if_c jmp #read_xmm_word
sub address, zpu_hub_start
rdword data, address
jmp #read_word_ret
read_xmm_word
' ....
'Write a WORD from "data" to ZPU memory at "address"
write_word
cmp address, zpu_hub_start wc 'Check for normal memory access
if_c jmp #write_xmm_word
sub address, zpu_hub_start
wrword data, address
jmp #write_long_ret
write_xmm_word
' ....
5) Allowing access to all of Propeller memory
Today we have this:
zpu_hub_start long $10000000 'Start of HUB access window in ZPU memory space
zpu_cog_start long $10008000 'Start of COG access window in ZPU memory space
zpu_io_start long $10008800 'Start of IO access window
Can we move cog_start to $10010000 like this?
zpu_hub_start long $10000000 'Start of HUB access window in ZPU memory space
zpu_cog_start long $10010000 'Start of COG access window in ZPU memory space
zpu_io_start long $10018000 'Start of IO access window
Sorry for the long post, but I think there is enough detail there to describe the approach.
I'll be posting a ZOG demo using this method within the next few weeks.
I had been making some progress with ZOG on the C3 until I ran into this strange problem. My cache driver for the C3 uses a buffer in hub memory as the cache and I tried moving that buffer to a different place in memory and started seeing strange behavior. In order to try to track this down I uncommented the check_bytecode function call in the start function of debug_zog.bin and got the following output.
ZOG v1.6 (CACHE)
Starting SD driver...0000FFFF
Mounting SD...00000000
Opening ZPU image 'fibo.bin'...00000000
Reading image...17056 bytes
Clearing bss: .......................
Waiting 2 seconds before program check...
Restarting SD driver...0000FFFF
Remounting SD...00000000
Checking image...
Program load error: 0000007E expected : A7 received : 7E (126)
Program load error: 0000007F expected : 83 received : 34 (127)
Program load error: 000000FE expected : 00 received : 7F (254)
Program load error: 000000FF expected : 00 received : 9D (255)
Program load error: 0000017E expected : 00 received : 12 (382)
Program load error: 0000017F expected : 00 received : 9D (383)
Program load error: 000001FE expected : 00 received : FF (510)
Program load error: 000001FF expected : 00 received : E0 (511)
Program load error: 0000027E expected : 00 received : 2F (638)
Program load error: 0000027F expected : 00 received : 95 (639)
Program load error: 000002FE expected : 00 received : EB (766)
Program load error: 000002FF expected : 04 received : C5 (767)
Program load error: 0000037E expected : 00 received : FF (894)
Program load error: 0000037F expected : 00 received : 6C (895)
Program load error: 000003FE expected : 00 received : 06 (1022)
Program load error: 000003FF expected : 00 received : CF (1023)
Program load error: 0000047E expected : 08 received : ED (1150)
Program load error: 0000047F expected : 88 received : DA (1151)
Program load error: 000004FE expected : 81 received : 6E (1278)
Program load error: 000004FF expected : 0B received : 9F (1279)
Program load error: 0000057E expected : EE received : 6F (1406)
Program load error: 0000057F expected : 70 received : 54 (1407)
Program load error: 000005FE expected : 08 received : 39 (1534)
Program load error: 000005FF expected : 80 received : F3 (1535)
Program load error: 0000067E expected : 05 received : BD (1662)
Program load error: 0000067F expected : FC received : BF (1663)
Program load error: 000006FE expected : 0C received : EB (1790)
Program load error: 000006FF expected : 05 received : 62 (1791)
Program load error: 0000077E expected : 08 received : EF (1918)
Program load error: 0000077F expected : 05 received : FD (1919)
Program load error: 000007FE expected : 08 received : BA (2046)
Program load error: 000007FF expected : 05 received : 62 (2047)
Program load error: 0000087E expected : A4 received : 0F (2174)
Program load error: 0000087F expected : F5 received : D2 (2175)
Program load error: 000008FE expected : 72 received : EF (2302)
Program load error: 000008FF expected : 0C received : D4 (2303)
Program load error: 0000097E expected : 3D received : DE (2430)
Program load error: 0000097F expected : DC received : F0 (2431)
Program load error: 000009FE expected : 59 received : B8 (2558)
Program load error: 000009FF expected : 19 received : 77 (2559)
Program load error: 00000A7E expected : 51 received : AF (2686)
Program load error: 00000A7F expected : 06 received : 18 (2687)
Program load error: 00000AFE expected : D8 received : 8B (2814)
Program load error: 00000AFF expected : 80 received : 60 (2815)
Program load error: 00000B7E expected : 8F received : FF (2942)
Program load error: 00000B7F expected : 2E received : FC (2943)
Program load error: 00000BFE expected : 38 received : BF (3070)
Program load error: 00000BFF expected : F8 received : F2 (3071)
Program load error: 00000C7E expected : 38 received : CD (3198)
Program load error: 00000CFE expected : D4 received : FC (3326)
Program load error: 00000CFF expected : 82 received : AF (3327)
Program load error: 00000D7E expected : 81 received : F7 (3454)
Program load error: 00000D7F expected : 08 received : 3F (3455)
Program load error: 00000DFE expected : 90 received : FE (3582)
Program load error: 00000DFF expected : 7E received : 7D (3583)
Program load error: 00000E7E expected : F5 received : 67 (3710)
Program load error: 00000E7F expected : 51 received : E3 (3711)
Program load error: 00000EFE expected : 3D received : E7 (3838)
Program load error: 00000EFF expected : D3 received : F7 (3839)
Program load error: 00000F7E expected : 80 received : 68 (3966)
Program load error: 00000F7F expected : 75 received : 06 (3967)
Program load error: 00000FFE expected : 70 received : 00 (4094)
Program load error: 00000FFF expected : 22 received : 00 (4095)
Program load error: 00001FFE expected : 84 received : 00 (8190)
Program load error: 00001FFF expected : FD received : 00 (8191)
Program load error: 00002FFE expected : 88 received : 00 (12286)
Program load error: 00002FFF expected : 0C received : 00 (12287) 17056 Bytes Checked.
Program Load Failed.
Init SD card -> 00000000
pc op sp tos reason
FFFF007C 00 0000FFD0 FFFFFFFF BREAKPOINT
First, this probably vindicates ZOG itself since this checking of the bytecodes happens before ZOG is started. Also, it's strange that the last two bytes in each 128 byte cache line are trashed. Also, the errors happen every 128 bytes regularly until address $1000 where it switches to only the last two 128 byte cache lines in each 4K block of memory. As it turns out, 4K is a magic number for my cache code since it uses 32 cache lines of 128 bytes each to cache SRAM accesses. This means it is using 4K for the SRAM cache. It also uses another 4K for the flash cache but this program doesn't load anything into flash. Any idea what might be happening here?
Okay, I figured out my problem. There seems to be something in my cache code that requires that the cache buffer be aligned on a cache line boundary. I'm not sure why that should be since I do everything with offsets and at the last minute add the base address of the cache to the offset but there must be something I missed. If I properly align the buffer things seem to work fine no matter where the buffer is located in memory.
Is your cache declared as long? Declaring it as byte or word could cause trouble.
Accessing a long (n/4)*4 from a word boundary (n/2)*2 always uses the long address.
This is your VMCOG merge code that uses small buffers right?
My code is a combination of code from VMCOG, SdramCache, sdspiFemto, and some of my own code all hacked together so no one will recognize it. I take responsiblity for all of the bugs! :-)
I'm an idiot though. Of course the cache has to be aligned on a cache line boundary. That's the way your SdramCache code works too. I just forgot about that. What I'm not sure of is why one buffer just happened to be aligned correctly and the other wasn't. There isn't a very large chance (1 in 128) that an arbitrary byte array would be aligned on a 128 byte boundary. In any case, I have it working now. I'm going to move the cache buffer to the top of hub memory just below the rendezvous variables and my SD card sector buffer. Then I'm going to try to move all of the filesystem code into C and get rid of any use of sdspiFemto after starting ZOG. I guess doing the same with FullDuplexSerialPlus won't be to hard either.
I'm going to move the cache buffer to the top of hub memory just below the rendezvous variables ...
Why does cache have to be in a fixed place? I prefer to let the compiler allocate it for me.
I guess it doesn't matter since ZOG appears hopelessly fractured in the realm of merge anyway.
I'm not sure I understand what you mean by "let the compiler allocate it". Do you mean the Spin compiler? The reason I want it fixed at the top of hub RAM is that I want to kill the debug_zog.spin COG after starting ZOG and have everything run either in C or as PASM drivers loaded into COGs by the C code. This should leave all of hub RAM minus what I'm using as a cache available to C programs through something like your huballoc/hubfree API. Of course, I'll have to recode those in C...
By the way, one setback I've found to my plan is that FullDuplexSerialPlus.spin is mostly written in Spin not PASM. I had thought that it was all assembly language with a mailbox interface. I guess I'll have to look in the OBEX for a PASM serial driver.
So run_zog is not an option? I like debug_zog too because that's what I've been using exclusively. Of course I don't mind the SPIN stuff. I suppose your new debug_zog really is too different for any merge attempt.
You're in luck on FullDuplexSerial.... The run_zog library has a C++ driver, and the OBEX has code I wrote for ICC (which I was going to port anyway regardless of run_zog): http://obex.parallax.com/objects/361
So run_zog is not an option? I like debug_zog too because that's what I've been using exclusively. Of course I don't mind the SPIN stuff. I suppose your new debug_zog really is too different for any merge attempt.
I haven't really looked at run_zog.spin. I guess maybe I should! :-)
Is the new revision of your SDRAM board available yet? I should get one and a PP board so I can make it work with my ZOG changes. It would be nice if what I'm doing would run on more than just the C3.
You're in luck on FullDuplexSerial.... The run_zog library has a C++ driver, and the OBEX has code I wrote for ICC (which I was going to port anyway regardless of run_zog): http://obex.parallax.com/objects/361
Thanks for the suggestion. I'll take a look at your C version. Does it still use a COG to handle the FIFO I/O?
New SDRAM modules will be built next week. The OBEX version is written for ICC and uses a PASM blob.
If you like I can take some time in the morning and port it to ZOG for the shared memory hub interface.
It really should not be so hard, except for the tx/rx buffers need to be defined in a struct rather than 2 globals.
New SDRAM modules will be built next week. The OBEX version is written for ICC and uses a PASM blob.
If you like I can take some time in the morning and port it to ZOG for the shared memory hub interface.
It really should not be so hard, except for the tx/rx buffers need to be defined in a struct rather than 2 globals.
Thanks but I can probably figure out how to hack the serial code. Right now I need to get my SD card code working from C instead of going through the syscall interface. That will keep me busy for a while! :-)
Then I want to integrate the C FAT code I'm using with the C runtime library code so I can get stdio working. It may be a while before I get to the serial stuff.
Page 41 already. I have to go back a few pages and catch up with what you guys are doing.
As Jazzed says there is a C only driver for FullDuplexSerial in the lib directory, also a C only driver for VMCog. When building those the makefile also produces a test program that runs under run_zog. With that you can have a completely Spin free Propeller if you like that uses all of HUB for C. Or start Zog in HUB and continue to run some Spin code as well. Sadly I only got these drivers working from C in HUB.
Jazzed, I have to look at you HUB allocator solution some more, sounds interesting...
I looked at run_zog.spin and I'm not sure it is going to help me. Since I'm loading code into external SRAM or flash I need something like the SD card loader that is in debug_zog.spin.
I'm working on getting more of the stdio library working on the C3 under ZOG. I have open/close/read/write working for SD card files as well as the terminal. The SD card driver is managed in C using the same PASM COG that is used for the cache memory since both use SPI devices on the funny C3 SPI bus. I'd like to try to get fopen/fclose/fread/fwrite working now but I believe they require malloc/free to be available.
Heater: Have you gotten malloc/free working under ZOG? I think it may simply be a matter of getting sbrk() implemented but I'm not sure. Has anyone done work on this yet?
Cool! I'll have to try it. I hope I didn't break it by loading the code and data into separate memories (flash and SRAM). I suppose that malloc uses the data section though so it should be okay. I also have to take a look at your C serial code.
I'm working on adding word and byte access to hub memory and I notice that ZOG.spin has read_word and read_long but not read_byte. Is there any reason that byte access was done with inline code but the others were done with separate subroutines? Can you see any disadvantage to splitting out the byte read/write code? I suppose the inline code is a bit faster but why wouldn't the same be true for word and long accesses?
Comments
Looks like we need some kind of a "propeller.c::cognew" for external memory.
So, I put in a hack for a cogbuffer in debug_zog.spin.
Then added a syscall to get the cogbuffer start address.
Then I made cognewx and copied 520 longs from ZOG space to HUB space.
Then called cognew(buffptr, parmptr) ... no luck.
I print the data copied by longmove and it looks right.
Is there an endian-ness issue here? I'm stuck.
Depends, where did the COG binary blob come from?
Did you byte reverse it and convert it to an object file linked with your C program?
How did you copy it from ext RAM to HUB RAM?
Did you have a look at the Makefile for test_libzog? Specifically the build steps for the FullDuplexSerial and VMCog PASM blobs.
Did you have any luck getting even run_zog to work?
I have, but not recently. I'll try again. I decided to get debug_zog.spin working first.
There is a potential bug in Zog:
If you look at the cognew syscall handler in zog.spin you will find that it adds zpu_memory_addr to the address of the given PASM blob. This is because the coginit instruction of course wants the actual HUB address not the address of the blob in ZPU memory space.
This works fine when running from HUB but...
In debug_zog.spin the Zog parameter par_zpu_memory_addr is not set correctly for running from external RAM. At least in the version I am looking at now it is always set as:
BUT zpu_memory is just a DAT label (non-zero)
Previously zog running from ext RAM has never used its zpu_memory_addr.
Suggest either setting par_zpu_memory_addr to zero in debug_zog.spin or removing the mapping in zog.spin
I would be inclined to temporarily hack a call to break into Zog's SYS_cognew handler and then print the content of the COG buffer in HUB from Spin. Just to check it is all present and correct prior to coginit.
As far as I can tell, for any driver that relies on data from the "business end" of a program needs the business end to pass data to HUB ram. The only way for that to happen is via rendezvous or a device interface like read, write, ioctl, etc.... I thought perhaps a linker section could put data into shared memory space using pointers, but we need to know the address. So, here we are full circle.
At this point, a choice needs to be made on how to proceed. I'll leave that up to you. The easiest but most inelegant route seems to be adding SPIN device drivers with debug_zog or some other file and use the syscall interface. The rendezvous interface could be used, but block devices must be supported otherwise it's useless. I'll re-start my Catalina port while I await your decision.
BTW, we need to sync up sources at some point.
>> At this point, a choice needs to be made on how to proceed. I'll leave that up to you.
@Heater, any thoughts on how to proceed? I have my hands full right now, but I expect it won't be long before I'm back in ZOG's domain - maybe early December.
Not really. There is a lot going on at work just now.
I've been following your efforts with Catalina an can see how much effort RossH has put into platform support. I've always hoped Zog could get away with something less "all encompassing" whilst at the same time making it easy for interested parties to bolt in whatever they want for themselves. I will never have the time or, frankly, the motivation to provide a Catalina level of platform support.
For example:
1) All PASM running with Zog is tweaked to be usable without Spin. Just having PAR and mailbox/rendezvous interfaces. Currently I only have VMCog and my tweaked FullDuplexSerialPlus as examples.
2) PASM blobs are only started in Cogs from Spin, as in debug_zog. Or from ZPU C code executing from HUB, as is done by run_zog.
3) The mess of mailboxes/rendezvous and any buffer areas are careful and manually placed into the HUB memory Map, somewhere high. Where they can be found by any other Spin or C processes running on the Prop. And also by C code running from ext RAM.
4) Zog running C from HUB will need to be able to run further Zogs in HUB. Or run a single Zog in external RAM.
The programming model here is then rather like C on any other uC. The C code runs (HUB or EXT) and has a bunch of registers and memory areas (mailboxes/rendezvous) by which it can communicate with hardware devices (Mostly PASM in COGS.)
This model is especially so for any C code running from EXT RAM.
Note: There are really two programs here:
a) Something like run_zog/debug zog that runs from EEPROM.
b) Whatever C program is loaded to EXT RAM from SD card or wherever.
In many cases there may only be a)
PROBLEM: As you have noted the C code, as in b), has no idea where those mailboxes/rendezvous areas are. They can be accessed by pointers but what addresses to set them to?
TENTATIVE SOLUTION:
Write the memory map out as a file full of #defines that gets used from C. Create a little utility program that converts that file into a spin file containing the equivalent constant definitions.
Or the other way around, start from a neutral format and generate C and Spin from it.
OK. still thinking on this...
I look forward to trying out Catalina when you have it running in 32MB:)
Currently when running code from EXT RAM Zog implements a memory mapping which allows it to see into HUB and COG space. So for example:
*some_hub_adddr = x
should write to the correct location HUB.
However when running code from HUB there is no such mapping and VMCOG/SDCACHE is not even used.
*some_ext_addr = x
does not work.
I get the idea that doing this from a single Zog source with much #ifdef is going to be a mess.
We would then have the slight complication that the memory maps would be different for zog_hub and zog_ext versions. That is, for example, address zero is the first byte in ext RAM for zog_ext with HUB appearing up at 32MB. While for zog_hub address zero is the first byte in ZPU space in HUB with ext RAM somewhere higher.
We could make both memory maps common, say:
0000 and up, for 32MB, is HUB space.
32MB up, for another 32MB, is ext RAM
64MB and up is COG space and whatever other devices get attached.
But, that requires that programs built for ext RAM are linked and located at 32MB not 0000 as for HUB programs.
I fully expect that any C code committed to the OBEX should work with ZOG given a little effort.
Adding Catalina support for SDRAM is quite painful because of the way Catalina reloads itself. Hopefully Ross will be able to review what I have and help me over the hump. Catalina is almost working, but I hit a wall and I need to get something out that is functional with SDRAM, and I think I've figured out the best way to do it now.
ZOG's simplicity is it's strength. ZOG can get away with less platform support as long as there are contributors to enrich it. A GNU "native" LMM toolchain would be similar. The simpler the better.
Blech! :P :sick: There's a better way.
Here's a list of options considered so far.
Let's try this instead:
HUB Allocated Shared Memory - otherwise known as ShMem
The goal here is for XMM or HUB C code to ask ZOG for blocks of HUB for communicating with PASM drivers.
ZOG decodes addresses $1000_0000 to $1000_8000 as hub access. Any 32 bit type reference in C code today in that address range interacts with HUB memory.
I've ported the OBEX C TV_Text driver which uses 16 bit access for display memory. The latest ZOG does not support word and byte references yet, but it can and I've added that to my zog.spin.
What is necessary for Shared Memory to work?
- C code uses Shared Memory for communicating with PASM drivers
- ZOG needs a Shared Memory HUB allocator
- ZOG needs to know where is top of heap
- ZOG needs 2 syscalls: SYS_huballoc and SYS_hubfree
- ZOG needs BYTE and WORD Shared Memory hooks added
- ZOG should have access to Propeller ROM space
I've implemented most of this in debug_zog.spin and added WORD access hook to zog.spin.1) C code for starting the TV_Text driver has these key elements: The simplest rendezvous is knowing where the "top of heap" starts. Presumably the last word in HUB memory can point to the top of heap. Other things can be put between top of heap and the last word of HUB.
Now, the example I'm showing does not use rendezvous at all. Presumably, one can store the driver's main control struct reference at a reserved COG address. The rendezvous structure can look something like this:
4) SYS_huballoc/SYS_hubfree - could be called ShMem_alloc/ShMem_free
I picked arbitrary numbers 30 and 31 for the syscall identifiers. Here's debug_zog.spin implementation. The mem object I'm using is a simple first fit chain malloc.
5) Adding BYTE and WORD Shared Memory hooks
I've not added byte access yet, but I added read_word and write_word as:
5) Allowing access to all of Propeller memory
Today we have this:
Can we move cog_start to $10010000 like this?
Sorry for the long post, but I think there is enough detail there to describe the approach.
I'll be posting a ZOG demo using this method within the next few weeks.
Accessing a long (n/4)*4 from a word boundary (n/2)*2 always uses the long address.
This is your VMCOG merge code that uses small buffers right?
I'm an idiot though. Of course the cache has to be aligned on a cache line boundary. That's the way your SdramCache code works too. I just forgot about that. What I'm not sure of is why one buffer just happened to be aligned correctly and the other wasn't. There isn't a very large chance (1 in 128) that an arbitrary byte array would be aligned on a 128 byte boundary. In any case, I have it working now. I'm going to move the cache buffer to the top of hub memory just below the rendezvous variables and my SD card sector buffer. Then I'm going to try to move all of the filesystem code into C and get rid of any use of sdspiFemto after starting ZOG. I guess doing the same with FullDuplexSerialPlus won't be to hard either.
I guess it doesn't matter since ZOG appears hopelessly fractured in the realm of merge anyway.
You're in luck on FullDuplexSerial.... The run_zog library has a C++ driver, and the OBEX has code I wrote for ICC (which I was going to port anyway regardless of run_zog): http://obex.parallax.com/objects/361
Is the new revision of your SDRAM board available yet? I should get one and a PP board so I can make it work with my ZOG changes. It would be nice if what I'm doing would run on more than just the C3. Thanks for the suggestion. I'll take a look at your C version. Does it still use a COG to handle the FIFO I/O?
If you like I can take some time in the morning and port it to ZOG for the shared memory hub interface.
It really should not be so hard, except for the tx/rx buffers need to be defined in a struct rather than 2 globals.
Then I want to integrate the C FAT code I'm using with the C runtime library code so I can get stdio working. It may be a while before I get to the serial stuff.
I'll look forward to your new SDRAM board!
As Jazzed says there is a C only driver for FullDuplexSerial in the lib directory, also a C only driver for VMCog. When building those the makefile also produces a test program that runs under run_zog. With that you can have a completely Spin free Propeller if you like that uses all of HUB for C. Or start Zog in HUB and continue to run some Spin code as well. Sadly I only got these drivers working from C in HUB.
Jazzed, I have to look at you HUB allocator solution some more, sounds interesting...
Heater: Have you gotten malloc/free working under ZOG? I think it may simply be a matter of getting sbrk() implemented but I'm not sure. Has anyone done work on this yet?
malloc/free works fine. But I guess I only tested it when running from HUB.
read/write_long need to functions, they are used to much to in-line, that eats all the COG.
read_byte is in-lined for speed as it is on the "critical path" as opcode fetch. Perhaps only of noticeable benefit hen running from HUB.
Therefore loadb/storeb got in-lined byte access as well.
read/write_word are functions because....well I don't know why, they are only used once each so it makes little difference.