Revisiting XMM Mode Using Modern Memory Chips

Wingineer19 · 2019-08-27 06:27

RossH wrote: »

I'll keep you posted, but if you have something else to get on with in the meantime, I'd suggest you do so.

Not a problem. I still have lots of code to write. I'm adding various Menus to my program now.

I still need to interface the GPS receiver to the Prop, along with the WiFi radio. Not to mention getting the stepper motors to work with it too. I'm a long, long way from where I want to go. It will likely take months. Probably well into next year. So I'm using, and will be using, the Download To XMM Ram And Interact tool within Catalina quite a bit.

Question: Just wanted to confirm that you weren't able to add Multithreading to the XMM Kernel?

I ask because I'm using the 4-Port serial driver, and my CduPrint() function is used to write output to the terminal screen.

Well, actually CduPrint() uses our friendly vsprintf() function to dump the desired output to a string. Then I have another function called CduTxDTask() that actually outputs the string content to the screen one byte at a time.

Right now the CduTxDTask() function is contained within an infinite loop under the main() function of the program. It works, but kinda on the slow side.

In CMM mode I had assigned this function to a separate cog so screen updates were pretty fast.

Can't launch a separate cog running C code in XMM so another approach would be to use Multithreading (if it was available) within the same cog that executes the XMM C code.

Another approach would be to implement the CduTxDTask() feature within a separate cog running a PASM or Spin program. To do that I would need to pass a pointer to the output string used by CduPrint() to the PASM/Spin program.

Plus, I would need to know where in HubRam your Serial Port functions actually have the Tx and Rx buffers so my PASM/Spin program could work with them. It sounds like a real mess, but doable, and most likely implemented by creating some type of plugin to make it happen.

However, I'm not yet ready to explore this PASM/Spin plugin possibility because I have much work left to do with my code on the aforementioned items.

But be forewarned that when I get to that point I will likely have lots of questions about accessing your Serial Port buffers in HubRam via the PASM/Spin plugin

RossH · 2019-08-27 08:03

Wingineer19 wrote: »

Question: Just wanted to confirm that you weren't able to add Multithreading to the XMM Kernel?

Hmmm. I never anticipated anyone would want multithreading in conjunction with XMM. It is probably doable kernel space-wise, but the performance of multi-threaded XMM would be quite poor, depending on how many concurrent threads you wanted to run. If you had too many threads executing (and I'm talking maybe 3 or more) then nearly every context switch would result in a cache miss, requiring a page reload. This would also degrade the granularity of the round-robin scheduling, meaning that the multithreading would be "jittery". But a small number of threads might be possible. I've just rewritten all the multithreading code for the P2, so I am reasonably familiar with it again.

Not promising anything, but how many threads would you want to run, and how much of a performance hit would you be willing to accept?

Wingineer19 · 2019-08-27 18:03

RossH wrote: »

Not promising anything, but how many threads would you want to run, and how much of a performance hit would you be willing to accept?

Just some quick background information as to what this whole exercise is about: I'm building a device to perform some position surveying on my property. It will have a GPS receiver (initially using WAAS but then later RTK corrections) and a radio link to relay this information in real time back to my control computer.

Anyway, to answer your maximum threads question, I was thinking that 3 will likely be it.

One thread would be dedicated to the CduPrint() stuff. This port is for the benefit of the user to get position, health status, perform diagnostics, etc on the device, usually upon initial deployment and setup. Although it would be nice to have screen updates in a reasonably fast manner, it isn't time critical.

Another thread would be dedicated to the GPS receiver. It would be time critical, especially on the receive side, as it will need to grab various GPS messages and place them into their respective data structures. It will then have to set a flag so a function called by main() can do its magic with them. The transmit side would also be time critical to provide the RTK corrections from the radio link to the GPS receiver. However, the radio appears to have two UART ports so maybe I can tie one directly into the GPS receiver and send the corrections via a dedicated link. That is TBD.

The remaining thread would be dedicated to the radio link. It would be time critical on the receive side if the thread itself has to handle the uplinked RTK corrections. If the second UART on the radio can be used for this instead, then it's less critical as the thread would only need to receive commands from the control computer, and downlink the GPS position and status data provided to it by a different function running within main().

So the answer as to whether or not I can live with performance hits depends upon how well the GPS and radio threads perform, which in turn depends upon the buffering provided by the 4-Port serial driver and whether or not the threads can extract data from them in time to avoid overruns.

Oh, while on the subject of the 4-Port Serial Driver I noticed something.

Looking at the fdserial used within propGCC there's a function called fdserial_txEmpty(). It checks the Tx buffer and reports back whether or not the buffer is empty. It doesn't wait.

I couldn't find an equivalent function in the Catalina 4-Port driver.

This isn't a problem on the receive side as Catalina has the s4_rxcheck() which doesn't wait. It either returns a character or a -1 indicating none.

Catalina does have s4_txflush(), but it apparently requires the user program to wait while the Tx buffer is emptied.

The s4_tx() function is used to transmit characters, but it too apparently requires the user program to wait until the Tx buffer can accept the output character.

Requiring the user program to wait until the Tx buffer is empty, instead of just seeing that it isn't and moving on to something else, could have an adverse impact on performance.

Granted, the Ports are operating at 115.2Kbps so the performance hit would be minimal, but it would become very pronounced if operating the Ports at much lower baud rates. In fact, it would likely have dire consequences at a lower baud rate when operating in time critical situations where several different tasks must be performed in a timely manner and delays must be kept to a minimum.

A function like s4_txcheck() could be beneficial in time critical situations. It just reports whether or not the Tx buffer is empty and the user program takes it from there.

If empty, the user program could then use s4_tx() to output a character. Otherwise, the user program can move on to other tasks that need to be performed and then come back later and try the s4_txcheck() again.

Hopefully that makes sense.

Anyway, I didn't intend to overload your plate with having to first examine the EEPROM programming issue, then the XMM Multithreading issue, and now maybe the s4_txcheck() issue.

I really appreciate your customer support and your price can't be beat

RossH · 2019-08-29 04:38

Wingineer19 wrote: »

A function like s4_txcheck() could be beneficial in time critical situations. It just reports whether or not the Tx buffer is empty and the user program takes it from there.

If empty, the user program could then use s4_tx() to output a character. Otherwise, the user program can move on to other tasks that need to be performed and then come back later and try the s4_txcheck() again.

Hopefully that makes sense.

Anyway, I didn't intend to overload your plate with having to first examine the EEPROM programming issue, then the XMM Multithreading issue, and now maybe the s4_txcheck() issue.

I've just had a look at the code, and this one looks fairly easy. I will add a function called s4_txcheck() which will return the number of bytes available in the tx buffer. So if it returns zero, it means the buffer is full, and calling s4_tx() would block.

This means you can use code like:

if (s4_txcheck(port)) {
   s4_tx(port, byte);
}
else {
   do_something_else();
}

I'll add this to the next release.

Wingineer19 · 2019-08-29 06:41

RossH wrote: »
Wingineer19 wrote: »

A function like s4_txcheck() could be beneficial in time critical situations. It just reports whether or not the Tx buffer is empty and the user program takes it from there.

If empty, the user program could then use s4_tx() to output a character. Otherwise, the user program can move on to other tasks that need to be performed and then come back later and try the s4_txcheck() again.

Hopefully that makes sense.

Anyway, I didn't intend to overload your plate with having to first examine the EEPROM programming issue, then the XMM Multithreading issue, and now maybe the s4_txcheck() issue.

I've just had a look at the code, and this one looks fairly easy. I will add a function called s4_txcheck() which will return the number of bytes available in the tx buffer. So if it returns zero, it means the buffer is full, and calling s4_tx() would block.

This means you can use code like:
if (s4_txcheck(port)) {
   s4_tx(port, byte);
}
else {
   do_something_else();
}
I'll add this to the next release.

Thanks, Ross, that's awesome!

My program will be performing time critical stuff, including managing I/O traffic to/from at least three serial ports, so avoiding a halt in execution while waiting for any particular Tx buffer to empty will be most beneficial.

One of these ports might be running at 19.2Kbps, and that's when I realized that if my program halts while waiting for its Tx buffer to clear, the whole time critical scheme will likely fail. But with the s4_txcheck() function, this is no longer a concern.

I guess you will also consider adding a tty_txcheck() and tty256_txcheck() function to the serial library as well?

Have you given any more thought to the XMM Multithread capability? It might be worth a try even if only 2 threads are usable. Of course, I'm not the one who would have to add this feature to the compiler so it's easy for me to ponder this capability

Thanks again for the s4_txcheck() addition. I will sleep better tonight knowing that my time critical stuff won't come crashing down over the Tx buffer issue

Wingineer19 · 2019-08-30 19:45

Hi @RossH,

I have one more question about the 4-Port Serial Driver.

Is it possible to change the baud rate in real time?

I know that the initial baud rate settings are configured in the Extras.Spin file.

I couldn't find anything like an s4_baud(port,rate) function that can be configured at runtime.

I'm asking because I have a GPS receiver that defaults at 9.6Kbps and I need a way to link to it at that speed, command it to switch over to 115.2Kbps, then resume communication at that higher rate.

RossH · 2019-08-31 01:42

Wingineer19 wrote: »

Hi @RossH,

I have one more question about the 4-Port Serial Driver.

Is it possible to change the baud rate in real time?

I know that the initial baud rate settings are configured in the Extras.Spin file.

I couldn't find anything like an s4_baud(port,rate) function that can be configured at runtime.

I'm asking because I have a GPS receiver that defaults at 9.6Kbps and I need a way to link to it at that speed, command it to switch over to 115.2Kbps, then resume communication at that higher rate.

Not easily, no. The baud rates of the ports in the 4 port driver are configured by modifying the Spin image before the cog is even loaded and started. There is no way to subsequently change the settings of the individual ports - not without a significant rewrite, and I'm pretty sure you would not have enough space to do this.

A better way to achieve what you want might be to use the single port serial driver for the GPS receiver, but modify it to be able to change its baud rate "on the fly" (e.g. in response to a specific string being sent to it). This would require some PASM coding, but it should be possible to run this modified driver in conjunction with the existing 4 port serial driver (to handle the other ports) provided you have enough cogs (note: I have not tried this!).

Wingineer19 · 2019-08-31 03:08

Question: Is it possible to change the baud rate of the 4-Port Serial Driver at runtime?

RossH wrote: »

Not easily, no. The baud rates of the ports in the 4 port driver are configured by modifying the Spin image before the cog is even loaded and started. There is no way to subsequently change the settings of the individual ports - not without a significant rewrite, and I'm pretty sure you would not have enough space to do this.

A better way to achieve what you want might be to use the single port serial driver for the GPS receiver, but modify it to be able to change its baud rate "on the fly" (e.g. in response to a specific string being sent to it). This would require some PASM coding, but it should be possible to run this modified driver in conjunction with the existing 4 port serial driver (to handle the other ports) provided you have enough cogs (note: I have not tried this!).

I didn't think so but had to ask to be sure.

A serial port driver would have to be written that permits runtime changes in baud rates, and if doing that might as well allow changes to character size, stop bits, sending a break condition, and supporting XON and XOFF functionality too. I haven't seen such a critter for the propeller. But, I digress.

Fortunately, there might be another way to address this baud rate problem without modifying any of the serial port drivers.

In addition to a single UART port on the GPS receiver, it also has an I2C port. I need to do some more digging on this, but it might be possible to send a command string to the GPS receiver via I2C forcing it to change the UART baud rate to 115.2Kbps.

It doesn't look like I can directly use your plugin that speaks I2C because it appears to be configured to work with the EEPROM on pins P29 and P28.

So, I will just need to write some I2C functions, like i2c_start(), i2c_write(), and i2c_stop() that work on pins P10 and P11 that I've already reserved.

I'm going to have to use I2C anyway at some point on this project to communicate with some temperature sensors and a battery charge monitor, so might as well jump into it now.

Since Catalina allows me to access dira and outa using external variables DIRA and OUTA, respectively, I should be able to implement these I2C functions in C.

The C functions will be slower than if I did it in PASM, but who cares because I2C is a relatively slow bus anyway.

If this I2C scheme works, I can send not only baud rate commands, but also message request commands to the GPS receiver, and then get the responses from it using the 4-Port Serial UART Driver at the desired 115.2Kbps rate. That would get me to where I need to go.

msrobots · 2019-08-31 03:45

@Wingineer19 ,

you can use the I2C pins where the eeprom is. It is as bus, no need for new pins. Just make sure your board has resistors on both I2C pins, some Parallax boards don't. You will need them in case of clock stretching. Some driver allow that, not sure about Catalina.

Never waste pins. and after boot, the eeprom does not do much, so the bus is empty anyways.

Mike

RossH · 2019-08-31 03:50

Wingineer19 wrote: »

The C functions will be slower than if I did it in PASM, but who cares because I2C is a relatively slow bus anyway.

This should be possible. But before you do that, have a look in the old OBEX to see if there is an I2C object that already does what you need. If there is, I can probably make it work with Catalina.

Wingineer19 · 2019-08-31 05:11

msrobots wrote: »

@Wingineer19 ,

you can use the I2C pins where the eeprom is. It is as bus, no need for new pins. Just make sure your board has resistors on both I2C pins, some Parallax boards don't. You will need them in case of clock stretching. Some driver allow that, not sure about Catalina.

Never waste pins. and after boot, the eeprom does not do much, so the bus is empty anyways.

Mike

Unfortunately I can't use this bus because the 64KB EEPROM on the board is being replaced with two Atmel AT24CM02 256KB EEPROMs, for a total of 512KB EEPROM memory.

The usual A2, A1, A0 fields within the Device Address Byte that I2C devices use have been replaced with actual Memory Address lines within these two chips. The result is that these two chips will use up all combinations of "device addresses" that I2C devices normally use, hence they are the only two devices allowed on the bus.

I've already removed the Parallax provided 64KB EEPROM from my board and installed a single AT24CM02, with the second one to be installed shortly. The good news is that the propeller does work with this 256KB EEPROM upon bootup.

My program is running in XMM memory, and it's already more than 70KB long, so that's why I moved to this higher capacity EEPROM for program storage.

Upon bootup I want the propeller to transfer my program from EEPROM to SRAM and execute from there. The Catalina EEPROM loader has some hiccups doing this, so @RossH added this to his "to do" list when he gets the time. Right now I just dump the code directly into SRAM from my PC for testing and debugging so no biggie.

Bottom line, I will have to use a different set of pins to work with the GPS receiver, temp sensors, and battery monitor.

The good news is that I still have enough pins to work with after doing this so I'm not in a panic (yet)...

Wingineer19 · 2019-08-31 05:14

RossH wrote: »

Wingineer19 wrote: »

The C functions will be slower than if I did it in PASM, but who cares because I2C is a relatively slow bus anyway.

This should be possible. But before you do that, have a look in the old OBEX to see if there is an I2C object that already does what you need. If there is, I can probably make it work with Catalina.

That's a good idea, even though I already starting writing the I2C functions in C to perform the start, write, and stop commands.

I'll swing on over to the OBEX and see what's available...

msrobots · 2019-08-31 06:17

Wingineer19 wrote: »

msrobots wrote: »

@Wingineer19 ,

you can use the I2C pins where the eeprom is. It is as bus, no need for new pins. Just make sure your board has resistors on both I2C pins, some Parallax boards don't. You will need them in case of clock stretching. Some driver allow that, not sure about Catalina.

Never waste pins. and after boot, the eeprom does not do much, so the bus is empty anyways.

Mike

Unfortunately I can't use this bus because the 64KB EEPROM on the board is being replaced with two Atmel AT24CM02 256KB EEPROMs, for a total of 512KB EEPROM memory.

The usual A2, A1, A0 fields within the Device Address Byte that I2C devices use have been replaced with actual Memory Address lines within these two chips. The result is that these two chips will use up all combinations of "device addresses" that I2C devices normally use, hence they are the only two devices allowed on the bus.

I've already removed the Parallax provided 64KB EEPROM from my board and installed a single AT24CM02, with the second one to be installed shortly. The good news is that the propeller does work with this 256KB EEPROM upon bootup.

My program is running in XMM memory, and it's already more than 70KB long, so that's why I moved to this higher capacity EEPROM for program storage.

Upon bootup I want the propeller to transfer my program from EEPROM to SRAM and execute from there. The Catalina EEPROM loader has some hiccups doing this, so @RossH added this to his "to do" list when he gets the time. Right now I just dump the code directly into SRAM from my PC for testing and debugging so no biggie.

Bottom line, I will have to use a different set of pins to work with the GPS receiver, temp sensors, and battery monitor.

The good news is that I still have enough pins to work with after doing this so I'm not in a panic (yet)...

OH cool,

you got the big EEPROMs running. I missed that. Yeah then you might need another bus.

Sorry,

Mike

Wingineer19 · 2019-08-31 06:54

msrobots wrote: »

OH cool,

you got the big EEPROMs running. I missed that. Yeah then you might need another bus.

Sorry,

Mike

No need to apologize. Most people probably wouldn't mess with the EEPROM arrangement like I did. I just wasn't happy with the 64KB capacity so I tossed it. I was delighted to see that the propeller worked with the 256KB EEPROM. I'm assuming it will still work once I add the other 256KB onto the bus.

I'm also pleasantly surprised how well XMM is working, even though I can only run a single instance of C on it. The serial port drivers operate as plugins, even under XMM, so I essentially have a single core CPU with multiple serial ports. Even with the single instance C limitation this arrangement still beats many of the microcontrollers out there.

If I can find a nifty I2C object that does what I need, then @RossH might be able to work his magic, turn it into a plugin, and add it to the Catalina library as well

I've added several Menus to my code as I'm building it out to work with the GPS receiver and the radio. These Menus are really slick, and they will get even better when I can populate them with real data.

RossH · 2019-08-31 07:50

RossH wrote: »

I used to have code that would execute XMM SMALL programs directly from the EEPROM, instead of first loading them to XMM RAM, but it seems to have gotten lost somewhere along the line. I was sure it was in one of the very last of the Catalina P1 releases, but I have tried to find it and cannot. It may have been in a release that never actually got issued. It was just an implementation of the XMM API that read code direct from EEPROM (above the first 32kb) instead of from SRAM or FLASH. It wasn't particularly useful since it was generally slower than executing from XMM RAM and it could only be used for SMALL programs, but for completeness I may re-write it one day.

Aha! I just found this code again. It looks like I may have issued it as a patch release (perhaps someone specifically asked for it), but for some reason it never made its way into the mainstream releases, and thereafter got lost. I have attached a copy for reference, and in case you are interested, but I do not suggest you use it yet - for one thing, it will not work with your EEPROM, and for another it was intended for an earlier release, and would probably overwrite some important files with earlier versions.

I will re-test the code, and (assuming it still works ok) I will include it in the next release. The documentation says it works for EEPROMS up to 128kb, but if we can get the EEPROM loader working for larger EEPROMs, then this should work as well.

I am glad to see my memory is not yet failing - I was beginnning to worry

msrobots · 2019-08-31 09:39

Yeah @RossH,

while waiting for the P2 the Grenades are hitting closer and closer.

Doctor:
I have two very bad News for you.
Patient:
OK.
Doctor:
You have a bad cancer and really, really bad Alzheimer.
Patient:
Sounds bad, but at least I do not have cancer...

Mike

RossH · 2019-08-31 11:50

msrobots wrote: »

Doctor:
You have a bad cancer and really, really bad Alzheimer.
Patient:
Sounds bad, but at least I do not have cancer...

Wingineer19 · 2019-09-01 03:19

RossH wrote: »

RossH wrote: »

I used to have code that would execute XMM SMALL programs directly from the EEPROM, instead of first loading them to XMM RAM, but it seems to have gotten lost somewhere along the line. I was sure it was in one of the very last of the Catalina P1 releases, but I have tried to find it and cannot. It may have been in a release that never actually got issued. It was just an implementation of the XMM API that read code direct from EEPROM (above the first 32kb) instead of from SRAM or FLASH. It wasn't particularly useful since it was generally slower than executing from XMM RAM and it could only be used for SMALL programs, but for completeness I may re-write it one day.

Aha! I just found this code again. It looks like I may have issued it as a patch release (perhaps someone specifically asked for it), but for some reason it never made its way into the mainstream releases, and thereafter got lost. I have attached a copy for reference, and in case you are interested, but I do not suggest you use it yet - for one thing, it will not work with your EEPROM, and for another it was intended for an earlier release, and would probably overwrite some important files with earlier versions.

I will re-test the code, and (assuming it still works ok) I will include it in the next release. The documentation says it works for EEPROMS up to 128kb, but if we can get the EEPROM loader working for larger EEPROMs, then this should work as well.

I am glad to see my memory is not yet failing - I was beginnning to worry

I definitely think you should include this feature in your next release, especially if you can get the EEPROM loader to work with the 256KB AT24CM02 chip.

I've already witnessed that this chip works fine as I'm able to load and execute a CMM program from it upon bootup.

Suppose Parallax can be convinced to replace their existing 64KB EEPROM on the FLiP module with this 256KB EEPROM. Then their downside risk would be minimal because the EEPROM will work in the manner already familiar with most users, namely CMM mode.

But with this 256KB EEPROM available, Catalina would be able to execute a SMALL XMM program directly from it without the need to add external SRAM and/or Flash chips like most XMM extensions require.

As you mentioned, XMM programs won't execute as fast from EEPROM as they do from SRAM due to the lower I2C bus speed, but perhaps this can be mitigated somewhat by finding a suitable cache size in HubRam. Nevertheless, there could be users who would be willing to trade speed for larger code space. It would all depend upon the application to be performed.

I find this possibility intriguing because there would be no need to add additional circuitry to the FLiP module. Just replace the 64KB EEPROM with the 256KB EEPROM and automatically an XMM capability opens up. Users who prefer to work with CMM and don't care for XMM can stay with it without any adverse impact. The change would be transparent to them.

I can only see an upside potential to this execute XMM from EEPROM capability.

Now, if only Parallax would think about a higher capacity EEPROM for the FLiP...

jmg · 2019-09-01 09:32

Wingineer19 wrote: »

Now, if only Parallax would think about a higher capacity EEPROM for the FLiP...

Problems there are the AT24CM02 parts you mention are costly, and appear to not come in TSSOP8 which FLiP uses.
There are 1M TSSOP8 parts like CAT24M01YI-GT3, so perhaps Parallax could consider a variant with that ?

On puzzle around memory, is FLASH is far cheaper than EEPROM but no one seems to make i2c FLASH ? - they instead make SPI/DSPI/QSPI ?
At large sizes, it could be cheaper to add a small MCU and a Serial FLASH part, than to up-size an EEPROM ?

Wingineer19 · 2019-09-01 15:55

jmg wrote: »

Wingineer19 wrote: »

Now, if only Parallax would think about a higher capacity EEPROM for the FLiP...

Problems there are the AT24CM02 parts you mention are costly, and appear to not come in TSSOP8 which FLiP uses.
There are 1M TSSOP8 parts like CAT24M01YI-GT3, so perhaps Parallax could consider a variant with that ?

On puzzle around memory, is FLASH is far cheaper than EEPROM but no one seems to make i2c FLASH ? - they instead make SPI/DSPI/QSPI ?
At large sizes, it could be cheaper to add a small MCU and a Serial FLASH part, than to up-size an EEPROM ?

Yeah, I don't know why Flash isn't available with an I2C bus. There has to be some reason why Industry went the SPI route.

But given the I2C device addressing/selection scheme used, the largest capacity I2C device that could be supported on the bus would be 512KB. Still, it seems a 512KB Flash I2C device would be dirt cheap compared to an EEPROM.

I would also like to see I2C bus speeds upwards of 10MHz, but that doesn't exist, either. The fastest I've seen advertised is 3.4MHz.

Indeed, Parallax would incur some design risk using the AT24CM02 EEPROM since the FLiP circuit board would have to be redesigned to accommodate its SOIC8 footprint as opposed to the TSSOP8 used by its existing EEPROM.

The CAT24M01Y1 (128KB) wouldn't require a FLiP circuit board redesign since it's a TSSOP8 package. Just a 1-for-1 swap with the existing 64KB EEPROM and you're good to go.

ST also sells the M24M02-DR (256KB) EEPROM but it too is an SOIC8 package, not TSSOP8. There has to be some reason for this.

Single unit cost at Mouser for the CAT24M01Y1 (128KB) is $1.70, the ST M24M02-DR (256KB) shows $2.78, and the AT24CM02 (256KB) I use lists for $2.51. As always, the additional cost would have to be passed on to the customer.

The beauty of just swapping out the 64KB EEPROM with either the 128KB one (better) or the 256KB one (desirable) is that a (slow) XMM mode could be supported without consuming any additional pins on the propeller.

My USB Project Board already has the AT24CM02 installed as the boot EEPROM.

@RossH recently rediscovered his XMM EEPROM API and is reviewing it, and as soon as the Catalina EEPROM loader works with the AT24CM02, I will be able to test the XMM EEPROM mode to see how it performs.

Since the P2 appears to be on ice for now, I guess it's time to go back and see how much more we can squeeze out of the P1. A higher capacity EEPROM could be a start.

jmg · 2019-09-01 20:36

Wingineer19 wrote: »

ST also sells the M24M02-DR (256KB) EEPROM but it too is an SOIC8 package, not TSSOP8. There has to be some reason for this.

I think that is purely die size, which also explains the higher prices...

Wingineer19 wrote: »

Single unit cost at Mouser for the CAT24M01Y1 (128KB) is $1.70, the ST M24M02-DR (256KB) shows $2.78, and the AT24CM02 (256KB) I use lists for $2.51. As always, the additional cost would have to be passed on to the customer.

Checking Flash in 2x3 DFN, I see Macronix and Gigadevice have that small package up to 16MB, for 50~60c/3k, so a low cost flash and a small MCU/CPLD? as a bridge device could 'fit' a Prop 1
Maybe you configure the bridge part with 2 i2c address, a default one is 24C512 boot-alike, and a second one has 24b address, good for 128MBit ?
boot-via-bridge gets tricky around the details, where the last address bit has not much time before first data bit is expected.
( I doubt P1 ROM supports clock stretching..) P2 boot clock speed is not super-high, maybe a much faster SPI side can manage this ?

Wingineer19 · 2019-09-02 02:57

jmg wrote: »

Checking Flash in 2x3 DFN, I see Macronix and Gigadevice have that small package up to 16MB, for 50~60c/3k, so a low cost flash and a small MCU/CPLD? as a bridge device could 'fit' a Prop 1
Maybe you configure the bridge part with 2 i2c address, a default one is 24C512 boot-alike, and a second one has 24b address, good for 128MBit ?
boot-via-bridge gets tricky around the details, where the last address bit has not much time before first data bit is expected.
( I doubt P1 ROM supports clock stretching..) P2 boot clock speed is not super-high, maybe a much faster SPI side can manage this ?

The use of an I2C to SPI bridge is a fascinating idea. It would certainly allow access to a plethora of SPI memory chips, and would have the added bonus of not requiring the use of propeller pins in order to do so.

There are I2C to SPI bridge chips on the market, like the NXP SC18IS602B, that wouldn't require the use of an additional microcontroller. However, the I2C bus speed with this chip is listed as 400KHz. If I was to go this route I would want the ability to push the I2C bus into the MHz range.

But looking at it from a risk management perspective, a small company like Parallax would aim for very low risk.

The no risk approach, of course, is to stay with the existing FLiP design and keep the 64KB EEPROM.

If we were to make a good case they might consider a low risk approach of just replacing the 64KB EEPROM with the TSSOP8 128KB one. This wouldn't require a design change, just a manufacturing one. And a minor one at that: Just install the 128KB EEPROM instead of the 64KB and be done with it.

If we were to make a really, really good case then Parallax might consider a moderate risk approach of using the AT24CM02 (or equivalent) 256KB EEPROM. This would require a circuit board design change, as well as a manufacturing change, resulting in a considerably higher risk than simply installing the 128KB EEPROM above. Unless, of course, a 256KB EEPROM in the TSSOP8 package hits the market, in which case we're back to a low risk approach.

If we were to make an excellent, highly persuasive argument then Parallax might consider a high risk approach of using a two chip solution, namely the I2C to SPI bridge and an SPI Flash memory. This would require a circuit board design change and a manufacturing change, resulting in the highest risk approach of those discussed here.

So although I think the I2C/SPI bridge idea is very cool, and I really want the 256KB EEPROM, realistically, if the stars align, we might see Parallax at some time in the future offer the FLiP module with the 128KB EEPROM (or a 256KB TSSOP8 if one existed).

They've got a lot riding on the P2 effort right now, so I doubt they want to incur additional risk by having to do a significant redesign of the FLiP module...

jmg · 2019-09-02 03:47

Wingineer19 wrote: »

There are I2C to SPI bridge chips on the market, like the NXP SC18IS602B, that wouldn't require the use of an additional microcontroller. However, the I2C bus speed with this chip is listed as 400KHz. If I was to go this route I would want the ability to push the I2C bus into the MHz range.

Interesting part, but it's not just speed that's an issue here, the bridge also needs to look like an BOOT device to P1, so power up can work normally.
That rather bumps you into MCU territory, but for example the EFM8UB3 has FIFOs in i2c and SPI peripherals, and specs 1MHz on i2c, so good platforms exist.

Wingineer19 wrote: »

If we were to make a good case they might consider a low risk approach of just replacing the 64KB EEPROM with the TSSOP8 128KB one. This wouldn't require a design change, just a manufacturing one. And a minor one at that: Just install the 128KB EEPROM instead of the 64KB and be done with it.

That's the most likely upgrade, and it depends on enough volume.
If enough users indicated they needed the extra memory, or if Parallax decided they needed a sales-boost on FLiP, they could increment the memory with little effort in the same package.

If you have working examples that usefully deploy more memory, I am sure that helps.

Wingineer19 · 2019-09-02 05:17

jmg wrote: »

Interesting part, but it's not just speed that's an issue here, the bridge also needs to look like an BOOT device to P1, so power up can work normally.
That rather bumps you into MCU territory, but for example the EFM8UB3 has FIFOs in i2c and SPI peripherals, and specs 1MHz on i2c, so good platforms exist.

Well, not necessarily. If you look at the SC18IS602B it has I2C address inputs A2,A1,A0. The existing EEPROM has these values tied to GND, so if you tie A2,A1,A0 on the SC18IS602B to a different value, then the EEPROM should still be accessed upon bootup. So the existence of the bridge on the I2C bus, but assigned to a different address, shouldn't interfere with the bootup process. Hence you wouldn't need an MCU for the bootup or bridging function.

Wingineer19 wrote: »

If we were to make a good case they might consider a low risk approach of just replacing the 64KB EEPROM with the TSSOP8 128KB one. This wouldn't require a design change, just a manufacturing one. And a minor one at that: Just install the 128KB EEPROM instead of the 64KB and be done with it.

jmg wrote: »

That's the most likely upgrade, and it depends on enough volume.
If enough users indicated they needed the extra memory, or if Parallax decided they needed a sales-boost on FLiP, they could increment the memory with little effort in the same package.

If you have working examples that usefully deploy more memory, I am sure that helps.

I agree, but I don't expect Parallax to make such a revision anytime soon.

For my current Project I have a choice: Use a FLiP module installed on a carrier board, or go with a USB Project Board.

The USB Project Board already has the AT24CM02 256KB EEPROM and two ISSI 65WVS5128GBLL 512KB X 8 SPI SRAMs installed. The propeller boots from the AT24CM02 and executes XMM from those two SRAMs.

If I go the FLiP route, I will still need to have the AT24CM02 EEPROM and the two 65WVS5128GBLL 512KB X 8 SPI SRAMs installed on the carrier board to run XMM from the SRAMs.

However, using the AT24CM02 on the carrier board for bootup and program storage will require me to remove/disable the 64KB EEPROM on the FLiP. That doesn't look easy at all, so I may be stuck with the USB Project Board anyway...

But I can't help wondering if a FLiP module had a large enough EEPROM installed, could it run XMM directly from EEPROM fast enough to avoid having to run XMM from external SRAMs like I currently do on the USB Project Board?

If so, using the FLiP would be much more elegant, circuit complexity would be reduced, the carrier board wouldn't need the AT24CM02 EEPROM or SRAMs installed, and those pins on the propeller currently allocated to the SRAMs could be repurposed.

Once we get the Catalina loader to correctly work with the AT24CM02 I will be able to run some tests on my USB Project Board to answer this very question...

jmg · 2019-09-02 06:03

Wingineer19 wrote: »

Well, not necessarily. If you look at the SC18IS602B it has I2C address inputs A2,A1,A0. The existing EEPROM has these values tied to GND, so if you tie A2,A1,A0 on the SC18IS602B to a different value, then the EEPROM should still be accessed upon bootup. So the existence of the bridge on the I2C bus, but assigned to a different address, shouldn't interfere with the bootup process. Hence you wouldn't need an MCU for the bootup or bridging function.

Well, yes, but I was assuming the EEPROM is completely replaced by Flash..

Wingineer19 wrote: »

I agree, but I don't expect Parallax to make such a revision anytime soon.

It's probably feasible for customers to swap the TSSOP part for modest volumes, which gets you to 1Mbit...

Addit: FLiP is compact, so SO8 would likely be too large.
Looking at TSSOP8, the body is 4.4 x 3.0mm, and ST do make a WLCSP- 8-bump, 3.556 x 2.011 mm, wafer level chip scale package outline, in the 2M part.
That might route to fit under the TSSOP8, as an alternate fit proposition. Not sure if Parallax production can manage BGA packages ?

SaucySoliton · 2019-09-03 03:30

If anyone needed a FLiP with 256kB, it could be done by stacking 2 128kB EEPROMs. The appropriate address lines of the top EEPROM would go to 3.3v instead of the pins below. The soldering would need to be done under a microscope.

The FLiP is open source hardware, so if anyone needed more than a few it would be best to redesign. The buffer for the LEDs on pins 26 and 27 could be removed or relocated to make room for another TSSOP EEPROM. Or just add some pads on the top for SOICs.

It seems like the EEPROM address lines are being consumed on the larger EEPROMs. It might be possible that the FLiP's 64kB and a carrier board with 64kB and 128kB EEPROMs, configured appropriately, would be indistinguishable from a single 256kB EEPROM. In fact I wonder if the larger EEPROMs contain 2 dies inside their package.

Wingineer19 · 2019-09-03 05:34

jmg wrote: »

Looking at TSSOP8, the body is 4.4 x 3.0mm, and ST do make a WLCSP- 8-bump, 3.556 x 2.011 mm, wafer level chip scale package outline, in the 2M part.
That might route to fit under the TSSOP8, as an alternate fit proposition. Not sure if Parallax production can manage BGA packages ?

The Atmel AT24CM02 is also available in a WLCSP package, so it would likely fit in the space used by the TSSOP8? But as you said, how would Parallax like to work with BGA packages?

I was considering buying some VS23S040D 4M SPI SRAMs because they offer an 8-bit databus. A single chip could replace the two QSPI SRAMs I'm using to run XMM. But I ran for cover when I saw it was only offered in a BGA package. SOIC would have been great, but I won't go near BGA.

SaucySoliton wrote: »

If anyone needed a FLiP with 256kB, it could be done by stacking 2 128kB EEPROMs. The appropriate address lines of the top EEPROM would go to 3.3v instead of the pins below. The soldering would need to be done under a microscope.

Technically, that should work fine. But when doing sequential reads one would need to be careful when approaching the boundary of each memory. The code would have to be smart enough to end the read cycle in one memory then restart it in the other. The appropriate safeguards would have to be included within the caching cog performing this task, but it shouldn't be hard to do.

The whole 256KB EEPROM issue would be much easier to solve if the manufacturers offered it in the TSSOP8 package...

RossH · 2019-09-06 10:42

Hello @Wingineer19

Just so you know I am still working on it ...

I have my XEPROM and EEPROM loaders working again. I'm not actually sure they stopped working - I think I was just not remembering how to use them correctly

In any case. I can now load XMM SMALL programs into EEPROMs up to 128kb (the largest EEPROM I have) and either execute them direct from the EEPROM (on platforms with no XMM RAM), or after loading the program into XMM RAM.

The next step is to start looking at what needs to change in the EEPROM loaders to support EEPROMs larger than 128kb.

Ross.

Wingineer19 · 2019-09-06 16:04

Hi @RossH,

I figured you've been very busy with not only adding txcheck() to the serial port library, but getting the XEPROM and EEPROM loaders working, figuring out how to support larger sized EEPROMs, updating Catalina to support P2, and dealing with whatever else life throws your way.

I'm currently using a modified USB Project Board for the XMM experiment I'm working on. I say modified because I added two SPI 512KB SRAMs and one 256KB EEPROM to this board. The two SRAMs are operating in a "Rampage2" mode (i.e. merging the 4-bit bus from each chip into a single 8-bit one). I completely removed the 64KB EEPROM and replaced it with that 256KB one.

I could have left the existing 64KB EEPROM on the board and then added the 256KB, but to avoid addressing conflicts I would have to move the 256KB into the next higher 256KB block.

This would have effectively given this EEPROM physical memory map:
64KB EEPROM:$0000_0000 to $0000_FFFF
256KB EEPROM: $0004_0000 to $0007_FFFF

As you can see there's quite a gap between the two. The EEPROM mapping isn't contiguous. I don't think your EEPROM loader would like that.

With the 64KB EEPROM completely removed, and the 256KB one in its place, the physical addressing is now $0000_0000 to $0003_FFFF, which the loader shouldn't have a problem with once it can handle the 256KB size. I know the propeller doesn't have any problem with this arrangement upon bootup.

I mention this because I'm wanting to replace the USB Project Board with a FLiP module. Looking at this module, however, it would be quite treacherous attempting to remove the 64KB EEPROM. Most likely the module would be damaged.

That means I need to keep the 64KB EEPROM and then add the 256KB one externally. This brings us back to this physical memory map:
64KB EEPROM:$0000_0000 to $0000_FFFF
256KB EEPROM: $0004_0000 to $0007_FFFF

Which in turn circles back to the memory not being contiguous, and your EEPROM loader not knowing how to deal with it. Unless, of course, there's a way to tell the EEPROM loader how the EEPROM addressing is configured, just like you can with SRAMs and Flash memory within the Definition file.

If not, I suppose I could work around this problem by modifying the Flash API functions and essentially redirecting them to the 256KB EEPROM to mimic what your XEPROM effectively does. I could tell the Definition file where this "Flash" memory is mapped to, then use the FLASH loader when compiling the program.

I assume after bootup the XMM would be executing from EEPROM (using the Flash API functions) just like XEPROM would. That would be fine if that's what I want.

But what if I really wanted this "Flash" memory to be copied to SRAM and execute XMM from there? Will the FLASH loader let me do that if I tell build_utilities that both Flash and SRAM are present?

Just wanting to make sure I understand how the Loader works here:

If I want to execute XMM directly from EEPROM, I choose the XEPROM loader.

If I want to copy from EEPROM to SRAM, then execute XMM from SRAM, I choose the EEPROM loader.

If I want to execute XMM directly from Flash, then I choose the FLASH loader.

What if I wanted to copy from Flash to SRAM, then execute XMM from SRAM?

That last option is what I would most likely want if I write the Flash API functions to actually read from EEPROM instead of Flash...

Thoughts?

RossH · 2019-09-07 08:57

Wingineer19 wrote: »

Just wanting to make sure I understand how the Loader works here:

I'm afraid this gets a little complicated

There are XMM APIs, Platforms, and 2nd and 3rd stage loaders (the 1st stage loader is always the loader built into the Propeller itself, which can load programs from the serial port or from EEPROM). And apart from the 1st stage loader, they all have to have a NAME to specify which one to use.

What complicates things is that sometimes the Platform name is the same as the XMM API name (because the XMM is build into the platform, such as on the C3) but sometimes it is not (e.g. the Propeller Memory Card or PMC, which can be used on both the QUICKSTART or DEMO platforms). Also, the 2nd and 3rd stage loaders tend to have the same name, but the 2nd stage loader is specified in the payload command line, whereas the 3rd stage loader has to be specified in the catalina command line.

If I want to execute XMM directly from EEPROM, I choose the XEPROM loader.

There is no XEPROM loader. There is only an XEPROM XMM API. You compile with -C XEPROM on the catalina command line, as you would with any other XMM API, then use the 2nd stage EEPROM loader on the payload command line.

If I want to copy from EEPROM to SRAM, then execute XMM from SRAM, I choose the EEPROM loader

Yes. You compile with your XMM API (e.g. -C PMC) on the catalina command line, but also include the EEPROM 3rd stage loader in your program (by specifying -C EEPROM on the catalina command line). Then you specify the 2nd stage EEPROM loader on the payload command line.

If I want to execute XMM directly from Flash, then I choose the FLASH loader.

FLASH is both an extension to the XMM API as well as a 2nd stage Loader. So in addition to your XMM API name (e.g. -C PMC) you must also specify -C FLASH on the catalina command line to enable the Flash API, then also specify the FLASH 2nd stage loader on the payload command line. Just to complicate things further, there is also a separate FLASH_Boot utility, which is a stand-alone 2nd stage loader that you can program into EEPROM. This allows FLASH programs to be automatically executed on boot.

What if I wanted to copy from Flash to SRAM, then execute XMM from SRAM?

This is not supported, because it would not be particularly useful. However, if it was supported, it would require FLASH to also be the name of a 3rd stage loader.

The reason it is not particularly useful is that both FLASH and SRAM are serial, and are generally of similar speeds (at least for reading, if not writing), so you may as well just execute your code directly from the FLASH, and save your SRAM. Even if the FLASH were significantly slower than the SRAM, the difference would be reduced because you typically have to use the cache with both.

That last option is what I would most likely want if I write the Flash API functions to actually read from EEPROM instead of Flash...

Thoughts?

I doubt that I will ever support non-contigous EEPROMs, particularly if they require significantly different protocols to access - there would probably not be enough space for two sets of access code, which would make all the APIs and Loaders much more complex than they are now - and even now they do my head in whenever I have to revisit them!

Revisiting XMM Mode Using Modern Memory Chips

Comments