Well, not really. This actually cleared up the confusion I was having. When trying to understand the XMM load and execute process it's best not to confuse/conflate the loaders with the APIs. Thanks for the clarification.
I'm using the CUSTOM platform. So far I've only written the XMM SRAM API, and it works well. I haven't tackled the XMM FLASH API yet.
Let's see if I understand the discussion of Loaders and APIs correctly:
1. If I have FLASH installed, but no SRAM, then I could run a SMALL XMM program since the Code will be in Flash and the Data and Stack in HubRam. I couldn't run a LARGE XMM program because that would require the Data to be in XMM memory, and by definition Data implies read/write ability. If I include the Flash_Boot utility, the program will load and execute upon bootup.
2. If I have both FLASH and SRAM installed, then I could run either a SMALL or LARGE XMM program (my choice). If SMALL, the situation will be as in #1 above. If LARGE, the Code will execute from FLASH, the Data will be in SRAM, and the Stack still in HubRam. As above, if I include the Flash_Boot utility the program will load and execute upon bootup.
3. If I only have SRAM installed, I could run either a SMALL or LARGE XMM program (again my choice). In the SMALL case, Code will execute from SRAM, with Data and Stack in HubRam. In the LARGE case, both Code and Data will be in SRAM, with the Stack remaining in HubRam. Obviously the FLASH loader couldn't be used in this case, since there is no Flash memory. I can load and execute interactively using Code::Blocks, or the Catalina command line (indeed this is how I test and debug my program), but it can't load and execute upon bootup unless I use the EEPROM loader.
4. If I only have an EEPROM installed, I could run a SMALL XMM program if I include the XEPROM API. Code will execute from EEPROM, Data and Stack are in HubRam. Execution speed will be slower than using an SPI FLASH memory, but will be mitigated somewhat by using the Cache. The XMM program will load from EEPROM and execute upon bootup.
Those pretty much appear to be my XMM options.
Distilling all of this down, I can conclude:
1. If using the existing USB Project Board, with the 64KB EEPROM board removed, and the 256KB EEPROM and 512KB SRAM installed, I can run in either SMALL or LARGE memory mode. The program will be stored in EEPROM, then loaded and executed upon bootup. The Code will execute from SRAM, unless I chose the XEPROM API, in which case it will execute from EEPROM.
2. If using the FLiP module, and keeping its 64KB EEPROM, the options are more limited. I could add an external FLASH chip, in which case I could run XMM SMALL Code from the Flash. Or, I could add the external 256KB EEPROM, and re-write the FLASH API functions to access this EEPROM. That would mimic what your XEPROM API does, and would allow an XMM SMALL program to execute. Or, I could remove the 64KB EEPROM from the FLiP (Yikes!!) and install the 256KB EEPROM externally. That would allow me to run your XEPROM API natively (i.e. I wouldn't have to emulate this by re-writing the FLASH API). I could then add the external SRAMs. This would be identical to the features in the USB Project Board above, and now I could run either SMALL or LARGE Code, and have it stored in EEPROM.
Bottom line, if I don't want to remove the 64KB EEPROM from the FLiP, and want to run in XMM SMALL mode, then my choices are restricted to adding either an external FLASH chip, or the 256KB EEPROM and re-writing the FLASH API functions to work with it. In either case the FLASH loader will have to be used.
The EEPROM bus speed is considerably slower than the SPI FLASH, especially if the FLASH is running in Quad Mode (or if using two of them in the 8-bit "Rampage2" configuration). For speed the Flash chip would be the choice, but for reduced driver complexity the EEPROM would be the way to go.
If the FLiP module had a 128KB or 256KB EEPROM installed, then it could run XMM SMALL programs natively without requiring the installation of external memory chips. Just replace the existing EEPROM with a larger one and it's good to go. Maybe Parallax doesn't recognize that, or maybe they do but there's no big market demand for such a feature. It would be an additional selling point, though. But, I digress.
Looks like I need to do some experimenting on the FLiP with external FLASH and EEPROM to determine the best choice.
Or, just stick with the existing USB Project Board and continue writing my XMM program for it.
I don't suppose you've done any benchmark tests comparing XEPROM to FLASH (or SRAM) for speed and performance of the test code?
Let's see if I understand the discussion of Loaders and APIs correctly:
1. If I have FLASH installed, but no SRAM, then I could run a SMALL XMM program since the Code will be in Flash and the Data and Stack in HubRam. I couldn't run a LARGE XMM program because that would require the Data to be in XMM memory, and by definition Data implies read/write ability. If I include the Flash_Boot utility, the program will load and execute upon bootup.
Correct.
2. If I have both FLASH and SRAM installed, then I could run either a SMALL or LARGE XMM program (my choice). If SMALL, the situation will be as in #1 above. If LARGE, the Code will execute from FLASH, the Data will be in SRAM, and the Stack still in HubRam. As above, if I include the Flash_Boot utility the program will load and execute upon bootup.
Correct.
3. If I only have SRAM installed, I could run either a SMALL or LARGE XMM program (again my choice). In the SMALL case, Code will execute from SRAM, with Data and Stack in HubRam. In the LARGE case, both Code and Data will be in SRAM, with the Stack remaining in HubRam. Obviously the FLASH loader couldn't be used in this case, since there is no Flash memory. I can load and execute interactively using Code::Blocks, or the Catalina command line (indeed this is how I test and debug my program), but it can't load and execute upon bootup unless I use the EEPROM loader.
Correct.
4. If I only have an EEPROM installed, I could run a SMALL XMM program if I include the XEPROM API. Code will execute from EEPROM, Data and Stack are in HubRam. Execution speed will be slower than using an SPI FLASH memory, but will be mitigated somewhat by using the Cache. The XMM program will load from EEPROM and execute upon bootup.
Correct.
Those pretty much appear to be my XMM options.
Distilling all of this down, I can conclude:
1. If using the existing USB Project Board, with the 64KB EEPROM board removed, and the 256KB EEPROM and 512KB SRAM installed, I can run in either SMALL or LARGE memory mode. The program will be stored in EEPROM, then loaded and executed upon bootup. The Code will execute from SRAM, unless I chose the XEPROM API, in which case it will execute from EEPROM.
Correct, but this requires that I can get the EEPROM stage 2 and 3 loaders working with a 256kb or 512KB EEPROMs. Currently, I believe they will only work with EEPROMS up to 128kb. The last option you mentioned also requires the XEPROM API, but I would leave that up to you - that it be easy enough given that you already have experience with writing XMM APIs.
2. If using the FLiP module, and keeping its 64KB EEPROM, the options are more limited. I could add an external FLASH chip, in which case I could run XMM SMALL Code from the Flash. Or, I could add the external 256KB EEPROM, and re-write the FLASH API functions to access this EEPROM. That would mimic what your XEPROM API does, and would allow an XMM SMALL program to execute. Or, I could remove the 64KB EEPROM from the FLiP (Yikes!!) and install the 256KB EEPROM externally. That would allow me to run your XEPROM API natively (i.e. I wouldn't have to emulate this by re-writing the FLASH API). I could then add the external SRAMs. This would be identical to the features in the USB Project Board above, and now I could run either SMALL or LARGE Code, and have it stored in EEPROM.
Correct. However, why would you add an external 256kB EEPROM just to store code when you can just add an external FLASH chip (with or without SRAM) and achieve pretty much the same results? Actually better, since you could probably find FLASH chips that were both larger and faster.
Bottom line, if I don't want to remove the 64KB EEPROM from the FLiP, and want to run in XMM SMALL mode, then my choices are restricted to adding either an external FLASH chip, or the 256KB EEPROM and re-writing the FLASH API functions to work with it. In either case the FLASH loader will have to be used.
Correct.
The EEPROM bus speed is considerably slower than the SPI FLASH, especially if the FLASH is running in Quad Mode (or if using two of them in the 8-bit "Rampage2" configuration). For speed the Flash chip would be the choice, but for reduced driver complexity the EEPROM would be the way to go.
No. The FLASH solution would be much easier for both of us, since Catalina's infrastructure already supports the FLASH solution for FLASH up to at least 4GB (the largest I have tested), but the EEPROM solution only for EEPROMs up to 128kb.
If the FLiP module had a 128KB or 256KB EEPROM installed, then it could run XMM SMALL programs natively without requiring the installation of external memory chips. Just replace the existing EEPROM with a larger one and it's good to go. Maybe Parallax doesn't recognize that, or maybe they do but there's no big market demand for such a feature. It would be an additional selling point, though. But, I digress.
I think Parallax, like me, would ask why you want to use EEPROM instead of FLASH just for code storage? The EEPROM on the P1 was only really intended to be used for boot and to store some data, not for code storage. The P2 platforms don't even have an EEPROM - they just have FLASH. (There may be plans to eventually add one - I am not sure. But if so, it would again be primarily for boot purposes, and perhaps to reduce cost - not to store code).
Looks like I need to do some experimenting on the FLiP with external FLASH and EEPROM to determine the best choice.
Or, just stick with the existing USB Project Board and continue writing my XMM program for it.
I don't suppose you've done any benchmark tests comparing XEPROM to FLASH (or SRAM) for speed and performance of the test code?
Not really. I think FLASH is faster, but that could be just the specific chips used in the various platforms I have. As I have said, any differences will tend to be mitigated by the use of the cache. As far as I can tell, the only downside to FLASH is that it is more expensive and wears out faster. But neither is likely to be a problem for small volumes where it is just used for code storage.
If the FLiP module had a 128KB or 256KB EEPROM installed, then it could run XMM SMALL programs natively without requiring the installation of external memory chips. Just replace the existing EEPROM with a larger one and it's good to go. Maybe Parallax doesn't recognize that, or maybe they do but there's no big market demand for such a feature. It would be an additional selling point, though. But, I digress.
…However, why would you add an external 256kB EEPROM just to store code when you can just add an external FLASH chip (with or without SRAM) and achieve pretty much the same results? Actually better, since you could probably find FLASH chips that were both larger and faster...
I think Parallax, like me, would ask why you want to use EEPROM instead of FLASH just for code storage? The EEPROM on the P1 was only really intended to be used for boot and to store some data, not for code storage. The P2 platforms don't even have an EEPROM - they just have FLASH. (There may be plans to eventually add one - I am not sure. But if so, it would again be primarily for boot purposes, and perhaps to reduce cost - not to store code).
Because going with the 128KB or 256KB EEPROM upgrade within the FLiP would allow much more code storage for something like XMM programs, and not require the use of external memory which would consume pins. Flash and/or SRAM on the P1 requires at least 4 pins in SPI mode, 6 pins in Quad Mode, or 10 pins if running in "Rampage2" mode (like my XMM code is doing now).
No doubt many users want to maximize the number of pins available. For my Project I'm not (yet) in a crisis mode due to the lack of pins, but I'm getting close to that threshold.
Additionally, the EEPROM upgrade would open up the possibility of executing XMM directly from the EEPROM using your XEPROM API, again without the need for external Flash or SRAMs.
As you mentioned, the speed degradation running XMM from EEPROM versus Flash/SRAM would be somewhat mitigated by using the Cache within HubRam. So one would have to consider the performance tradeoffs of running XMM from the EEPROM versus running it from external Flash/SRAM (and thus consuming pins) instead. Slower speed and more pins, or faster speed and fewer pins. The choice, as always, would depend upon the application.
Of course those not running XMM programs (and I think the vast, vast majority of P1 users aren't) would be indifferent to the additional memory because they wouldn't need it anyway. Fortunately the addition would be transparent to them because they would still be able to load and execute CMM/LMM code like they do now.
Indeed the EEPROM on the P1 was originally meant to just store bootup code. But I think the P1 capabilities and applications have been pushed far beyond what Parallax originally envisioned. Not until users got familiar with it were such clever enhancements like LMM mode (opening the door to C compilers), and then XMM after that, discovered.
From what I can tell the P2 uses SPI Flash for bootup. That right there opens the door to XMM on the P2 when you consider how SPI Flash and/or SRAM have already been added to the P1 to support XMM...
From what I can tell the P2 uses SPI Flash for bootup. That right there opens the door to XMM on the P2 when you consider how SPI Flash and/or SRAM have already been added to the P1 to support XMM...
I am not sure the P2 will ever have much need for external XMM RAM. Yes, we might use the Flash for XMM code, but that is quite trivial ... and you have now added one more item to my TODO list, blast you!
However, insufficient pins is not really a limitation when it comes to XMM on the P1 - it just means you have to be a little inventive. There are XMM implementations that require NO extra pins - e.g. look at the C3, where all the SPI devices (SRAM, FLASH, SD etc) share the same SPI bus. If you already have a parallel bus for external SRAM, but are short of pins, then one additional pin is all that is required to support both FLASH and SRAM (the RamPage2 you mention uses two, but I think it only really needed one).
In other words, if you already have some kind of XMM SRAM, supporting XMM FLASH with a similar interface can be done at the cost of either zero or perhaps just one additional pin.
The more I think about it, the more I think FLASH is the correct solution. However, I still intend to have a look at what is required to support larger EEPROMs.
I am not sure the P2 will ever have much need for external XMM RAM. Yes, we might use the Flash for XMM code, but that is quite trivial ... and you have now added one more item to my TODO list, blast you!
Sorry, but I seem to have a way of inadvertently stirring up trouble. That's in addition to intentionally stirring it up
However, insufficient pins is not really a limitation when it comes to XMM on the P1 - it just means you have to be a little inventive. There are XMM implementations that require NO extra pins - e.g. look at the C3, where all the SPI devices (SRAM, FLASH, SD etc) share the same SPI bus. If you already have a parallel bus for external SRAM, but are short of pins, then one additional pin is all that is required to support both FLASH and SRAM (the RamPage2 you mention uses two, but I think it only really needed one).
In other words, if you already have some kind of XMM SRAM, supporting XMM FLASH with a similar interface can be done at the cost of either zero or perhaps just one additional pin.
My "Rampage2" configuration currently uses 10 pins to talk to the dual SRAMs. 8 are for the databus, one is CLK, the other is CS. I could install dual Flash chips, tie them into the databus as well, and use one more pin for their CS, for a total of 11. This leaves me with a grand total of 2 remaining spare pins -- enough for an I2C bus for temperature sensors and battery monitor.
But, since I'm going to run XMM in SMALL memory mode anyway, I can drop the SRAMs entirely and just go with Flash, thus dropping it back down to 10 pins.
The more I think about it, the more I think FLASH is the correct solution.
For higher speed and availability of Flash drivers within Catalina, absolutely.
As an alternative, I'm taking a close look at the XMM EEPROM capability. Not only would it free up a lot of pins, but circuit complexity would drop, along with the real estate requirement on the carrier board.
I'm looking at freeing up a minimum of 6 pins, or 10 if I drop the "Rampage2" memory arrangement completely. The unknown is how much speed degradation my program will experience doing this.
I won't know until I try, so I think I'll take a crack at it this upcoming week and start writing the Flash API functions that work with the EEPROM. Once it's up and running, and my XMM program code is loaded and executed by the FLASH loader, I will finally have my answer.
If the speed is unacceptable, I can pivot and see how a single SPI Flash chip running in Quad Mode will perform. If that works, then I can free up 4 pins by omitting the second Flash chip.
However, I still intend to have a look at what is required to support larger EEPROMs.
Excellent. Stepping up to a single 256KB chip should be sufficient, as that is the largest size I could find on the market.
Although you can string different EEPROMs onto an I2C bus for a total of 512KB, including two of these 256KB ones, that would result in segmented memory blocks and the driver code would have to keep track of address boundaries for each one. Too much of a nuisance in my opinion. Personally, I wouldn't fool with this 512KB arrangement, unless you actually enjoy being punished
Right now I'm thinking my final code will fit between 128KB and 256KB, so it should be fine. If my code exceeds 256KB, then I should just move on to another microcontroller (like the P2 perhaps?).
Excellent. Stepping up to a single 256KB chip should be sufficient, as that is the largest size I could find on the market.
Odd. I've reviewed all the Catalina EEPROM code, and I can see no obvious reason why it would not work with your 256kb EEPROMs. I believe you are using AT24CM02 chips? These appear to be protocol compatible with all the 32kb, 64kb and 128kb EEPROMS that I have used.
All the code I have reviewed should handle EEPROMs up to 512kb without modification, although obviously I have only ever tested it on EEPROMs up to 128kb.
What I will do is write a simple EEPROM test program that uses the Catalina code but allows you to read/write to various addresses - either single bytes or pages - up to 512kb - and get you to run it on your hardware. Hopefully this will help pinpoint what is going wrong.
We are a bit busy here at the moment, so this may take me a week or so.
Odd. I've reviewed all the Catalina EEPROM code, and I can see no obvious reason why it would not work with your 256kb EEPROMs. I believe you are using AT24CM02 chips? These appear to be protocol compatible with all the 32kb, 64kb and 128kb EEPROMS that I have used.
Correct, the AT24CM02 is what I'm using. I've looked over the documentation for the 24LC512 and my AT24CM02 and I couldn't find anything obvious as to why it wouldn't work.
Add to this the fact that the propeller is happy with the AT24CM02 and has no problem storing, loading, then executing CMM code from it. So the protocols obviously are backwards compatible to the original 32KB EEPROM the propeller initially used, otherwise I would have hit a dead stop.
All the code I have reviewed should handle EEPROMs up to 512kb without modification, although obviously I have only ever tested it on EEPROMs up to 128kb.
What I will do is write a simple EEPROM test program that uses the Catalina code but allows you to read/write to various addresses - either single bytes or pages - up to 512kb - and get you to run it on your hardware. Hopefully this will help pinpoint what is going wrong.
Sounds like a great idea. I'll be more than happy to run the test and give you the results. If we're lucky it's something simple that we've overlooked and easy to fix.
We are a bit busy here at the moment, so this may take me a week or so.
Ross.
No problem. I expect to be neck deep this week writing Flash API functions to work with the EEPROM, as well as working on a new circuit board layout for this project.
Although I'm hoping the XMM EEPROM only approach works, I probably will include provisions on the carrier board for using the "Rampage2" configuration with dual Flash chips with my FLiP module. This clever dual chip arrangement does provide very reasonable XMM code execution speed.
While I was thinking about the EEPROM test program I am planning to write, I realized there is already a program that can read and write large EEPROMS which might give us a clue.
This program is the "Hydra Asset Manager", which can supposedly read and write EEPROMs up to 512KB. It was written for the Hydra, but I have modified it so it should work on your platform.
Download the attached Zip file and then follow these instructions:
1. Start the Hydra Asset Manager by unzipping the attached file somewhere
and then executing "HydraAssetManager_1_09_Final.exe"
2. In the bottom of the displayed window, select your EEPROM Size in the
"Memory Size (KB)" dropdown box to be 256.
3. Press the "Load HAM Driver" button at the top of the Window. Loading the
driver takes about 10 seconds. There is no confirmation when it is
complete.
4. Drag and Drop the file "256KB_EEPROM.eeprom" from a Windows Explorer
windows to the "Memory Map" in the Hydra Asset Manager (the big black box).
5. Press the "Upload to Hydra" button. This can take a minute or so. A
dialog box will say "Programming complete" when the upload is finished.
6. Press the "Load HAM Driver" button at the top of the Window again.
Loading the driver takes about 10 seconds. There is no confirmation when
it is complete.
7. Press the "Download from Hydra" button. This can take a minute or so. A
dialog box telling you where the download has been saved will be displayed
when the download is complete.
8. Send me the downloaded file.
These instructions are also included in the file (in "Wingineer README.TXT").
If this works, I might be able to modify it to use my Catalina EEPROM code rather than the Hydra Asset Manager EEPROM code. If it doesn't, then there is something more fundamental going wrong.
Unfortunately the test didn't get very far. I made it to Step 5 and got this:
When I clicked the OK Button it said this:
I repeated the test multiple times, ensuring that I precisely followed the procedure, and always got the same results above.
I rebooted the computer just in case it was doing weird stuff, and got the exact same result.
I recycled the propeller board several times. Same result.
After the last failure I moved on to Step 6 anyway, and of course got this:
Maybe it doesn't like the Device Address Byte. But, there's nothing magical about it. The A2 device bit is the same as in the 24LC512. It's tied to ground on the AT24CM02, just like it is on the 24LC512.
The A1 and A0 device bits in the 24LC512 have been replaced with memory address bits A17 and A16 in the AT24CM02. If device bit A2, and memory address bits A17 and A16 are set to Zero within the Device Address Byte, then we should be able to access the first 64KB block of this EEPROM as determined by the First and Second Word Address Byte values.
Upon bootup, the propeller does indeed set these bits to Zero and proceeds with loading and executing my CMM program from the AT24CM02. Nor does it have any problem writing CMM programs to it. That much is proven.
If we construct a Fault Tree and work through it, maybe we can discover what is going wrong. Here are some possibilities:
1. Protocols. The AT24CM02 supports byte writes, page writes, current address read, random read, and sequential read, just like the 24LC512. So I don't think the problem resides within the protocols.
2. Timing. Here we have to look at setup time, rise time, hold time, and fall time. We can't control the rise or fall times. That depends upon how fast the prop can toggle the pins. Setup time and hold time do vary, depending upon if we're running in Standard Mode, Fast Mode, or Fast Mode Plus. These factors control the clock pulse width and the period, which in turn affects the data window. We might need to adjust some of the delay times in order to get it to work.
3. Page Writes. The AT24CM02 supports up to 256 bytes per page, compared to 128 bytes per page for the 24LC512. However, the AT24CM02 does support "partial page writes", so having it write 128 bytes per page shouldn't be an issue.
4. Page Write Time. The AT24CM02 specifies a maximum of 10ms per page write, while the 24LC512 specifies 5ms. If the code is only allowing 5ms maximum, there could be an issue if the AT24CM02 takes between 5ms to 10ms to complete.
5. Bus Ack. The specs say that the SDA must be released at the falling edge of the 8th clock pulse to permit an ACK/NACK to occur in time for the 9th clock pulse. What if there's some problem with this process?
I'm assuming the prop has "push-pull" or "totem pole" logic on SCL (P28) and SDA (P29), so it appears to me that DIRA on SDA would have to switch to INPUT while SCL is still HI, not after dropping it to LO. Otherwise you risk a direct short on SDA if the EEPROM attempts to pull it low to ACK while it was still set to OUTPUT immediately after SCL drops low, but before you can make it an INPUT.
Does that make sense? I don't know how long after the 8th clock pulse drops low before the EEPROM would attempt an ACK/NACK response. If it's immediate then you risk the short circuit scenario if SDA was still set as an output. Conversely, it seems you could accidentally trigger a false Start or Stop condition if the SDA value changes while the SCL is still HI while switching SDA to an INPUT.
I guess whether or not such a condition is detected depends upon how fast and how often the EEPROM samples the SDA and SCL lines. I suppose that's the risk one takes when using "push-pull" logic instead of "open drain" as specified by the I2C protocol.
Anyway, here's what we do know: The EEPROM works fine with the prop during bootup. CMM code loads and executes fine. The fact that CMM code was loaded fine, also proves that it stored fine.
Whatever scheme the bootloader uses for setup time, hold time, clock period, page write, page write time, and reads (current, random, or sequential) works. So somewhere within all of this resides the answer to our question...
To start the troubleshooting effort, I would suggest changing the write cycle time to 10ms and, if possible, adjust the write page size to 256 bytes, and see if it works...
If you load up tachyon it will be very easy to test your memory at different speeds interactively. Just saying.
btw, I don't have any 24M02s but I do have 24M01s that I can try out. Here is a quick interaction with EEPROM but built-in commands allow LOAD SAVE COPY FILL etc and it is very easy to construct special tests on the fly.
... $8000 $4000 $AA EFILL --> ok
... $8000 $80 EE DUMP -->
0000.8000: AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA ................
0000.8010: AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA ................
0000.8020: AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA ................
0000.8030: AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA ................
0000.8040: AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA ................
0000.8050: AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA ................
0000.8060: AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA ................
0000.8070: AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA ................ ok
... $12345678 $8000 E! --> ok
... $8000 $20 EE DUMP -->
0000.8000: 78 56 34 12 AA AA AA AA AA AA AA AA AA AA AA AA xV4.............
0000.8010: AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA ................ ok
Unfortunately the test didn't get very far. I made it to Step 5 and got this:
Oops. I think that's my fault - what speed clock crystal do you have? The HAM driver needs to be compiled for the correct clock.
Ross.
EDIT: I have attached the correct HAM driver for a 5Mhz clock, which is the most common - if that's your clock speed then replace the file of the same name with the one attached.
If you load up tachyon it will be very easy to test your memory at different speeds interactively. Just saying.
Hi Peter
Can you specify the exact commands we should use? I am not Forth literate, and when I try to use the commands you apparently used (e.g. "$8000 $80 EE DUMP"?) all I get is "???"
If you load up tachyon it will be very easy to test your memory at different speeds interactively. Just saying.
Hi Peter
Can you specify the exact commands we should use? I am not Forth literate, and when I try to use the commands you apparently used (e.g. "$8000 $80 EE DUMP"?) all I get is "???"
Perhaps you missed the previous command that wrote data to the EEPROM?
$8000 $4000 $AA EFILL
which I believe means:
Starting at address $8000 for $4000 bytes, using byte value $AA, fill a range in EEPROM
Then
$8000 $80 EE DUMP
Starting at address $8000 for $80 bytes from EEPROM display the contents
It's possible the EE DUMP sequence relies on some setup that the EFILL command can perform.
The Tachyon 5.4 kernel is just that, a kernel, and you can get by with it too. But the kernel can be extended simply by pasting source code modules down the terminal and the first one that makes it a standard expanded Tachyon is EXTEND.FTH. If you paste this source in with a line delay of 12ms or so it will extend the kernel with a great many capabilities including the EEPROM words and also use that same capability to autosave itself back to EEPROM so it won't be lost.
However to make it easier I will append the latest binary to this post shortly. Older binaries are also available in the Dropbox folder.
Well, it tells me that neither Catalina nor the HAM driver are able to program a 256KB EEPROM
I was hoping this would just work (the authors of the HAM program apparently thought it would!) so that I could have some working code to start from.
Back to the drawing board!
Peter's code may work. When he posts it we could try that.
Bummer!
I was looking over the EEPROM Spin driver code and I noticed this:
LAST_ADDRESS = $1FFFF
Could that be an issue or was it changed to $3FFFF before you recompiled?
Not only that - I looked through the code after it failed and found that the loader assumes EEPROMs are 128kb or less, even though HAM allows you to specify sizes up to 512Kb
BTW, if you haven't had a look in the Dropbox binaries folder, here is a new binary with EXTEND preloaded.
I will also update my sig to point to the latest binary.
EDIT: I forgot to add FAT32 file handlers to this binary I posted but I will update the Dropbox version shortly.
NOTE: DUMP can be temporarily redirected to dump memory other than hub RAM. Modifier methods include EE SF SD FS etc. Variations of DUMP to format as words, longs, ASCII etc are DUMPB DUMPW DUMPL DUMPA DUMPAW DUMPC
That looks really interesting. I'm guessing you have 10k pullups that are too weak and slow for your breadboard or something. Try slowing down the EEPROM access to 100kHz with this:
REVECTOR EESPEED I2C100
You might have to hit enter twice but repeat your tests again and it probably will work.
That looks really interesting. I'm guessing you have 10k pullups that are too weak and slow for your breadboard or something. Try slowing down the EEPROM access to 100kHz with this:
REVECTOR EESPEED I2C100
You might have to hit enter twice but repeat your tests again and it probably will work.
Bingo, it looks like you hit the jackpot!
REVECTOR EESPEED I2C100
... $8000 $80 $AA EFILL --> ok
... $8000 $80 EE DUMP -->
0000.8000: AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA ................
0000.8010: AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA ................
0000.8020: AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA ................
0000.8030: AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA ................
0000.8040: AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA ................
0000.8050: AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA ................
0000.8060: AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA ................
0000.8070: AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA ................ ok
... $12345678 $8000 E! --> ok
... $8000 $20 EE DUMP -->
0000.8000: 78 56 34 12 AA AA AA AA AA AA AA AA AA AA AA AA xV4.............
0000.8010: AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA ................ ok
...
Now, can we incrementally step up the speed until it stops working?
Great, what value pull-ups are you using? You can easily go down to 1k without any problems.
Btw, you can change the speed easily but I think it's a hardware issue.
$10000 $80 $AA EFILL --> ok
... $10000 $80 EE DUMP -->
0001.0000: AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA ................
0001.0010: AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA ................
0001.0020: AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA ................
0001.0030: AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA ................
0001.0040: AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA ................
0001.0050: AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA ................
0001.0060: AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA ................
0001.0070: AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA ................ ok
... $12345678 $10000 E! --> ok
... $10000 $20 EE DUMP -->
0001.0000: 78 56 34 12 AA AA AA AA AA AA AA AA AA AA AA AA xV4.............
0001.0010: AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA ................ ok
...
$20000 $80 $AA EFILL --> ok
... $20000 $80 EE DUMP -->
0002.0000: AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA ................
0002.0010: AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA ................
0002.0020: AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA ................
0002.0030: AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA ................
0002.0040: AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA ................
0002.0050: AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA ................
0002.0060: AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA ................
0002.0070: AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA ................ ok
... $12345678 $20000 E! --> ok
... $20000 $20 EE DUMP -->
0002.0000: 78 56 34 12 AA AA AA AA AA AA AA AA AA AA AA AA xV4.............
0002.0010: AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA ................ ok
...
Comments
Ouch, my head hurts
Well, not really. This actually cleared up the confusion I was having. When trying to understand the XMM load and execute process it's best not to confuse/conflate the loaders with the APIs. Thanks for the clarification.
I'm using the CUSTOM platform. So far I've only written the XMM SRAM API, and it works well. I haven't tackled the XMM FLASH API yet.
Let's see if I understand the discussion of Loaders and APIs correctly:
1. If I have FLASH installed, but no SRAM, then I could run a SMALL XMM program since the Code will be in Flash and the Data and Stack in HubRam. I couldn't run a LARGE XMM program because that would require the Data to be in XMM memory, and by definition Data implies read/write ability. If I include the Flash_Boot utility, the program will load and execute upon bootup.
2. If I have both FLASH and SRAM installed, then I could run either a SMALL or LARGE XMM program (my choice). If SMALL, the situation will be as in #1 above. If LARGE, the Code will execute from FLASH, the Data will be in SRAM, and the Stack still in HubRam. As above, if I include the Flash_Boot utility the program will load and execute upon bootup.
3. If I only have SRAM installed, I could run either a SMALL or LARGE XMM program (again my choice). In the SMALL case, Code will execute from SRAM, with Data and Stack in HubRam. In the LARGE case, both Code and Data will be in SRAM, with the Stack remaining in HubRam. Obviously the FLASH loader couldn't be used in this case, since there is no Flash memory. I can load and execute interactively using Code::Blocks, or the Catalina command line (indeed this is how I test and debug my program), but it can't load and execute upon bootup unless I use the EEPROM loader.
4. If I only have an EEPROM installed, I could run a SMALL XMM program if I include the XEPROM API. Code will execute from EEPROM, Data and Stack are in HubRam. Execution speed will be slower than using an SPI FLASH memory, but will be mitigated somewhat by using the Cache. The XMM program will load from EEPROM and execute upon bootup.
Those pretty much appear to be my XMM options.
Distilling all of this down, I can conclude:
1. If using the existing USB Project Board, with the 64KB EEPROM board removed, and the 256KB EEPROM and 512KB SRAM installed, I can run in either SMALL or LARGE memory mode. The program will be stored in EEPROM, then loaded and executed upon bootup. The Code will execute from SRAM, unless I chose the XEPROM API, in which case it will execute from EEPROM.
2. If using the FLiP module, and keeping its 64KB EEPROM, the options are more limited. I could add an external FLASH chip, in which case I could run XMM SMALL Code from the Flash. Or, I could add the external 256KB EEPROM, and re-write the FLASH API functions to access this EEPROM. That would mimic what your XEPROM API does, and would allow an XMM SMALL program to execute. Or, I could remove the 64KB EEPROM from the FLiP (Yikes!!) and install the 256KB EEPROM externally. That would allow me to run your XEPROM API natively (i.e. I wouldn't have to emulate this by re-writing the FLASH API). I could then add the external SRAMs. This would be identical to the features in the USB Project Board above, and now I could run either SMALL or LARGE Code, and have it stored in EEPROM.
Bottom line, if I don't want to remove the 64KB EEPROM from the FLiP, and want to run in XMM SMALL mode, then my choices are restricted to adding either an external FLASH chip, or the 256KB EEPROM and re-writing the FLASH API functions to work with it. In either case the FLASH loader will have to be used.
The EEPROM bus speed is considerably slower than the SPI FLASH, especially if the FLASH is running in Quad Mode (or if using two of them in the 8-bit "Rampage2" configuration). For speed the Flash chip would be the choice, but for reduced driver complexity the EEPROM would be the way to go.
If the FLiP module had a 128KB or 256KB EEPROM installed, then it could run XMM SMALL programs natively without requiring the installation of external memory chips. Just replace the existing EEPROM with a larger one and it's good to go. Maybe Parallax doesn't recognize that, or maybe they do but there's no big market demand for such a feature. It would be an additional selling point, though. But, I digress.
Looks like I need to do some experimenting on the FLiP with external FLASH and EEPROM to determine the best choice.
Or, just stick with the existing USB Project Board and continue writing my XMM program for it.
I don't suppose you've done any benchmark tests comparing XEPROM to FLASH (or SRAM) for speed and performance of the test code?
Not really. I think FLASH is faster, but that could be just the specific chips used in the various platforms I have. As I have said, any differences will tend to be mitigated by the use of the cache. As far as I can tell, the only downside to FLASH is that it is more expensive and wears out faster. But neither is likely to be a problem for small volumes where it is just used for code storage.
Because going with the 128KB or 256KB EEPROM upgrade within the FLiP would allow much more code storage for something like XMM programs, and not require the use of external memory which would consume pins. Flash and/or SRAM on the P1 requires at least 4 pins in SPI mode, 6 pins in Quad Mode, or 10 pins if running in "Rampage2" mode (like my XMM code is doing now).
No doubt many users want to maximize the number of pins available. For my Project I'm not (yet) in a crisis mode due to the lack of pins, but I'm getting close to that threshold.
Additionally, the EEPROM upgrade would open up the possibility of executing XMM directly from the EEPROM using your XEPROM API, again without the need for external Flash or SRAMs.
As you mentioned, the speed degradation running XMM from EEPROM versus Flash/SRAM would be somewhat mitigated by using the Cache within HubRam. So one would have to consider the performance tradeoffs of running XMM from the EEPROM versus running it from external Flash/SRAM (and thus consuming pins) instead. Slower speed and more pins, or faster speed and fewer pins. The choice, as always, would depend upon the application.
Of course those not running XMM programs (and I think the vast, vast majority of P1 users aren't) would be indifferent to the additional memory because they wouldn't need it anyway. Fortunately the addition would be transparent to them because they would still be able to load and execute CMM/LMM code like they do now.
Indeed the EEPROM on the P1 was originally meant to just store bootup code. But I think the P1 capabilities and applications have been pushed far beyond what Parallax originally envisioned. Not until users got familiar with it were such clever enhancements like LMM mode (opening the door to C compilers), and then XMM after that, discovered.
From what I can tell the P2 uses SPI Flash for bootup. That right there opens the door to XMM on the P2 when you consider how SPI Flash and/or SRAM have already been added to the P1 to support XMM...
But, since I'm going to run XMM in SMALL memory mode anyway, I can drop the SRAMs entirely and just go with Flash, thus dropping it back down to 10 pins. For higher speed and availability of Flash drivers within Catalina, absolutely.
As an alternative, I'm taking a close look at the XMM EEPROM capability. Not only would it free up a lot of pins, but circuit complexity would drop, along with the real estate requirement on the carrier board.
I'm looking at freeing up a minimum of 6 pins, or 10 if I drop the "Rampage2" memory arrangement completely. The unknown is how much speed degradation my program will experience doing this.
I won't know until I try, so I think I'll take a crack at it this upcoming week and start writing the Flash API functions that work with the EEPROM. Once it's up and running, and my XMM program code is loaded and executed by the FLASH loader, I will finally have my answer.
If the speed is unacceptable, I can pivot and see how a single SPI Flash chip running in Quad Mode will perform. If that works, then I can free up 4 pins by omitting the second Flash chip. Excellent. Stepping up to a single 256KB chip should be sufficient, as that is the largest size I could find on the market.
Although you can string different EEPROMs onto an I2C bus for a total of 512KB, including two of these 256KB ones, that would result in segmented memory blocks and the driver code would have to keep track of address boundaries for each one. Too much of a nuisance in my opinion. Personally, I wouldn't fool with this 512KB arrangement, unless you actually enjoy being punished
Right now I'm thinking my final code will fit between 128KB and 256KB, so it should be fine. If my code exceeds 256KB, then I should just move on to another microcontroller (like the P2 perhaps?).
Odd. I've reviewed all the Catalina EEPROM code, and I can see no obvious reason why it would not work with your 256kb EEPROMs. I believe you are using AT24CM02 chips? These appear to be protocol compatible with all the 32kb, 64kb and 128kb EEPROMS that I have used.
All the code I have reviewed should handle EEPROMs up to 512kb without modification, although obviously I have only ever tested it on EEPROMs up to 128kb.
What I will do is write a simple EEPROM test program that uses the Catalina code but allows you to read/write to various addresses - either single bytes or pages - up to 512kb - and get you to run it on your hardware. Hopefully this will help pinpoint what is going wrong.
We are a bit busy here at the moment, so this may take me a week or so.
Ross.
Add to this the fact that the propeller is happy with the AT24CM02 and has no problem storing, loading, then executing CMM code from it. So the protocols obviously are backwards compatible to the original 32KB EEPROM the propeller initially used, otherwise I would have hit a dead stop. Sounds like a great idea. I'll be more than happy to run the test and give you the results. If we're lucky it's something simple that we've overlooked and easy to fix. No problem. I expect to be neck deep this week writing Flash API functions to work with the EEPROM, as well as working on a new circuit board layout for this project.
Although I'm hoping the XMM EEPROM only approach works, I probably will include provisions on the carrier board for using the "Rampage2" configuration with dual Flash chips with my FLiP module. This clever dual chip arrangement does provide very reasonable XMM code execution speed.
While I was thinking about the EEPROM test program I am planning to write, I realized there is already a program that can read and write large EEPROMS which might give us a clue.
This program is the "Hydra Asset Manager", which can supposedly read and write EEPROMs up to 512KB. It was written for the Hydra, but I have modified it so it should work on your platform.
Download the attached Zip file and then follow these instructions:
These instructions are also included in the file (in "Wingineer README.TXT").
If this works, I might be able to modify it to use my Catalina EEPROM code rather than the Hydra Asset Manager EEPROM code. If it doesn't, then there is something more fundamental going wrong.
Ross.
Unfortunately the test didn't get very far. I made it to Step 5 and got this:
When I clicked the OK Button it said this:
I repeated the test multiple times, ensuring that I precisely followed the procedure, and always got the same results above.
I rebooted the computer just in case it was doing weird stuff, and got the exact same result.
I recycled the propeller board several times. Same result.
After the last failure I moved on to Step 6 anyway, and of course got this:
Maybe it doesn't like the Device Address Byte. But, there's nothing magical about it. The A2 device bit is the same as in the 24LC512. It's tied to ground on the AT24CM02, just like it is on the 24LC512.
The A1 and A0 device bits in the 24LC512 have been replaced with memory address bits A17 and A16 in the AT24CM02. If device bit A2, and memory address bits A17 and A16 are set to Zero within the Device Address Byte, then we should be able to access the first 64KB block of this EEPROM as determined by the First and Second Word Address Byte values.
Upon bootup, the propeller does indeed set these bits to Zero and proceeds with loading and executing my CMM program from the AT24CM02. Nor does it have any problem writing CMM programs to it. That much is proven.
If we construct a Fault Tree and work through it, maybe we can discover what is going wrong. Here are some possibilities:
1. Protocols. The AT24CM02 supports byte writes, page writes, current address read, random read, and sequential read, just like the 24LC512. So I don't think the problem resides within the protocols.
2. Timing. Here we have to look at setup time, rise time, hold time, and fall time. We can't control the rise or fall times. That depends upon how fast the prop can toggle the pins. Setup time and hold time do vary, depending upon if we're running in Standard Mode, Fast Mode, or Fast Mode Plus. These factors control the clock pulse width and the period, which in turn affects the data window. We might need to adjust some of the delay times in order to get it to work.
3. Page Writes. The AT24CM02 supports up to 256 bytes per page, compared to 128 bytes per page for the 24LC512. However, the AT24CM02 does support "partial page writes", so having it write 128 bytes per page shouldn't be an issue.
4. Page Write Time. The AT24CM02 specifies a maximum of 10ms per page write, while the 24LC512 specifies 5ms. If the code is only allowing 5ms maximum, there could be an issue if the AT24CM02 takes between 5ms to 10ms to complete.
5. Bus Ack. The specs say that the SDA must be released at the falling edge of the 8th clock pulse to permit an ACK/NACK to occur in time for the 9th clock pulse. What if there's some problem with this process?
I'm assuming the prop has "push-pull" or "totem pole" logic on SCL (P28) and SDA (P29), so it appears to me that DIRA on SDA would have to switch to INPUT while SCL is still HI, not after dropping it to LO. Otherwise you risk a direct short on SDA if the EEPROM attempts to pull it low to ACK while it was still set to OUTPUT immediately after SCL drops low, but before you can make it an INPUT.
Does that make sense? I don't know how long after the 8th clock pulse drops low before the EEPROM would attempt an ACK/NACK response. If it's immediate then you risk the short circuit scenario if SDA was still set as an output. Conversely, it seems you could accidentally trigger a false Start or Stop condition if the SDA value changes while the SCL is still HI while switching SDA to an INPUT.
I guess whether or not such a condition is detected depends upon how fast and how often the EEPROM samples the SDA and SCL lines. I suppose that's the risk one takes when using "push-pull" logic instead of "open drain" as specified by the I2C protocol.
Anyway, here's what we do know: The EEPROM works fine with the prop during bootup. CMM code loads and executes fine. The fact that CMM code was loaded fine, also proves that it stored fine.
Whatever scheme the bootloader uses for setup time, hold time, clock period, page write, page write time, and reads (current, random, or sequential) works. So somewhere within all of this resides the answer to our question...
To start the troubleshooting effort, I would suggest changing the write cycle time to 10ms and, if possible, adjust the write page size to 256 bytes, and see if it works...
btw, I don't have any 24M02s but I do have 24M01s that I can try out. Here is a quick interaction with EEPROM but built-in commands allow LOAD SAVE COPY FILL etc and it is very easy to construct special tests on the fly. ...
Oops. I think that's my fault - what speed clock crystal do you have? The HAM driver needs to be compiled for the correct clock.
Ross.
EDIT: I have attached the correct HAM driver for a 5Mhz clock, which is the most common - if that's your clock speed then replace the file of the same name with the one attached.
Thanks Peter. Good suggestion.
Oh, I'm using the USB Project Board with the 5MHz crystal.
And what Peter is suggesting looks intriguing!
Hi Peter
Can you specify the exact commands we should use? I am not Forth literate, and when I try to use the commands you apparently used (e.g. "$8000 $80 EE DUMP"?) all I get is "???"
Hi @Wingineer - I just edited my previous post to include the correct HAM driver. Refresh this thread and you should see it.
Perhaps you missed the previous command that wrote data to the EEPROM?
which I believe means:
Starting at address $8000 for $4000 bytes, using byte value $AA, fill a range in EEPROM
Then Starting at address $8000 for $80 bytes from EEPROM display the contents
It's possible the EE DUMP sequence relies on some setup that the EFILL command can perform.
OK, I will grab it and try it out...
However to make it easier I will append the latest binary to this post shortly. Older binaries are also available in the Dropbox folder.
Yes! It worked!
The test result file is enclosed. Hopefully it provides some clues as to what is going on.
Well, it tells me that neither Catalina nor the HAM driver are able to program a 256KB EEPROM
I was hoping this would just work (the authors of the HAM program apparently thought it would!) so that I could have some working code to start from.
Back to the drawing board!
Peter's code may work. When he posts it we could try that.
Bummer!
I was looking over the EEPROM Spin driver code and I noticed this:
LAST_ADDRESS = $1FFFF
Could that be an issue or was it changed to $3FFFF before you recompiled?
Not only that - I looked through the code after it failed and found that the loader assumes EEPROMs are 128kb or less, even though HAM allows you to specify sizes up to 512Kb
I will also update my sig to point to the latest binary.
EDIT: I forgot to add FAT32 file handlers to this binary I posted but I will update the Dropbox version shortly.
NOTE: DUMP can be temporarily redirected to dump memory other than hub RAM. Modifier methods include EE SF SD FS etc. Variations of DUMP to format as words, longs, ASCII etc are DUMPB DUMPW DUMPL DUMPA DUMPAW DUMPC
OK, I was able upload the binary you just posted to my prop board and save it to EEPROM.
Here's the result of this simple test:
I don't know, but it doesn't appear to have written to the EEPROM correctly?
Bingo, it looks like you hit the jackpot!
Now, can we incrementally step up the speed until it stops working?
Btw, you can change the speed easily but I think it's a hardware issue.
One of them is 10K, the other is 4.7K, but I don't remember the particular order. I'll have to ohm it out.
Here's the latest test, which I assume is at 400KHz:
I guess my next order of business is to replace the pullups with 1K resistors and then try to go Fast Mode Plus (1MHz)...