What is your typical hub ram usage like?
mark
Posts: 252
I'm trying to come up with ideas for interesting P1 Verilog mods and where efficiency improvements could possibly be made. One thing I'd like to know is how your hub ram is typically used?
1) Big code for one cog
2) Big code for one cog and big data
3) Big code for multiple cogs (how many?)
4) Big code for multiple cogs and big data
5) Big data
6) ____________
Out of the 8 available cogs, how often can you fit executing code into a cog itself? How likely can your code fit in a cog if you used C/PASM but you chose to use SPIN out of convenience?
1) Big code for one cog
2) Big code for one cog and big data
3) Big code for multiple cogs (how many?)
4) Big code for multiple cogs and big data
5) Big data
6) ____________
Out of the 8 available cogs, how often can you fit executing code into a cog itself? How likely can your code fit in a cog if you used C/PASM but you chose to use SPIN out of convenience?
Comments
1) Tools limitations
2) IO Pins
3) Tools limitations
4) Hub memory
5) Tools limitations
6) Frequency
7) Tools limitations
If you doubled the number of IO pins, then I'd start to be pressed for cores, but I'd run out of memory first. And control-intensive program sufficiently large to require that memory are well beyond the limits of the existing tools.
Typical application - number collected today:
Cores used: 5
Cores not running spin: 3
Soures lines of code in application: 4723 in 11 modules
Source lines in library code: 3107 in 12 modules
Binary size: 6840 longs
Aggregate IO rate: 4 @ 150K baud 1/2 duplex
2 @ 38400 baud full duplex
1 @ 4800 baud 1/2 duplex
128 @ 35KHz PWM
1 @ 20Khz SPI
Those source line counts are just textual line counts. My software is very declarative and carefully documented, but because Spin syntax limits statements to a single source line, I'd say that the density is about the same as one might expect of C. The library modules are all my own and get lots of reuse - but they're bastardized all to hell because there's no link-time dead code elimination and code space is at such a premium.
Because there are no link maps, it's really hard to know the size of the assembly language, but the three assembly language drivers aren't close to hitting their 496 word limit. I'd guess two are at 25% and one is at 75%.
Even doubled, the per-core memory isn't really interesting for control-intensive algorithms. Just like in network hardware, it helps to think of Data Plane and Control Plane as very different beasts that require very different solutions. The Data Plane is hard real time and throughput intensive. The Control Plane is flow controlled and endless corner cases.
While it would be trivial to double the per-core memory size, beyond that, the architecture just has no real legs. The problem is balance among IO Pins, Hub memory, core count and frequency. The size of the per-core memory just doesn't make the list. But the tools - now there's a place that could use some energy.
You should try bst or homespun. They both have conditional compilation and IIRC both (at least bst) have dead code removal.
While bst is no longer supported, there are plenty of us using it.
I am unsure about the new OpenSpin and SimpeIDE- perhaps someone can comment.
This post has 87 views and only one actual response to the question?
All code is GCC C++. The memory footprint is 94%:
without debugging enabled.
The three CMM cogs are the interesting ones. They're effectively three loops that monitor buffers and shared variables, and each is too large to fit into a single cog (even 4x size, my guess). This particular application simply streams data from sensors through the Propeller and into the SD card. To accommodate that I use a circular buffer of 900 bytes. The rest of the data in the hub is just housekeeping variables.
Openspin does not remove dead code. It's the same in most respects as the PropellerTool's compiler.
SimpleIDE when using PropellerGCC will remove dead code if Enable Pruning is enabled in the project manager.
1. Has 512KB SRAM and runs Catalina XMM in one cog, 2 cogs IIRC run FAT16/32 SD access, 1 cog modified FDX serial.
2. 4 cogs used, 3 cogs have modified FDX serial (PASM), 1 cog arbiter/forwarder (spin)
3. 3 cogs used, 1 scans a keyboard, 1 drives an LCD, 1 has a modified FDX serial. All PASM.
Other projects use 512KB SRAM and a few cogs but have only 2 I/O free.
So many of my projects use 2 props.
Summary: Short of pins and/or hub ram.
So it sounds like you run some big code, but generally just out of one cog while the rest are PASM drivers, and perhaps big data for the LCD buffer/tile map?
Main cog is always spin so code in Hub. Sometimes a second spin Cog but usually all other in PASM.
NOT using BST I often resort to comment out unused functions in Spin Objects. Saves a lot of longs.
Data used for PASM cog image mostly get reused for buffer at runtime.
Enjoy!
Mike
does your spin program tend to be big, or is your memory primarily used by data?
When you run C code from sram via XMM, does that pipe the code directly to the cog running it, or is it cached in hub memory?
Is it even possible to use a Propeller without external RAM expansion?
Can you please offer a bit more detail to your usage? Generally, what does your hub ram tend to be used for primarily?
Back in the day we spent a lot of timing developing a Z80 emulation to run on the Propeller, see ZiCog. As did PullMoll, see qz80. Others have been doing 6502 emulations and so on.
You can pretty much fit an entire Intel 8085 emulation in a single COG. Which is pretty amazing. So it was always very frustrating though that in order to make a full up CP/M running machine out of a Prop you had to add external RAM. Which uses all the pins up and slows everything down horribly. We were always craving for that 128K Prop where 64K could be used for the emulated machine address space and the is room for a VGA text terminal and other supporting software.
Then there was Zog, a ZPU emulation that could run code compiled with GCC on the Prop. That was before we had propgcc. ZPU code is not the smallest so HUB space was very limiting for it.
Since then all my Prop programming has been pretty small scale, hardly projects more just experiments, so it has not worried me much. When I want space there is the STM32 F4 ARM chips or it's time to move up to a Raspberry Pi or whatever.