Large programs in C++

Dr_Acula · 2012-08-01 23:00

I've got stuck thinking about large programs and I would be grateful for some advice.

The fundamental problem is - where do you store a large program and how does a program access the same storage?

Let's say the program is stored in external ram. If your code wants to access that same ram, you have to disable the cache driver.

Ditto if the program is running from an SD card. You have to disable the cache driver to access the SD.

There are solutions and there is code to put the cache to sleep. But that is specialised C code and won't work with the Spin To C converter. Also it implies that the memory driver sits in hub (eg SD driver), and if you have devoted hub ram to a cache and also to a screen buffer there may not be much room left.

The more I think about this, the more I wonder if the simplest solution is two separate types of memory. One is for the code, and one is for data. It might be two SD cards. It might be two ram chips (though they can't share the same pin so more likely serial than parallel). It might be a hybrid - eg flash ram and an SD card.

I wonder which is the best? For data storage, sram will be better than flash. But sram is usually parallel and parallel uses up lots of pins. Serial sram could be an option. For running the code, is flash better or is an SD card better? SD cards are probably the cheapest per gigabyte, but then again, a program is not going to be a gigabyte as it will take ages to download. On the other hand, the removeable nature of SD cards could be the answer .

Or would it be better to have an SD card for the program, an SD card for storage of bitmaps and text files, and some serial sram for data (so you can free up hub space for more cache)?

Or is flash better?

I'd really appreciate everyone's thoughts.

rod1963 · 2012-08-01 23:28

A alternative to flash is MRAM and you can use it either as SRAM or non-volitile storage. It's probably a budget buster though, for a SPI 128K x 8 @ $7.62 or a 512K x 8 for $15.04

David Betz · 2012-08-02 05:09

Dr_Acula wrote: »

Let's say the program is stored in external ram. If your code wants to access that same ram, you have to disable the cache driver.

Ditto if the program is running from an SD card. You have to disable the cache driver to access the SD.

This isn't completely true. The Propeller GCC cache drivers all try to let go of their SPI pins when they are not running. That means that the only thing you need to do in order to access the same device for another purpose is to make sure that the lowest level code that accesses the device doesn't cause any cache misses. This is easily arranged by putting that particular function in hub memory. The function can be very small so it doesn't need to take up a lot of precious hub space. I'll try to dig out an example of doing this.

Also, Propeller GCC already has code in it to share the SD card between the filesystem and the SD cache driver.

Dr_Acula · 2012-08-02 06:39

That sounds interesting David.

I need to be thinking clearly about this, because what would be great would be to think of a way to take an existing Spin program and convert it to C and somehow preserve the SD part. So you might have a Spin program with fsrw or Kye's driver, and the function calls are things like "open" and "read n longs" and "close". So one could maybe think about a pre-processor that takes that spin code and translates it to C handles, and then do a spintoC conversion and maybe paste back in that C code.

I've tried a number of times to start writing a large GUI operating system for the touchscreens but there are a few problems I have come up against, and one of them is the integration of pasm with a higher level language. I found some solutions for catalina by writing my own custom IDE but I think a better answer is to write code in Spin as this makes the entire obex available.

I'll think about this some more as I'm still not quite sure I can explain exactly what the problem is! Suffice to say I think there is a solution that is tantalisingly close here.

jazzed · 2012-08-02 07:29

Dr_Acula wrote: »

Suffice to say I think there is a solution that is tantalisingly close here.

Solutions are:

Don't try to run XMMC from SD card
Let go of SPIN and use C/C++ entirely

The Spin2Cpp converter is great for converting spin to C++, but I don't think it was intended to be a Spin/C++ daily language swap tool. Spin is designed for using HUB RAM only - whatever you can load there will run. Spin2Cpp won't give you big spin unless you let go of trying to run from SD card.

ersmith · 2012-08-02 12:39

jazzed wrote: »

Solutions are:
Don't try to run XMMC from SD card

Let go of SPIN and use C/C++ entirely

Or, write the main program in C/C++ and use SPIN (converted by spin2cpp) just for the objects you need from ObEx.

Or, manually edit your C code (and/or the spin2cpp output) to add __attribute__(section(".hubtext")) to the low level routines that need to access SDRAM; but putting them in the HUB memory you can make sure the cache COG is not conflicting with them.

Eric

Heater. · 2012-08-02 12:51

From now on my solution to running large C or C++ programs is as follows: Run them on a Rasperry Pi or similar ARM board. I'm sure there will be more like that very soon. Link that to a Propeller and use the Propeller for what it is good at, real world, real time interfaces.
For 30 dollars or so the ARM board gives you ethernet, USB, SD card, easy WIFI if you want it, etc etc.
As you know, I like to squeeze a quart into a pint pot but at some point you have to give up.

David Betz · 2012-08-02 12:57

Heater. wrote: »

From now on my solution to running large C or C++ programs is as follows: Run them on a Rasperry Pi or similar ARM board. I'm sure there will be more like that very soon. Link that to a Propeller and use the Propeller for what it is good at, real world, real time interfaces.
For 30 dollars or so the ARM board gives you ethernet, USB, SD card, easy WIFI if you want it, etc etc.
As you know, I like to squeeze a quart into a pint pot but at some point you have to give up.

Maybe Chip will give us an on-chip ARM core in Propeller 3!

Heater. · 2012-08-02 13:23

David,
Yes, that idea has crossed my mind a while back.
Who says all COGs have to be COG? One of them could be an ARM with similar access into HUB space but otherwise running from a Giga Byte of RAM.

jazzed · 2012-08-02 13:32

I mentioned adding an ARM core to Chip once. He stared at me blankly for a moment and moved on to another subject.

Heater. · 2012-08-02 13:48

Try again!
Thing is, as an educational tool introducing the uninitiated to programming and electronics and the general operation of a CPU the propeller is totally briliant.
This kind of introduction to computing is exactly what the Raspberry Pi is all about.
I do wonder though that with the basic mechanics of the ARM hardware being burried under a ton of Linux and closed source graphics and other drivers how well that idea can work.out. It all gets too complex.
On the other hand in the industrial world the Prop soon runs out of steam. By the time you have to move to C/C++ to get the job done its far to big to fit and mind bendingly slow.
Conclusion: ARM with access to HUB on the same chip is perfect.
Saves me building the "Propeller Pi" board:)

David Betz · 2012-08-02 13:54

Heater. wrote: »

Try again!

That's probably a good idea. The last time this was brought up was probably before the value of having an industry-standard C/C++ compiler was recognized. Having a processor designed to run that kind of language to handle the boring main loop code and a bunch of COGs to handle all of the interfacing to non-standard hardware would be wonderful.

Heater. · 2012-08-02 14:01

Also, sad to say, but looking at Propellers and Pi's as educational tools or just toys the Prop now has an uphill struggle in simple terms of price. When something like a C3 board is four or five times the price of a Pi with a lot less.functionality.
I don't want to come over all negative. I think the Prop is great at what it does and Parallax is one cool company.

David Betz · 2012-08-02 14:10

Heater. wrote: »

Also, sad to say, but looking at Propellers and Pi's as educational tools or just toys the Prop now has an uphill struggle in simple terms of price. When something like a C3 board is four or five times the price of a Pi with a lot less.functionality.
I don't want to come over all negative. I think the Prop is great at what it does and Parallax is one cool company.

Yes, the RaspberryPi is inexpensive. It's also almost impossible to buy at the moment as far as I can tell. Also, it's based on a SOC where many of the interesting interfaces (like video and graphics) are proprietary and require an NDA to get documentation. At least the Propeller is open. Besides, the ARM is just a boring traditional processor. Yes, it gets the work done but it isn't interesting. Let it do the boring high-level code and do the interesting stuff on a COG!

Heater. · 2012-08-02 14:19

Impossible to buy? Perhaps. I have one:) Demand has been huge, far bigger than expected. 500000 so far expecting a million by Christmas. I imagine Parallax could only dream of that for the Prop when it was launched.
As for the rest of your post, I totally agree.

Dr_Acula · 2012-08-02 18:19

heater said

From now on my solution to running large C or C++ programs is as follows: Run them on a Rasperry Pi or similar ARM board. I'm sure there will be more like that very soon. Link that to a Propeller and use the Propeller for what it is good at, real world, real time interfaces.

Can the pi run a touchscreen? I think there are some cool things we haven't done yet with the prop, and I still think it can be done cheaper than the Pi if you compare like with like (ie mass produced surface mount).

Re dual SD, I just realised that with a dual touchscreen you get two SD sockets for free, and with 3 wire links I think I could use the second SD card.

However, with all the great ideas above I can see some other solutions.

A question re C - can you put inline pasm in a C program? I know there is something called GAS? but is there a pasm solution. If not, I'll have a think about that because there is one for catalina and it should not be too hard to replicate for C++.

jazzed said

Don't try to run XMMC from SD card
Let go of SPIN and use C/C++ entirely

Great points. First one has got me thinking - with the 1 meg of sram on the boards I am using, it should be possible to run the program from external ram via the cache. The drivers are pretty simple - write n longs or read n longs. The way they work is based on Cluso's design where you have a cog waiting for a command and the command is to do things like read or write. The code hangs until the cog has finished. That means there is automatically no conflict issue. If the cache driver is fetching a block then it won't let the program continue. If the program is fetching a block then the cache driver will have already got its block. It is fully deterministic because the cog is being used in a non parallel way in this instance. If someone wants to write a driver I should be able to find a spare freebie board.

Point 2 - use C. I agree, and ultimately the aim would be to take the entire obex, run each one through the spintoc converter and then use these objects in C as C code. However, there could still be instances where one wants to tweak the pasm code and change it every compile/download which is where I think it could be useful to have code with inline pasm rather than the cog code being .h files with arrays in hex.

jazzed · 2012-08-02 19:54

Dr_Acula wrote: »

Great points. First one has got me thinking - with the 1 meg of sram on the boards I am using, it should be possible to run the program from external ram via the cache.

Just use part of it for cache and part of it for buffering. You can share a lock efficiently to arbitrate the bus because cache uses block transfers, and cache locks only when a swap is necessary.

I don't understand the need to take the entire obex and run it through the spin2cpp converter.

Trying to port a spin program that uses fsrw or kyedos to run on xmmc from sd card is abusing spin2cpp; it is an unreasonable thing to do.

Using converted programs in LMM is fine, but non-trivial programs will be too big until Eric finishes his compression scheme.

Dr_Acula · 2012-08-02 21:51

Thanks jazzed, good advice there.

Where would I find the ram driver code for the dracblade board? ( c:\propgcc\which folder?) I think someone wrote a driver for that board and I am hoping I can modify it for the touchburger ram.

I'm looking for the primitives that do things like move data in and out of the cache. It should be simpler than the SD cache as it won't need an SD driver.

David Betz · 2012-08-03 03:58

Dr_Acula wrote: »

Thanks jazzed, good advice there.

Where would I find the ram driver code for the dracblade board? ( c:\propgcc\which folder?) I think someone wrote a driver for that board and I am hoping I can modify it for the touchburger ram.

I'm looking for the primitives that do things like move data in and out of the cache. It should be simpler than the SD cache as it won't need an SD driver.

It's in the propgcc Google Code repository in the propgcc/loader/spin/dracblade_cache.spin file but I thought that other guy who was working with you already got a cache driver working on your new board?

Heater. · 2012-08-03 05:29

Dr_A,

Can the pi run a touchscreen?

Sure. Can a Prop play HD videos on an HDMI TV?:)

I think there are some cool things we haven't done yet with the prop...

No doubt about it.

...and I still think it can be done cheaper than the Pi if you compare like with like (ie mass produced surface mount).

[/QUOTE]
Depends on what it is, horses for courses as always.
Want lot's of I/O pins? The Prop is your device.
Want real time control of said pins? The Prop is your device.
Want low standby power consumption? The Prop is your device.
Want large C++ programs that don't run at snails pace? Let's get the Prop some assistance with that.

Dr_Acula · 2012-08-03 05:35

It's in the propgcc Google Code repository in the propgcc/loader/spin/dracblade_cache.spin file but I thought that other guy who was working with you already got a cache driver working on your new board?

Ah that might explain things, I don't seem to have a propgcc/loader directory. I'll see if it is on the web somewhere...

Addit: found it http://code.google.com/p/propgcc/source/browse/loader/spin/dracblade_cache.spin?name=v0_2_1

That should work. So replace this

BSTART
        mov     address,vmaddr          ' get the high address byte
        shr     address,#16             ' shift right by 16 places
        and     address,#$FF            ' ensure rest of bits zero
        mov     HighLatch,address       ' put value into HighLatch
        mov     dira,LatchDirection     ' setup active pins 138 and bus
        mov     outa,HighLatch          ' send out HighLatch
        or      outa,HighAddress        ' or with the high address
        andn    outa,GateHigh           ' set gate low
        or      outa,GateHigh           ' set the gate high again
        mov     ptr, hubaddr            ' hubaddr = hub page address
        mov     address, vmaddr
        mov     count, line_size
BSTART_RET
        ret

'----------------------------------------------------------------------------------------------------
'
' BREAD
'
' vmaddr is the virtual memory address to read
' hubaddr is the hub memory address to write
' count is the number of longs to read
'
' trashes count, ptr
'
'----------------------------------------------------------------------------------------------------

BREAD
        call    #BSTART
rdloop  call    #read_memory_byte       ' read byte from address into data_8
        wrbyte  data_8,ptr              ' write data_8 to hubaddr ie copy byte to hub
        add     ptr,#1                  ' add 1 to hub address
        add     address,#1              ' add 1 to ram address
        djnz    count,#rdloop           ' loop until done
BREAD_RET
        ret

with this (plus a few more lines not shown)

pasmramtohub            call    #get_values             ' get hubaddr,ramaddr,len and set control pins
                        and     dira,maskP16P31         '%11111111_11111111_00000000_00000000 inputs
                        andn    outa,maskP16            ' memory /rd low
ramtohub_loop           mov     data_16,ina             ' get the data
                        wrword  data_16,hubaddr         ' move data to hub
                        andn    outa,maskP20            ' clock 161 low
                        or      outa,maskP20            ' clock 161 high
                        add     hubaddr,#2              ' increment the hub address 
                        djnz    len,#ramtohub_loop
                        or      outa,maskP16            ' memory /rd high  
                        jmp     #init                   ' ' tristate pins and listen for commands

Ok, so say we have some code written, what needs to be recompiled and how do you do this?

David Betz · 2012-08-03 06:23

Dr_Acula wrote: »

Ok, so say we have some code written, what needs to be recompiled and how do you do this?

Once you've added your own BREAD and BWRITE functions you just need to use bstc with the -c option to compile the driver and write a .dat file. Then move that to the directory with all of the other drivers (usually /opt/parallax/propeller-load or maybe C:\propgcc\propeller-load). You'll also have to create a .cfg file that names your new driver and defines any parameters needed by the driver such as pins if the driver is configurable.

jazzed · 2012-08-05 18:45

Here's a copy of the cache code I have for the touch screen board Joe sent. Joe wrote most of it.
I suppose some changes are necessary for the new board.

I'm sure there was a thread where Joe and I discussed this and posted versions, but I can't find that now.

Dr_Acula · 2012-08-05 19:48

Thanks jazzed, that is brilliant.

Joe and I have a few things to sort out as we have changed the schematic so the driver needs changing. But that code will be a great framework as most things are just going to be cut and paste from the new driver code.

In order to get this to work we have groups of pins, so a possible scenario is that the C program is talking to the finger input of the touchscreen/serial port etc and in order to do a cache fetch it may need to formally shut down things using those pins. Possibly even unload a cog then reload it again. Some of that code is a few lines of spin and I note the attached driver is all pasm, so will we need to translate it to pasm?

If yes, then it can be done, just need to think more about the code. At the moment, group 1 and 2 are for the display and fast ram access, group 4 is going to be VGA and TV and keyboard so you can run demoboard programs, and group 3 is probably going to be the default for a running program. Group 3 at the moment is a SPI A to D chip and the SPI driver reading the finger on the screen. It might turn out that if we use SPI things that simply deselecting the /CS line is enough to disable the device, in which case the cache driver will be very simple and we will be able to do it all in pasm.

Thanks++ for that code, I think I now have a clearer idea of what will be needed.

jazzed · 2012-08-05 20:06

Dr_Acula wrote: »

Some of that code is a few lines of spin and I note the attached driver is all pasm, so will we need to translate it to pasm?

A cache driver should be as fast as possible.

average joe · 2012-08-06 20:21

Hey guys, late to the party again! Here's the deal, I'll whip up a cache driver ASAP. I needed to personally vet the board and I'm quite happy! I've been grinning from ear to ear since last night. The cache driver should not be too complicated, just a couple of things to figure out. I'm still wondering about locking the bus, but will probably write a driver with locking and without. I'm fairly sure the files are on my pc somewhere...

Large programs in C++

Comments