cog memory

4x5n · 2011-09-08 21:54

Sorry for asking another stupid question! I'm new to learning how to program parallax chips and am having a lot of fun. I promise that I'll be ordering the manuals for both the prop and basic stamp since I can see that the basic stamp has it's places.

My question concerns the the availability of cog ram to programs written in "spin". One of the books I got "programming the propeller with spin" implies that spin methods (objects?) are loaded into cog memory when a cog is started. Reading documentation online from parallax states that only code written in assembly is loaded into cog memory or use cog memory. If that's true it seems to me that spin code that uses multiple cogs will potentially have severe performance issues and that a lot of memory in the prop will be "wasted" if it's unavailable for use by spin code.

Are there any good references on how to write methods in pasm? I'm getting the feeling I'm going to be learning how to program with it if I'm going to be able to use cog memory.

Mike Green · 2011-09-08 22:29

A cog that's running Spin code is completely full with the Spin interpreter and its internal variables. The Spin interpreter is loaded automatically from the Propeller's ROM when needed using a COGINIT instruction.

By the way, you can download both the Propeller Manual and the Stamp Manual (Basic Stamp Syntax and Reference Manual) from Parallax's Download webpage.

Look at the two "sticky threads" in this forum (at the top of the thread list) for links to various tutorials on both Spin and Assembly programming.

Note: You don't write methods in PAsm. You write entire programs. There's about 100uS of overhead in starting up a cog, mostly in loading 496 longs (32-bit instructions) from hub memory to the cog's memory. There's no way to just load fragments or short routines using the standard (ROM) Spin interpreter. There has been some experimenting with an enhanced Spin interpreter that's loaded from hub RAM rather than ROM that allows short assembly routines to be loaded into the cog from hub memory, then executed, but it's experimental and not something for a beginner to work with.

Dr_Acula · 2011-09-08 22:40

If that's true it seems to me that spin code that uses multiple cogs will potentially have severe performance issues and that a lot of memory in the prop will be "wasted" if it's unavailable for use by spin code.

You have been reading the manuals thoroughly!

Yes, the prop has 32kilobytes of hub ram and worst case scenario with all the cogs loaded you lose 14k of that, or just under half. I agree it is a waste. But never fear, there are solutions.

You can recycle hub ram by loading in cog code from an SD card, and load each cog through the same 2k block. That means only 2k is wasted, and you can even recycle that if you want to, or you can keep it free to reload cog code over and over.

Another option is to note where the cog code is and then recycle it after the cogs are loaded by using it to store other data.

One thing I have done is create code that use most of the hub ram for a video buffer, so you can put all the cog code into the video buffer, load it into the cogs, and then use that hub ram for video.

There are recycling techniques for both Spin and C.

Generally you don't have to worry about this until you are writing big programs, and by then you will know a lot more tricks. So jump in and start writing some code and sing out if you run out of program memory because there are other ways to make programs smaller besides putting cog code into an sd card.

frank freedman · 2011-09-08 22:58

4x5n wrote: »

Sorry for asking another stupid question! I'm new to learning how to program parallax chips and am having a lot of fun. I promise that I'll be ordering the manuals for both the prop and basic stamp since I can see that the basic stamp has it's places.

My question concerns the the availability of cog ram to programs written in "spin". One of the books I got "programming the propeller with spin" implies that spin methods (objects?) are loaded into cog memory when a cog is started. Reading documentation online from parallax states that only code written in assembly is loaded into cog memory or use cog memory. If that's true it seems to me that spin code that uses multiple cogs will potentially have severe performance issues and that a lot of memory in the prop will be "wasted" if it's unavailable for use by spin code.

Are there any good references on how to write methods in pasm? I'm getting the feeling I'm going to be learning how to program with it if I'm going to be able to use cog memory.

Check the for the tutorials by potatohead (basic starting up) and deSilva. Both of which I have found very useful. You perceive correctly that you will take a significant hit using spin over assembly (PASM). For example, take an ADC MCP3201, and try to sample in spin.

The best I could get was about 500 microsecond conversion time, so on a good day, you could take about 1.8K cycles provided you did nothing else with the cog like store or manipulate the data with the capture cog. By nyquist, you would be able to at best sample a 900Hz waveform w/o aliasing. In PASM, I am (ab)using the same MC3201 with a conversion time of about 169KHZ if my scope is to be believed. I have not yet D/A'd the values and compared, so overclocking the MCP3201 may not be such accurate results, but I do get reasonable looking activity on the data lines with a sawtooth or sine on the input to the ADC.

Of course you could write a 4*20 LCD driver in PASM, but would you really gain all that much over doing it in spin? How fast do you really need to update a LCD test display. Faster to code in spin......

Just my thoughts, but the potatohead papers and the deSilva PDF are recommended by many of the far beyond my current level members here.

And yes, it is fun otherwise why do it at all?

Frank

Also, you need not order the prop manuals unless you like the paper nicely bound. They can be downloaded along with lots of app notes, ed kit materials that go with code also downloadable. Save the money and buy toys.. Uh I mean parts/kits/etc.

RossH · 2011-09-08 23:01

Dr_Acula wrote: »

There are recycling techniques for both Spin and C.

I'm not familiar with the Spin techniques, but Dr_A is correct about C - with Catalina you can use a two-phase loader in conjunction with either an SD card or a 64kb EEPROM. Effectively, all the cogs are loaded in the first phase, then the application program is loaded in the second phase - this means you really do have the full 32kb available for your application if you need it.

You can even use one or more of the cogs as extra data stores (Catalina provides an example of doing this). Looked at this way, the Propeller actually has up to 48kb of RAM available, not 32kb!

Ross.

StefanL38 · 2011-09-09 06:41

As you are just beginning. Can you give a short description of what you want to do later as a maximum? How man codelines and what maximum speed do you require?
Then the experts here can estimate if this can be done with the propeller-chip or not.

SPIN is an interpreted language. The SPIN-Interpreter is loaded into the cog-RAM. Each line of code in SPIN resides in the HUB-RAM as spin-tokens which needs less place than PASM-code.
So you have 32kB of RAM for SPIN which would do more than a 32kB PASM-code. This concept means: More codespace (than you have coding in assembler) for the price of lower execution-speed.
The SPIN-interpreter is written in PASM-code and this interpreter occuoies the full cog-RAM. A small loop coded in spin has a frequency of some kHz.

PASM is roughly 80 times faster but PASM-code can only be 2kB per cog at a time. As Dr_Acula mentioned there are several techniques to expand PASM-code.

As Mike Green mentioned starting a a cog takes around 100 microseconds. So starting a new cog is a quick thing.

Methods are subroutines. objects are *.SPIN-files that can contain 1 to n methods.

Objects can use the same cog as your main-code. Objects can use 1 to 7 extra cogs depending on how they are coded.
This means objects and cogs and methods are completely different things. They are not completely independent of each other
but they can be almost independent of each other.
There is code that uses 200 methods in just one cog.
There is code that uses 4 methods in 4 cogs
there is code that uses 100 methods in 3 cogs.And almost any other combination that you like.
To understand more about methods, objects and cogs read the manual and work through the PE Kit Lab exercises.

Whenever you have questions feel free to aks here. The so called "simple" questions are a good oportunity for almost newbees to give help too.
So please post easy questions that everybody can take part in answering them.

keep the questions coming
best regards

Stefan

4x5n · 2011-09-09 08:55

I'm happily surprised with the reaction to my post and questions.

I should give a bit of history about myself. My degree is in electronics and I spent ~10yrs working in the field of industrial process control before changing careers and moving into IT. During those years I spent ~8y years writing (among other things) interrupt driven machine control software using the Motorola 6809 CPU and later the 68hc11 MPU as well as PLCs. In all cases memory was very tight and limited to a total of 64K of directly accessible addresses and since the I/O was memory mapped I was generally lucky to have 48K of usable memory. This was on processors running at 1-2MHZ! As a result the need to write very efficient code was ingrained in me!! By my back of the envelope calculations and figuring a prop running spin is an order of magnitude faster then what I had available to me back in the day!! It's just that I have a built in need to write tight efficient code. :-) I do miss the interrupts though. Having to poll inputs takes a lot of cycles that could be spent elsewhere but it is what it is so I'll have to adjust.

I have a few very related processes that I have planned for the propeller. One of my hobbies is B&W film photography and as a result I spend a lot of time developing film. I'm looking to automate that process since the last of the companies that made processors for us hobbyists stopped years and the price of the processors are insane!! There will need to be more then one since depending on the size and type of film I develop and the developer I use the process of development is different. The basic requirements are a motor (I'm planning on using a stepper) with a good amount of torque, temp control, and of course a series of inputs to allow the user (me at first and other later) to start and stop the cycle and develop their own cycles. The cycle includes the temp the development will occur, how aggressive the agitation will be, the frequency of agitation and the length of the development cycle and then stopping and "fixing" the film into a negative. All of these need to be adjustable. :-) My plan is to allow the user to generate various cycles and store them, then use an lcd, monitor, etc at run time to select the cycle the want to use. The plan is for the program that will execute the cycle to be table driven and the various cycles will be entries in the table.

That of course is my long term dream and I do see that in the long run I'll want to be able to store the cycle information into an sd card or on an external computer. For now a separate program for each cycle type will be downloaded into the propeller. As long as I keep it table driven I can live with it at first. I'm sure I'm in way over my head but like said I'm having a lot of fun going through the labs and as I progress writing routines/methods to do things that will be related to the final project.

HShanko · 2011-09-09 09:29

4x5n, Your 'film processor' sounds like possibly a one Prop project. There may even be a board which has most of what you need in existence. That is maybe an existing board could get you started to write the code for the project. Yep. Sounds like a fun project.

Mike Green · 2011-09-09 09:49

One of the fundamental ideas behind the Propeller is that, with 8 independent processors, polling stops becoming an evil idea and usually simplifies coding. If you have two tasks that should run at the same time, you simply have the two tasks running in two different cogs. One task might read 3 tank temperatures, one after another, and adjust the power to 3 heaters based on a desired temperature value. Another would step 3 motors each based on a direction, speed, and number of steps to go. A 3rd task would be the main one and would interact with a keyboard and display to set variables, display status, etc. There would be two other cogs that would handle the low level I/O for the keyboard and display. That's 5 out of 8. The other 3 would be idle (essentially turned off) or you'd use them for other tasks.

Heater. · 2011-09-09 09:59

4x5n,
As an old 8 bit micro wrangler you are in good company here. The 6809 was my favorite processor back in the day.
You might have to adjust your mind to the "Propeller way" a little bit though. For example, you should not be missing interrupts, rather you should be happy the Propeller does not have any.
Consider this: An interrupt handler is just some processing that you would like to happen "now" when some event occurs. When it's done your normal processing resumes. I have seen systems where pretty much all the work happens in interrupt handlers and the background is almost nothing but an idle loop.
But the Prop has 8 processors. Each one of them can be set up to handle an event "now". Like waiting for a particular time or oa change in input pin.
So, interrupt handlers can be replaced by parallel executing code on different COGs. With the added bonus that you don't have to worry about priority levels or one event handler starving another of time.
When you get that simple idea in mind you start to wonder why processors hsve not been built like the Prop a long time ago.

4x5n · 2011-09-09 12:41

Although I reserve the right to continue to miss interrupts the more time I spend writing code for the propeller and get into using multiple cogs in a "knock up, knock down as needed" fashion I'm starting to see that there are advantages to that approach. I should point out that writing "real time" interrupt driven machine control isn't for the faint of heart as anyone who's done any amount of it can testify to. I'm just having trouble making sense of how I'm going to have one cog tell another cog to "stop what you're doing and do this instead" while that other cog is running. I've got some ideas that involve semaphores and possibly having the cog handling the inputs to shutdown cogs and start them back up again running the desired code.

For now I may just be putting the cart before the horse and need to work my way through the propeller education kit that I got! I've already written some key subroutine/subprograms and then as I got more familiar with spin and the "propeller way" of writing code rewrote the code to be much more efficient, smaller and much more flexible. I'm starting to get the hang of it. :-)

As to the use of assembly. After writing a short program that sent out a pulse to a pin in spin and reading the output on a freq counter I was impressed. I also slept on it a bit and will use "spin" where possible and assembly where needed. It's looking like I should be able to get all the code I need into 32KB.

Mike Green · 2011-09-09 13:07

For signaling from cog to cog, you usually use shared variables. Since 32-bit references to hub (shared) memory are atomic, very often several items are packed into a 32-bit word (long). Say that you use a COGNEW to "fork" a new cog to run a routine (method) called "doItNow" after initializing a global variable called "command". This variable is initialized to zero before the COGNEW. "doItNow" looks at the variable's most significant byte. If it's zero, it waits a second, then checks again. If the byte is not zero, the most significant byte is a command of some sort, the next byte is a count, and the next two bytes (16-bit word) is the address of a buffer. "doItNow" breaks the long apart and does the requested operation, say a request to read or write to an Analog to Digital Converter. "doItNow" does the requested operation, then stores either a zero into "command" or a value with the most significant byte a zero and the rest some kind of error code.

VAR long command
    long smallStack[ 20 ]   ' stack space for 2nd cog
    byte buffer[ 8 ]   ' some kind of buffer

PUB mainRoutine | temp
   command := 0
   cognew( doItNow, @smallStack)   ' start up doItNow using a small stack area
   repeat
      command := 1<<24 | 2<<16 | @buffer   ' read 2 bytes into buffer
      repeat while command & $ff000000 > 0  ' wait for operation to finish
      ' here we can decode any possible error information
      ' process the buffered information

PUB doItNow | temp
   repeat
      waitcnt(clkfreq + cut)   ' wait for one second to elapse
      temp := command   ' check the shared variable
      if temp & $ff000000 == 0  ' it the MSByte is zero, go back and wait
         next
      case temp >> 24   ' decode the command
         1: ' code = 1, read data
         2: ' code = 2, write data
      command := 0   ' set the shared variable to zero to indicate we're done

Note that it's possible to arrange things so that doItNow is processing the next expected operation while mainRoutine is still processing the previous results.

You don't really want to continually shutdown and restart a cog to do small tasks. Think of the extra cog as a kind of interpreter, interpreting commands that tell it what to do. It does one command, leaves some kind of data or status for the other cogs to use, then goes on to the next thing. I/O drivers like FullDuplexSerial that implement a full duplex serial port use two ring buffers, one for transmit and one for receive. The I/O cog pulls bytes, one at a time, from the transmit buffer and transmits them a bit at a time at the same time as it's watching the I/O pins for incoming data which gets put into the receive buffer. The program calling FullDuplexSerial either pulls characters out of the receive buffer or stuffs them into the transmit buffer if there's room.

Heater. · 2011-09-09 13:53

4x5n,
Imagine that on a normal MCU you might have an interrupt routine that is launched when a pin changes state. That interrupt routine does nothing whilst "waiting" for the pin to change state.
So it can be on the Prop but instead of a piece of code sitting there idle you can have an entire COG halted on a "waitpe" or such instruction. It springs into life when the pin changes, does whatever work it has to do, and then loops back to wait again on the pin. Whilst doing all that it reads or writes data from memory where some other code can see it. Just like an interrupt routine does.
Wheather you need locks or semaphors and such around access to that shared data is another matter. Most prop device drivers get by with no locks.

4x5n · 2011-09-09 22:40

It's looking like I'm going to have change my approach to programming with the Propeller. I haven't decided if it's a good thing or not yet! :-) I'm sure that it's just going to take some getting used to writing machine control software that isn't interrupt driven.

I'm liking the idea of cogs being able to share memory though. I can see how that will make things easier. There's no sense in limiting myself to single bits of information with semaphores when the arch reads and writes 4 bytes at a time!!

Being able to have a cog sit and wait on a series of input pins is really appealing as well. In my mind all the more reason to get a printed copy of the manual. I've already downloaded the pdf of the manual but being old find that I like to sit down with printed out tech manuals and references and read them!! Scrolling through a pdf just isn't the same.

The more I learn about this chip the more impressed I am by it! I just wish there was a way to access cog ram with a spin method!!

Mike Green · 2011-09-09 22:51

There are 16 special locations in cog memory (512 - 496 = 16) that can be accessed from Spin. Some of them are read-only like CNT, PAR, INA, and INB. The rest are read/write like OUTA, OUTB, DIRA, DIRB, CTRA, CTRB, FRQA, FRQB, PHSA, PHSB, VCFG, and VSCL. The ones that are not implemented in the current Propeller (OUTB and DIRB) can be used for data. If you're not using the counters, FRQA, FRQB, PHSA, and PHSB are also usable for data. In assembly, you have to be careful with these since there's actually memory ("shadow memory") underlying these locations. If you attempt to access read-only locations using the destination field of an instruction, you actually reference the shadow memory and if you jump to one of these locations, the instructions come from shadow memory. In Spin this issue doesn't apply.

4x5n · 2011-09-11 19:59

Hmmm, I don't know that I would have thought of using the 16 longs in cog memory used by the counters when the counters aren't used. I haven't done any serious programming on the propeller and don't know if having those 16 longs is a significant amount of memory. After nearly a decade of programming very resource limited systems I look at the memory in cogs and wish that it was available for use. A knee jerk thing.

After another weekend spent writing routines (meant for learning to program the prop but useful as sub-routines for future products) I'm getting more comfortable with spin and am amazed at how little very few resources very capable and powerful routines actually take in terms of method size and variable usage. I'm still thinking that my thought of writing most of the code in spin and using pasm when spin can't perform as required.

After learning to do more real work with spin and becoming more familiar with it's capabilities and limitations I'm thinking of taking a look at propbasic. I like the fact that propbasic compiles down to and outputs pasm code. I still remember when I started writing machine control code for the 6809 and again later with the 68hc11 I wrote the initial code in C and compiled to assembly and from there maintained the code in assembly. It was a useful learning tool that ended up being I used to do "real" work with.

I do think we've beat this thread to death now. :-)

cog memory

Comments