I like this kind of discussion. It's educational, and frankly indicative of the progress of our community. When we are having these, good things happen. Go back and look. It's been done over and over.
The prop has enough unique attributes that it will require boot-strapping like this. Rather than grow frustrated over the state of the discussion, consider just tuning out for a bit.
Well, there you go... the one method guaranteed to always be available.
But how does this help with supporting multiple languages?
For instance, I believe Forth allocates a fixed block per cog (which is in fact quite analogous - but not the same as - the C method of using a registry that contains a pointer to a block per long). The fact is that the basic method needs to be extended to be useful from a high level language - and two different languages currently do it in two different ways, which means they cannot currently use each others cog programs.
There is benefit to be had from trying to standardize this if we can do so without a burdensome cost.
What hasn't been mentioned yet is a method to benchmark the resultant cog add-in code.
I would think that most of what's been discussed is getting pushed further than the basic need.
If we can settle on a interprop / intercog timing benchmark then all the other
flavors will gain respect by their brevity to that initial known functionality.
Filling buffers and sending strings has a different requirement than
setting a bit in cog2 on prop3 (and it's reflected acknowledgement to the sender(s) )
Start with the shortest possible and best benchmark and work toward
the more complicated needs, referring to the increased functionality
VS the induced lag from the shortest benchmark we can achieve.
So the "standard" is, "Here's a pointer (or two, or three, or an array of pointers) into some hub memory I own. The docs for his module explain what to do with it." Or, in the case of cog-resident code, "Give me the address of a block of your hub memory. My docs explain how it needs to be structured and how to communicate." It doesn't have to be any more limiting than that.
Yes, that is it. Chip's original series of drivers conform to this standard. However, many other objects do not (and again, in the interests of diplomacy, I'm not going to name them). So, getting all objects up to Chip's standard would be a huge achievement and would make it much easier to port between languages.
The 'standard' as I see it is for all the communication longs to be contiguous and for the start location of the block to be passed in par. I'd also add that it is worth trying to use cog code that reads these variables into cog variables using a loop rather than 'unrolled' code, because it will use less space. Perhaps unstated but worth stating is that the order of variables will be the same in hub and cog. You can name those variables to make them easy to reference, and (I think) that therefore they will need different names - some people use myvariable for the hub ones and _myvariable for the cog ones. Or you could use spin_myvariable for the hub and cog_myvariable for the cog ones.
If you happen to have a buffer as part of the common block, and you happen to also need to send some temporary startup variables, you can save some space by passing these through the buffer.
And then document everything in the code.
Now - over and above this standard one can think about other extra standards like a registry. I'm bursting with all sorts of clever ideas there, but I am also tempered by the fact that I am just about to sit down and write a new pasm driver for a touchscreen on the dracblade, so every 'cool' theoretical idea has to work in practice, and it has to justify any extra code space.
If a registry is needed, I'm still pondering what is the best format for such a registry. I have some ideas but I need to ponder about this some more.
Shut the thread down? Why? Looking at the most recent posts here, it seems that people are just beginning to discuss the actual issue. Or do you mean shut this thread down so we can start another "clean" thread? I could probably agree with that suggestion.
One thing though - you seem to think that there is some "other" place where this stuff should be discussed and agreed by "people far more knowledgeable" - but (frightening as it may seem!) I'm afraid the participants in this forum are those people! - there is no other place, and there are no other people. If we don't do this, and do this here, then it won't happen. Isn't this what these forums are for? Parallax certainly won't do it - quite understandably, they are snowed under with other things, and anyway have little experience with this stuff - software is not their core business.
As you suggested, each of us compiler developers here could all go away and "write something out" - but look where that has got us over the past 5 years! And also, look at the vitriol that the only documented proposal attracts - much of which seems to be objection to the choice of one single word!
I think it is interesting that you raise the RFCs (that form the basis of the internet) as an analogy. Yes, it's true those started as academic discussion documents, and you didn't have to comply with them. Two points, though: (1) an RFC is usually the output of a bunch of discussion in forums like this, not the input (they are described as "after the fact" standards) and (2) where are those people who chose not to comply with the various RFCs now?
Apart from the inevitable "nay-sayers" who seem to frequent every thread in these forums that proposes something a little different, I think this thread is doing just fine.
Ross.
In my experiences these types of threads tend to devolve into debating societies and in the end nothing gets done. This thread is/has devolved to that level. As you pointed out look at the todo the word registry has caused! The reality is that the for the most part people agree that if done right this could be useful and beneficial. It's also a reality that the people most effected are those writing the compilers since they're the ones that will implement the standards. If done well most people won't have to know that the standard exists or is in use.
I think it would be very helpful to define some variation on the following:
1) A table indexed by cog # containing the starting address of the cog's communications area. This would be used by other cogs to communicate with the cog in question. A cog being started would find its communications area address in PAR. The cog's interface (not the cog itself) would update the table. A zero would be the null address.
2) A table indexed by device class containing the starting address of the cog's communications area. Typical device classes could be console input, console output, maybe mass storage. A device doesn't have to have a device class. This is intended to support device substitution, mostly for stream devices, where there are a variety of possible devices that could be used to provide the class's functionality. For mass storage, an SD card or a couple of flash chips or some large EEPROMs or a USB memory stick could all be used for holding files with some common basic functionality along with access maybe via another device class pointer to the low level functionality that's device specific.
This is slightly simpler than the proposal earlier in this thread and has the advantage that the cog's code doesn't normally have to know about these tables. Variations on this have been suggested by others before, but I thought I'd just restate it for discussion.
The cog's interface (not the cog itself) would update the table.
I think in this case the "coglet's" will need to have a minimal interface, and that interface will need to be standardized for each class of objects. The reason being the goal is to use any language; since the interface runs in the context of the caller, the interface will need to be written for each calling language to be supported. If the idea is to be able to swap out various coglets that support say console output, the interface on the caller side will need to work with any coglet that properly meets the requirements of it's device type code for console output.
I think it would be very helpful to define some variation on the following:
1) A table indexed by cog # containing the starting address of the cog's communications area. This would be used by other cogs to communicate with the cog in question. A cog being started would find its communications area address in PAR. The cog's interface (not the cog itself) would update the table. A zero would be the null address.
2) A table indexed by device class containing the starting address of the cog's communications area. Typical device classes could be console input, console output, maybe mass storage. A device doesn't have to have a device class. This is intended to support device substitution, mostly for stream devices, where there are a variety of possible devices that could be used to provide the class's functionality. For mass storage, an SD card or a couple of flash chips or some large EEPROMs or a USB memory stick could all be used for holding files with some common basic functionality along with access maybe via another device class pointer to the low level functionality that's device specific.
This is slightly simpler than the proposal earlier in this thread and has the advantage that the cog's code doesn't normally have to know about these tables. Variations on this have been suggested by others before, but I thought I'd just restate it for discussion.
Hi Mike,
This is simple and neat. I do have one question, though (and please don't throw rocks at me, I'm battered and bruised enough already ) - if you combine these two tables into one, that's essentially all Catalina's current registry is - what is the purpose in separating them? The main benefit I can see for having two tables is that you can always access the information you need by index rather than having to search - so performance improves - but the downside is that while the list of cogs is fixed and finite (i.e. 8) the list of device classes is not - and it could grow over time. So how do you know how big that table should be?
In my experiences these types of threads tend to devolve into debating societies and in the end nothing gets done. This thread is/has devolved to that level. As you pointed out look at the todo the word registry has caused! The reality is that the for the most part people agree that if done right this could be useful and beneficial. It's also a reality that the people most effected are those writing the compilers since they're the ones that will implement the standards. If done well most people won't have to know that the standard exists or is in use.
Hi 4x5n,
I understand your points, but the main thing is that the whole thing will only work if enough people buy into the concept with enough enthusiasm to make it work.
So I think we need to leave this thread running a little longer - people are still coming to grips with the idea itself, the potential benefits, and the potential costs. At this point, the actual method we eventually agree on matters less.
Also, I think this thread may have hit bottom and may actually be on the way up!
1) the title of this this thread says "communicating *with* cogs". Not *between* cogs, so I presume this standard is not really about inter cog communications?
2) Let's say we have a bootloader and we load cogs in two passes. Does the second load overwrite the registry?
3) Let's say that you have a registry, and each entry in the registry points to a block of data associated with that cog. Will the second load overwrite that data? And to answer that partially, obviously it is pretty unlikely if you have a keyboard driver with 19 longs, but what if the driver is a display with 20k of display buffer and that buffer is wrapped up as part of the group of data that cog needs. Is there a way that second loader program could know that the first loader has loaded up a cog that requires 20k of hub space? Is there a 'fit' command you could use as you write that second program to give an indication you are going to run out of space soon?
4) What if you decide to load up 7 cogs with your preloader, and then load the main program and now you want to swap in and out some more plugins. How do you handle the memory allocated to the plugins? As an example, we now swap out our display driver using 20k of hub ram and replace it with a mouse driver that uses only 7 longs. Does that free up 20k of hub ram?
I have a feeling you may have already thought about some of this with Catalina...
I think in this case the "coglet's" will need to have a minimal interface, and that interface will need to be standardized for each class of objects. The reason being the goal is to use any language; since the interface runs in the context of the caller, the interface will need to be written for each calling language to be supported. If the idea is to be able to swap out various coglets that support say console output, the interface on the caller side will need to work with any coglet that properly meets the requirements of it's device type code for console output.
C.W.
Yes, I agree. To combine this thread with Mike's idea - each "device class" should offer the same interface.
Ross, I have some more questions! 1) the title of this this thread says "communicating *with* cogs". Not *between* cogs, so I presume this standard is not really about inter cog communications?
I could have said either. The most common model when a high-level language is involved is definitely one main "client" cog and multiple "server" cogs (e.g. a main thread and some drivers) but another applicable model would be "communicating processes" - but no matter what model your language uses, on the Propeller it definitely involves inter-cog communications.
2) Let's say we have a bootloader and we load cogs in two passes. Does the second load overwrite the registry?
In Catalina, no - the registry is in high Hub RAM, and not overwritten by the subsequent load. All comms blocks are also allocated downward from high RAM, so really the second loader should not load stuff any higher than the "low water mark" established by the first load.
3) Let's say that you have a registry, and each entry in the registry points to a block of data associated with that cog. Will the second load overwrite that data? And to answer that partially, obviously it is pretty unlikely if you have a keyboard driver with 19 longs, but what if the driver is a display with 20k of display buffer and that buffer is wrapped up as part of the group of data that cog needs. Is there a way that second loader program could know that the first loader has loaded up a cog that requires 20k of hub space? Is there a 'fit' command you could use as you write that second program to give an indication you are going to run out of space soon?
See above. We should probably also have a standard for dynamic memory allocation - but I think if I suggested one I'd get lynched!
4) What if you decide to load up 7 cogs with your preloader, and then load the main program and now you want to swap in and out some more plugins. How do you handle the memory allocated to the plugins? As an example, we now swap out our display driver using 20k of hub ram and replace it with a mouse driver that uses only 7 longs. Does that free up 20k of hub ram?
This is not addressed by Catalina's mechanism. As I said above, there is no standard for dynamic memory management on the Propeller (C has one, but it is at the application level, not the cog level). There is not likely to be for a while yet, so you have to handle all this yourself. Ross.
See above. We should probably also have a standard for dynamic memory allocation - but I think if I suggested one I'd get lynched!
I guess that would be code in whatever higher level language you are using. If we are going to break out of the model of "load up the cogs and never touch them again" to reloading cogs, I guess we have to think about the allocation of hub memory that those cogs use.
And maybe that "post office" code will also handle some of the intercog communications. How does your registry model handle, for instance, cog 1 and cog 2 trying to send a message at the same time to cog 3?
Ross I think what you propose has some merit. I'm working on a 'plug and play system' and a registry would make things easier, and better a registry adhering to some kind of standard than none. I'm coming from a different angle but a registry looks useful either way.
The biggest benefit I can see is if you can have cogs on remote props mapping their requests/responses into memory on the "local" prop, perhaps via a conduit cog. This would make it very easy to (never run out of cogs again).
If it hasn't already been considered, I think it would be worth looking closely at the inter cog communications outline for the Prop2, to make sure whatever is ratified will work in well once prop2 is released.
I'm also debating whether a "service oriented" registry rather than a "cog oriented one" makes more sense, because its quite possible for 1 cog to provide several services at once, or conversely a VGA driver may span up to 7 cogs. For multi prop environments this helps because 1 local cog may actually be a "conduit comms cog" for sending/retrieving data for multiple services on a remote prop.
I guess that would be code in whatever higher level language you are using. If we are going to break out of the model of "load up the cogs and never touch them again" to reloading cogs, I guess we have to think about the allocation of hub memory that those cogs use.
C has dynamic memory management - as do many high-level languages - but here we need something available before the high-level language program has even been loaded - so in fact we need a much "lower level" mechanism.
And maybe that "post office" code will also handle some of the intercog communications. How does your registry model handle, for instance, cog 1 and cog 2 trying to send a message at the same time to cog 3?
Currently the application software has to "know" that such contention might arise, and use a lock (or some other mechanism) to prevent it. One change I would make to Catalina's registry would be to allow the inclusion of a lock allocated for each cog program (and note the same lock could be used for multiple cog programs). If the lock in the registry is non-zero, the software would have to acquire the specified lock before proceeding (lock zero could not be used for this - but that's not a big problem).
This would make the mechanism invisible from the application level - if the cog knew it needed a lock, it would allocate one during initialization and put it in the registry, and each application that tries to talk to the cog would know it has to use it.
Ross I think what you propose has some merit. I'm working on a 'plug and play system' and a registry would make things easier, and better a registry adhering to some kind of standard than none. I'm coming from a different angle but a registry looks useful either way.
The biggest benefit I can see is if you can have cogs on remote props mapping their requests/responses into memory on the "local" prop, perhaps via a conduit cog. This would make it very easy to (never run out of cogs again).
If it hasn't already been considered, I think it would be worth looking closely at the inter cog communications outline for the Prop2, to make sure whatever is ratified will work in well once prop2 is released.
I'm also debating whether a "service oriented" registry rather than a "cog oriented one" makes more sense, because its quite possible for 1 cog to provide several services at once, or conversely a VGA driver may span up to 7 cogs. For multi prop environments this helps because 1 local cog may actually be a "conduit comms cog" for sending/retrieving data for multiple services on a remote prop.
Hi tubular,
All good points. I think your "prop to prop" conduit is a great idea. I do something quite similar to this in Catalina via "proxy" drivers - these allow a cog program on one Prop to act as proxy for the real cog program that is actually executing on another Prop - the local proxy cog presents the same interface as the remote cog would if it was executing locally.
I currently use it only for my HMI (keyboard, mouse, screen) and SD Card plugins - but I could generalize it to be able to act as proxy for any cog program.
Also, I am kind of swayed by the idea of a service-oriented registry rather than a cog-oriented one - Eric suggested this earlier in the thread. I just would like to see how much complexity it adds in practice.
Ok, that makes sense Ross. Tubular's comment may help guide what you might do
I'm also debating whether a "service oriented" registry rather than a "cog oriented one" makes more sense, because its quite possible for 1 cog to provide several services at once, or conversely a VGA driver may span up to 7 cogs. For multi prop environments this helps because 1 local cog may actually be a "conduit comms cog" for sending/retrieving data for multiple services on a remote prop.
I too have been thinking about services vs cogs. Tubular's example is a good one - some displays span multiple cogs, or at the other extreme, multiple services could fit on one cog.
Tubular's idea of using multiple propellers is interesting. Instead of saying "plot this pixel here" by altering a value in hub, for a service routine, you might say "plot this pixel here" by sending a message to service number n, with x bytes attached.
Then you need a simple mailbox program to handle the transfers. Such a mailbox program might exist in your higher level language, but equally, it could also exist in pasm in a cog.
You send some bytes to the postoffice with a destination address. The post office (cog code say) gobbles them up and determines they are destined for a cog on another propeller. So it sends them to the local serial port cog, which sends the message on to the remote cog.
A little bit convoluted. But then again, if you think about things in terms of services rather than cogs, you might be able to combine the "serial port service" and the "post office" service into the same cog.
I need to think about how one would program a propeller with 16 cogs. Or more...
Ross,
The reason for two tables is indeed so that you can reference them by index. One table is normally sized based on the number of cogs. The other table is based on the number of common services. I don't expect that every service will be represented. That would make it administratively difficult to add new services. I can certainly see a keyboard of some sort, a display, maybe a separate debug display, maybe a separate asynchronous serial port although it would have to be carefully designed to allow for a 4-port driver. I would want a service for some mass storage device although at what level I'm not sure yet.
Again, I expect the actual cog code to not need to reference these tables since the driver knows what kind of driver it is. The tables are to provide information for other cogs to make use of the driver, whether that provides a higher level abstraction or provides an interpreter of some sort (like Spin or an LMM).
I understand your points, but the main thing is that the whole thing will only work if enough people buy into the concept with enough enthusiasm to make it work.
So I think we need to leave this thread running a little longer - people are still coming to grips with the idea itself, the potential benefits, and the potential costs. At this point, the actual method we eventually agree on matters less.
Also, I think this thread may have hit bottom and may actually be on the way up!
Ross.
Since I'm not the thread cop or moderator here it's not up to me if the thread ends. :-) My understanding of the point of this thread was to float the idea of creating a standard. The thought is there and for the most part people agree that it's a good idea. The real buy in has to come from the people writing compilers since you're going to be the ones generating the actual code. When writing C or C++ programs I as a programmer has little input on how/where variables and objects are stored or how they're accessed and I can't think of to many times that I needed to know or care. With the propeller and what we're discussing here the only time it would matter to me is if I try to access a pasm program or spin method such as a driver from a program written in C or if I want to use an object that conforms to the standard from pasm or spin.
The reason I think it would be a good time start a new thread would be to allow for the discussion of the nature of the standard and maybe it's definition. This thread would then "die on the vine" as the actual standard was defined.
My understanding of the point of this thread was to float the idea of creating a standard. The thought is there and for the most part people agree that it's a good idea.
Hi 4x5n,
It's early days yet, we've only just got over the inevitable "it'll never fly, Orville!" stage, and got people thinking
However, I'll reread the whole thread over the weekend, and start a new one if there is enough input to justify moving to the next stage. The main point of the new thread will probably be to identify exactly what any "standard" should cover - cog-to-cog communication is a given, but should it also cover resource management and cog initialization, memory management, prop-to-prop communication etc.
If anybody has any thoughts on this, feel free to throw them onto the bonfire!
The main point of the new thread will probably be to identify exactly what any "standard" should cover - cog-to-cog communication is a given, but should it also cover resource management and cog initialization, memory management, prop-to-prop communication etc.
A standard for Prop-to-Prop communication? Seriously? Please forgive me if I interpret your comments as some kind of power trip, but your naked ambition to bind everyone with standards seems quixotic at best. You can't define standards a priori, either by fiat or by committee. Successful standards, if they're even needed (and I contend that a need has yet to be demonstrated), are not created; they evolve from experience in a frothy, messy cauldron of creativity. And experience necessarily entails a lot of false starts and missteps. To claim that this process can or should be controlled, tamed, or even guided seems the height of hubris to me.
Ross, even though I'm not a C programmer, I respect the work that you've put into Catalina. I just hate to see you squander a hard-earned reputation on this "standards" nonsense. Evolution and a free marketplace will winnow out the chaff, leaving the fittest "standards" to survive on their own. But we're nowhere near the point where we can declare or even guess what they will be.
-Phil
DISCLAIMER: I don't really care how any of this affects the C community, since I don't consider myself a candidate for membership. (Frankly, you C guys just don't seem to have much fun.) I do care, though, if any of this spills over into the Spin and PASM arena or if the energy devoted to C detracts from Spin and PASM in any way. So please think of me as a very concerned outsider looking in. Thanks.
A standard for Prop-to-Prop communication? Seriously? Please forgive me if I interpret your comments as some kind of power trip, but your naked ambition to bind everyone with standards seems quixotic at best. You can't define standards a priori, either by fiat or by committee. Successful standards, if they're even needed (and I contend that a need has yet to be demonstrated), are not created; they evolve from experience in a frothy, messy cauldron of creativity. And experience necessarily entails a lot of false starts and missteps. To claim that this process can or should be controlled, tamed, or even guided seems the height of hubris to me.
Ross, even though I'm not a C programmer, I respect the work that you've put into Catalina. I just hate to see you squander a hard-earned reputation on this "standards" nonsense. Evolution and a free marketplace will winnow out the chaff, leaving the fittest "standards" to survive on their own. But we're nowhere near the point where we can declare or even guess what they will be.
-Phil
DISCLAIMER: I don't really care how any of this affects the C community, since I don't consider myself a candidate for membership. (Frankly, you C guys just don't seem to have much fun.) I do care, though, if any of this spills over into the Spin and PASM arena or if the energy devoted to C detracts from Spin and PASM in any way. So please think of me as a very concerned outsider looking in. Thanks.
Hi Phil,
I generally respect your opinions, but in this case I just so happen to disagree with them.
I tuned out early on. I have just read from pages 3 to 6 (here).
The services section was I was interested in for Sphinx or some form of OS. IIRC I wanted 5 basic services to be present... Console In, Console Out, Aux In, Aux Out and SD card. The rest could be added subsequently, but the primaries would be defined/reserved using a rendezvous method. This is all detailed in a rather old thread (and nothing happened). There is even a wiki page.
I advocated that the four console in & out and Aux in & out be purely character based and could be dynamically substituted. Console in could be a keyboard, touch pad, or serial port. Console out could be a screen (TV, VGA, LCD) or serial. Aux in & out would be a secondary channel of the console, so could be any of the same drivers.
What prompted this was in particular, the different methods in TV, VGA and serial. In one "OUT" was used, next "TX", etc, and all to do exactly the same. This was just a spin naming problem, but it was grass-roots!
Of course, the last was buffers for SD. I agree, that now we have other mass storage options, they could now include Flash etc.
The other problems do not affect Spin programs, but the parameter loading differences place a huge burden on other languages if done in spin. But, these are often done for a reason of space. (possible solution below)
And all of these problems were just with Spin/Pasm.
What I did not realise in all this discussion until near the end, is that Catalina is reclaiming the hub space (great work Ross). That is an even more important reason for some form of standard because the variables may no longer be present in hub. Reclaiming hub space is a big plus when running C programs.
Here is an idea. Of course we DO NEED SOME FORM OF STANDARD first.
To conserve cog space, the parameters could be loaded by LMM. How without using valuable cog space??? Well, I have already done this in my zero-footprint debugger which uses the shadow ram. It is not a true LMM machine, but close enough to load the parameters via a simple LMM style loop. Because the LMM is actually a pasm program, it most likely can be compiled by Catalina (or any other language). Perhaps it could be a part of the "other language services" routines.
I am not sure if this helps Ross, but it's my input for now.
Ross, I realise cog to cog communications opens up a whole bag of possibilities. I'd like to see something just beyond a registry, with a simple mechanism for sharing data from remote cogs. This first iteration (don't dare call it a standard) would be useful to find where the issues lie, such as how much overhead inside the cog is required (surely some or all can be reclaimed after initialisation)
I have an industrial application that involves multiple cores and can arrange hardware for you to test on if required. I was thinking of something along the lines of Modbus for data distribution, and this would work, but it's hardly efficient bandwidth wise.
It has be suggested, by Phil and others, that there already is a standard.
Basically dictated by the hardware and SPIN/PASM summed up as "...cogs
communicate through the hub. That's it" Of course that is the standard most
current objects are writen to.
Further it is suggested that using a different language than Spin does not
change anything or require further standardization.
Well..Here is why a little standards or conventions would be a great help in
all languages other than Spin.
1) Many objects set parameters into the PASM DAT section from Spin prior
to loading the PASM to COG with coginit/new.
Now it is easy to extract the compiled PASM from Spin objects and use it as a
binary blob of firmware to load into COG from C or other languages. This can be
done with both BSTC and HomeSpun I believe.
Problem: The C or other language has no way to know where in that binary blob
it should tweak thos parameters that were set by Spin originally.
Solution: If the convention/standard was that this "poking" of data into COG
code prior to loading was not allowed then all PASM code would be easily reused
across all languages.
This has been discussed here before and is on Jazzed's checkpoint list for
driver writing.
2) Most existing objects start there work as soon as they are loaded. Even if
that is only making the first read to some command long (passed via PAR).
Problem: If you want to start 7 cogs with drivers and then load your main
application code to HUB and start it this going to fail as the COGS are now
operating before the main application is in place resulting in chaos.
Solution: An extra level of indirection in the parameter loading through PAR.
On cognew/int PAR points to some fixed location that is zero.
When the application runs it sets that fixed location to point to thge
parameters the app has set up.
The COG then proceeds to read it's parames from there.
This was pointed out by Dr_A and is handled in Catalina. It should be rule 6 on
Jazzed's check point list.
These two little ideas, if they had been known and adopted years ago would have
resulted in being able to now use all/most of those PASM parts of drivers and
other objects in any language that comes along. As it is a lot of reworking is
required.
It has also been suggested that these standards should be handled by compiler
writers and that the normal Prop user need not know about them. I think the
above shows why this is not the case, The normal Prop users uses a lot of PASM
and no compiler writer can help you make you PASM code more generally useful.
RossH, I know you are wanting to move this debate up in abstraction but these
simple points seem to have been missed by some here.
There are couple of examples which address the queries that you raised in this draft copy of Getting Started with Propforth Assembler that I have put together.
Its only 4 Pages of so but you might gain some insight into how Propforth fits together.
It includes two Forth routines. A trivial routine which blinks a led in cog 4 whilst printing the a character of the alphabet on each blink in the current cog.
The second by Caskas is a Delta Sigma routine ported from Spin.
I really can't see the Jupiter Hive Group, and other Propforth users, being interested in standards other
than those imposed by the instruction set. IMO
Forth is a standalone system.
I can see why standards may be benificial to C users, but that is not the title of the thread.
Evolution and a free marketplace will winnow out the chaff, leaving the fittest "standards" to survive on their own. But we're nowhere near the point where we can declare or even guess what they will be.
It seems like in general in these cases there are competing standards and one may rise to the top and become the chosen one. In this case of the Prop we have a lot of "roll your own" methods being used to suite the particular case at hand;I think an attempt at working out some suggested standards has merit. I think the relatively small size of the Prop community precludes the spontaneous creation of standards out of thin air, so if a few people have the time and energy to try I think that is a good thing. I'm a little ambivalent on the need for the registry and discovery concepts, but having some standard device interfaces and standards for cog-cog comms is appealing.
... 1) Many objects set parameters into the PASM DAT section from Spin prior
to loading the PASM to COG with coginit/new.
Now it is easy to extract the compiled PASM from Spin objects and use it as a
binary blob of firmware to load into COG from C or other languages. This can be
done with both BSTC and HomeSpun I believe.
Problem: The C or other language has no way to know where in that binary blob
it should tweak thos parameters that were set by Spin originally.
...
If you load the cog code as a binary blob, you can also just poke into this blob with hardcoded offsets to change the binary blob, before you start the cog. The offsets can you get from a BST listing, or just with a little Spin methode in the original object:
term.dec(@theVarToChange - @entry)
So also such a driver can easy be ported to any new language.
Comments
The prop has enough unique attributes that it will require boot-strapping like this. Rather than grow frustrated over the state of the discussion, consider just tuning out for a bit.
But how does this help with supporting multiple languages?
For instance, I believe Forth allocates a fixed block per cog (which is in fact quite analogous - but not the same as - the C method of using a registry that contains a pointer to a block per long). The fact is that the basic method needs to be extended to be useful from a high level language - and two different languages currently do it in two different ways, which means they cannot currently use each others cog programs.
There is benefit to be had from trying to standardize this if we can do so without a burdensome cost.
Ross.
What hasn't been mentioned yet is a method to benchmark the resultant cog add-in code.
I would think that most of what's been discussed is getting pushed further than the basic need.
If we can settle on a interprop / intercog timing benchmark then all the other
flavors will gain respect by their brevity to that initial known functionality.
Filling buffers and sending strings has a different requirement than
setting a bit in cog2 on prop3 (and it's reflected acknowledgement to the sender(s) )
Start with the shortest possible and best benchmark and work toward
the more complicated needs, referring to the increased functionality
VS the induced lag from the shortest benchmark we can achieve.
jr
Yes, that is it. Chip's original series of drivers conform to this standard. However, many other objects do not (and again, in the interests of diplomacy, I'm not going to name them). So, getting all objects up to Chip's standard would be a huge achievement and would make it much easier to port between languages.
The 'standard' as I see it is for all the communication longs to be contiguous and for the start location of the block to be passed in par. I'd also add that it is worth trying to use cog code that reads these variables into cog variables using a loop rather than 'unrolled' code, because it will use less space. Perhaps unstated but worth stating is that the order of variables will be the same in hub and cog. You can name those variables to make them easy to reference, and (I think) that therefore they will need different names - some people use myvariable for the hub ones and _myvariable for the cog ones. Or you could use spin_myvariable for the hub and cog_myvariable for the cog ones.
If you happen to have a buffer as part of the common block, and you happen to also need to send some temporary startup variables, you can save some space by passing these through the buffer.
And then document everything in the code.
Now - over and above this standard one can think about other extra standards like a registry. I'm bursting with all sorts of clever ideas there, but I am also tempered by the fact that I am just about to sit down and write a new pasm driver for a touchscreen on the dracblade, so every 'cool' theoretical idea has to work in practice, and it has to justify any extra code space.
If a registry is needed, I'm still pondering what is the best format for such a registry. I have some ideas but I need to ponder about this some more.
In my experiences these types of threads tend to devolve into debating societies and in the end nothing gets done. This thread is/has devolved to that level. As you pointed out look at the todo the word registry has caused! The reality is that the for the most part people agree that if done right this could be useful and beneficial. It's also a reality that the people most effected are those writing the compilers since they're the ones that will implement the standards. If done well most people won't have to know that the standard exists or is in use.
1) A table indexed by cog # containing the starting address of the cog's communications area. This would be used by other cogs to communicate with the cog in question. A cog being started would find its communications area address in PAR. The cog's interface (not the cog itself) would update the table. A zero would be the null address.
2) A table indexed by device class containing the starting address of the cog's communications area. Typical device classes could be console input, console output, maybe mass storage. A device doesn't have to have a device class. This is intended to support device substitution, mostly for stream devices, where there are a variety of possible devices that could be used to provide the class's functionality. For mass storage, an SD card or a couple of flash chips or some large EEPROMs or a USB memory stick could all be used for holding files with some common basic functionality along with access maybe via another device class pointer to the low level functionality that's device specific.
This is slightly simpler than the proposal earlier in this thread and has the advantage that the cog's code doesn't normally have to know about these tables. Variations on this have been suggested by others before, but I thought I'd just restate it for discussion.
I think in this case the "coglet's" will need to have a minimal interface, and that interface will need to be standardized for each class of objects. The reason being the goal is to use any language; since the interface runs in the context of the caller, the interface will need to be written for each calling language to be supported. If the idea is to be able to swap out various coglets that support say console output, the interface on the caller side will need to work with any coglet that properly meets the requirements of it's device type code for console output.
C.W.
Hi Mike,
This is simple and neat. I do have one question, though (and please don't throw rocks at me, I'm battered and bruised enough already ) - if you combine these two tables into one, that's essentially all Catalina's current registry is - what is the purpose in separating them? The main benefit I can see for having two tables is that you can always access the information you need by index rather than having to search - so performance improves - but the downside is that while the list of cogs is fixed and finite (i.e. 8) the list of device classes is not - and it could grow over time. So how do you know how big that table should be?
Ross.
Hi 4x5n,
I understand your points, but the main thing is that the whole thing will only work if enough people buy into the concept with enough enthusiasm to make it work.
So I think we need to leave this thread running a little longer - people are still coming to grips with the idea itself, the potential benefits, and the potential costs. At this point, the actual method we eventually agree on matters less.
Also, I think this thread may have hit bottom and may actually be on the way up!
Ross.
1) the title of this this thread says "communicating *with* cogs". Not *between* cogs, so I presume this standard is not really about inter cog communications?
2) Let's say we have a bootloader and we load cogs in two passes. Does the second load overwrite the registry?
3) Let's say that you have a registry, and each entry in the registry points to a block of data associated with that cog. Will the second load overwrite that data? And to answer that partially, obviously it is pretty unlikely if you have a keyboard driver with 19 longs, but what if the driver is a display with 20k of display buffer and that buffer is wrapped up as part of the group of data that cog needs. Is there a way that second loader program could know that the first loader has loaded up a cog that requires 20k of hub space? Is there a 'fit' command you could use as you write that second program to give an indication you are going to run out of space soon?
4) What if you decide to load up 7 cogs with your preloader, and then load the main program and now you want to swap in and out some more plugins. How do you handle the memory allocated to the plugins? As an example, we now swap out our display driver using 20k of hub ram and replace it with a mouse driver that uses only 7 longs. Does that free up 20k of hub ram?
I have a feeling you may have already thought about some of this with Catalina...
Yes, I agree. To combine this thread with Mike's idea - each "device class" should offer the same interface.
Ross.
Re
I guess that would be code in whatever higher level language you are using. If we are going to break out of the model of "load up the cogs and never touch them again" to reloading cogs, I guess we have to think about the allocation of hub memory that those cogs use.
And maybe that "post office" code will also handle some of the intercog communications. How does your registry model handle, for instance, cog 1 and cog 2 trying to send a message at the same time to cog 3?
Ross I think what you propose has some merit. I'm working on a 'plug and play system' and a registry would make things easier, and better a registry adhering to some kind of standard than none. I'm coming from a different angle but a registry looks useful either way.
The biggest benefit I can see is if you can have cogs on remote props mapping their requests/responses into memory on the "local" prop, perhaps via a conduit cog. This would make it very easy to (never run out of cogs again).
If it hasn't already been considered, I think it would be worth looking closely at the inter cog communications outline for the Prop2, to make sure whatever is ratified will work in well once prop2 is released.
I'm also debating whether a "service oriented" registry rather than a "cog oriented one" makes more sense, because its quite possible for 1 cog to provide several services at once, or conversely a VGA driver may span up to 7 cogs. For multi prop environments this helps because 1 local cog may actually be a "conduit comms cog" for sending/retrieving data for multiple services on a remote prop.
Currently the application software has to "know" that such contention might arise, and use a lock (or some other mechanism) to prevent it. One change I would make to Catalina's registry would be to allow the inclusion of a lock allocated for each cog program (and note the same lock could be used for multiple cog programs). If the lock in the registry is non-zero, the software would have to acquire the specified lock before proceeding (lock zero could not be used for this - but that's not a big problem).
This would make the mechanism invisible from the application level - if the cog knew it needed a lock, it would allocate one during initialization and put it in the registry, and each application that tries to talk to the cog would know it has to use it.
Ross.
Hi tubular,
All good points. I think your "prop to prop" conduit is a great idea. I do something quite similar to this in Catalina via "proxy" drivers - these allow a cog program on one Prop to act as proxy for the real cog program that is actually executing on another Prop - the local proxy cog presents the same interface as the remote cog would if it was executing locally.
I currently use it only for my HMI (keyboard, mouse, screen) and SD Card plugins - but I could generalize it to be able to act as proxy for any cog program.
Also, I am kind of swayed by the idea of a service-oriented registry rather than a cog-oriented one - Eric suggested this earlier in the thread. I just would like to see how much complexity it adds in practice.
Ross.
I too have been thinking about services vs cogs. Tubular's example is a good one - some displays span multiple cogs, or at the other extreme, multiple services could fit on one cog.
Tubular's idea of using multiple propellers is interesting. Instead of saying "plot this pixel here" by altering a value in hub, for a service routine, you might say "plot this pixel here" by sending a message to service number n, with x bytes attached.
Then you need a simple mailbox program to handle the transfers. Such a mailbox program might exist in your higher level language, but equally, it could also exist in pasm in a cog.
You send some bytes to the postoffice with a destination address. The post office (cog code say) gobbles them up and determines they are destined for a cog on another propeller. So it sends them to the local serial port cog, which sends the message on to the remote cog.
A little bit convoluted. But then again, if you think about things in terms of services rather than cogs, you might be able to combine the "serial port service" and the "post office" service into the same cog.
I need to think about how one would program a propeller with 16 cogs. Or more...
The reason for two tables is indeed so that you can reference them by index. One table is normally sized based on the number of cogs. The other table is based on the number of common services. I don't expect that every service will be represented. That would make it administratively difficult to add new services. I can certainly see a keyboard of some sort, a display, maybe a separate debug display, maybe a separate asynchronous serial port although it would have to be carefully designed to allow for a 4-port driver. I would want a service for some mass storage device although at what level I'm not sure yet.
Again, I expect the actual cog code to not need to reference these tables since the driver knows what kind of driver it is. The tables are to provide information for other cogs to make use of the driver, whether that provides a higher level abstraction or provides an interpreter of some sort (like Spin or an LMM).
Since I'm not the thread cop or moderator here it's not up to me if the thread ends. :-) My understanding of the point of this thread was to float the idea of creating a standard. The thought is there and for the most part people agree that it's a good idea. The real buy in has to come from the people writing compilers since you're going to be the ones generating the actual code. When writing C or C++ programs I as a programmer has little input on how/where variables and objects are stored or how they're accessed and I can't think of to many times that I needed to know or care. With the propeller and what we're discussing here the only time it would matter to me is if I try to access a pasm program or spin method such as a driver from a program written in C or if I want to use an object that conforms to the standard from pasm or spin.
The reason I think it would be a good time start a new thread would be to allow for the discussion of the nature of the standard and maybe it's definition. This thread would then "die on the vine" as the actual standard was defined.
I couldn't agree more!
Hi 4x5n,
It's early days yet, we've only just got over the inevitable "it'll never fly, Orville!" stage, and got people thinking
However, I'll reread the whole thread over the weekend, and start a new one if there is enough input to justify moving to the next stage. The main point of the new thread will probably be to identify exactly what any "standard" should cover - cog-to-cog communication is a given, but should it also cover resource management and cog initialization, memory management, prop-to-prop communication etc.
If anybody has any thoughts on this, feel free to throw them onto the bonfire!
Ross.
A standard for Prop-to-Prop communication? Seriously? Please forgive me if I interpret your comments as some kind of power trip, but your naked ambition to bind everyone with standards seems quixotic at best. You can't define standards a priori, either by fiat or by committee. Successful standards, if they're even needed (and I contend that a need has yet to be demonstrated), are not created; they evolve from experience in a frothy, messy cauldron of creativity. And experience necessarily entails a lot of false starts and missteps. To claim that this process can or should be controlled, tamed, or even guided seems the height of hubris to me.
Ross, even though I'm not a C programmer, I respect the work that you've put into Catalina. I just hate to see you squander a hard-earned reputation on this "standards" nonsense. Evolution and a free marketplace will winnow out the chaff, leaving the fittest "standards" to survive on their own. But we're nowhere near the point where we can declare or even guess what they will be.
-Phil
DISCLAIMER: I don't really care how any of this affects the C community, since I don't consider myself a candidate for membership. (Frankly, you C guys just don't seem to have much fun.) I do care, though, if any of this spills over into the Spin and PASM arena or if the energy devoted to C detracts from Spin and PASM in any way. So please think of me as a very concerned outsider looking in. Thanks.
Hi Phil,
I generally respect your opinions, but in this case I just so happen to disagree with them.
Is that ok with you?
Ross.
The services section was I was interested in for Sphinx or some form of OS. IIRC I wanted 5 basic services to be present... Console In, Console Out, Aux In, Aux Out and SD card. The rest could be added subsequently, but the primaries would be defined/reserved using a rendezvous method. This is all detailed in a rather old thread (and nothing happened). There is even a wiki page.
I advocated that the four console in & out and Aux in & out be purely character based and could be dynamically substituted. Console in could be a keyboard, touch pad, or serial port. Console out could be a screen (TV, VGA, LCD) or serial. Aux in & out would be a secondary channel of the console, so could be any of the same drivers.
What prompted this was in particular, the different methods in TV, VGA and serial. In one "OUT" was used, next "TX", etc, and all to do exactly the same. This was just a spin naming problem, but it was grass-roots!
Of course, the last was buffers for SD. I agree, that now we have other mass storage options, they could now include Flash etc.
The other problems do not affect Spin programs, but the parameter loading differences place a huge burden on other languages if done in spin. But, these are often done for a reason of space. (possible solution below)
And all of these problems were just with Spin/Pasm.
What I did not realise in all this discussion until near the end, is that Catalina is reclaiming the hub space (great work Ross). That is an even more important reason for some form of standard because the variables may no longer be present in hub. Reclaiming hub space is a big plus when running C programs.
Here is an idea. Of course we DO NEED SOME FORM OF STANDARD first.
To conserve cog space, the parameters could be loaded by LMM. How without using valuable cog space??? Well, I have already done this in my zero-footprint debugger which uses the shadow ram. It is not a true LMM machine, but close enough to load the parameters via a simple LMM style loop. Because the LMM is actually a pasm program, it most likely can be compiled by Catalina (or any other language). Perhaps it could be a part of the "other language services" routines.
I am not sure if this helps Ross, but it's my input for now.
I have an industrial application that involves multiple cores and can arrange hardware for you to test on if required. I was thinking of something along the lines of Modbus for data distribution, and this would work, but it's hardly efficient bandwidth wise.
Basically dictated by the hardware and SPIN/PASM summed up as "...cogs
communicate through the hub. That's it" Of course that is the standard most
current objects are writen to.
Further it is suggested that using a different language than Spin does not
change anything or require further standardization.
Well..Here is why a little standards or conventions would be a great help in
all languages other than Spin.
1) Many objects set parameters into the PASM DAT section from Spin prior
to loading the PASM to COG with coginit/new.
Now it is easy to extract the compiled PASM from Spin objects and use it as a
binary blob of firmware to load into COG from C or other languages. This can be
done with both BSTC and HomeSpun I believe.
Problem: The C or other language has no way to know where in that binary blob
it should tweak thos parameters that were set by Spin originally.
Solution: If the convention/standard was that this "poking" of data into COG
code prior to loading was not allowed then all PASM code would be easily reused
across all languages.
This has been discussed here before and is on Jazzed's checkpoint list for
driver writing.
2) Most existing objects start there work as soon as they are loaded. Even if
that is only making the first read to some command long (passed via PAR).
Problem: If you want to start 7 cogs with drivers and then load your main
application code to HUB and start it this going to fail as the COGS are now
operating before the main application is in place resulting in chaos.
Solution: An extra level of indirection in the parameter loading through PAR.
On cognew/int PAR points to some fixed location that is zero.
When the application runs it sets that fixed location to point to thge
parameters the app has set up.
The COG then proceeds to read it's parames from there.
This was pointed out by Dr_A and is handled in Catalina. It should be rule 6 on
Jazzed's check point list.
These two little ideas, if they had been known and adopted years ago would have
resulted in being able to now use all/most of those PASM parts of drivers and
other objects in any language that comes along. As it is a lot of reworking is
required.
It has also been suggested that these standards should be handled by compiler
writers and that the normal Prop user need not know about them. I think the
above shows why this is not the case, The normal Prop users uses a lot of PASM
and no compiler writer can help you make you PASM code more generally useful.
RossH, I know you are wanting to move this debate up in abstraction but these
simple points seem to have been missed by some here.
There are couple of examples which address the queries that you raised in this draft copy of Getting Started with Propforth Assembler that I have put together.
Its only 4 Pages of so but you might gain some insight into how Propforth fits together.
It includes two Forth routines. A trivial routine which blinks a led in cog 4 whilst printing the a character of the alphabet on each blink in the current cog.
The second by Caskas is a Delta Sigma routine ported from Spin.
I really can't see the Jupiter Hive Group, and other Propforth users, being interested in standards other
than those imposed by the instruction set. IMO
Forth is a standalone system.
I can see why standards may be benificial to C users, but that is not the title of the thread.
Ron
It seems like in general in these cases there are competing standards and one may rise to the top and become the chosen one. In this case of the Prop we have a lot of "roll your own" methods being used to suite the particular case at hand;I think an attempt at working out some suggested standards has merit. I think the relatively small size of the Prop community precludes the spontaneous creation of standards out of thin air, so if a few people have the time and energy to try I think that is a good thing. I'm a little ambivalent on the need for the registry and discovery concepts, but having some standard device interfaces and standards for cog-cog comms is appealing.
C.W.
Standards are wonderful things. We should all have one:)
If you load the cog code as a binary blob, you can also just poke into this blob with hardcoded offsets to change the binary blob, before you start the cog. The offsets can you get from a BST listing, or just with a little Spin methode in the original object: So also such a driver can easy be ported to any new language.
Andy