Shop OBEX P1 Docs P2 Docs Learn Events
Prop and RISC — Parallax Forums

Prop and RISC

ServoMServoM Posts: 10
edited 2010-03-16 16:40 in Propeller 1
I am not expert on instruction set architectures, but wonder if anyone has done a comparison (written up somewhere) of the Propeller instruction set and register architecture with other architectures, mainly RISC.

Are there notable similarities and differences?· What is the closest RISC architecture that is similar (if it exists)?

·

Comments

  • Mike GreenMike Green Posts: 23,101
    edited 2010-03-12 16:11
    The Propeller is a RISC architecture. All instructions, with the exception of those involving the hub, take 4 clock cycles to execute. The hub instructions can be considered to be I/O instructions which can "stall" the processor for periods of time for external events to take place.

    No, there's not been a formal written comparison.

    It would be difficult to say what other RISC architectures are similar. Various parts of the Propeller's architecture can be seen in different historical computers. Having conditionally executed instructions controlled by a condition code in all instructions is an idea that's very old. Similarly, having results conditionally changed (like the flags and result operand) is a very old idea in computer design that's appeared over and over. There's really nothing in the Propeller's design that's fundamentally new, but the combination of choices along with the implementation and the attention to detail is unusual.
  • potatoheadpotatohead Posts: 10,261
    edited 2010-03-14 21:43
    I think the round robin shared memory access is new, and responsible for much of the ease of use in multi-processing seen on the chip.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Propeller Wiki: Share the coolness!
    8x8 color 80 Column NTSC Text Object
    Safety Tip: Life is as good as YOU think it is!
  • localrogerlocalroger Posts: 3,452
    edited 2010-03-14 22:32
    Actually the round-robin shared memory is pretty old; as one of the online tutorials says it's not the most efficient way to share memory between multiple cores, but it is the most deterministic and reliable.
  • potatoheadpotatohead Posts: 10,261
    edited 2010-03-14 23:34
    What are some older systems that used it, like the propeller does?

    I probably should have expanded some. The combination of the COG - HUB separate memory address spaces, where code only executes on the COG is pretty unique. The non-associative copy to COG on cognew, in particular is not something I've seen elsewhere. Anyway, if you've some references in mind, please link me up! I like reading that kind of stuff.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Propeller Wiki: Share the coolness!
    8x8 color 80 Column NTSC Text Object
    Safety Tip: Life is as good as YOU think it is!

    Post Edited (potatohead) : 3/15/2010 1:37:21 AM GMT
  • localrogerlocalroger Posts: 3,452
    edited 2010-03-15 00:21
    potatohead, Hub RAM is basically the simplest version possible of what is sometimes called multiport RAM; it has to run at 8x its nominal access speed in order to serve 8 masters. There are more efficient sharing algorithms out there -- actually there's a pretty extensive literature -- but what Hub RAM does is the core from which they all spring. You won't find a lot of literature describing it for the same reason you won't find a lot of literature describing how to implement a flip-flop. The radical thing in the Prop's case is taking this very basic, inefficient, too obvious to be useful thing and using it anyway because it allows somewhat determinant timing. All the other schemes meant to improve on round-robin access are faster most of the time, but sometimes they are much slower. That never happens on a Prop; you always get to the Hub in 22 cycles or less, no matter what else is happening. No other sharing system can guarantee that kind of worst-case performance.

    Running the individual processors that share such multiport RAM on their own local non-shared RAM is a pretty Propeller-unique idea, particularly making the unshared space so small and the instruction set so rich. As for the cognew process for transferring a hub RAM image to a Cog, IME you're right; never seen anything quite like that anywhereelse. I do wish it had occurred to Chip to provide cleaner ways to reclaim those 2K images for later use once the Cogs are launched; with only 32K of Hub RAM a few 2K cog images are an important resource worth reclaiming.
  • Peter JakackiPeter Jakacki Posts: 10,193
    edited 2010-03-15 00:46
    localroger said...
    <snip> I do wish it had occurred to Chip to provide cleaner ways to reclaim those 2K images for later use once the Cogs are launched; with only 32K of Hub RAM a few 2K cog images are an important resource worth reclaiming.

    Wouldn't it be possible for the compiler to be modified to bunch all the code stuff up in the top of memory or somewhere? At present we use the DAT directive but dat smile.gif doesn't mean that it's PASM so we really need a CODE/PASM/COG directive to assist the compiler to place the image in an area that's easy to reclaim rather than interspersed throughout. Buffers could shadow the cog image provided of course that the image doesn't get wiped before it's used! Anyway, if we at least had the option of specifying these things as well as VAR origins I would for instance shadow the cog images with video memory and start up the video last. Having all that PASM in a reusable area means I can pack more Spin code in, and I need to pack more in.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    *Peter*
  • localrogerlocalroger Posts: 3,452
    edited 2010-03-15 01:41
    PJ -- yes, there's nothing inherent even in the design of the Spin interpreter that requires it to be hard to reclaim DAT images. Just as there's nothing in the interpreter that forbids using GOTO; the interpreter itself implements GOTO (since it's necessary for low level implementation of things like REPEAT) and it's the PropTool and Spin compiler that refuse us its use.
  • potatoheadpotatohead Posts: 10,261
    edited 2010-03-15 01:49
    Agreed on reclaiming the image space. With PASM it's not that hard to do, but with SPIN, it's more difficult.

    One thing I'm not clear on is the HUB operating at 8X. Since it's round-robin, doesn't that really mean it's just 1X?

    The trade off with round robin, I think is most significant, is the ease of use. Having bits of code just work, no matter what the state of the multi-processor is, really is powerful. Easy things are pretty easy, while not making hard things impossible. Some peak speed potential is off the table, but given the mess I've seen associated with more complex shared memory systems, the case for not using them well at all is strong, where that isn't the case on the Propeller.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Propeller Wiki: Share the coolness!
    8x8 color 80 Column NTSC Text Object
    Safety Tip: Life is as good as YOU think it is!
  • localrogerlocalroger Posts: 3,452
    edited 2010-03-16 00:14
    potatohead, what I meant by the Hub being 8x is that if you only had a single Cog and you wanted to provide ca. 16 clock cycle access (as we get from the Prop) the Hub could get away with running at 1/16 the system clock rate (and a great power saver that would be). But it actually runs at half the system clock rate, or 8x faster than it would need to for one cog, in order to give that level of performance to all 8 cogs.

    Most shared memory systems deliver considerably better average performance than that, at the cost of a lot worse worst-case performance when the stars don't line up.

    Now that I'm thinking about it I wonder why the Hub runs at half speed; maybe the Hub needs an extra clock cycle to change state before it can use a clock cycle to access Hub RAM? It would sure change the complexion of things if the Hub could sync every 8 cycles instead of every 16.
  • Cluso99Cluso99 Posts: 18,069
    edited 2010-03-16 05:42
    potatohead: I owned and worked on a similar architecture mini-computer from early 70's onwards. It was the Friden, then Singer and finally ICL System Ten and was released in 1969 using discrete logic. A later redesign in 1981 called ICL System 25 and was produced until 1993 and maintained till 1999. Marks & Spence were a major user in the UK. Sears were a big user in the US in the early days.

    Hub memory was called Common and Cog memory was called Partition. It could have up to 20 Partitions. The System Ten ended up with 16 instructions including multiply and divide memory to memory, no registers. It was a decimal machine - yes memory was decimally addressed. A multiply could multiply up to 10 digits by up to 10 digits and the result would be the length of both A & B and stored in the B for a length of A+B, all in decimal, no overflow possible.

    The Partitions were actually time-sliced by hardware. It was programmed in assembler and ran online order entry, etc, which was very unusual at the time.

    http://www.computermuseum.org.uk/fixed_pages/icl_system_25.html
    http://members.iinet.net.au/~daveb/S10/Sys-10.html

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Links to other interesting threads:

    · Home of the MultiBladeProps: TriBlade,·RamBlade,·SixBlade, website
    · Single Board Computer:·3 Propeller ICs·and a·TriBladeProp board (ZiCog Z80 Emulator)
    · Prop Tools under Development or Completed (Index)
    · Emulators: CPUs Z80 etc; Micros Altair etc;· Terminals·VT100 etc; (Index) ZiCog (Z80) , MoCog (6809)·
    · Prop OS: SphinxOS·, PropDos , PropCmd··· Search the Propeller forums·(uses advanced Google search)
    My cruising website is: ·www.bluemagic.biz·· MultiBlade Props: www.cluso.bluemagic.biz
  • potatoheadpotatohead Posts: 10,261
    edited 2010-03-16 07:47
    Thanks for the link to a very interesting old machine.

    I think it's worth noting that machine did actually have one memory space, where one of it's concurrent tasks executed in a "local" partition, where memory addressing would be consistent, where positive offsets are "task" or "partition" relative, but still in the physical RAM, and negative offsets operated in the shared memory, non partitioned.

    The non-associative, and separate memory space of the COG, as seen in the Propeller, is quite unique, IMHO. It is a seperate address space both physically and logically, where as the S10 machine appeared to have logically separate memory spaces, but not physically separate.

    Still, that machine shares a lot of interesting design ideas. Good read. Thanks again for linking that.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Propeller Wiki: Share the coolness!
    8x8 color 80 Column NTSC Text Object
    Safety Tip: Life is as good as YOU think it is!
  • Cluso99Cluso99 Posts: 18,069
    edited 2010-03-16 12:32
    potatohead: The memory, while built from contiguous banks, were physically seperated from the partitions by hardware. The memory sizes of each partition were done by jumper links using much bigger "U" gold pins on the front of the boards in the cabinet. The System Ten used core memory. The later System 25 was battery-backed DRAM and the configuration was soft and set at boot time. No partition could access another partition's memory. The partition and common memory, from a user point of view, were totally seperate, just like the propeller. I gave my System 25 away only 2 years ago. I sold my System Ten's (yes multiples, but only used 1) when I stopped using it in 2000 - sold for scrap and shipped for China to recover the gold. I used the System Ten & System 25 to develop software and hardware over a period from 1974-2000.

    I wrote an emulation for the System 25 in the early 90's which was formally validated but not sold. It ran 3x faster on an 80486 33MHz. I wrote it in assembler and targetted the 486 instructions for speed. Maybe some time I will write an emulation for it on RamBlade smile.gif

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Links to other interesting threads:

    · Home of the MultiBladeProps: TriBlade,·RamBlade,·SixBlade, website
    · Single Board Computer:·3 Propeller ICs·and a·TriBladeProp board (ZiCog Z80 Emulator)
    · Prop Tools under Development or Completed (Index)
    · Emulators: CPUs Z80 etc; Micros Altair etc;· Terminals·VT100 etc; (Index) ZiCog (Z80) , MoCog (6809)·
    · Prop OS: SphinxOS·, PropDos , PropCmd··· Search the Propeller forums·(uses advanced Google search)
    My cruising website is: ·www.bluemagic.biz·· MultiBlade Props: www.cluso.bluemagic.biz
  • potatoheadpotatohead Posts: 10,261
    edited 2010-03-16 15:25
    What an interesting old machine. And I think that says something about it, given a more modern, and way more capable CPU only got 3X.

    I should have been more clear in my post above. The document you linked, did clearly say it was separate memory, and you gotta love hand configuring those too. That's almost like a COG! No wonder you like the Prop [noparse]:)[/noparse]

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Propeller Wiki: Share the coolness!
    8x8 color 80 Column NTSC Text Object
    Safety Tip: Life is as good as YOU think it is!
  • Cluso99Cluso99 Posts: 18,069
    edited 2010-03-16 16:40
    Yes... I almost feel at home LOL

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Links to other interesting threads:

    · Home of the MultiBladeProps: TriBlade,·RamBlade,·SixBlade, website
    · Single Board Computer:·3 Propeller ICs·and a·TriBladeProp board (ZiCog Z80 Emulator)
    · Prop Tools under Development or Completed (Index)
    · Emulators: CPUs Z80 etc; Micros Altair etc;· Terminals·VT100 etc; (Index) ZiCog (Z80) , MoCog (6809)·
    · Prop OS: SphinxOS·, PropDos , PropCmd··· Search the Propeller forums·(uses advanced Google search)
    My cruising website is: ·www.bluemagic.biz·· MultiBlade Props: www.cluso.bluemagic.biz
Sign In or Register to comment.