idea for alternate fast inter-cog signalling
Bill Henning
Posts: 6,445
in Propeller 2
Ok, this is different, and may not be at all feasible.
The idea is to leverage off the muxes already used for the hub, and interrupts.
Each cog gets:
(Simple version)
MSGSOURCE - 5 bit register, 1cccc in case a message arrived, can cause interrupt
MSG - 32 bit message
Either two new ops, or fixed cog register numbers
WRMSG cog,S/# .... when eggbeater comes around, uses eggbeater muxes and data paths to write to the message register in destination cog, optionally interrupts it. pre-empts hub use that cycle.
RDMSG D .... used on destination cog to read the last received message
(Complicated version)
MSGAVAIL - 16 bit register, one bit per potential sending cog
MSG0..15 ... last message from named cog
WRMSG cog, S/# .... when eggbeater comes around, uses eggbeater muxes and data paths to write to the message register in destination cog, optionally interrupts it. pre-empts hub use that cycle.
RDMSG srcog, D .... used on destination cog to read the last received message
Possible simplification:
perhaps deposit the message in the LUT?
It would be (worst case) about 17-18 clocks before the message was received in the destination cog, and it could be as little as 2-3 clocks to send to the "next" cog.
18 clocks / 200MHz = <100ns to send a 4 byte message, so >40MB/sec
The idea is to leverage off the muxes already used for the hub, and interrupts.
Each cog gets:
(Simple version)
MSGSOURCE - 5 bit register, 1cccc in case a message arrived, can cause interrupt
MSG - 32 bit message
Either two new ops, or fixed cog register numbers
WRMSG cog,S/# .... when eggbeater comes around, uses eggbeater muxes and data paths to write to the message register in destination cog, optionally interrupts it. pre-empts hub use that cycle.
RDMSG D .... used on destination cog to read the last received message
(Complicated version)
MSGAVAIL - 16 bit register, one bit per potential sending cog
MSG0..15 ... last message from named cog
WRMSG cog, S/# .... when eggbeater comes around, uses eggbeater muxes and data paths to write to the message register in destination cog, optionally interrupts it. pre-empts hub use that cycle.
RDMSG srcog, D .... used on destination cog to read the last received message
Possible simplification:
perhaps deposit the message in the LUT?
It would be (worst case) about 17-18 clocks before the message was received in the destination cog, and it could be as little as 2-3 clocks to send to the "next" cog.
18 clocks / 200MHz = <100ns to send a 4 byte message, so >40MB/sec
Comments
2 or 3 clocks to write 32 bits @200MHz = 400MB/s or 267MB/s
Note: this is a one way block transfer write. Reading by the other cog has not been taken into account.
??
Faster inter-cog signalling is already in there, as a recent addition.
Any COG can now signal any other COG(s) with SysCLK granularity.
The problem is the "when eggbeater comes around", which is then the same as HUB access ?
The (also recently added) DAC_Data path can do bytes in less than this 17-18 clks.
Still waiting on exact numbers, & code examples, but there may be scope to overlay the SET and GET times in the Smart Pin cell pipelines, to shave some clocks.
Chip has this done, and is in the next build.