...I will say this. A deadline needs to be set. Why? Because any project without a deadline becomes that perpetual 95% complete essay that will never get done. There will ALWAYS be something you can do differently to tweak it a little here or a little there.
You're right. This has really been settling into my brain over the last few weeks.
We've got a date for final Verilog submission to Treehouse for synthesis: November 1st.
I've just about got those block reads and writes done. They were a lot more difficult than I thought they'd be. I worked for 36 hours straight from Wednesday morning to Thursday night and made lots of progress. I've been drinking lots of lemon water and eating lightly, which is helping productivity a lot. I got the interrupt system all revamped to handle asynchronous breakpoints from other cogs now. I just have some finishing to do on the block reads/writes. Then, we can start playing with the Prop2 and worrying about smart pins.
I've just about got those block reads and writes done. They were a lot more difficult than I thought they'd be.
I'm guessing the complicated part was because you are trying to avoid stalling the cog in the same way that a RDxxx/WRxxx would. If that's the case, just let it stall like the other memory hubops. That keeps things simple and consistent.
That can be the (quick) core of a Blue-Stamp module.
{" BGM111 Module with Pre-Installed BLE Stack and Easy-to-Use Scripting Language Provides Plug-and-Play Solution "} Testing Bold Testing ["[b]Fails?[/b]"]
I've just about got those block reads and writes done. They were a lot more difficult than I thought they'd be.
I'm guessing the complicated part was because you are trying to avoid stalling the cog in the same way that a RDxxx/WRxxx would. If that's the case, just let it stall like the other memory hubops. That keeps things simple and consistent.
Then, we can start playing with the Prop2 and worrying about smart pins.
Muahahahaha!
What makes the block read/write tricky is that there are a few different state sequences, based on whether you are reading or writing multiple longs, as opposed to singular bytes/word/longs. Also, after reading/writing multiple longs into/from COG ram within the same instruction cycle, the previously-read D register value sitting on the one of the cog RAM's output ports is no longer there, so a re-read of cog RAM must be done, and this took some juggling to make it only one more clock, and only when RDLONG/WRLONG-repeat are operating.
I co-opted the RDLONG/WRLONG instructions into being the block r/w instructions. The way it works is, if you do a SETQ right before the RDLONG/WRLONG and if Q is greater than 0, it reads/writes that many more longs into ascending addresses, both in the hub and cog. This way, you can load a cog with executable code in two instructions, at any register offset and from any address in the hub with no alignment restrictions.
There were some nice developments to come out of this SETQ business. Interrupts are now disallowed during SETQ, just like ALTDS. This makes sure that SETQ+RDLONG/WRLONG are tightly bound and there are no worries about them being interrupted, as we can't read Q. This also makes Q's value in interrupts irrelevant, because whatever SETQ was being used for, already happened, so there's no need to worry about preserving it in order to use and later restore it. Everyone can use SETQ with abandon now.
Here are the instructions that use SETQ:
RDLONG (if reading multiple longs)
WRLONG (if writing multiple longs)
WAITINT/WAITPER/WAITEDG/WAITPAT/WAITRDL/WAITWRL/WAITXRO/WAITFBW (if WC is used add timeout)
QDIV/QROT (to provide another 32-bit term, in addition to D and S)
I just finished the RDLONG-repeat and now I must finish the WRLONG-repeat. After that, I don't feel like there will be anything lacking from the design. Smart pins can pick up all the sundry I/O functions that we can fit into them.
The latest document that Chip posted still shows the name LINK instead of CALLD. I thought it was decided to use the name CALLD for the instruction. So which one is it, LINK or CALLD?
So it sounds like there is no need for the RDLONGS and WRLONGS instructions since RDLONG and WRLONG can do the same thing, correct? Have you removed the RDLONGS and WRLONGS instructions? If so, please don't say that you reshuffled the instruction binary values again. I just spent the last couple of days updating spinsim with the instruction values from the last document.
Yes, I know that things can change at any time. I'm just trying to keep up with Chip's changes. It's exciting to hear about the November 1st date for the Verilog submission to Treehouse. It looks like P2 Day is almost here!
That can be the (quick) core of a Blue-Stamp module.
Ouch. I tried to read that before when you posted it, and just had to escape that wall of text.
Most off-putting sumarization ever written. Which from some reading, is a shame.
Agree, a simple Blue-Stamp module like that could be a nice add-on for Parallax, doesn't look like too much R&D is necessary. Might be nice on the P2-Proto board as well.
June July August
Su Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa
-- -- -- -- -- -- 1 2 3 4 1
-- -- -- -- -- -- -- 5 6 7 8 9 10 11 2 3 4 5 6 7 8
-- -- -- -- -- -- -- 12 13 14 15 16 17 18 9 10 11 12 13 14 15
21 22 23 24 25 26 27 19 20 21 22 23 24 25 16 17 18 19 20 21 22
28 29 30 26 27 28 29 30 31 23 -- -- -- -- -- --
-- --
Interrupts are done.
The P2 instruction set is completed.
Block reads and writes almost done.
Deadline for final Verilog submission to Treehouse: November 1st
P2 Day is almost here!
I have a question regarding P2 instructions, specifically the four-bit condition field (CCCC).
Why is CCCC not first part of the instruction (most significant nibble) ?
I can imagine some things that would be easier in that case.
Just can't get this out of my head --- I would love to understand the reasoning behind this.
Tried some searches and read a lot of posts on P2 design; could not find relevant info.
The position of the CCCC field is completely arbitrary AFAIK. For the most part it doesn't matter where the bit fields are located as long as you use an assembler to produce the instructions. It might make is easier to decode instructions by hand, but that becomes very tedious after a while. If you have to decode a lot of instructions it's easier to do it by writing a disassembler.
Re: "The position of the CCCC field is completely arbitrary AFAIK. For the most part it doesn't matter where the bit fields are located as long as you use an assembler [...]"
I was thinking of enabling and disabling instructions with Set Nibble, and maybe checking an instruction's state with Get Nibble.
Is this form of self-modification frowned upon?
(I guess it is asking for trouble; I know this from early programming education, but in Propeller context it seems reasonable.)
Now that I mentioned SETNIB I found relevant comments:
February 2014, Seairth started a discussion on same topic.
Is it the latest thought on this?
Found SETCOND anyway.
LOL! I Did? Well, it sounds like something I'd bring up. And, yes, I still think it would make more sense to place it at the beginning.
cccc_iiiiiii_zci_ddddddddd_sssssssss
just looks better to me (and better matches PASM's instruction layout too).
Also, some side benefits:
SETCOND just becomes an alias for SETNIB7 (wich did not exist at the time I allegedly brought this up last time). The SETCOND opcode could then be used for GETI, to provide parity with the GETD/SETD and GETS/SETS instructions. Of course, you could then also have a GETCOND, which would just be an alias for GETNIB7.
Any value from $0FFFFFFF down would never be accidentally executed as an instruction.
When you were looking at a hex dump, the "always" condition (F) would provide natural alignment for the eye with other conditionals sticking out very clearly.
Instruction opcodes that extend into D/S wouldn't have an awkward COND field in the middle of them.
But, as Dave points out, the position is arbitrary. This all comes down to Chip's personal preference. Would he change it now? I doubt it. Unless he thought that the first bullet above was reason enough.
I imagine the only reason the condition bits are where they are is because one naturally tends to think of the opcode first.
In most processors I would have said it makes no difference where the different bit fields go in an instruction unless one has variable length instructions. But as you say, perhaps when thinking about self modifying code there is an advantage to moving things into convenient positions.
Self-modifying code is frowned upon in much of the computing world. It can make understanding how a program works rather difficult. Probably has security issues in a general purpose OS. Besides how are you going do it when executing from ROM?
But this is the Propeller. With the P1 self-modifying code is essential due to the nature of it's addressing modes and lack of indexing from registers. One has to be able to modify src/dst addresses within an instruction so as to index into arrays for example. You will find a lot of self-modifying code in OBEX. I see no reason for frowning on modification of condition bits.
I believe the SETI instruction (MOVI on the Prop1) is the reason for the why the bulk of the opcode is at the msb end. Maybe another similar instruction, finish the coverage, is needed as well.
I have a question regarding P2 instructions, specifically the four-bit condition field (CCCC).
Why is CCCC not first part of the instruction (most significant nibble) ?
I can imagine some things that would be easier in that case.
Just can't get this out of my head --- I would love to understand the reasoning behind this.
Tried some searches and read a lot of posts on P2 design; could not find relevant info.
Thanks!
The reason that the condition field is located in the middle of the instruction is just because I made it that way a long time ago on Prop1 and carried it over to Prop2.
I was looking at our encodings and I agree that it would be a lot cleaner to have that 4-bit field at the head of the instruction. It would then mimic the order of the assembly language and, as someone pointed out, enable SETNIB7 to affect an instruction's condition, freeing up SETCOND for some other use. After I get the WRLONG-repeat done, I will make this change. It's rather trivial, but will take several hours to go through all the affected Verilog and assembler code. One other benefit to doing this is that it makes the 20-bit immediate addresses and 23-bit immediates for AUGS/AUGD contiguous. It's a lot cleaner.
Thanks for bringing this up, Guys.
Dave Hein, please forgive me. And, yes, some stuff got shuffled a bit when I got rid of RDLONGS/WRLONGS. It was only several instructions, though.
... One other benefit to doing this is that it makes the 20-bit immediate addresses and 23-bit immediates for AUGS/AUGD contiguous. It's a lot cleaner.
Well, I just made all the changes to get the CCCC bits into the top nibble of the instructions and everything seems to be working fine. The assembler simplified and so did the Verilog code.
Since SETCOND was no longer needed and SETDS was rather superfluous, I removed them and added in the Prop2-Hot instructions SEUSSF/SEUSSR which do crazy bit rearranging and inverting in fixed forward and reverse patterns.
I was looking at our encodings and I agree that it would be a lot cleaner to have that 4-bit field at the head of the instruction.
Just a note to say that some P1 code uses the instruction bits position as a counter and sets the C flags when it overflows as a trick to speed-up reading/writing from hub memory, if I understand correctly, it won't work anymore if the positions is changed. I don't know if on P2 this kind of tricks are still needed or there are better/faster ways to do these things (admittely I'm a bit lost about the P2 feature changes...) but I would think a bit more before changing the instruction layout, maybe it still could be used for something else.
... I don't know if on P2 this kind of tricks are still needed or there are better/faster ways to do these things (admittely I'm a bit lost about the P2 feature changes...) ...
I've just been trying to work out what some of the new instructions do ... There's certainly a lot more there than I was expecting to find.
Well, I just made all the changes to get the CCCC bits into the top nibble of the instructions and everything seems to be working fine. The assembler simplified and so did the Verilog code.
Nice!
Now 28 bit constants can be buried within PASM code which also function as NOP's in fine tuned timing loops.
Actually, I'm OK with the changes, it will make it easier to hand assemble and disassemble P2 code, and it does make the bits in large immediate values contiguous. I guess I'll wait for a new instruction spec document before I incorporate the changes in spinsim.
Nice!
Now 28 bit constants can be buried within PASM code which also function as NOP's in fine tuned timing loops.
Well spotted - a smarter assembler can use a NOPc or similar, and then check the constants area, for values that can be 're-purposed'.
That also means constants could need to be placement floating, and placed last.
Keeps the code readable, but allows more use of COG memory.
Comments
You're right. This has really been settling into my brain over the last few weeks.
We've got a date for final Verilog submission to Treehouse for synthesis: November 1st.
I've just about got those block reads and writes done. They were a lot more difficult than I thought they'd be. I worked for 36 hours straight from Wednesday morning to Thursday night and made lots of progress. I've been drinking lots of lemon water and eating lightly, which is helping productivity a lot. I got the interrupt system all revamped to handle asynchronous breakpoints from other cogs now. I just have some finishing to do on the block reads/writes. Then, we can start playing with the Prop2 and worrying about smart pins.
I'm guessing the complicated part was because you are trying to avoid stalling the cog in the same way that a RDxxx/WRxxx would. If that's the case, just let it stall like the other memory hubops. That keeps things simple and consistent.
Muahahahaha!
hehe, a Prop purist may raise their eyebrows, but it makes good business sense.
In a similar vein, I think Parallax should look very closely at products like this
http://news.silabs.com/press-release/product-news/silicon-labs-simplifies-bluetooth-smart-design-fully-integrated-blue-geck
That can be the (quick) core of a Blue-Stamp module.
{" BGM111 Module with Pre-Installed BLE Stack and Easy-to-Use Scripting Language Provides Plug-and-Play Solution "}
Testing Bold Testing ["[b]Fails?[/b]"]
What makes the block read/write tricky is that there are a few different state sequences, based on whether you are reading or writing multiple longs, as opposed to singular bytes/word/longs. Also, after reading/writing multiple longs into/from COG ram within the same instruction cycle, the previously-read D register value sitting on the one of the cog RAM's output ports is no longer there, so a re-read of cog RAM must be done, and this took some juggling to make it only one more clock, and only when RDLONG/WRLONG-repeat are operating.
I co-opted the RDLONG/WRLONG instructions into being the block r/w instructions. The way it works is, if you do a SETQ right before the RDLONG/WRLONG and if Q is greater than 0, it reads/writes that many more longs into ascending addresses, both in the hub and cog. This way, you can load a cog with executable code in two instructions, at any register offset and from any address in the hub with no alignment restrictions.
There were some nice developments to come out of this SETQ business. Interrupts are now disallowed during SETQ, just like ALTDS. This makes sure that SETQ+RDLONG/WRLONG are tightly bound and there are no worries about them being interrupted, as we can't read Q. This also makes Q's value in interrupts irrelevant, because whatever SETQ was being used for, already happened, so there's no need to worry about preserving it in order to use and later restore it. Everyone can use SETQ with abandon now.
Here are the instructions that use SETQ:
RDLONG (if reading multiple longs)
WRLONG (if writing multiple longs)
WAITINT/WAITPER/WAITEDG/WAITPAT/WAITRDL/WAITWRL/WAITXRO/WAITFBW (if WC is used add timeout)
QDIV/QROT (to provide another 32-bit term, in addition to D and S)
I just finished the RDLONG-repeat and now I must finish the WRLONG-repeat. After that, I don't feel like there will be anything lacking from the design. Smart pins can pick up all the sundry I/O functions that we can fit into them.
Yes, I will get that updated.
CALLD it will be.
Let us know how we can help.
You should know by now. If it's not cast into silicon, packaged in epoxy, soldered into a PCB and on your bench, it can change at any time
Exciting times ahead.
Ouch. I tried to read that before when you posted it, and just had to escape that wall of text.
Most off-putting sumarization ever written. Which from some reading, is a shame.
Agree, a simple Blue-Stamp module like that could be a nice add-on for Parallax, doesn't look like too much R&D is necessary. Might be nice on the P2-Proto board as well.
63 days since the end of Spring.
30 days to the beginning of Fall.
70 days to November 1st.
Interrupts are done.
The P2 instruction set is completed.
Block reads and writes almost done.
Deadline for final Verilog submission to Treehouse: November 1st
P2 Day is almost here!
Unfortunately, that's just how press-releases are. Pull out a part number, and look it up, and you get more useful information.
http://www.silabs.com/products/wireless/bluetooth/Pages/bluegecko-bluetooth-smart-module-intro.aspx
I have a question regarding P2 instructions, specifically the four-bit condition field (CCCC).
Why is CCCC not first part of the instruction (most significant nibble) ?
I can imagine some things that would be easier in that case.
Just can't get this out of my head --- I would love to understand the reasoning behind this.
Tried some searches and read a lot of posts on P2 design; could not find relevant info.
Thanks!
I was thinking of enabling and disabling instructions with Set Nibble, and maybe checking an instruction's state with Get Nibble.
Is this form of self-modification frowned upon?
(I guess it is asking for trouble; I know this from early programming education, but in Propeller context it seems reasonable.)
February 2014, Seairth started a discussion on same topic.
Is it the latest thought on this?
Found SETCOND anyway.
LOL! I Did? Well, it sounds like something I'd bring up. And, yes, I still think it would make more sense to place it at the beginning.
cccc_iiiiiii_zci_ddddddddd_sssssssss
just looks better to me (and better matches PASM's instruction layout too).
Also, some side benefits:
But, as Dave points out, the position is arbitrary. This all comes down to Chip's personal preference. Would he change it now? I doubt it. Unless he thought that the first bullet above was reason enough.
I imagine the only reason the condition bits are where they are is because one naturally tends to think of the opcode first.
In most processors I would have said it makes no difference where the different bit fields go in an instruction unless one has variable length instructions. But as you say, perhaps when thinking about self modifying code there is an advantage to moving things into convenient positions.
Self-modifying code is frowned upon in much of the computing world. It can make understanding how a program works rather difficult. Probably has security issues in a general purpose OS. Besides how are you going do it when executing from ROM?
But this is the Propeller. With the P1 self-modifying code is essential due to the nature of it's addressing modes and lack of indexing from registers. One has to be able to modify src/dst addresses within an instruction so as to index into arrays for example. You will find a lot of self-modifying code in OBEX. I see no reason for frowning on modification of condition bits.
Knowing you, I know what this means :-D
The reason that the condition field is located in the middle of the instruction is just because I made it that way a long time ago on Prop1 and carried it over to Prop2.
I was looking at our encodings and I agree that it would be a lot cleaner to have that 4-bit field at the head of the instruction. It would then mimic the order of the assembly language and, as someone pointed out, enable SETNIB7 to affect an instruction's condition, freeing up SETCOND for some other use. After I get the WRLONG-repeat done, I will make this change. It's rather trivial, but will take several hours to go through all the affected Verilog and assembler code. One other benefit to doing this is that it makes the 20-bit immediate addresses and 23-bit immediates for AUGS/AUGD contiguous. It's a lot cleaner.
Thanks for bringing this up, Guys.
Dave Hein, please forgive me. And, yes, some stuff got shuffled a bit when I got rid of RDLONGS/WRLONGS. It was only several instructions, though.
I can hear Dave's 'Nooooooooo' from here....
Still, cleaner opcodes will save everyone time downstream..
Since SETCOND was no longer needed and SETDS was rather superfluous, I removed them and added in the Prop2-Hot instructions SEUSSF/SEUSSR which do crazy bit rearranging and inverting in fixed forward and reverse patterns.
Now back to WRLONG-repeat.
Just a note to say that some P1 code uses the instruction bits position as a counter and sets the C flags when it overflows as a trick to speed-up reading/writing from hub memory, if I understand correctly, it won't work anymore if the positions is changed. I don't know if on P2 this kind of tricks are still needed or there are better/faster ways to do these things (admittely I'm a bit lost about the P2 feature changes...) but I would think a bit more before changing the instruction layout, maybe it still could be used for something else.
I've just been trying to work out what some of the new instructions do ... There's certainly a lot more there than I was expecting to find.
Now 28 bit constants can be buried within PASM code which also function as NOP's in fine tuned timing loops.
Actually, I'm OK with the changes, it will make it easier to hand assemble and disassemble P2 code, and it does make the bits in large immediate values contiguous. I guess I'll wait for a new instruction spec document before I incorporate the changes in spinsim.
That also means constants could need to be placement floating, and placed last.
Keeps the code readable, but allows more use of COG memory.