Are you saying that the debug interrupt doesn't work only on the DE0-Nano images? Hmmm... I never invoked the write-protect. The DE0-Nano does have the distinction of having only 32KB of hub RAM. Hey! That means it only has 16KB at the bottom of hub RAM ($00000..$03FFF). Is that enough for your code?
... The DE0-Nano does have the distinction of having only 32KB of hub RAM. Hey! That means it only has 16KB at the bottom of hub RAM ($00000..$03FFF). Is that enough for your code?
I'll add the comment that small MCUs that can protect portions of memory, often give a choice to how much.
ie they may allow protect of any choice of 2k/4k/8k/16k, for example.
How many spare bits are in the Memory protect area ? - can P2 do the same ?
I'm making a new BeMicro-A9 file now with just two cogs and no CORDIC, so it will compile quickly. It has the old pin assignments, like before. We'll need to know if this works, or something else has broken.
Is HUBRAM no longer all contiguous for <1MB? If not, why not?
I you had a whole 1MB then it would be contiguous. For smaller memory, though, the last 32KB is always located up at the top, now, from $FC000..$FFFFF. It can be write-protected, and a gap will exist between where the bottom RAM ends and $FC000.
Let's say you have 512KB, as the first chip will. Bottom memory goes from $00000..$7BFFF and top memory goes from $FC000..$FFFFF.
I loaded V27x on CVA9 and that seems to work ok. I have SW1 on.
In my post above, the short code is in the v27x, while the longer code is in the v27. I don't see where the problem is, but the longer code fails. When DIP_SW1 is OFF, it should be the same as the short code.
I'm having other very strange problems with V27 too. The basic TAQOZ kernel seems to half work, there are some strange things going on there. Although I am still tracking it down there is one thing I found which may have to do with CORDIC functions. It's a bit awkward to check but I know that I use qdiv for a variety of functions and when I bypassed the higher level call to this in one routine it started working, at least for that routine.....
I'm having other very strange problems with V27 too. The basic TAQOZ kernel seems to half work, there are some strange things going on there. Although I am still tracking it down there is one thing I found which may have to do with CORDIC functions. It's a bit awkward to check but I know that I use qdiv for a variety of functions and when I bypassed the higher level call to this in one routine it started working, at least for that routine.....
That v27x doesn't have CORDIC. It was just a test to try to figure out why the BeMicro-A9 has been unresponsive.
I'm having other very strange problems with V27 too. The basic TAQOZ kernel seems to half work, there are some strange things going on there. Although I am still tracking it down there is one thing I found which may have to do with CORDIC functions. It's a bit awkward to check but I know that I use qdiv for a variety of functions and when I bypassed the higher level call to this in one routine it started working, at least for that routine.....
That v27x doesn't have CORDIC. It was just a test to try to figure out why the BeMicro-A9 has been unresponsive.
Peter, what board are you using?
Yes, I've confirmed that CORDIC DOES NOT WORK on V27x RTM!!!
I've got the BeMicro-A9 or CVA9 for short as my main P2 board, I don't worry about the rest although I do have CVA2, DE0, and DE2-115.
I'm having other very strange problems with V27 too. The basic TAQOZ kernel seems to half work, there are some strange things going on there. Although I am still tracking it down there is one thing I found which may have to do with CORDIC functions. It's a bit awkward to check but I know that I use qdiv for a variety of functions and when I bypassed the higher level call to this in one routine it started working, at least for that routine.....
That v27x doesn't have CORDIC. It was just a test to try to figure out why the BeMicro-A9 has been unresponsive.
Peter, what board are you using?
Yes, I've confirmed that CORDIC DOES NOT WORK on V27x RTM!!!
I've got the BeMicro-A9 or CVA9 for short as my main P2 board, I don't worry about the rest although I do have CVA2, DE0, and DE2-115.
Okay. Good. We'll get this thing working properly. Do you see anything wrong in those code-snippets of AHDL?
As far as I can tell I can't see a problem and the mapping looks fine too. However I might be able to find out what's going on from a different angle as I can load up V27 onto my old DE2-115 and load TAQOZ to see if I can find anything amiss there.
As far as I can tell I can't see a problem and the mapping looks fine too. However I might be able to find out what's going on from a different angle as I can load up V27 onto my old DE2-115 and load TAQOZ to see if I can find anything amiss there.
Is HUBRAM no longer all contiguous for <1MB? If not, why not?
I you had a whole 1MB then it would be contiguous. For smaller memory, though, the last 32KB is always located up at the top, now, from $FC000..$FFFFF. It can be write-protected, and a gap will exist between where the bottom RAM ends and $FC000.
Let's say you have 512KB, as the first chip will. Bottom memory goes from $00000..$7BFFF and top memory goes from $FC000..$FFFFF.
Why not just make it wrap on 512KB versions, so the top half aliases the bottom half? It is simpler and seems more natural. Otherwise, it's awkward to use that top 32KB in non-protected mode because it's 512KB away from everything else.
Is HUBRAM no longer all contiguous for <1MB? If not, why not?
I you had a whole 1MB then it would be contiguous. For smaller memory, though, the last 32KB is always located up at the top, now, from $FC000..$FFFFF. It can be write-protected, and a gap will exist between where the bottom RAM ends and $FC000.
Let's say you have 512KB, as the first chip will. Bottom memory goes from $00000..$7BFFF and top memory goes from $FC000..$FFFFF.
Why not just make it wrap on 512KB versions, so the top half aliases the bottom half? It is simpler and seems more natural. Otherwise, it's awkward to use that top 32KB in non-protected mode because it's 512KB away from everything else.
Then, should we ever get a 1MB version, it would potentially break existing code.
It used to wrap, but that was causing huge problems on accidentally-oversized downloads. What we have now is actually very clean. RAM starts at $00000 and then we always have the last 32K of RAM at the top of memory ($FC000..$FFFFF). That gives us TWO stable areas in which to put code, and the latter can be write-protected and conveniently contains the debug-interrupt vectors.
Here is a new file for the BeMicro-A9 where I'm assigning an intermediate variable to the DIP_SW1-determined signals, then assigning that variable, all at once, to the TRI element inputs. This shouldn't make any difference, but who knows? Maybe TRI inputs don't like being mux'd directly.
I finally got around to digging up my DE2-115 and got V27 into it. I haven't found any problems with it so far though. I will try the new CVA9 in the morning though.
It's good to see the write protect working
Here WP protects and WE enables.
MODULES LOADED:
1EC0: EXTEND.fth TACHYON FORTH EXTENSIONS for the P2 - 171124-0000
Mon, 01 Jan 200100:00:00
Parallax Propeller II .:.:--TAQOZ--:.:. V-10171112.0000 V25 BOOT
----------------------------------------------------------------
TAQOZ# ok
TAQOZ# ok
TAQOZ# ROM QD
0F.C000: 16 B8 64 FD E0 0700 FF 180106 FB 17 5C 65 FD ..d..........\e.
0F.C010: E1 0700 FF 8C 2004 FB 000880 FF 3F 00 0C FC ..... ......?... ok
TAQOZ# 0 ROM ! ROM QD
0F.C000: 00000000 E0 0700 FF 180106 FB 17 5C 65 FD .............\e.
0F.C010: E1 0700 FF 8C 2004 FB 000880 FF 3F 00 0C FC ..... ......?... ok
TAQOZ# $FD64B816 ROM ! ok
TAQOZ# ROM QD
0F.C000: 16 B8 64 FD E0 0700 FF 180106 FB 17 5C 65 FD ..d..........\e.
0F.C010: E1 0700 FF 8C 2004 FB 000880 FF 3F 00 0C FC ..... ......?... ok
TAQOZ# WP 0 ROM ! ROM QD
0F.C000: 16 B8 64 FD E0 0700 FF 180106 FB 17 5C 65 FD ..d..........\e.
0F.C010: E1 0700 FF 8C 2004 FB 000880 FF 3F 00 0C FC ..... ......?... ok
TAQOZ#
It used to wrap, but that was causing huge problems on accidentally-oversized downloads....
On that topic, can either the booter, or the downloader, verify the writes ?
Common in Flash parts, and it could be useful if at least the boundary blocks were verified on any download.
What we have now is actually very clean. RAM starts at $00000 and then we always have the last 32K of RAM at the top of memory ($FC000..$FFFFF). That gives us TWO stable areas in which to put code, and the latter can be write-protected and conveniently contains the debug-interrupt vectors.
Did On-Semi confirm you can add 32k of RAM ? What speed impact did adding the address bit have ?
Comments
My code is 20k.
That's my problem.
I'll add the comment that small MCUs that can protect portions of memory, often give a choice to how much.
ie they may allow protect of any choice of 2k/4k/8k/16k, for example.
How many spare bits are in the Memory protect area ? - can P2 do the same ?
Thanks a million for always trying this stuff out!
I'm making a new BeMicro-A9 file now with just two cogs and no CORDIC, so it will compile quickly. It has the old pin assignments, like before. We'll need to know if this works, or something else has broken.
Runs my test code fine.
I you had a whole 1MB then it would be contiguous. For smaller memory, though, the last 32KB is always located up at the top, now, from $FC000..$FFFFF. It can be write-protected, and a gap will exist between where the bottom RAM ends and $FC000.
Let's say you have 512KB, as the first chip will. Bottom memory goes from $00000..$7BFFF and top memory goes from $FC000..$FFFFF.
Well, that's weird, but good, since we've isolated the problem.
This works:
h.pin_in[63..0] = (p[63..56], !s[2], !s[1], !dip_sw[4..1], p[49..0]); pin_tri[].in = h.pin_out[63..0]; pin_tri[].oe = h.pin_dir[63..0]; p[63..0] = pin_tri[63..0].out;
And this doesn't ?!?:
case (dip_sw[1]) is when b"0" => h.pin_in[63..0] = (p[63..62], p[39], p[41], p[36], p[40], p[57..56], !s[2], !s[1], !dip_sw[4..1], p[49..42], p[60], p[58], p[61], p[38..37], p[59], p[35..0]); when b"1" => h.pin_in[63..0] = (p[63..56], !s[2], !s[1], !dip_sw[4..1], p[49..0]); end case; case (dip_sw[1]) is when b"0" => pin_tri[].in = ( h.pin_out[63..62], h.pin_out[39], h.pin_out[41], h.pin_out[36], h.pin_out[40], h.pin_out[57..42], h.pin_out[60], h.pin_out[58], h.pin_out[61], h.pin_out[38..37], h.pin_out[59], h.pin_out[35..0] ); when b"1" => pin_tri[].in = h.pin_out[63..0]; end case; case (dip_sw[1]) is when b"0" => pin_tri[].oe = ( h.pin_dir[63..62], h.pin_dir[39], h.pin_dir[41], h.pin_dir[36], h.pin_dir[40], h.pin_dir[57..42], h.pin_dir[60], h.pin_dir[58], h.pin_dir[61], h.pin_dir[38..37], h.pin_dir[59], h.pin_dir[35..0] ); when b"1" => pin_tri[].oe = h.pin_dir[63..0]; end case; p[63..0] = pin_tri[63..0].out;
P.S. This is an AHDL file, which Quartus needs at the top level. It's not Verilog.
In my post above, the short code is in the v27x, while the longer code is in the v27. I don't see where the problem is, but the longer code fails. When DIP_SW1 is OFF, it should be the same as the short code.
That v27x doesn't have CORDIC. It was just a test to try to figure out why the BeMicro-A9 has been unresponsive.
Peter, what board are you using?
I've got the BeMicro-A9 or CVA9 for short as my main P2 board, I don't worry about the rest although I do have CVA2, DE0, and DE2-115.
Okay. Good. We'll get this thing working properly. Do you see anything wrong in those code-snippets of AHDL?
Ah, good idea.
Why not just make it wrap on 512KB versions, so the top half aliases the bottom half? It is simpler and seems more natural. Otherwise, it's awkward to use that top 32KB in non-protected mode because it's 512KB away from everything else.
Then, should we ever get a 1MB version, it would potentially break existing code.
Here's a variant of both that isn't quite as wide on the page. Also, added the clock edges to my diagram and a key for my lines.
| | | | | | | rdRAM Ib|------+ | rdRAM Ic|------+ | rdRAM Id|------+ | rdRAM Ie| | | | | | | | | | | latch Da|--+ +--> rdRAM Db|---------> latch Db|--+ +--> rdRAM Dc|---------> latch Dc|--+ +--> rdRAM Dd|---------> latch Dd| latch Sa|--+ +--> rdRAM Sb|---------> latch Sb|--+ +--> rdRAM Sc|---------> latch Sc|--+ +--> rdRAM Sd|---------> latch Sd| latch Ia|--+ +--> latch Ib|---------> latch Ib|--+ +--> latch Ic|---------> latch Ic|--+ +--> latch Id|---------> latch Id| | | | | | | | | | | | +---------------ALU--------> wrRAM Ra| +---------------ALU--------> wrRAM Rb| +---------------ALU--------> wrRAM Rc| | | | | | | | | |stall/done = 'gox' | |stall/done = 'gox' | |stall/done = 'gox' | | 'get' | done = 'go' | 'get' | done = 'go' | 'get' | done = 'go' | clk _________ _________ _________ _________ _________ _________ ________| 1 |_________| 2 |_________| 3 |_________| 4 |_________| 5 |_________| 6 |_________| PCflux |...................|...........===c===.|...................|...........===d===.|...................|...........===e===.| Ifetch |====b====..........|...................|====c====..........|...................|====d====..........|...................| Odecod |.........====b====.|...................|.........====c====.|...................|.........====d====.|...................| SDfetc |...................|====b====..........|...................|====c====..........|...................|====d====..........| Fdecod |...................|.........====b====.|...................|.........====c====.|...................|.........====d====.| ALUs1 |========a=========.|...................|========b=========.|...................|========c=========.|...................| ALUs2 |...................|========a=========.|...................|=========b========.|...................|========c=========.| Rwrite |...................|...................|==a==..............|...................|==b==..............|...................| PCflux - Program Counter updating. Ifetch - Instruction fetching from CogRAM, LUT or FIFO. Odecod - Opcode decode for S/D operands. SDfetc - S and D parallel fetches, if required. Fdecod - Fully decode the instruction. Probably mostly fan-out to feed selected logic block. ALUs1 - Stage one execute, do the job. (I'm uncertain of a registered partition from stage two) ALUs2 - Stage two execute. Mostly mux'ing the result from the selected logic block. Rwrite - Result write back to D destination if required.
PCflux happens in the first cycle.
Yes, there is a registered partition between ALUs1 and ALUs2.
Here is a new file for the BeMicro-A9 where I'm assigning an intermediate variable to the DIP_SW1-determined signals, then assigning that variable, all at once, to the TRI element inputs. This shouldn't make any difference, but who knows? Maybe TRI inputs don't like being mux'd directly.
https://drive.google.com/file/d/1xXIGHczoyVasSMP8c2cvb3hV0sHbVIQq/view?usp=sharing
It's good to see the write protect working
Here WP protects and WE enables.
MODULES LOADED: 1EC0: EXTEND.fth TACHYON FORTH EXTENSIONS for the P2 - 171124-0000 Mon, 01 Jan 2001 00:00:00 Parallax Propeller II .:.:--TAQOZ--:.:. V-10171112.0000 V25 BOOT ---------------------------------------------------------------- TAQOZ# ok TAQOZ# ok TAQOZ# ROM QD 0F.C000: 16 B8 64 FD E0 07 00 FF 18 01 06 FB 17 5C 65 FD ..d..........\e. 0F.C010: E1 07 00 FF 8C 20 04 FB 00 08 80 FF 3F 00 0C FC ..... ......?... ok TAQOZ# 0 ROM ! ROM QD 0F.C000: 00 00 00 00 E0 07 00 FF 18 01 06 FB 17 5C 65 FD .............\e. 0F.C010: E1 07 00 FF 8C 20 04 FB 00 08 80 FF 3F 00 0C FC ..... ......?... ok TAQOZ# $FD64B816 ROM ! ok TAQOZ# ROM QD 0F.C000: 16 B8 64 FD E0 07 00 FF 18 01 06 FB 17 5C 65 FD ..d..........\e. 0F.C010: E1 07 00 FF 8C 20 04 FB 00 08 80 FF 3F 00 0C FC ..... ......?... ok TAQOZ# WP 0 ROM ! ROM QD 0F.C000: 16 B8 64 FD E0 07 00 FF 18 01 06 FB 17 5C 65 FD ..d..........\e. 0F.C010: E1 07 00 FF 8C 20 04 FB 00 08 80 FF 3F 00 0C FC ..... ......?... ok TAQOZ#
Common in Flash parts, and it could be useful if at least the boundary blocks were verified on any download.
Did On-Semi confirm you can add 32k of RAM ? What speed impact did adding the address bit have ?
is it maybe possible to put it at the end of the long address space?
It could be reached with 'negativ' addresses, and flow over to address 0? So sort of be continuous?
FFFFC000-FFFFFFFF
Mike