Shop OBEX P1 Docs P2 Docs Learn Events
Prop2 FPGA files!!! - Updated 2 June 2018 - Final Version 32i - Page 99 — Parallax Forums

Prop2 FPGA files!!! - Updated 2 June 2018 - Final Version 32i

1969799101102160

Comments

  • cgraceycgracey Posts: 14,134
    Sorry, I meant 16KB.
  • 16K of vectors!!! No, it's just 16k, pretty sure it's a typo.

  • msrobots wrote: »
    FC000...FFFFF is 16K

    is it maybe possible to put it at the end of the long address space?

    It could be reached with 'negativ' addresses, and flow over to address 0? So sort of be continuous?

    FFFFC000-FFFFFFFF

    Mike

    I don't know. Kind of like it where it is. Flowing around 0 is just asking for trouble, if you ask me.
    cgracey wrote: »
    Sorry, I meant 16KB.

    Didn't refresh! :D
  • Chip
    BeMicro_A9_Prop2_v27y.jic is still a flat liner. :(
  • cgraceycgracey Posts: 14,134
    ozpropdev wrote: »
    Chip
    BeMicro_A9_Prop2_v27y.jic is still a flat liner. :(

    Thanks for testing it. This just doesn't make any sense.
  • cgraceycgracey Posts: 14,134
    potatohead wrote: »
    16K of vectors!!! No, it's just 16k, pretty sure it's a typo.

    The debug interrupt vectors start at the last long ($FFFFC) for cog0 and go down. There's one for each COG.
  • Yes, and that makes great sense. Jmg's post implied 16k for vectors. !!! That's a freaking ton of vectors. Kind of funny to think about.
  • jmgjmg Posts: 15,162
    potatohead wrote: »
    Yes, and that makes great sense. Jmg's post implied 16k for vectors. !!! That's a freaking ton of vectors. Kind of funny to think about.

    Not really, that's 16k simply because that's the next-increment in memory size. Vectors actually use is a tiny portion of that.
    The rest is valid variable or code space.

    Some have indicated locking the vectors, is less than ideal for debug, & moving them out of ROM Locked block
    is one way to solve that.

  • potatoheadpotatohead Posts: 10,261
    edited 2017-11-13 15:05
    I think that too, but I don't think it's worth extending to 32k, which is a pretty big chunk of RAM. 16k seems like an amount that won't cost us too much. Global variables, data structures, other things can be packed in there with no real impact, when the region isn't being used as a kernel, OS, or support software.

    I also think the possible solutions are pretty reasonable and workable.

    They can be redirected, at one instruction cost, or a routine can update them at a few instruction cost.

    Or, don't write inhibit.

    Edit: Its also nice to have those vectors be write inhibited. The code in the region owns them, and that can make recovery from ugly bugs possible. Bonus for debug tools written to run there.
  • cgraceycgracey Posts: 14,134
    The reason it's 16KB is because that's the ROM size. We had 8 longs up there to cover the debug interrupt instructions, but I expanded the area to accommodate the whole ROM, then added a write-protect mechanism. I think it's just peachy now. We've got the ROM loading into it in just 1.5ms on boot.

    I used to have memory wrap, but that caused some real gotcha's. Now, there's a big gap that reads $00's and ignores writes. That is safe and reasonable, I think.
  • Chip
    Is it possible to tweak Pnut to allow loads into the top 16k ($FC000-$FFFF)?
    This would help Nano users in particular who have lost half theeir hub memory.

    For example the following code compiles Ok from Pnut but never loads the top memory.
    Ctrl-M shows object file is correct although "OBJ byes" value is weird.
    OBJ bytes: :32,224
    
    _CLKMODE: 00
    _CLKFREQ: 00B71B00
    
    00000- 00 FE 65 FD 3E F8 0C FC 38 5B 81 FF 3E 0E 1C FC   ..e.>...8[..>...
    00010- 41 7C 64 FD 5A 62 82 FF 28 00 64 FD 00 C0 CF FE   A|d.Zb..(.d.....
    00020- 04 00 B0 FD EC FF 9F FD 61 ED CF FA 2D 00 64 AD   ........a...-.d.
    00030- 3E 00 9C FA F8 FF 9F CD 3E EC 27 FC E8 FF 9F FD   >.......>.'.....
    00040- 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
    .
    .
    .
    FBFD0- 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
    FBFE0- 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
    FBFF0- 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
    FC000- 48 65 6C 6C 6F 20 77 6F 72 6C 64 2C 20 00 00 00   Hello world, ...
    FC010- 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
    
    

    and here the source program
    
    CON
    	sys_clk = 80_000_000
    	baudrate = 115_200
    	tx_pin = 62
    	nco = (sys_clk / baudrate) * $1_0000 & $FFFFFC00 
    	nco_f = (sys_clk - ((sys_clk / baudrate) * baudrate)) * 64 / baudrate
    
    DAT		org
    
    		clkset	#$ff
    		wrpin	#%1_11110_0,#tx_pin
    		wxpin	##nco | nco_f << 10 |7,#tx_pin
    		dirh	#tx_pin
    
    loop		waitx	##sys_clk
    		loc	ptra,#@message
    		call	#print
    		jmp	#loop
    
    print		rdbyte	pa,ptra++ wz
    	if_z	ret
    send_char	rdpin	0,#tx_pin wc
    	if_c	jmp	#send_char
    		wypin	pa,#tx_pin
    		jmp	#print
    
    		orgh	$fc000
    message		byte	"Hello world, ",0
    
  • jmgjmg Posts: 15,162
    ozpropdev wrote: »
    Chip
    Is it possible to tweak Pnut to allow loads into the top 16k ($FC000-$FFFF)?
    Does P2load work ?

    Seems pnut should allow the full 1M download, as some FPGAs do have that ?
    It could easily report the memory slice sizes, on a simple bottom-up and top-down data inspection.


  • jmg wrote: »
    Seems pnut should allow the full 1M download, as some FPGAs do have that ?
    It could easily report the memory slice sizes, on a simple bottom-up and top-down data inspection.

    The big A9 FPGA's have been brought back to 512k to represent the final silicon.


  • jmgjmg Posts: 15,162
    ozpropdev wrote: »
    The big A9 FPGA's have been brought back to 512k to represent the final silicon.
    Sure, but that's only a software setting and easily changed.
    Once the chip is released, some may want to use 1M FPGAs for development platforms.

  • Cluso99Cluso99 Posts: 18,069
    What is wrong with putting this 16KB in the bottom 16KB of HUB RAM?

    HubRam.jpg

    The JMP vectors could be placed from $0_1000 upwards (need to be =>$0_1000 to run hubexec).

    We would like to be able to unlock the lower 128 byte block due to rdxxxx/wrxxxx immediate access.

    Maybe the write protection could be done in 4x 4KBblocks ?

    This way, hub is still contiguous. The ROM is copied into the bottom 16KB of hub. If necessary, the user program can move any/all of this to wherever. It permits a much better use of hub for buffers (particularly screen buffers). It also allows subsequent P2 extended versions to have >1MB of hub (with some caveats due to instruction bits).
    384 x 459 - 30K
  • cgraceycgracey Posts: 14,134
    edited 2017-11-13 15:37
    Whichever the case, the debug interrupt instructions (1 per cog) ought to be placed at one end of memory or the other, as they must be in fixed locations. It seems to me that there is a need for 16KB of write-protectable RAM, as well, and that might as well overlap the debug interrupt instructions, as they need write-protecting, too.

    Locating all that at $00000 would be cleaner and would not bifurcate hub memory, but then programs could not start at $00000, like they do now. I think it is nice that beginners get to orient their programs at the start of memory. They can be oblivious, for a time, about things like protected memory at the end of the map which contains the debug interrupt instructions.

    Anyway, I think much deference should be given to locating applications at $00000. The end of memory can almost be forgotten about, while frontloading the protected area makes it stick out like a sore thumb. It's kind of like giving your guest the best seat in the room.
  • potatoheadpotatohead Posts: 10,261
    edited 2017-11-13 15:39
    I really like where it is right now.

    Agreed with the sore thumb perception. Doing it this way keeps the number of things one must know to get started down lower.

    Write protecting the vectors makes a ton of sense. It's an opportunity for those to be managed by system code, should it be in play.



  • cgraceycgracey Posts: 14,134
    16KB is ~3% of 512KB. Not a big deal, anyway.
  • Yup.
  • My first assumption was that the ROM would be loaded from 0 upwards, containing booter and SHA-stuff.

    I am delighted about the current solution. But even with a gap in between, it needs to be possible to load something there while booting.

    The simplest way would be to allow Pnut and P2load to load the complete address space even if no RAM is present. Can do no harm?

    @ozpropdev's example (as usual) is clean to read and doing a ORG $FC000 for higher RAM or ORG $FFFFC to write a debug vector makes quite sense to any assembler programmer.

    But if one wants to use the ROM content and set a debug vector while loading a binary he will need to include the ROM content of the upper area, thus loading a 1MB image.

    The other solution would be a change in the binary format, instead of saving a copy of the RAM image, saving every ORG based block with address to load to and size.

    Then the P2 booter would need to walk down the list and load each block at each address.

    Mike
  • potatoheadpotatohead Posts: 10,261
    edited 2017-11-13 17:33
    Can't a second stage loader do that?

    Not that I mind an upgrade.

    Personally, I would prefer it load at addresses contained in the format, just like the P2 monitor would do on data cut n paste. That kind of thing rocks.

    We should do it.

    Programmers set ORG where needed, go. Simple, lean, fast, robust.

    I dislike having to push a whole megabyte when it's just not gonna get used.

    And, if we support ORG blocks, developers can still push a megabyte and zero / data fill the gaps in the image if they want or somehow need to.



  • jmgjmg Posts: 15,162
    cgracey wrote: »
    ....
    Locating all that at $00000 would be cleaner and would not bifurcate hub memory, but then programs could not start at $00000, like they do now. I think it is nice that beginners get to orient their programs at the start of memory. They can be oblivious, for a time, about things like protected memory at the end of the map which contains the debug interrupt instructions.
    ...
    Other MCUs have reset/interrupts at 0000, which means you always know where they are, no matter what future memory size you may have.
    If you cannot add 16k of memory above 512k, you are forcing a split on what was a clean binary block, & then I'd say placing the ROM at 00H becomes more important. ( That split has already bitten written code.. )

    With other MCUs the offsets are largely managed by the tools, (so invisible to any beginnner) and you can use segments in assembler, so that CSEG ORG 00 is still the first byte of code...
    ( in P2, first byte of HUB code would be something like HSEG 00 ?)

    You would probably want a ROM segment in the Assembler, no matter where the base of that is.


  • jmgjmg Posts: 15,162
    msrobots wrote: »
    The simplest way would be to allow Pnut and P2load to load the complete address space even if no RAM is present. Can do no harm?

    I'm guessing P2load already does that, and pnut certainly should be fixed.

    msrobots wrote: »
    But if one wants to use the ROM content and set a debug vector while loading a binary he will need to include the ROM content of the upper area, thus loading a 1MB image.

    The other solution would be a change in the binary format, instead of saving a copy of the RAM image, saving every ORG based block with address to load to and size.
    Then the P2 booter would need to walk down the list and load each block at each address.

    That's called intel hex :)

    Certainly, you do not want to be sending a large 1MB blob & even many files of 1MB are less than ideal...

  • Sure a second stage loader could do that also, but then you would NEED a second stage loader to access the upper ROM/RAM.

    Would it be possible that either the address counter wraps at $FFFFF or the 16 K placed at $FFFFC000 so it wraps with the long boundary?

    Then a loader could load a continuous image, say ORG $FFFFFFFC to set a debug vector and then the program image follows at address 0?

    Would allow to load continuous starting at the debug vectors leaving BIOS/ROM/RAM unchanged or starting at 0 without changing the upper RAM or start at (FFF)FC000 to load a continuous image in one block?

    just asking,

    Mike

  • cgraceycgracey Posts: 14,134
    I'll make PNut.exe, for now, just load up to $FFFBF, if there's data ORGH'd up that high. That will protect the last 16 longs, which are the debug interrupt instructions.

    I still need to get this BeMicro-A9 problem solved, somehow. I could just make two different images, but that seems ridiculous.
  • why protect the debug vectors?

    debug is mostly used in development, so uploading a image with activated debug vector might come in handy?

    Mike
  • Cluso99Cluso99 Posts: 18,069
    While you can place code currently starting at $0_0000, users cannot run code from there (hubexec) due to mapping of the cog and lut addresses for the program counter.
    So that has to be explained.

    Why is that any different to explaining that their hubexec code starts at $0_1000 with the first $xxx bytes reserved for the Interrupt vectors.
    And the ROM is initially copied to $0_0000-$0_3FFF (bottom 16KB of HUB RAM).

    The pnut2 (or whatever) compiler could default to compile at ORGH $0_4000.

    These days, memory maps on micros are often quite complex, with maps including ram, bootloaders, flash, and eeprom, registers, etc.

    The P2 would still be extremely simple, and wouldn't require the hub to be broken into two blocks, just one contiguous block. This is far superior, especially for some of the proposed later versions with less cogs that most likely will have smaller hub ram.

    Contiguous memory is IMHO always better. Think VGA where you want a large frame buffer. In this P2, you have a max frame buffer size of 512KB-16KB= 496KB.
    A 256KB P2 would have a max buffer of 240KB, and a 128KB would give 112KB.

    Remember all the old discussions about having a place for mailboxes, etc. These could all fit naturally in the bottom 4KB of Hub below the JMP vectors.

    BTW I haven't checked lately. I have assumed the Interrupt Vectors to be physical JUMP instructions. If they are in fact just addresses, they could be placed much lower in Hub, just above the 128 bytes that can be directly accessed using immediate addressing in RDxxxx/WRxxxx instructions.
  • wasn't there something that hubexec works below $1000, but just on ODD addresses?

    or is that gone?

    Mike
  • I think that is gone.
  • jmgjmg Posts: 15,162
    cgracey wrote: »
    I still need to get this BeMicro-A9 problem solved, somehow. I could just make two different images, but that seems ridiculous.


    Did you check to confirm the DIP sw is actually wired as expected ? - can you activate some other pin, based on the DIP setting to confirm - even using a similar equation syntax, in case Altera gets confused there ?
Sign In or Register to comment.