Shop OBEX P1 Docs P2 Docs Learn Events
I've decided on what I want to try first with the P1 Verilog (Evil Segment Registers) — Parallax Forums

I've decided on what I want to try first with the P1 Verilog (Evil Segment Registers)

Bill HenningBill Henning Posts: 6,445
edited 2014-08-13 06:53 in Propeller 1
Over on the P2 forum m00tykins asked about running Linux on the P2.

I know, old question. Not possible, right?

Actually... I ended up posting what the minimum hardware change required would be, and after a few minutes, I decided it would make a nice byte-sized Verilog experiment for me.

This will NOT happen quicky as I am a bit busy - but it will be my first P1 Verilog project!

This will greatly help PropGCC as well on the P1 Verilog on a DE2-115 (and perhaps BE CV) as multiple LMM programs could run at the same time, all happily thinking they had the Prop to themselves.

Mulitple Spin interpreters, Forth's etc would all benefit.

I intend to stick to the implementation below until I have tried it, as it is the simplest form I can imagine.

Quoting my post http://forums.parallax.com/showthread.php/156825-Linux-Minix-on-P2?p=1284685&viewfull=1#post1284685
I am planning on adding CSEG/DSEG to the Propeller 1 Verilog :)

I am even tempted to make the segment addresses be quad long aligned (just like the x86) as it would make potential x86 emulators faster.

I will use the 'WC' flag on RDxxxx/WRxxxx to choose between DSEG and SSEG

Two new instructions:

SETSEG n,reg
GETSEG n,reg

where n is:

0 for CSEG
1 for DSEG
2 for SSEG
3 for ESEG reserved, not implemented for now

(hey, if I will use x86 segment register names, I may as well do it right)

Segment registers will be initialized to $0 on startup, so unless modified, no relocation takes place.

to read using DSEG, use RDLONG dest, src

to read using SSEG, use RDLONG dest, src wc

to write using DSEG, use WRLONG dest, src

to write using SSEG, use WRLONG dest, src wc

hubexec will *ALWAYS* use CSEG, but as it is initialized to 0, it is transparent

When launching a cog, first few instructions should set up CSEG/DSEG/SSEG

In the future, limit registers can be added, using SETSEG/GETSEG n+4 will refer to the limit register associated with segment register n

No criticism of segmentation please - it is a nice simple cheap way of getting almost free data/code relocation, without the significant resources and changes an MMU would need.

Later, I may modify coginit to set up the segment registers, in that fashion multiple relocation-unaware P1 style code would port easily.

Comments

  • Bill HenningBill Henning Posts: 6,445
    edited 2014-08-09 11:03
    I was thinking about how to modify the segment registers for the cog that is about to be launched so that it can transparently run existing P1 code.

    One more instruction:

    INITSEG dst,src

    where

    src = source register in cog executing the INITSEG instruction
    dst = %000cccsss

    ccc = cog ID for the NON-running cog, INITSEG does nothing to running cogs
    sss = segment register to modify (CSEG/DSEG/SSEG/ESEG/CLIM/DLIM/SLIM/ELIM in order, with only CSEG/DSEG/SSEG implemented initially)

    Note that designs with more than eight cogs can simply increase the number of 'c' bits
  • Bill HenningBill Henning Posts: 6,445
    edited 2014-08-09 12:18
    Update for INITSEG

    INITSEG dst,src

    where

    src = source register in cog executing the INITSEG instruction
    dst = %p00cccsss

    p = protection bit
    ccc = cog ID for the NON-running cog, INITSEG does nothing to running cogs
    sss = segment register to modify (CSEG/DSEG/SSEG/ESEG/CLIM/DLIM/SLIM/ELIM in order, with only CSEG/DSEG/SSEG implemented initially)

    p
    0 cog can change its own segment registers
    1 cog cannot change its own segment registers

    sss
    000 CSEG
    001 DSEG
    010 SSEG
    011 ESEG - reserved, not implemented
    100 CLIM - reserved, not implemented
    101 DLIM - reserved, later writes only allowed if DSEG+addr < DLIM
    110 SLIM - reserved, later writes only allowed if SSEG+addr < SLIM
    111 ELIM - reserved, not implemented

    Road Map:

    - first implementation will only support CSEG/DSEG/SSEG
    - next support for 'p' will be added
  • Bill HenningBill Henning Posts: 6,445
    edited 2014-08-09 14:28
    I think the initial implementation of the limit registers will be a mask - so a 4KB limit would be $00FFF, 64KB $0FFFF etc.

    Advantage:

    - easy to implement
    - uses fewer transistors (and be faster) for blocking writes outside of the range than a comparator would
    - faster than a comparator

    Disadvantage:

    - Write protect limits must be power of two sized

    I am trying very hard to NOT think of adding a 'dirty' bit to DSEG/SSEG
  • Bill HenningBill Henning Posts: 6,445
    edited 2014-08-10 07:39
    I think I figured out a good use for ESEG / ELIM.

    Once the simple segmentation works, I'll consider adding an admittedly strange mode for hub addressing:

    $xxxx0000-$xxxx7FFF: use DSEG
    $xxxx8000-$xxxxFFFF: use ESEG

    1) This would allow segmentation aware programs to move/copy blocks of memory in a large hub more easily

    2) This would allow mapping the same "ROM" image to several legacy (non-segmented) P1 apps running in cogs.

    3) This allows for a shared data segment, and a private data segment, for multiple cogs.

    By adding more segment registers it would be possible to have more simultaneously accessible blocks (almost a poor man's MMU) however I am not convinced that would be worth it, at that point a real MMU would be better :)

    I do like the idea of sharing "ROM" images between cogs.
  • Bill HenningBill Henning Posts: 6,445
    edited 2014-08-12 15:46
    BE MICRO CV's are here!

    Checklist:

    - Download Quartus 14 Web Edition (DONE)
    - Install Quartus 14 Web Edition (DONE)
    - unpack BE MICRO CV (DONE)
    - Download Verilog code (DONE)

    Coming up next:

    - Build CV version
    - Apply /RST patch
    - "Deep Dive" Study the Verilog
    - Make Changes :-)
  • Cluso99Cluso99 Posts: 18,069
    edited 2014-08-12 15:59
    Hooray!
    Make sure you are running 64bit for Quartus 14.0...
    It is not spelled out on Altera's website and its a big download.
    I got caught so have had to change laptops (stole the wifes)

    I have had a deep dive into the verilog code. Lots of it is quite easy to understand.I have done some small coding in VHDL before, but not verilog.
    While I can understand it, writing it is another problem as you need to fully understand syntax and concept. But I am muddling thru so you should have no problems.

    May the force be with you... ;)
  • Bill HenningBill Henning Posts: 6,445
    edited 2014-08-12 16:04
    Thanks Ray!

    I've had enough pain due to 64 bit vs 32 bit libs in the past that I googled installing quartus on 64 bit Linux before I even started, and added the needed libs. After that it was pretty painless (so far at least).

    This will be fun, and due to the inexpensive CV's it won't take equipment from trying the next P2 drop.
    Cluso99 wrote: »
    Hooray!
    Make sure you are running 64bit for Quartus 14.0...
    It is not spelled out on Altera's website and its a big download.
    I got caught so have had to change laptops (stole the wifes)

    I have had a deep dive into the verilog code. Lots of it is quite easy to understand.I have done some small coding in VHDL before, but not verilog.
    While I can understand it, writing it is another problem as you need to fully understand syntax and concept. But I am muddling thru so you should have no problems.

    May the force be with you... ;)
  • Bill HenningBill Henning Posts: 6,445
    edited 2014-08-12 16:36
    Available Sacrifical Op-Codes:

    http://forums.parallax.com/showthread.php/156795-Where-to-grow-the-Prop1-architecture-instruction-wise
    cgracey wrote: »
    In the current Prop1 chip, there are four unused instruction opcodes:

    000100 ZCRICCCC DDDDDDDDD SSSSSSSSS <MUL> D,S/#
    000101 ZCRICCCC DDDDDDDDD SSSSSSSSS <MULS> D,S/#
    000110 ZCRICCCC DDDDDDDDD SSSSSSSSS <ENC> D,S/#
    000111 ZCRICCCC DDDDDDDDD SSSSSSSSS <ONES> D,S/#

    These instructions were planned for future implementation, but are unused in the current chip and tools.

    Making new instructions that work in these spaces will not impede the operation of the current tools systems. If you were to make the ADD instruction behave differently, on the other hand, you would cause the ROM code to malfunction and you wouldn't even be able to download a program. By doing new things in these four instruction spaces, you'll encounter no problems with the ROM code or current tools.

    I propose, for the purpose of varied development efforts, to rename these opcodes as follows:

    000100 ZCRICCCC DDDDDDDDD SSSSSSSSS USR0 D,S/#
    000101 ZCRICCCC DDDDDDDDD SSSSSSSSS USR1 D,S/#
    000110 ZCRICCCC DDDDDDDDD SSSSSSSSS USR2 D,S/#
    000111 ZCRICCCC DDDDDDDDD SSSSSSSSS USR3 D,S/#

    The assemblers can be modified to recognize and assemble these instructions so that a way exists to grow the instruction set without any fixed functional requirements.

    More opcode space - note these instructions would be unconditional:
    xxxxxx xxxx 0000 ddddddddd sssssssss

    For now, I intend to leave MUL/MULS alone, and will confine my experiments to:

    000110 ZCRICCCC DDDDDDDDD SSSSSSSSS <ENC> D,S/#
    000111 ZCRICCCC DDDDDDDDD SSSSSSSSS <ONES> D,S/#

    and possibly

    xxxxxx xxxx 0000 ddddddddd sssssssss

    with the exception of

    000000 xxxx 0000 ddddddddd sssssssss

    which I think should be reserved as official NOP space
  • Cluso99Cluso99 Posts: 18,069
    edited 2014-08-12 17:06
    The SYSOP instruction could be expanded like Chip did with the older P2.

    000011 zcri cccc xxxxxxxxx nnnnnnnnn
    where nnnnnnnnn are sub-instructions (some already defined as COGNEW, COGID, etc)

    If we can get someone to change one of the P1 compilers to:
    1. Add 4 instructions (as suggested by Chip)....
    USR0 D,S/# ' 000100 zcri cccc ddddddddd sssssssss
    USR1 D,S/# ' 000100 zcri cccc ddddddddd sssssssss
    USR2 D,S/# ' 000100 zcri cccc ddddddddd sssssssss
    USR3 D,S/# ' 000100 zcri cccc ddddddddd sssssssss
    2. Permit compilation of hub >32KB (64KB or more)

    This would aid us all in developing new P1 verilog and testing it.
  • Bill HenningBill Henning Posts: 6,445
    edited 2014-08-12 17:45
    Good point!
    Cluso99 wrote: »
    The SYSOP instruction could be expanded like Chip did with the older P2.

    000011 zcri cccc xxxxxxxxx nnnnnnnnn
    where nnnnnnnnn are sub-instructions (some already defined as COGNEW, COGID, etc)

    If we can get someone to change one of the P1 compilers to:
    1. Add 4 instructions (as suggested by Chip)....
    USR0 D,S/# ' 000100 zcri cccc ddddddddd sssssssss
    USR1 D,S/# ' 000100 zcri cccc ddddddddd sssssssss
    USR2 D,S/# ' 000100 zcri cccc ddddddddd sssssssss
    USR3 D,S/# ' 000100 zcri cccc ddddddddd sssssssss
    2. Permit compilation of hub >32KB (64KB or more)

    This would aid us all in developing new P1 verilog and testing it.
  • roglohrogloh Posts: 5,795
    edited 2014-08-12 23:58
    Two or four spare opcodes is not a lot, but if we are willing to have some instructions take 8 clocks instead of 4 to execute, we might be able to use one of the spare USR instructions to expand into an entirely new instruction set by allowing the next instruction to be processed completely differently when preceeded by this special instruction. Of course you do sacrifice COG instruction space and execution time that way, but it opens up lots of possibilities of completely new instructions.

    UPDATE: I'm sure this is very difficult way to go as it gets into the entire CPU processing pipeline. :frown:
  • pik33pik33 Posts: 2,366
    edited 2014-08-13 00:01
    Then we will have segments and prefix instructions. So let we can use this to switch it from real to protected mode. I don't know what that mode for the Propeller can be and then what it can do in this mode

    A new Propeller 386 in on its way :)
  • roglohrogloh Posts: 5,795
    edited 2014-08-13 00:16
    :smile:Great, then we can run DOOM.
  • Bill HenningBill Henning Posts: 6,445
    edited 2014-08-13 06:51
    I think I'll use USR3 (on my systems) for a full opcode, some of the SYSOP space, and later, if I need more, some of the NOP space (cccc=0000)

    I'd prefer to avoid prefix instructions, but a AUGS/AUGD/BIG would be helpful for 32 bit constants.
    rogloh wrote: »
    Two or four spare opcodes is not a lot, but if we are willing to have some instructions take 8 clocks instead of 4 to execute, we might be able to use one of the spare USR instructions to expand into an entirely new instruction set by allowing the next instruction to be processed completely differently when preceeded by this special instruction. Of course you do sacrifice COG instruction space and execution time that way, but it opens up lots of possibilities of completely new instructions.

    UPDATE: I'm sure this is very difficult way to go as it gets into the entire CPU processing pipeline. :frown:
  • Bill HenningBill Henning Posts: 6,445
    edited 2014-08-13 06:53
    Between an external cog being able to set up the segment registers before launching a new cog, and the limit registers, we would be 90% of the way to having a protected mode.

    FYI, the default of setting all segment registers to 0 is equivalent to the x86 small model.
    pik33 wrote: »
    Then we will have segments and prefix instructions. So let we can use this to switch it from real to protected mode. I don't know what that mode for the Propeller can be and then what it can do in this mode

    A new Propeller 386 in on its way :)
Sign In or Register to comment.