I've decided on what I want to try first with the P1 Verilog (Evil Segment Registers)
Bill Henning
Posts: 6,445
Over on the P2 forum m00tykins asked about running Linux on the P2.
I know, old question. Not possible, right?
Actually... I ended up posting what the minimum hardware change required would be, and after a few minutes, I decided it would make a nice byte-sized Verilog experiment for me.
This will NOT happen quicky as I am a bit busy - but it will be my first P1 Verilog project!
This will greatly help PropGCC as well on the P1 Verilog on a DE2-115 (and perhaps BE CV) as multiple LMM programs could run at the same time, all happily thinking they had the Prop to themselves.
Mulitple Spin interpreters, Forth's etc would all benefit.
I intend to stick to the implementation below until I have tried it, as it is the simplest form I can imagine.
Quoting my post http://forums.parallax.com/showthread.php/156825-Linux-Minix-on-P2?p=1284685&viewfull=1#post1284685
I know, old question. Not possible, right?
Actually... I ended up posting what the minimum hardware change required would be, and after a few minutes, I decided it would make a nice byte-sized Verilog experiment for me.
This will NOT happen quicky as I am a bit busy - but it will be my first P1 Verilog project!
This will greatly help PropGCC as well on the P1 Verilog on a DE2-115 (and perhaps BE CV) as multiple LMM programs could run at the same time, all happily thinking they had the Prop to themselves.
Mulitple Spin interpreters, Forth's etc would all benefit.
I intend to stick to the implementation below until I have tried it, as it is the simplest form I can imagine.
Quoting my post http://forums.parallax.com/showthread.php/156825-Linux-Minix-on-P2?p=1284685&viewfull=1#post1284685
Bill Henning wrote:I am planning on adding CSEG/DSEG to the Propeller 1 Verilog
I am even tempted to make the segment addresses be quad long aligned (just like the x86) as it would make potential x86 emulators faster.
I will use the 'WC' flag on RDxxxx/WRxxxx to choose between DSEG and SSEG
Two new instructions:
SETSEG n,reg
GETSEG n,reg
where n is:
0 for CSEG
1 for DSEG
2 for SSEG
3 for ESEG reserved, not implemented for now
(hey, if I will use x86 segment register names, I may as well do it right)
Segment registers will be initialized to $0 on startup, so unless modified, no relocation takes place.
to read using DSEG, use RDLONG dest, src
to read using SSEG, use RDLONG dest, src wc
to write using DSEG, use WRLONG dest, src
to write using SSEG, use WRLONG dest, src wc
hubexec will *ALWAYS* use CSEG, but as it is initialized to 0, it is transparent
When launching a cog, first few instructions should set up CSEG/DSEG/SSEG
In the future, limit registers can be added, using SETSEG/GETSEG n+4 will refer to the limit register associated with segment register n
No criticism of segmentation please - it is a nice simple cheap way of getting almost free data/code relocation, without the significant resources and changes an MMU would need.
Later, I may modify coginit to set up the segment registers, in that fashion multiple relocation-unaware P1 style code would port easily.
Comments
One more instruction:
INITSEG dst,src
where
src = source register in cog executing the INITSEG instruction
dst = %000cccsss
ccc = cog ID for the NON-running cog, INITSEG does nothing to running cogs
sss = segment register to modify (CSEG/DSEG/SSEG/ESEG/CLIM/DLIM/SLIM/ELIM in order, with only CSEG/DSEG/SSEG implemented initially)
Note that designs with more than eight cogs can simply increase the number of 'c' bits
INITSEG dst,src
where
src = source register in cog executing the INITSEG instruction
dst = %p00cccsss
p = protection bit
ccc = cog ID for the NON-running cog, INITSEG does nothing to running cogs
sss = segment register to modify (CSEG/DSEG/SSEG/ESEG/CLIM/DLIM/SLIM/ELIM in order, with only CSEG/DSEG/SSEG implemented initially)
p
0 cog can change its own segment registers
1 cog cannot change its own segment registers
sss
000 CSEG
001 DSEG
010 SSEG
011 ESEG - reserved, not implemented
100 CLIM - reserved, not implemented
101 DLIM - reserved, later writes only allowed if DSEG+addr < DLIM
110 SLIM - reserved, later writes only allowed if SSEG+addr < SLIM
111 ELIM - reserved, not implemented
Road Map:
- first implementation will only support CSEG/DSEG/SSEG
- next support for 'p' will be added
Advantage:
- easy to implement
- uses fewer transistors (and be faster) for blocking writes outside of the range than a comparator would
- faster than a comparator
Disadvantage:
- Write protect limits must be power of two sized
I am trying very hard to NOT think of adding a 'dirty' bit to DSEG/SSEG
Once the simple segmentation works, I'll consider adding an admittedly strange mode for hub addressing:
$xxxx0000-$xxxx7FFF: use DSEG
$xxxx8000-$xxxxFFFF: use ESEG
1) This would allow segmentation aware programs to move/copy blocks of memory in a large hub more easily
2) This would allow mapping the same "ROM" image to several legacy (non-segmented) P1 apps running in cogs.
3) This allows for a shared data segment, and a private data segment, for multiple cogs.
By adding more segment registers it would be possible to have more simultaneously accessible blocks (almost a poor man's MMU) however I am not convinced that would be worth it, at that point a real MMU would be better
I do like the idea of sharing "ROM" images between cogs.
Checklist:
- Download Quartus 14 Web Edition (DONE)
- Install Quartus 14 Web Edition (DONE)
- unpack BE MICRO CV (DONE)
- Download Verilog code (DONE)
Coming up next:
- Build CV version
- Apply /RST patch
- "Deep Dive" Study the Verilog
- Make Changes :-)
Make sure you are running 64bit for Quartus 14.0...
It is not spelled out on Altera's website and its a big download.
I got caught so have had to change laptops (stole the wifes)
I have had a deep dive into the verilog code. Lots of it is quite easy to understand.I have done some small coding in VHDL before, but not verilog.
While I can understand it, writing it is another problem as you need to fully understand syntax and concept. But I am muddling thru so you should have no problems.
May the force be with you...
I've had enough pain due to 64 bit vs 32 bit libs in the past that I googled installing quartus on 64 bit Linux before I even started, and added the needed libs. After that it was pretty painless (so far at least).
This will be fun, and due to the inexpensive CV's it won't take equipment from trying the next P2 drop.
http://forums.parallax.com/showthread.php/156795-Where-to-grow-the-Prop1-architecture-instruction-wise
More opcode space - note these instructions would be unconditional:
For now, I intend to leave MUL/MULS alone, and will confine my experiments to:
000110 ZCRICCCC DDDDDDDDD SSSSSSSSS <ENC> D,S/#
000111 ZCRICCCC DDDDDDDDD SSSSSSSSS <ONES> D,S/#
and possibly
xxxxxx xxxx 0000 ddddddddd sssssssss
with the exception of
000000 xxxx 0000 ddddddddd sssssssss
which I think should be reserved as official NOP space
000011 zcri cccc xxxxxxxxx nnnnnnnnn
where nnnnnnnnn are sub-instructions (some already defined as COGNEW, COGID, etc)
If we can get someone to change one of the P1 compilers to:
1. Add 4 instructions (as suggested by Chip)....
USR0 D,S/# ' 000100 zcri cccc ddddddddd sssssssss
USR1 D,S/# ' 000100 zcri cccc ddddddddd sssssssss
USR2 D,S/# ' 000100 zcri cccc ddddddddd sssssssss
USR3 D,S/# ' 000100 zcri cccc ddddddddd sssssssss
2. Permit compilation of hub >32KB (64KB or more)
This would aid us all in developing new P1 verilog and testing it.
UPDATE: I'm sure this is very difficult way to go as it gets into the entire CPU processing pipeline. :frown:
A new Propeller 386 in on its way
I'd prefer to avoid prefix instructions, but a AUGS/AUGD/BIG would be helpful for 32 bit constants.
FYI, the default of setting all segment registers to 0 is equivalent to the x86 small model.