I've decided on what I want to try first with the P1 Verilog (Evil Segment Registers)

Bill Henning · 2014-08-09 08:54

Over on the P2 forum m00tykins asked about running Linux on the P2.

I know, old question. Not possible, right?

Actually... I ended up posting what the minimum hardware change required would be, and after a few minutes, I decided it would make a nice byte-sized Verilog experiment for me.

This will NOT happen quicky as I am a bit busy - but it will be my first P1 Verilog project!

This will greatly help PropGCC as well on the P1 Verilog on a DE2-115 (and perhaps BE CV) as multiple LMM programs could run at the same time, all happily thinking they had the Prop to themselves.

Mulitple Spin interpreters, Forth's etc would all benefit.

I intend to stick to the implementation below until I have tried it, as it is the simplest form I can imagine.

Quoting my post http://forums.parallax.com/showthread.php/156825-Linux-Minix-on-P2?p=1284685&viewfull=1#post1284685

Bill Henning wrote:

I am planning on adding CSEG/DSEG to the Propeller 1 Verilog

I am even tempted to make the segment addresses be quad long aligned (just like the x86) as it would make potential x86 emulators faster.

I will use the 'WC' flag on RDxxxx/WRxxxx to choose between DSEG and SSEG

Two new instructions:

SETSEG n,reg
GETSEG n,reg

where n is:

0 for CSEG
1 for DSEG
2 for SSEG
3 for ESEG reserved, not implemented for now

(hey, if I will use x86 segment register names, I may as well do it right)

Segment registers will be initialized to $0 on startup, so unless modified, no relocation takes place.

to read using DSEG, use RDLONG dest, src

to read using SSEG, use RDLONG dest, src wc

to write using DSEG, use WRLONG dest, src

to write using SSEG, use WRLONG dest, src wc

hubexec will *ALWAYS* use CSEG, but as it is initialized to 0, it is transparent

When launching a cog, first few instructions should set up CSEG/DSEG/SSEG

In the future, limit registers can be added, using SETSEG/GETSEG n+4 will refer to the limit register associated with segment register n

No criticism of segmentation please - it is a nice simple cheap way of getting almost free data/code relocation, without the significant resources and changes an MMU would need.

Later, I may modify coginit to set up the segment registers, in that fashion multiple relocation-unaware P1 style code would port easily.

Bill Henning · 2014-08-09 11:03

I was thinking about how to modify the segment registers for the cog that is about to be launched so that it can transparently run existing P1 code.

One more instruction:

INITSEG dst,src

where

src = source register in cog executing the INITSEG instruction
dst = %000cccsss

ccc = cog ID for the NON-running cog, INITSEG does nothing to running cogs
sss = segment register to modify (CSEG/DSEG/SSEG/ESEG/CLIM/DLIM/SLIM/ELIM in order, with only CSEG/DSEG/SSEG implemented initially)

Note that designs with more than eight cogs can simply increase the number of 'c' bits

Bill Henning · 2014-08-09 12:18

Update for INITSEG

INITSEG dst,src

where

src = source register in cog executing the INITSEG instruction
dst = %p00cccsss

p = protection bit
ccc = cog ID for the NON-running cog, INITSEG does nothing to running cogs
sss = segment register to modify (CSEG/DSEG/SSEG/ESEG/CLIM/DLIM/SLIM/ELIM in order, with only CSEG/DSEG/SSEG implemented initially)

p
0 cog can change its own segment registers
1 cog cannot change its own segment registers

sss
000 CSEG
001 DSEG
010 SSEG
011 ESEG - reserved, not implemented
100 CLIM - reserved, not implemented
101 DLIM - reserved, later writes only allowed if DSEG+addr < DLIM
110 SLIM - reserved, later writes only allowed if SSEG+addr < SLIM
111 ELIM - reserved, not implemented

Road Map:

- first implementation will only support CSEG/DSEG/SSEG
- next support for 'p' will be added

Bill Henning · 2014-08-09 14:28

I think the initial implementation of the limit registers will be a mask - so a 4KB limit would be $00FFF, 64KB $0FFFF etc.

Advantage:

- easy to implement
- uses fewer transistors (and be faster) for blocking writes outside of the range than a comparator would
- faster than a comparator

Disadvantage:

- Write protect limits must be power of two sized

I am trying very hard to NOT think of adding a 'dirty' bit to DSEG/SSEG

Bill Henning · 2014-08-10 07:39

I think I figured out a good use for ESEG / ELIM.

Once the simple segmentation works, I'll consider adding an admittedly strange mode for hub addressing:

$xxxx0000-$xxxx7FFF: use DSEG
$xxxx8000-$xxxxFFFF: use ESEG

1) This would allow segmentation aware programs to move/copy blocks of memory in a large hub more easily

2) This would allow mapping the same "ROM" image to several legacy (non-segmented) P1 apps running in cogs.

3) This allows for a shared data segment, and a private data segment, for multiple cogs.

By adding more segment registers it would be possible to have more simultaneously accessible blocks (almost a poor man's MMU) however I am not convinced that would be worth it, at that point a real MMU would be better

I do like the idea of sharing "ROM" images between cogs.

Bill Henning · 2014-08-12 15:46

BE MICRO CV's are here!

Checklist:

- Download Quartus 14 Web Edition (DONE)
- Install Quartus 14 Web Edition (DONE)
- unpack BE MICRO CV (DONE)
- Download Verilog code (DONE)

Coming up next:

- Build CV version
- Apply /RST patch
- "Deep Dive" Study the Verilog
- Make Changes :-)

Cluso99 · 2014-08-12 15:59

Hooray!
Make sure you are running 64bit for Quartus 14.0...
It is not spelled out on Altera's website and its a big download.
I got caught so have had to change laptops (stole the wifes)

I have had a deep dive into the verilog code. Lots of it is quite easy to understand.I have done some small coding in VHDL before, but not verilog.
While I can understand it, writing it is another problem as you need to fully understand syntax and concept. But I am muddling thru so you should have no problems.

May the force be with you...

Bill Henning · 2014-08-12 16:04

Thanks Ray!

I've had enough pain due to 64 bit vs 32 bit libs in the past that I googled installing quartus on 64 bit Linux before I even started, and added the needed libs. After that it was pretty painless (so far at least).

This will be fun, and due to the inexpensive CV's it won't take equipment from trying the next P2 drop.

Cluso99 wrote: »

Hooray!
Make sure you are running 64bit for Quartus 14.0...
It is not spelled out on Altera's website and its a big download.
I got caught so have had to change laptops (stole the wifes)

I have had a deep dive into the verilog code. Lots of it is quite easy to understand.I have done some small coding in VHDL before, but not verilog.
While I can understand it, writing it is another problem as you need to fully understand syntax and concept. But I am muddling thru so you should have no problems.

May the force be with you...

Bill Henning · 2014-08-12 16:36

Available Sacrifical Op-Codes:

http://forums.parallax.com/showthread.php/156795-Where-to-grow-the-Prop1-architecture-instruction-wise

cgracey wrote: »

In the current Prop1 chip, there are four unused instruction opcodes:

000100 ZCRICCCC DDDDDDDDD SSSSSSSSS <MUL> D,S/#
000101 ZCRICCCC DDDDDDDDD SSSSSSSSS <MULS> D,S/#
000110 ZCRICCCC DDDDDDDDD SSSSSSSSS <ENC> D,S/#
000111 ZCRICCCC DDDDDDDDD SSSSSSSSS <ONES> D,S/#

These instructions were planned for future implementation, but are unused in the current chip and tools.

Making new instructions that work in these spaces will not impede the operation of the current tools systems. If you were to make the ADD instruction behave differently, on the other hand, you would cause the ROM code to malfunction and you wouldn't even be able to download a program. By doing new things in these four instruction spaces, you'll encounter no problems with the ROM code or current tools.

I propose, for the purpose of varied development efforts, to rename these opcodes as follows:

000100 ZCRICCCC DDDDDDDDD SSSSSSSSS USR0 D,S/#
000101 ZCRICCCC DDDDDDDDD SSSSSSSSS USR1 D,S/#
000110 ZCRICCCC DDDDDDDDD SSSSSSSSS USR2 D,S/#
000111 ZCRICCCC DDDDDDDDD SSSSSSSSS USR3 D,S/#

The assemblers can be modified to recognize and assemble these instructions so that a way exists to grow the instruction set without any fixed functional requirements.

More opcode space - note these instructions would be unconditional:

Bill Henning wrote: »

xxxxxx xxxx 0000 ddddddddd sssssssss

For now, I intend to leave MUL/MULS alone, and will confine my experiments to:

000110 ZCRICCCC DDDDDDDDD SSSSSSSSS <ENC> D,S/#
000111 ZCRICCCC DDDDDDDDD SSSSSSSSS <ONES> D,S/#

and possibly

xxxxxx xxxx 0000 ddddddddd sssssssss

with the exception of

000000 xxxx 0000 ddddddddd sssssssss

which I think should be reserved as official NOP space

Cluso99 · 2014-08-12 17:06

The SYSOP instruction could be expanded like Chip did with the older P2.

000011 zcri cccc xxxxxxxxx nnnnnnnnn
where nnnnnnnnn are sub-instructions (some already defined as COGNEW, COGID, etc)

If we can get someone to change one of the P1 compilers to:
1. Add 4 instructions (as suggested by Chip)....
USR0 D,S/# ' 000100 zcri cccc ddddddddd sssssssss
USR1 D,S/# ' 000100 zcri cccc ddddddddd sssssssss
USR2 D,S/# ' 000100 zcri cccc ddddddddd sssssssss
USR3 D,S/# ' 000100 zcri cccc ddddddddd sssssssss
2. Permit compilation of hub >32KB (64KB or more)

This would aid us all in developing new P1 verilog and testing it.

Bill Henning · 2014-08-12 17:45

Good point!

Cluso99 wrote: »

The SYSOP instruction could be expanded like Chip did with the older P2.

000011 zcri cccc xxxxxxxxx nnnnnnnnn
where nnnnnnnnn are sub-instructions (some already defined as COGNEW, COGID, etc)

If we can get someone to change one of the P1 compilers to:
1. Add 4 instructions (as suggested by Chip)....
USR0 D,S/# ' 000100 zcri cccc ddddddddd sssssssss
USR1 D,S/# ' 000100 zcri cccc ddddddddd sssssssss
USR2 D,S/# ' 000100 zcri cccc ddddddddd sssssssss
USR3 D,S/# ' 000100 zcri cccc ddddddddd sssssssss
2. Permit compilation of hub >32KB (64KB or more)

This would aid us all in developing new P1 verilog and testing it.

rogloh · 2014-08-12 23:58

Two or four spare opcodes is not a lot, but if we are willing to have some instructions take 8 clocks instead of 4 to execute, we might be able to use one of the spare USR instructions to expand into an entirely new instruction set by allowing the next instruction to be processed completely differently when preceeded by this special instruction. Of course you do sacrifice COG instruction space and execution time that way, but it opens up lots of possibilities of completely new instructions.

UPDATE: I'm sure this is very difficult way to go as it gets into the entire CPU processing pipeline. :frown:

pik33 · 2014-08-13 00:01

Then we will have segments and prefix instructions. So let we can use this to switch it from real to protected mode. I don't know what that mode for the Propeller can be and then what it can do in this mode

A new Propeller 386 in on its way

rogloh · 2014-08-13 00:16

Great, then we can run DOOM.

Bill Henning · 2014-08-13 06:51

I think I'll use USR3 (on my systems) for a full opcode, some of the SYSOP space, and later, if I need more, some of the NOP space (cccc=0000)

I'd prefer to avoid prefix instructions, but a AUGS/AUGD/BIG would be helpful for 32 bit constants.

rogloh wrote: »

Two or four spare opcodes is not a lot, but if we are willing to have some instructions take 8 clocks instead of 4 to execute, we might be able to use one of the spare USR instructions to expand into an entirely new instruction set by allowing the next instruction to be processed completely differently when preceeded by this special instruction. Of course you do sacrifice COG instruction space and execution time that way, but it opens up lots of possibilities of completely new instructions.

UPDATE: I'm sure this is very difficult way to go as it gets into the entire CPU processing pipeline. :frown:

Bill Henning · 2014-08-13 06:53

Between an external cog being able to set up the segment registers before launching a new cog, and the limit registers, we would be 90% of the way to having a protected mode.

FYI, the default of setting all segment registers to 0 is equivalent to the x86 small model.

pik33 wrote: »

Then we will have segments and prefix instructions. So let we can use this to switch it from real to protected mode. I don't know what that mode for the Propeller can be and then what it can do in this mode

A new Propeller 386 in on its way

I've decided on what I want to try first with the P1 Verilog (Evil Segment Registers)

Comments