Suggestion for an unconditional JMPRET instruction

cgracey · 2014-03-12 06:30

Sapieha wrote: »

Hi Chip.

I think it is NO good idea -- lets it be that it is --- else it can made problems we still can't imagine

That's what I'm worried about. We start going down the rabbit hole.

It was worth looking into, though. Who knows, some improvement may yet be possible. Let's see if Dave comes back with anything.

Bill Henning · 2014-03-12 06:53

While faster JMP's would be nice in some circumstances, I have found that in the vast majority of cases I could use the delayed instructions -- which are effectively single cycle.

Out of 50 delayed instructions, one or two will need a nop, so about 2% of the time they end up being effectively two cycle instructions.

cgracey wrote: »

Well, I don't see how this sequence could work if the RET executed in stage 2 of the pipeline:

NOP
PUSH #x
RET

The RET would execute at the same time as the NOP - before the PUSH. This could maybe be gotten around by not executing the RET in stage 2 if there is a PUSH or a POP in a higher stage of the pipeline. CALL would have the same issues as RET. I wonder if there'd be any other rules that would need to be followed. I don't have a lot of confidence at the moment about getting CALL/RET to execute early.

A plain JMP #/@ could be executed early with some circuitry to patch up an errant PC if it was cancelled later in the pipeline by the instruction above it. That should work. This means that only a plain JMP could be likely made to execute in 2 clocks without going to the extremes that other branches would require.

The reason REPS can work in stage 2 is because it doesn't subtract the repeat-block size from the target PC any earlier than when it's in stage 3, where it may be getting cancelled by a branch in the higher stage, cancelling it's PC-subtraction effect. So, REPS is kind of harmless because it doesn't modify the target PC until it's in the next stage, unlike what an early JMP would have to do - that is - affect the target PC right away, before we know if we're getting cancelled in the next stage or not. This leaves a potential mess to be cleaned up. This would not be too hard to accommodate, though. The question is, is it worth doing for what is basically a hard-wired GOTO? It's mainly CALLs and RETs that need speeding up, but they are perhaps too complicated to handle early.

I guess, in summary, there are strong reasons for having instructions execute at some constant point in the pipeline. It keeps things properly ordered and keeps one sane.

Dave Hein · 2014-03-12 06:53

I have some ideas about handling conflicting instructions at stage2 and stage 4, but I'm going to hold off posting them until after the new FPGA image is available. I would rather see a new FPGA than propose any more changes at this point.

Ramon · 2014-03-12 08:31

Please, don't hold off any idea. When will be the 'new' image the last image?

cgracey · 2014-03-12 08:39

Ramon wrote: »

Please, don't hold off any idea. When will be the 'new' image the last image?

We're a couple of iterations away, I think, because we still need to incorporate some USB functions and a SERDES.

Suggestion for an unconditional JMPRET instruction

Comments