The TESTIN and TESTNIN still function the same, right?
I don't see right away what it could be...
Garryj's 0.8c code only has one REP and it's in a decimal out function, so not the problem.
Tried changing all TESTIN to TESTNIN, but that didn't work either...
Was playing around in Photoshop and noticed a row-order flip option when saving as BMP.
This is perfect for the VGA demos because it lets the image look not flipped when viewed on both PC and P2-VGA output. Had to tweak the offsets to data and palette for some reason though...
The v14 update improved the USB smart pin efficiency, which upset the v0.8c code's end-of-packet detection for both low and full speed. I got things working again on v14a, but I've been diverted to other things over the last few weeks and haven't had a chance to verify that it still works with the v15 FPGA image.
The TESTIN and TESTNIN still function the same, right?
I don't see right away what it could be...
Garryj's 0.8c code only has one REP and it's in a decimal out function, so not the problem.
Tried changing all TESTIN to TESTNIN, but that didn't work either...
TESTIN and TESTNIN recently switched encodings. PNut handles it properly, but other tools will need to be changed.
USB low/full speed keyboard/mouse v0.9 "demo" and "minimal footprint host" sources updated for the P2 v15 FPGA image.
- End-of-packet detection changed and a few timing tweaks to work with the USB improvements Chip introduced in FPGA image v14.
- Bus transaction error retry logic has been improved.
- Some host routines that are not time-sensitive have been moved to hub exec to free up cog space.
- PNut v15 did its job in regard to REP and TESTIN/TESTNIN.
Was playing around in Photoshop and noticed a row-order flip option when saving as BMP.
This is perfect for the VGA demos because it lets the image look not flipped when viewed on both PC and P2-VGA output. Had to tweak the offsets to data and palette for some reason though...
Neat, this works on my KDE desktop and the viewers seem happy too ... examining, I see the reason is because of a negative value (-480) is stored in image height parameter. See attached. This appears to be a hack but also seems to be well supported for loading and displaying even if not so much at saving from paint packages.
As for the bitmap offset variation, that looks to be something specific to Photoshop. I note the official method of handling this is to read value stored at offset 0x0b to find the start of the bitmap.
PS: I still marvel at how BMP format went backwards like that. Someone's retribution for being subjected to little-endian maybe ...
I was looking at GETRND the other day... Had to google PRND, but that's clear now.
But, I'm not sure what is meant by each cog being unique. Also, not sure how to seed it...
I was looking at GETRND the other day... Had to google PRND, but that's clear now.
But, I'm not sure what is meant by each cog being unique. Also, not sure how to seed it...
There's one central 32-bit LFSR that each cog gets a different bit-order and static XOR mask of. It's seeded on reset. You just use it.
PS: I don't know Verilog so I imagine that's probably a factor.
PPS: I know that 32'b0 means a 32bit constant seeded with binary value 0.
PPPS: I know that >> S[4:0] means shift right by a 5bit value taken from either field S or register S.
PPPPS: I take it that D[31:0] and D are the same register but a cycle, or two, apart?
PS: I don't know Verilog so I imagine that's probably a factor.
PPS: I know that 32'b0 means a 32bit constant seeded with binary value 0.
PPPS: I know that >> S[4:0] means shift right by a 5bit value taken from either field S or register S.
PPPPS: I take it that D[31:0] and D are the same register but a cycle, or two, apart?
You're just about there. In verilog, the curly braces means concatenation, so..
{32'b0, D[31:0]}
basically means "a 64-bit value with 32 zeros, followed by the 32 bits in D." I'm guessing the "lower()" just means take the lowest 32 bits (after the shift).
PS: I don't know Verilog so I imagine that's probably a factor.
PPS: I know that 32'b0 means a 32bit constant seeded with binary value 0.
PPPS: I know that >> S[4:0] means shift right by a 5bit value taken from either field S or register S.
PPPPS: I take it that D[31:0] and D are the same register but a cycle, or two, apart?
You're just about there. In verilog, the curly braces means concatenation, so..
{32'b0, D[31:0]}
basically means "a 64-bit value with 32 zeros, followed by the 32 bits in D." I'm guessing the "lower()" just means take the lowest 32 bits (after the shift).
In the same sense that a shift and rotate are kind of interchangeable as naming choices anyway. Virtual and logical are often like that too. Can be confusing if one is trying to use a specific definition - ends up needing extra clarification.
Oh, SCL and SCLU are prefix instructions! Intriguing, what brought that idea about? I'm guessing there is some experience.
EDIT: They are S operand modifiers, like the ALTS instruction, right? Ah, no, it'll be they feed the ALU via the S port. Replacing whatever data the fully decoded S operand would have supplied.
I think I might understand why they are prefixes - It allows for a three operand operation when not wanting to modify either of the multiplicand sources.
I'm not sure of the effectiveness though ... since copying over top of the modified source is only a single instruction away ... so a prefix seems to defeat any possible speed or even size advantage.
Oh, prefixing an ADD with a SCLU is an effective MAC I think. I think the light has turned on.
In computing a FIR filter, you are multiplying and accumulating a FIFO sample buffer with an array of filter coefficients, neither of which can be overwritten. So, SCL is important because it doesn't clobber the inputs.
Just had to unroll a big loop.
PNut now gives me a "Relative Address Out of Range" error at my djnz at the end.
I don't really understand this since the place I'm jumping to hasn't changed...
Can't you jump to anywhere in a cog?
Reading the spreadsheet it says a signed relative branch - ** If #S and cogex, PC += signed (S)." Which means a max immediate range of -256 for a loop. That'll be a no.
Comments
The TESTIN and TESTNIN still function the same, right?
I don't see right away what it could be...
Garryj's 0.8c code only has one REP and it's in a decimal out function, so not the problem.
Tried changing all TESTIN to TESTNIN, but that didn't work either...
This is perfect for the VGA demos because it lets the image look not flipped when viewed on both PC and P2-VGA output. Had to tweak the offsets to data and palette for some reason though...
I'll try to get to that over the weekend.
TESTIN and TESTNIN recently switched encodings. PNut handles it properly, but other tools will need to be changed.
- End-of-packet detection changed and a few timing tweaks to work with the USB improvements Chip introduced in FPGA image v14.
- Bus transaction error retry logic has been improved.
- Some host routines that are not time-sensitive have been moved to hub exec to free up cog space.
- PNut v15 did its job in regard to REP and TESTIN/TESTNIN.
Neat, this works on my KDE desktop and the viewers seem happy too ... examining, I see the reason is because of a negative value (-480) is stored in image height parameter. See attached. This appears to be a hack but also seems to be well supported for loading and displaying even if not so much at saving from paint packages.
As for the bitmap offset variation, that looks to be something specific to Photoshop. I note the official method of handling this is to read value stored at offset 0x0b to find the start of the bitmap.
PS: I still marvel at how BMP format went backwards like that. Someone's retribution for being subjected to little-endian maybe ...
I was looking at GETRND the other day... Had to google PRND, but that's clear now.
But, I'm not sure what is meant by each cog being unique. Also, not sure how to seed it...
There's one central 32-bit LFSR that each cog gets a different bit-order and static XOR mask of. It's seeded on reset. You just use it.
Reading your functional descriptions I'm lost trying to interpret your shift instructions. Here's Shift Right as example: Explain that step by step please.
PPS: I know that 32'b0 means a 32bit constant seeded with binary value 0.
PPPS: I know that >> S[4:0] means shift right by a 5bit value taken from either field S or register S.
PPPPS: I take it that D[31:0] and D are the same register but a cycle, or two, apart?
You're just about there. In verilog, the curly braces means concatenation, so.. basically means "a 64-bit value with 32 zeros, followed by the 32 bits in D." I'm guessing the "lower()" just means take the lowest 32 bits (after the shift).
I changed it to this:
Rotate right. D = [31:0] of ({D[31:0], D[31:0]} >> S[4:0]). C = last bit shifted out if S[4:0] > 0, else D[0]. *
Seairth explained it correctly. I know it's kind of cryptic. Not much space and I'm leaning in recent syntax.
That's right.
That seems the instruction is very close to the arithmetic ones, ie: Naming should follow as a shift rather than rotate.
SCR/SCL would create a naming conflict with SCL (scale). The carry IS being rotated into D, in a sense.
Suck. I guess that's the end of that idea.
EDIT: They are S operand modifiers, like the ALTS instruction, right? Ah, no, it'll be they feed the ALU via the S port. Replacing whatever data the fully decoded S operand would have supplied.
I'm not sure of the effectiveness though ... since copying over top of the modified source is only a single instruction away ... so a prefix seems to defeat any possible speed or even size advantage.
In computing a FIR filter, you are multiplying and accumulating a FIFO sample buffer with an array of filter coefficients, neither of which can be overwritten. So, SCL is important because it doesn't clobber the inputs.
PNut now gives me a "Relative Address Out of Range" error at my djnz at the end.
I don't really understand this since the place I'm jumping to hasn't changed...
Can't you jump to anywhere in a cog?
The change will be a result of HubExec ... even LutExec impacts this design choice.