PDA

View Full Version : Faster PASM code, is it possible?



Bobb Fwed
07-10-2009, 01:29 AM
I have this code with simply shifts a value out, then shifts a return value in. This is the fewest number of commands I could figure out to use, but I am still new to PASM, so I would expect some experienced PASM programmers may know some tricks. I recently saw forum thread about 4-clock per bit outputs and 8-clock per bit inputs (but that was using the CTRs). I don't need quite that fast, but I am still looking at 5 instructions per bit output and output. Maybe there is some way to prep the value so I can shift in or out faster. Any help would be nice.




''****************************************
''* MCP 3008 ADC Frequency Reader v1.0 *
''* Author: Brandon Nimon *
''* Created: 6 July, 2009 *
''************************************************ ***********************************
''* This PASM-driven object reads the values off of an MCP3008 analog to digital *
''* converter (ADC) over the duration of one second. It mimics a Schmitt trigger, *
''* only clocking cycles after the channel's value has gone from zero to above a *
''* pre-specified value. Jitter can be easily reduced by requiring the "high" input *
''* to last for a certain number of ADC read cycles. The "high" value can also be *
''* easily changed to fit the needs of the input and to widen the "Schmitt gap". *
''* *
''* The program operates at about 1880 samples per megahertz the Propeller is *
''* running at. Which theoretically equates to 940Hz input per megahertz (though *
''* 620Hz or less would be safer assumption). Allowing for up to a 49.6KHz read at *
''* 80MHz clock speed. The PASM code also returns, for use by the programmer, the *
''* maximum and minimum ADC value achieved during the one second read time. *
''* *
''* The program has only been tested at a maximum of 64MHz clock speed. It has also *
''* not been tested using the RX and TX pins from the ADC IC connected on the same *
''* IO pin, but this method should work fine. It is possible at 3.3V, the MCP3008 *
''* may not be able to handle reads at above 107.2ksps. Testing was done on an *
''* MCP3008 at 3.30V and achived over 120ksps without a problem. *
''* *
''* If precise voltage measurements are needed, be sure to read the MCP3008's *
''* datasheet and note the skew of output values as the shift in/out speed is *
''* increased or decreased. *
''* *
''* The PASM code could easily be altered to read the same values off of a MCP3208. *
''* The 10-bit shift would need to be changed to 12 and the empty shifts would *
''* probably have to be increased as well. This can be done by altering the number *
''* that comes after this: "MOV bits_in, #" in the MAIN LOOP section, and *
''* "MOV emptyclk, #" in the SHIFT LSB IN AND IGNORE section. *
''* *
''* Some things that will affect how the program determines what a frequency is, *
''* can be altered in the DETERMINE FREQUENCY section. Widening the "Schmitt gap" *
''* or just requiring a higher voltage to consider "high" can be altered by *
''* changing the number that comes after this: "CMP val_out, #". Increasing *
''* the value will decrease the number of false frequency clocks, but too high will *
''* eliminate clocks altogether. Waiting for a certain number of "high" cycles can *
''* be changed by altering the number that comes after this: "CMP track, #". *
''* Resolution is reduced as the number of high cycles are increased, but it also *
''* reduces the likelihood of false frequency clocks. *
''* *
''* Example wiring would be as follows: *
''* R1 *
''* ADC─┳──┳──input device *
''* │ │ *
''* C1 R2 *
''*   *
''* R1: 10K (high impedence input is helpful, 10K should be the minimum) *
''* R2: 100K (this effectively creates a voltage divider, but also drives input to *
''* zero volts when not in use) *
''* C1: 0.01F (10000pF is the about the maximum you would want to use. The *
''* capcitor reduces jitter or spikes but also reduces resolution). *
''************************************************ ***********************************

CON

{ ==[ CLOCK SET ]== }
_CLKMODE = XTAL1 + PLL8X
_XINFREQ = 8_000_000 ' 8MHz Crystal


Vclk_p = 24
Vn_p = 25
Vo_p = 26
Vcs_p = 27

OBJ

DEBUG : "FullDuplexSerial"

VAR

LONG tx_pin ' 12 longs, DO NOT alter order!
LONG rx_pin
LONG ck_pin
LONG cs_pin
LONG bitcnt
LONG channel
LONG done
LONG frequency
LONG maxout
LONG minout
LONG count
LONG duration


PUB Main

DEBUG.start(31, 30, 0, 57600)
waitcnt(clkfreq + cnt)
DEBUG.tx($0D)

repeat
GET(Vo_p, Vn_p, Vclk_p, Vcs_p, 5, %11010)


PUB GET (DTPin, INPin, ClkPin, RSPin, bitcount, value)

longfill(@done, 0, 4)
longmove(@tx_pin, @DTPin, 6)
duration := clkfreq

cognew(@entry, @tx_pin)
waitcnt(clkfreq - 1000 + cnt)
REPEAT UNTIL (done)

DEBUG.str(string("Cycles: "))
DEBUG.dec(count)
DEBUG.tx($0D)

DEBUG.str(string("Frequency: "))
DEBUG.dec(frequency)
DEBUG.tx($0D)

DEBUG.str(string("Max Value: "))
DEBUG.dec(maxout)
DEBUG.tx($0D)

DEBUG.str(string("Min Value: "))
DEBUG.dec(minout)
DEBUG.tx($0D)

RETURN frequency


DAT
ORG 0
'=====[ START ]=========================================
entry
'-----[ SETUP VALUES, PINS, AND TIMER ]-----------------
MOV p1, PAR
RDLONG p2, p1 ' get data-out (TX) pin
MOV DPin, #1
SHL DPin, p2

ADD p1, #4
RDLONG p2, p1 ' get data-in (RX) pin
MOV NPin, #1
SHL NPin, p2

ADD p1, #4
RDLONG p2, p1 ' get clock (CLK) pin
MOV CPin, #1
SHL CPin, p2

ADD p1, #4
RDLONG p2, p1 ' get reset (CS) pin
MOV CSPin, #1
SHL CSPin, p2

ADD p1, #4
RDLONG Bits, p1 ' get number of output bits

ADD p1, #4
RDLONG val, p1 ' get output value

ADD p1, #4
MOV done_addr, p1 ' get "completed" mark address

ADD p1, #4
MOV freq_addr, p1 ' one second's worth of clocks

ADD p1, #4
MOV max_addr, p1 ' get max value address

ADD p1, #4
MOV min_addr, p1 ' get min value address

ADD p1, #4
MOV count_addr, p1 ' get min value address

ADD p1, #4
RDLONG sectime, p1 ' one second's worth of clocks


MOV OUTA, #0 ' set all low
MOV DIRA, #0 ' set all input
MOV OUTA, CSPin ' set CS pin high (inactive)
OR DIRA, DPin ' set pins we use to output
OR DIRA, CPin ' set pins we use to output
OR DIRA, CSPin ' set pins we use to output

MOV Bits3, Bits ' backup
MOV Bits2, #32 ' set to reset value (32)
SUB Bits2, Bits ' determine number of shifts (difference of output bits and a long)
SHL val, Bits2 ' shift value so first bit to output is at bit 31

MOV val2, val ' backup

MOV val_max, #0 ' set maximum value to zero

MOV val_min, #$1FF ' set minimum value to $1FF (511)
SHL val_min, #1 ' set minimum value to $3FE (1022)
ADD val_min, #1 ' set minimum value to $3FF (1023)

MOV freq, #0 ' clear any possible values
MOV clkreadyh, #0 ' clear any possible values

MOV idx, #0 ' set index to 7 (to cycle through voltages)
MOV strt, cnt ' start timer
NEG strt, strt ' get negative (used in place of SUB later on)


'-----[ MAIN LOOP ]-------------------------------------
bigloop
MOV bits_in, #10 ' set to reset value (10)

MOV Bits, Bits3 ' get backup
MOV val, val2 ' get backup

ANDN OUTA, CSPin ' set CS pin low (active)


'-----[ SHIFT COMMAND OUT ]-----------------------------
shift_out
SHL val, #1 WC ' shift output value and place bit 31 in C
MUXC OUTA, DPin ' set data pin to what value bit 31 was

OR OUTA, CPin ' start clock cycle
ANDN OUTA, CPin ' end clock cycle

DJNZ Bits, #shift_out ' cycle through the rest of the value


'-----[ NULL BITS ]-------------------------------------
MOV emptyclk, #2
:empty ' generate empty clocks to ditch unwanted bits
OR OUTA, CPin ' start clock cycle
ANDN OUTA, CPin ' end clock cycle
DJNZ emptyclk, #:empty


'-----[ SHIFT MSB VALUE IN ]----------------------------
MOV val_out, #0 ' set to reset value (0)
shift_in
TEST NPin, INA WC ' if data input pin is high
ADDX val_out, val_out ' add input pin value to output

OR OUTA, CPin ' start clock cycle
ANDN OUTA, CPin ' end clock cycle

DJNZ bits_in, #shift_in ' continue to end of input value


'-----[ SHIFT LSB IN AND IGNORE ]-----------------------
MOV emptyclk, #9
:empty ' generate empty clocks to ditch unwanted bits
OR OUTA, CPin ' start clock cycle
ANDN OUTA, CPin ' end clock cycle
DJNZ emptyclk, #:empty

OR OUTA, CSPin ' set CS pin high (inactive)


'-----[ DETERMINE MAX/MIN VALUE ]-----------------------
MIN val_max, val_out ' set val_max to whichever is highest
MAX val_min, val_out ' set val_min to whichever is lowest


'-----[ DETERMINE FREQUENCY ]---------------------------
CMP val_out, #4 WZ, WC ' if current value is below or equal to "threshold value"
IF_BE JMP #vbelow

vabove
CMP track, #2 WZ, WC ' if val_out has been above "threshold value" for this many times
IF_AE MOV clkreadyh, #1 ' set value to 1 so when val_out goes below "threshold value" we can clock a Hz
IF_AE JMP #move_on
ADD track, #1
JMP #move_on

vbelow
MOV track, #0 ' clear tracking value
TJNZ val_out, #move_on ' don't execute the following unless current value is 0 ("low" -- this gap between "threshold value" and 0 allows for a bit of a schmitt effect)
CMP clkreadyh, #0 WZ, WC
IF_A ADD freq, #1 ' add a clock to Hz value
MOV clkreadyh, #0 ' reset clock ready value

move_on
'-----[ SETUP VARIABLES FOR NEXT LOOP ]-----------------
ADD idx, #1
MOV waitlen, cnt ' get now
ADDS waitlen, strt ' difference of start and now
CMP waitlen, sectime WZ, WC ' if below one second...do another loop

IF_B JMP #bigloop ' do it again!


'-----[ END VALUE GATHERING ]---------------------------
WRLONG idx, count_addr ' number of loops to address
WRLONG freq, freq_addr ' frequency
WRLONG val_max, max_addr ' max voltage over the second-long sample
MOV Bits2, #1
WRLONG val_min, min_addr ' min voltage over the second-long sample
COGID val
WRLONG Bits2, done_addr ' write a non-zero value to done value
COGSTOP val


DPin RES ' tx
CPin RES ' clock
CSPin RES ' cs (reset)
NPin RES ' rx

sectime RES ' one second
waitlen RES ' tmp time
emptyclk RES ' number of empty clocks to excecute

freq_addr RES ' output address
max_addr RES ' output address
min_addr RES ' output address
done_addr RES ' output address
count_addr RES ' output address

freq RES ' frequency
clkreadyh RES ' ready to add one to frequency
track RES ' store number of times above "threshold value"
strt RES ' watchdog start time
idx RES

Bits RES ' number of output bits
Bits2 RES ' max number of bits available for output (32)
Bits3 RES ' "backup" of Bits
bits_in RES ' number of value bits to shift in (10)

val RES ' output value (channel number plus %11000)
val2 RES ' "backup" of val
val_out RES ' shifted in value
val_min RES ' minimum val_out
val_max RES ' maximum val_out

p1 RES ' address pointer (for value/pin/address setup)
p2 RES ' temperary value read from address


FIT 496




EDIT: Added full program in CODE section for reference.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
April, 2008: when I discovered the answers to all my micro-computational-botherations!

Post Edited (Bobb Fwed) : 7/9/2009 11:02:25 PM GMT

rokicki
07-10-2009, 01:33 AM
Yes, there are tons of faster ways to do this.

For simple examples, look at the fsrw object in the Obex.

For more complicated examples, look at the existing fsrw repository on sourceforge, where we have some examples where we can shift a value out along with two clock transitions, all at one instruction per bit (20MHz at an 80MHz clock).

For input, we have both two-instruction-per-bit and one-instruction-per-bit routines (the latter we are not yet confident in with respect to actual timing but the two-instruction-per-bit routines are solid).

You look like you are doing SD I/O, so our code is probably something that should be interesting and relevant.

In any case, for output, just dump the value in the PHSA register and shift that (setting the counter mode such that an output pin is always the high bit of the PHSA register). For input, TST/MUX and set one bit at a time. In both cases, unroll your loops.

Bobb Fwed
07-10-2009, 01:47 AM
I am actually just writing a system to get a frequency from an ADC like the MCP3008/3208. So small output values (5 bits usually) and only 10- or 12-bit inputs. I am at about 1620 samples per second for each 1MHz the propeller is running at (theoretically 129.6K samples per second at 80MHz). I was hoping to make that faster, but at some point I will run into the limit of the ADC (of course slowing it down for the ADC at 3.3V is easy...but I want to be able to hit the limit of the ADC at 5V: 200K samples per second).

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
April, 2008: when I discovered the answers to all my micro-computational-botherations!

Bobb Fwed
07-10-2009, 05:38 AM
rokicki said...
For input, TST/MUX and set one bit at a time. In both cases, unroll your loops.

Well, using TEST/ADDX (not MUX) I was able to get the input down to 2 instructions per bit (plus CLK cycle and DJNZ).
A couple other changes got me up to 1879 samples per megahert so a max of 150 ksps @ 80MHz. That is above the approximated limit of this ADC at 3.3V which is 107.6ksps. Of course, I've been testing it at over 120ksps at 3.3V without any problems.
But I would love to get it up to the 227.2ksps which is the theoretical maximum of the ADC at 5.5V. I guess I will attempt the CTR use for the shifting out.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
April, 2008: when I discovered the answers to all my micro-computational-botherations!

rokicki
07-10-2009, 05:46 AM
Yeah, CTR for shifting it out is easy.

After that you want to consider generating your clock with a counter. Problem with that is your timing is
pretty critical then; you need to make sure your clock edges are where you think they are with respect
to your data generation and sampling.

kwinn
07-10-2009, 06:47 AM
Oops. Took a look at ADC so scratch that idea.

Post Edited (kwinn) : 7/10/2009 12:28:54 AM GMT

Kye
07-10-2009, 06:51 AM
RCR and RCl also acomplish the same thing as addx in what your're doing.

So, for shifitng out with a counter. You would do something like: ... I'm not sure about this, I'm asking rokocki.

Assuming the variable value is eight bits.

shl value, #24

mov frqa, #0
mov phsa, value
mov frqa, value

...

And the counter would then just add that value to itself and shift it right out at 80 MHZ lets say.

And then to slow it down you would divide that value into a bunch of parts so it takes longer for it to be shifted out right?

Using powers of two would mean it would require very little time to do the divide also.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Nyamekye,

Post Edited (Kye) : 7/9/2009 11:58:13 PM GMT

Bobb Fwed
07-10-2009, 10:40 PM
Kye said...
RCR and RCl also acomplish the same thing as addx in what your're doing.

So, for shifitng out with a counter. You would do something like: ... I'm not sure about this, I'm asking rokocki.

Assuming the variable value is eight bits.

shl value, #24

mov frqa, #0
mov phsa, value
mov frqa, value

...

And the counter would then just add that value to itself and shift it right out at 80 MHZ lets say.

And then to slow it down you would divide that value into a bunch of parts so it takes longer for it to be shifted out right?
Using powers of two would mean it would require very little time to do the divide also.

I've never been good with the CTRs...

What CTR mode do you use for that code (above)? I think the FSRW object uses MOV CTRA, %00100 << 26 (or the equivalent).

Then, how do you drive the serial clock pin with CTRB?

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
April, 2008: when I discovered the answers to all my micro-computational-botherations!

Kye
07-11-2009, 06:55 AM
After thinking about it the code I proposed won't work because the value changes when its shifted out.

Um, you would use nco mode for low speed and pll for high speed.

The idea is that you set the fraq register to zero and in pll mode or nco mode bit 32 is the output.

So, you can just load the counter with the variable you want it to shift out, and then you just (shl phsa, #1) in a loop or unrooled loop and tha value is shifted out.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Nyamekye,