New P2 Silicon Observations

Yanomani · 2018-11-13 01:15

How many hours of electrocardiogram data could a mere 128 Mb spi flash hold in that configuration?

Women won't get bored if their husbands does survive many more decades!

cgracey · 2018-11-13 01:19

Yanomani wrote: »

How many hours of electrocardiogram data could a mere 128 Mb spi flash hold in that configuration?

Women won't get bored if their husbands does survive many more decades!

Actually, it wouldn't work like I thought, because you need to be writing timestamps periodically, so that when you do get a change, you can write it immediately. I'll keep thinking. It would be great to get 32 bits of data, somehow.

Yanomani · 2018-11-13 02:10

I was thinking the same, afterwards.

The first solution that comes to mind, is further reducing the maximum timestamp count to 28 bits (or any higher multiple of 14 bits), for internaly registering, at the streamer varying data detector.

Thus, bit 15 of the timestamp can be constantly 0, during real consecutive memory writing of varying data.

As long data becomes steady for two or more consecutive samples, internal bit 15 is set, and the counter is expanded to 28 (or it can be ever 28, but only the least significant 14 bits are shown, when registered at memory).

When a change occurs, you write two consecutive samples, with bit timestamp 15=1 + 14 bits (LSD then MSD, or the reverse; which better fits) of timer interval totalizer + 16 bits of actual sampled data, per write operation, thus new data are never lost.

Even if there are two differing 16-bit data values at the two consecutive samples, they'l be registered to memory, accordingly.

The equality timer totalyzer is reset at the first differing sample, preparing for a new round.

Even in cases where the two consecutive samples carry differing data, you know they are only one capture clock appart from each other. Anyway, registering ever differing samples, with a timestamp of 0001, is also another boring task.

0.02

cgracey · 2018-11-13 02:57

How about this:

You keep writing 32-bit samples until there is no change. On the first case of no change, you write how many samples you wrote (8 bits) and the elapsed time (24 bits), plus you copy the current offset to the Goertzel X accumulator for retrieval via GETXACC.

In this scheme, you just need a no-change sample at least once every 256 clocks:


change		write long	offset	time
--------------------------------------------
Yes		sample		0	0
Yes		sample		1	1
Yes		sample		2	2
Yes		sample		3	3
No		$04_000004 *	4	4
No				-	0
No				-	1
No				-	2
Yes		sample		5	3
Yes		sample		6	4
No		$02_000005 *	7	5
No (many)			-	6..$FFFEFF
No		$00_FFFF00 *	8	$FFFF00		TIMEOUT!!!
No				-	0
No				-	1
Yes		sample		9	2
No		$01_000003 *	10	3


* offset is written to Goertzel X for retrieval by GETXACC

So, you get to capture 32 pins this way. The 24-bit timer rolls over at $FFFF00, to provide enough more time for another 255 samples, in case they showed up so late.

To reconstruct what happened, you would do a GETXACC to retrieve the last offset of a samples+timestamp record and work backwards from there.

This would need a 32-bit mask to control what bits it's sensitive to.

Yanomani · 2018-11-13 04:30

When the associated Cog retrieves the offset, the Streamer is alerted since the accumulator resets to zero, due to GETXACC execution (there is no RQPIN-alike mechanism there, or am I loosing something?).

But, isn't there the need to stablish an interlocking mechanism, for the Streamer to be aware that the Cog has ended the reconstruction task?

E.g, in cases where the samples cames in equal-valued pairs, or a pair, followed by a triplet, followed by another pair of equal sample(s), and so on...

Doesn't the Cog could loose track, because it couln't retrieve such a rapid sequence of repeating patterns, in time to proccess each, individualy, in its backwards reconstruction task?

cgracey · 2018-11-13 04:44

There needs to be a trigger mechanism, too.

It would take a little time to reconstruct, yes.

jmg · 2018-11-13 05:06

cgracey wrote: »

Tubular wrote: »

I'm not sure what I'm more excited about seeing; the bitstreams, or the tool that views them at 1 horizontal pixel per clock.

Great to have your P2 based logic analyzer working again

Yes, that is super useful.

I've been thinking about a logic-analyzer mode for the streamer that will only write on changes and store 16 pins with a 16-bit timestamp. If it times out at 2^16 clocks, it issues the same data with $0000 for time. If no changes were occurring, it would write one long every 2^16 clocks. At 250MHz, that's only 15KB per second.

Yes, that sort of compressed capture gives very wide dynamic range, and all measurements would be 4ns accurate for 250MHz sysCLK.
Would this have qualify for Both/rise/fall edges to capture dT ?

cgracey · 2018-11-13 05:12

It could have such triggers. It would probably take 3 longs to hold all of the trigger setup, as 3 bits are needed per pin.

Ignore
High
Low
Rise
Fall

Too many possibilities for 2 bits per pin.

cgracey · 2018-11-13 05:15

Although, it's unlikely you could always trigger on multi-pin simultaneous transitions. So, some reduction in setup could be achieved by focusing on a single pin transition for triggering, qualified by other pin states.

cgracey · 2018-11-13 05:17

That would get it down to two bits per pin.

Then it might be nice to be able to qualify all recording. More bits needed.

cgracey · 2018-11-13 05:19

I think someone mentioned using SETPAT's configuration. That's a great idea. 64 config bits there.

Yanomani · 2018-11-13 05:26

After the accumulation was readen by GETXACC execution, the Streamer can switch to individual sample registering mode, untill another GETXACC was executed, informig that the Cog has ended its task (that can be as fast as registering the last accumulation into a separate buffer of indexes).

If needed, a paired Cog can do the unroll proccess, including sharing its workload with another pair of Cogs, and so on ...

P.S. Sure, provided the first GETXACC has returned a non-zero value.

jmg · 2018-11-13 05:36

cgracey wrote: »

Although, it's unlikely you could always trigger on multi-pin simultaneous transitions. So, some reduction in setup could be achieved by focusing on a single pin transition for triggering, qualified by other pin states.

That's more an Arm/Start condition, and then a capture condition, which is quite a lot of hardware.

I was thinking more of the time-stamp capture conditions, once started, assuming some user software Arm/Start, but that would have a tiny delay from the wait to the streamer start ?
How short could that delay be ? (from a WAIT to a Start Streamer) is that 3~4 sysclks ?

cgracey · 2018-11-13 05:53

I would think you'd record continuously in a big loop, and then wait for the trigger condition. Then stop a little while later and observe the data. Maybe the 24-bit timestamp can just always be the lower 24 bits of the system counter. A trigger event would just store the counter value. Then, you go through the record to get the pre- and post-data.

rogloh · 2018-11-13 06:07

A 32 bit inbuilt (let's say 250MHz) logic analyzer capability inside the P2 with sampling compression in HW sounds pretty handy, especially if coupled to an external device such as a Raspi via SMI or PC with some high speed 16 bit parallel bus on the remaining pins for extending the amount of compressed captured data or make it unlimited if the sampling clock is slow enough. Or the P2 could just run it all standalone and provide a nice control interface to a display at the same time. This could get a pretty decent logic analyzer into a lot of people's hands and will help debug their P2 SW/HW interfaces without needing external devices present or anything else hooked up to it. It could become invaluable over time.

However how much further risk/delay does this change add to P2 Rev B I wonder? Just a little? As a bonus last minute feature I guess it's worth the effort but only Chip can know where this might all end, hopefully very well and not mess up Rev B. There are just too many good things that would fit in a P2...

cgracey · 2018-11-13 06:12

It's a small incremental effort. There's really not much to it.

rogloh · 2018-11-13 06:15

Awesome!

pedward · 2018-11-13 19:31

It would be cool to have a level and edge triggered mode, where you set a level for the ADC to match.

cgracey · 2018-11-13 19:44

pedward wrote: »

It would be cool to have a level and edge triggered mode, where you set a level for the ADC to match.

That part could be done in software, as it's relatively slow.

Tubular · 2018-11-13 19:47

Or preset the smartpin using PinA > D for an analog threshold

cgracey · 2018-11-13 19:54

Tubular wrote: »

Or preset the smartpin using PinA > D for an analog threshold

Wow! That's the way to do it, and it sublimates right into the existing circuitry that way.

pedward · 2018-11-13 21:21

That's exactly what I was talking about, perhaps I didn't say it in the correct terminology.

Yes, preset a comparator in the smart pin for level triggering.

jmg · 2018-11-13 22:16

cgracey wrote: »

Tubular wrote: »

Or preset the smartpin using PinA > D for an analog threshold

Wow! That's the way to do it, and it sublimates right into the existing circuitry that way.

What is the Speed and Common mode voltage range of the PinA.V > VDAC(D) ?

Peter Jakacki · 2018-11-13 22:20

With these kinds of features it would help still the p2 as the go to chip for low cost logic analysers and scopes alone. Of course it becomes very easy to do all kinds of protocol analysis too but even then if we were analysing 16 serial ports we can also switch to smartpin receive and have a much larger capture size.

Imagine the front page of many an electronics magazine! Wot! not an Arduino? That's amazing! I want it!

JRetSapDoog · 2018-11-16 17:07

I'm not sure if this ever got closure from pages 15 and 16 of this thread:

cgracey wrote: »

Peter, big question:

Do you have a flash chip that can be booted from with deterministic timing, so that we can verify that, indeed, the big PRNG comes up with different results each time after power up?

cgracey wrote: »

It would be important to make a program that simply does a GETRND and outputs it to the pins. Each time power cycles, it should show something different.

The big PRNG is seeded from ADC noise by the ROM booter.

cgracey wrote: »

__red__ wrote: »

...In other news, we should probably remember to document not to rely on the PRNG seeding for anything specifically security related. It occurred to me today that just shorting rx_pin to ground while powering the chip on may make the seed predictable.

We use an internal GIO calibration mode which doesn't involve the external pin. We actually measure GND, but use the noise from many measurements to seed the PRNG. Doing an ADC conversion on GND doesn't return 0, but about 1/8, averaged over many bits. We use those bits to seed the PRNG. Not only does their duty vary slightly at all times, but the positions of 1's and 0's is ever-changing.

cgracey · 2018-11-16 17:09

Thanks for bringing that back up, JRetSapDoog. I was thinking about this yesterday, but forgot to say anything.

JRetSapDoog · 2018-11-16 17:24

Sure! I figured you'd get back to it sooner or later, but thought I'd mention it just so it wouldn't fall between the cracks. It's been in the back of my mind since I read those posts, so I decided to track down this thread just now. And now that it's back on your radar, I can free up a few neurons. Anyway, randomness (whatever it is) is interesting.

Cluso99 · 2018-11-18 03:29

Can anyone pick what is wrong with this?

The orgh $400 works but just a plain orgh does not. The palette is <$400.
I can change the $400 -> $404, $600, etc and it works fine, but if I make it $300 it fails.
Code is well below this.

  lut_start = 0
--------------------   
                org     0
                .....
'               loc     ptra,#\palette                  '\  copy color palette to lut ram
                loc     ptra,#@palette                  '\  copy color palette to lut ram
                setq2   #16-1                           '|
                rdlong  lut_start,ptra                  '/
--------------------

                orgh    '$400           ' this seems to fail <<<<<<<<<<<<<<<<<<<<<<<<<<
                orgh    $400            ' this seems to work <<<<<<<<<<<<<<<<<<<<<<<<<<
'====================================================================
'24 bit color format = rr_gg_bb_00
'====================================================================
palette         long    0               'black 
                long    $0000aa00       'blue 
                long    $00aa0000       'green 
                long    $00aaaa00       'cyan 
                long    $aa000000       'red 
                long    $aa00aa00       'magenta 
                long    $aa550000       'brown 
                long    $aaaaaa00       'gray 
                long    $55555500       'dark gray 
                long    $5555ff00       'bright blue 
                long    $55ff5500       'bright green 
                long    $55ffff00       'bright cyan 
                long    $ff555500       'bright red 
                long    $ff55ff00       'bright magenta 
                long    $ffff5500       'yellow 
                long    $ffffff00       'white 
'====================================================================

thej · 2018-11-18 04:59

Chip, is there anything that needs attention with regards to managing code between 2,4,8 &16 core variants of the P2?
Just thought I would put it out there.

J

cgracey · 2018-11-18 05:23

thej wrote: »

Chip, is there anything that needs attention with regards to managing code between 2,4,8 &16 core variants of the P2?
Just thought I would put it out there.

J

You will have varying hub delays. Otherwise, it should be the same.

New P2 Silicon Observations

Comments