@ersmith , I have some changes to enable higher baud rates like 2000000 on Macs using your latest loadp2 codebase I grabbed today. I fixed your Makefile to enable it to also build Mac specific code and put in the higher baud rate changes to the osint_linux.c file. This Makefile.txt is just renamed from Makefile to let me post it here.
Thanks, Roger! Those changes look good and I've put them up in my repo.
Was able to download and run my prior p2gcc based Micropython binary and run Chip's earlier HDMI demo on this rev B EVAL board. The colourful 16bpp bird image presented looks very nice (albeit upside down) on my old Dell 2405FPW which syncs fine to the 640x350 image it is being sent. Don't notice any shimmer or pixel sparkles etc with the magnifying glass. Sending over a 1.8m HDMI to DVI cable with the Digital Video Out breakout board. The rev B P2 chip is barely even warm at 250MHz.
This type of Dell LCD monitor is really great as it seems to sync to all sorts of things you throw at it. In the past I recall it even working down to displaying just a handful of active scanlines which it scales up to the full screen.
I do hope that the USB code works out fine operating at 250MHz because having HDMI and USB together is going to be really compelling now and I don't think overclocking it to 250MHz is going to be a drama for many people given the headroom being reported. There's no USB 12 MHz oscillator frequency multiple requirement is there?
It's a few years since I played with USB on P1. Back then I decided that 96MHz worked much better than at non 12MHz multiples so I downgraded my overclocked boards from 104MHz to 96MHz. Maybe I should have gone to 108MHz?? And then I never completed the project
... There's no USB 12 MHz oscillator frequency multiple requirement is there?
FWIR the USB development was done at 80MHz, and I’m not sure if a min MHz was determined on P2 silicon ?
- Someone chasing min power might be interested in lowest MHz ?
MCUs mostly use 48MHz for their USB clocks, but P2 may have SW paths that dictate min MHz ?
It is unlikely 150~250MHz SysCLKs care at all about the MHz, but some care is likely needed to get the NCO at ~12.000MHz averaged.
With this rev B EVAL board I was just able to get my Dell LCD to sync fine to DVI/HDMI operating at 252MHz which is an exact multiple of 12MHz and may help with USB clocking frequencies if 250MHz is not ideal (am still hoping 250MHz is going to work fine with USB, @garryj may know more there). This rate is going outside the normal timing spec of sending VGA over HDMI slightly and some other display devices may be less forgiving of this, but I'm still happy this frequency apparently works on my own setup.
The descriptions are different and I haven't familiarised myself with them .properly. The VGA code uses 7f00_0000 and çf00_0000 and I quickly changed them to 7f08 etc. But it's still not quite right. Mind you, I didn't have much time to read it through so a quick hint should help me on my way.
I found for VGA I had to change the CF00_0000) values to 7F01_0000.
It all started behaving itself after that.
Changing my $7F00_0000 to $7F08_0000 and my $CF00_0000 to $7F01_0000 got it working. Now I can watch video again and view BMPs and plot my Mandelbrots!
Here's my basic parameter table. I still need to update the bit field comments.
' 01 '
_hpixels long 640
_vlines long 480
' 03 FP,SY,BP,DIV
v_bs byte 16,96,48,1 '5,5,20,2
' 04 '
_fpix long fpixn ' pixel frequency'
_fset long round(fset) ' pixel nco value'
_palette long bmporg+$100 ' address of palette (normally $FC00)'
_screen long bmporg+$500 ' address of screen (normally $10000)'
' 08 $7F01 = output X3, X2, X1, X0 on all four DAC channels'
m_bs long $7F010000+55 'before sync <long> 32-bit immediate
m_sn long $7F010000+20 'sync
m_bv long $7F010000+110 'before visible
' 11 '
bpp long 8
'*1 note: The pixel parameter is calculated and added during init'
' 12 '%0111_0001_eppp_xxxx <long> 32-bit immediate $xxxxxxxx
m_vi long $7F01_0000 ' *1
' 13 %0111_1000_eppp_bbbb - 8-bit RFLONG LUT $xxxxxxxx
m_rf long $7F08_0000 ' *1
' 14 '
_hbytes long 640
dacmode_s long %0000_0000_000_1011000000000_01_00000_0+vgacog<<8 'hsync is 123-ohm, 3.3V
dacmode_c long %0000_0000_000_1011100000000_01_00000_0+vgacog<<8 'R/G/B are 75-ohm, 2.0V
With this rev B EVAL board I was just able to get my Dell LCD to sync fine to DVI/HDMI operating at 252MHz which is an exact multiple of 12MHz and may help with USB clocking frequencies if 250MHz is not ideal (am still hoping 250MHz is going to work fine with USB, @garryj may know more there). This rate is going outside the normal timing spec of sending VGA over HDMI slightly and some other display devices may be less forgiving of this, but I'm still happy this frequency apparently works on my own setup.
Some quick/dirty testing shows that at 250MHz USB works OK at both full-speed and low-speed (revA EVAL silicon). In my testing with VGA related sysclocks, only 148.5MHz (so far) has triggered USB frame jitter.
With this rev B EVAL board I was just able to get my Dell LCD to sync fine to DVI/HDMI operating at 252MHz which is an exact multiple of 12MHz and may help with USB clocking frequencies if 250MHz is not ideal (am still hoping 250MHz is going to work fine with USB, @garryj may know more there). This rate is going outside the normal timing spec of sending VGA over HDMI slightly and some other display devices may be less forgiving of this, but I'm still happy this frequency apparently works on my own setup.
252 MHz is much closer than 250 MHz to the HDMI 640x480 spec of 251.75 MHz.
Some quick/dirty testing shows that at 250MHz USB works OK at both full-speed and low-speed (revA EVAL silicon). In my testing with VGA related sysclocks, only 148.5MHz (so far) has triggered USB frame jitter.
Glad to hear that both 252MHz and 250MHz should work ok with USB. It's a slight pity that 148.5MHz has issues as that is probably a nice frequency of choice for HDTVs. I wonder if the rev B silicon will help in any way with its PLL changes or if 148.5MHz is just inherently a bad system clock value for getting USB working nicely. Did this frame jitter you observed cause actual USB data transfer errors/retransmissions etc or is it still able to recover all the data in the presence of the jitter?
At 148.5MHz my USB analyzer starts showing frame timing jitter (full-speed). The USB spec for a full-speed frame interval is 1.000ms +/- 500ns, and at 148.5MHz there are frames transmitted that are outside of that range. The deviation I'm seeing is in the ~500us to 604us ~500ns to 604ns range. The devices I test with still work, so they likely are a bit loose with the spec. 148MHz and 149MHz stay within range, so maybe the 500KHz XDIV value is a bit of a stretch? My knowledge of NCO workings is almost nil.
At 148.5MHz my USB analyzer starts showing frame timing jitter (full-speed). The USB spec for a full-speed frame interval is 1.000ms +/- 500ns, and at 148.5MHz there are frames transmitted that are outside of that range. The deviation I'm seeing is in the ~500us to 604us range. The devices I test with still work, so they likely are a bit loose with the spec. 148MHz and 149MHz stay within range, so maybe the 500KHz XDIV value is a bit of a stretch? My knowledge of NCO workings is almost nil.
Are you saying the protocol requires a 1.000ms frame interval, but your generating a 500us to 604us interval? Or, did you mean you are generating 500ns to 604ns of jitter?
You could toggle between two different adjacent baud rates to dither the baud, making timing more accurate.
252 MHz is much closer than 250 MHz to the HDMI 640x480 spec of 251.75 MHz.
With respect to regular VGA (not sent over HDMI), an even 25MHz dot clock makes me a bit nervous because I don't know if it's guaranteed to work. I came across this just now:
With analogue VGA monitors you can usually get away with using a 25 MHz pixel clock. However, based on the VESA tolerance of 0.5%, 25 MHz is not acceptable and displays may reject it. Note that 25.2 MHz is considered acceptable by VESA, which gives a 60 Hz refresh rate (rather than 59.940 Hz).
So 252 MHz divided down by 10 is closer to 25.175 MHz, and perhaps it has less jitter.
At 148.5MHz my USB analyzer starts showing frame timing jitter (full-speed). The USB spec for a full-speed frame interval is 1.000ms +/- 500ns, and at 148.5MHz there are frames transmitted that are outside of that range. The deviation I'm seeing is in the ~500us to 604us range. The devices I test with still work, so they likely are a bit loose with the spec. 148MHz and 149MHz stay within range, so maybe the 500KHz XDIV value is a bit of a stretch? My knowledge of NCO workings is almost nil.
I think you mean 500~604ns of jitter ?
148.5 is /12.375 to 12MHz or /(12+3/8) suggests 3 fractional nco bits are enough ?
500ns deviation is quite a few sysclks at 148.5Mhz
What is the USB NCO available step size ? Can 1 either side give a better average ?
Are you saying the protocol requires a 1.000ms frame interval, but your generating a 500us to 604us interval? Or, did you mean you are generating 500ns to 604ns of jitter?
You could toggle between two different adjacent baud rates to dither the baud, making timing more accurate.
Doh! Yes, 500ns to 604ns of jitter. The 1.000ms frame interval is controlled by a timer that triggers a frame transmission interrupt routine. I've tested with a dozen or so different sysclocks, and 148.5MHz (so far) has been the only outlier that generates this issue. I'll do some tweaking on both the frame timer and the USB NCO step size to see what happens.
I've just started trying to observe the sysclock jitter on the revA chip and compare with the revB. After some different approaches and trying different ways of using the scope I've settled on using the scope's "Repetitive" sampling mode. I think this mode is probably intended just for observing clock jitter.
Just now I've targetted the 148.5 MHz zone and in doing so discovered that the revA chip has moods! It can jump from stable to unstable and back again without warning. Also, lower PLL frequencies are noisier in general.
The revB chip is solid above 80 MHz on the PLL (pre-XDIVP), it progressively gets jittery lower down so we definitely need to exercise XDIVP from now on.
Nothing notable observed with 148.5 MHz on either chip.
With XDIVP = 1, below something like 25 MHz on the revB PLL can be extreme (high XDIVs are worse), big jumps in jitter, easy bad enough to corrupt serial comms. 20-21 MHz is a particular bad spot, maybe related to the 20 MHz crystal frequency. 10 MHz and lower is a pile of potholes.
Here's a screenshot of a very stable clock on the revB Eval board running 400 MHz sysclock producing a 1 MHz square wave on pin0. The trigger point is first rising edge at 5% position. The bottom half of the display is 20x zoom of the top half showing the final low going transition in detail at 10 ns/div. What isn't obvious at first glance is scope's sampling hardware is 200 MS/s (Mega Samples per second). But at the top it says 50 GS/s. What's going on is it actually triggers on many cycles of the "repeating" square wave to progressively build small pieces of the total 100,000 samples. The timing is accurate enough to emulate all the intervening samples needed for 50 GS/s effective ... provided the signal keeps repeating exactly the same all the time. So, if there was any jitter, over what is a relatively large time period, then it would splatter the display.
XDIV = 10
XMUL = 200
XDIVP = 1
Impressive scope Evan.
What initial divider and multiplier are you using?
With Rev1 silicon jitter was more apparent the lower the base frequency ie the lower frequency after the first divider. This resulted in a higher poll multiplier which seemed to be the culprit.
When I divided the crystal down to 0.5MHz, multiply by 297 and final divide by 1 there was a lot of jitter on the vga running 148.5MHz. But when using the base as 3MHz, multiplier of 99, and final divide by 2 it was way more stable.
PS: I've added the HUBSET mode config to the above post.
And here's what the same revB chip looks like with a comport corrupting rough result at 20.5 MHz sysclock. The comport actually still worked!
XDIV = 40
XMUL = 41
XDIVP = 1
And this one, on same revB chip, is same 20.5 MHz sysclock but the PLL is up at 123 MHz so all is good again.
XDIV = 40
XMUL = 246
XDIVP = 6
With Rev1 silicon jitter was more apparent the lower the base frequency ie the lower frequency after the first divider.
It looks like it's more mixed than that too. On the revA chip, the scope is telling me XDIV of 5, 7 and 17 are also a problem. It comes and goes from 20 and up.
@garryj we should also check USB works with 270MHz operation too as that allows HDTV resolutions over HDMI instead of VGA.
Yes, TonyB and JRetSapDoog, I agree that 252MHz would be better for sending VGA over DVI/HDMI, not 250MHz. 250MHz happened to be the frequency setup in the Chip's original HDMI demo code and it worked on my setup, though officially VGA is meant to be using a pixel rate of 25.175MHz in the specs. Being as close as possible to official VGA timing should help a larger range of display devices be supported by P2s, in particular HDTVs.
The revB chip is solid above 80 MHz on the PLL (pre-XDIVP), it progressively gets jittery lower down so we definitely need to exercise XDIVP from now on.
That’s largely expected - at lowest MHz the VCO is being very voltage starved, so will quickly deteriorate.
Most MCU vendors give some MHz band for VCO correct operation.
I think you are saying at no point is Rev B worse than Rev A, which is good to hear ?
@garryj we should also check USB works with 270MHz operation too as that allows HDTV resolutions over HDMI instead of VGA.
Didn't see any issues at 270MHz.
Not much progress regarding the 148.5MHz issue. At 148MHz and 149MHz there is no "frame jitter". At 148.5MHz the USB data transfers are clean, too -- squeaky clean. The "frame jitter" complaints all happen when the host/driver is polling the device for mouse/keyboard changes. When no data is available the device will issue an IN-NAK packet, and there are lots of them. The keyboard driver keeps a NAK count to trigger key auto-repeat, so they're almost used like a strobe. There may be something the analyzer sees during this exchange that I don't yet understand. But at this stage, whatever it is that the analyzer doesn't like is not having any affect on data transfers. Since the USB has its own NCO, maybe there's some subtle timing differences that show up at certain sysclock/USB pairings?
Today I played around with the frequencies of the revision B P2 and managed to drive my HDTV over an HDMI cable with it clocking at 270MHz. The resolution is 720x480@60Hz (480p60) which is an official TV resolution with a pixel rate of 27MHz. My Pioneer plasma recognized it and scaled accordingly.
I have just used the same test image from Chip's demo which is only 640x350 16bpp but I padded it with 40 pixels of red and green at the sides using immediate mode output to make up the 720 pixel width until I find/make a nicer test image that is already 720 pixels wide. The garbage at the bottom is when the streamer runs out of valid image pixels in hub RAM. Obviously it needs more RAM for holding an entire frame buffer at this resolution and 16bpp colour depth.
I'd quite like to look at that HyperRAM board soon and see if I can stream some pixel data out of it. The problem I see is that this particular HyperRAM only runs at up to 100MHz (200MB/s DDR) but ideally for a frame buffer we might like to clock it at 135MHz (or 126MHz for VGA) as that would be half of the P2 system clock used at these video output resolutions. It now seems like it will have to be clocked at only a quarter of the P2 clock, or 67.5MHz / 63MHz instead, which probably gets you transfers in the vicinity of 120MB/s depending on burst length. It would be much better if the RAM was rated to clock up to ~133MHz at 3.3V but for hitting that RAM clock speed right now you need to operate it at 1.8V. Streaming from this HyperRAM at ~120MB/s after overheads should still support 27M pixels per second at 480p60 using a 32 bit aligned truecolor pixel layout in memory (which is somewhat wasteful) but then it has minimal memory bandwidth remaining for any other read/writes. Using a 16bpp mode instead would leave you approximately half the bandwidth for any other reads/writes and is likely the better choice, or you could even go down to 8 bit colour. Maybe we could gang up two HyperRAMs to effectively make a 16 bit bus instead of an 8 bit bus for faster transfers with the higher colour depths. I wonder if they could then share clocks and chip selects and run from one single COG? Or you could run them individually on a COG each I guess.
Some of the simple tweaks I used with Chip's demo code for getting this video resolution:
hubset ##%1_000001_0000011010_1111_10_00 'configure PLL, 20MHz / 2 * 27 * 1 = 270MHz
...
callpa #25,#blank 'top blanks
mov x,##480 'set visible lines
line call #hsync 'do horizontal sync
xcont m_frame,color1
xcont m_rf,#1 'do visible line
xcont m_frame,color2
djnz x,#line 'another line?
callpa #18,#blank 'bottom blanks
...
color1 long %1011110000_0111110000_0111110000_10 - red(ish)
color2 long %0111110000_1011110000_0111110000_10 - green(ish)
m_frame long $7F910000+40 'frame borders 40 pixels
m_bs long $7F910000+32 'before sync
m_sn long $7F910000+64 'sync
m_bv long $7F910000+42 'before visible
m_vi long $7F910000+720 'visible
m_rf long $BF950000+640 'visible rfword rgb16 (5:6:5)
That’s largely expected - at lowest MHz the VCO is being very voltage starved, so will quickly deteriorate.
Most MCU vendors give some MHz band for VCO correct operation.
I think you are saying at no point is Rev B worse than Rev A, which is good to hear
Yep. Although, it's the first time I've done this so I'm kind of giving a running commentary of my discovery at the same time.
Today I played around with the frequencies of the revision B P2 and managed to drive my HDTV over an HDMI cable with it clocking at 270MHz. The resolution is 720x480@60Hz (480p60) ...
Great mode choice. Not pushing too hard on the overclocking and is a good reliable fit mode wise. Pixel doubling of that for games would be the way to go.
... It would be much better if the RAM was rated to clock up to ~133MHz at 3.3V but for hitting that RAM clock speed right now you need to operate it at 1.8V.
Baby steps. Get that going then see what happens with the RAM clock doubled. It may not complain at all.
Whats interesting with the Hyperram spec is that at 1v8, they spec 100, 133 or 166 MHz operation.
By the time you're at 3v0 you've only got 100 MHz operation available. I wonder what the midpoint voltage is, where 133 MHz still works? Could we be lucky at 2v5?
Chip has said previously that I/O would be slow and weak at 1v8 VIO, but perhaps there is some compromise in the middle. Not trying to distract from the excellent testing... just an observation
Whats interesting with the Hyperram spec is that at 1v8, they spec 100, 133 or 166 MHz operation.
By the time you're at 3v0 you've only got 100 MHz operation available. I wonder what the midpoint voltage is, where 133 MHz still works? Could we be lucky at 2v5?
Chip has said previously that I/O would be slow and weak at 1v8 VIO, but perhaps there is some compromise in the middle. Not trying to distract from the excellent testing... just an observation
That's a good point. We thought about adding an LDO to run the memory just above the P2's IO logic threshold voltage, but for demo and development decided to keep things as simple and standard as possible, especially as the possibility exists to use up to 3 of these hyper mem breakouts for experimenting with 2x or 3x faster access to data; and that nicely demos the abilities of cogs too.
So yes, tweaking vcc is one option, or multiple modules another.
Another thing about the hyper breakout pcb is that dual ram could be used instead of 1ram, 1flash. That would need an intrepid user with a hot air gun, but it's another option too! Maybe if there's enough demand, Parallax might be able to stock that configuration.
Comments
I wish I could take credit for loadp2, but it's @"Dave Hein" 's work -- I'm just doing some minor updates to it .
Here's a binary of the new loadp2 which should allow for fast loading on the new chips.
Eric
It's a few years since I played with USB on P1. Back then I decided that 96MHz worked much better than at non 12MHz multiples so I downgraded my overclocked boards from 104MHz to 96MHz. Maybe I should have gone to 108MHz?? And then I never completed the project
- Someone chasing min power might be interested in lowest MHz ?
MCUs mostly use 48MHz for their USB clocks, but P2 may have SW paths that dictate min MHz ?
It is unlikely 150~250MHz SysCLKs care at all about the MHz, but some care is likely needed to get the NCO at ~12.000MHz averaged.
Also, with the cooler running revB chip there is now some clock rate headroom to go for a 16:9 aspect screen mode.
Changing my $7F00_0000 to $7F08_0000 and my $CF00_0000 to $7F01_0000 got it working. Now I can watch video again and view BMPs and plot my Mandelbrots!
Here's my basic parameter table. I still need to update the bit field comments.
Some quick/dirty testing shows that at 250MHz USB works OK at both full-speed and low-speed (revA EVAL silicon). In my testing with VGA related sysclocks, only 148.5MHz (so far) has triggered USB frame jitter.
252 MHz is much closer than 250 MHz to the HDMI 640x480 spec of 251.75 MHz.
Glad to hear that both 252MHz and 250MHz should work ok with USB. It's a slight pity that 148.5MHz has issues as that is probably a nice frequency of choice for HDTVs. I wonder if the rev B silicon will help in any way with its PLL changes or if 148.5MHz is just inherently a bad system clock value for getting USB working nicely. Did this frame jitter you observed cause actual USB data transfer errors/retransmissions etc or is it still able to recover all the data in the presence of the jitter?
Are you saying the protocol requires a 1.000ms frame interval, but your generating a 500us to 604us interval? Or, did you mean you are generating 500ns to 604ns of jitter?
You could toggle between two different adjacent baud rates to dither the baud, making timing more accurate.
So 252 MHz divided down by 10 is closer to 25.175 MHz, and perhaps it has less jitter.
148.5 is /12.375 to 12MHz or /(12+3/8) suggests 3 fractional nco bits are enough ?
500ns deviation is quite a few sysclks at 148.5Mhz
What is the USB NCO available step size ? Can 1 either side give a better average ?
Doh! Yes, 500ns to 604ns of jitter. The 1.000ms frame interval is controlled by a timer that triggers a frame transmission interrupt routine. I've tested with a dozen or so different sysclocks, and 148.5MHz (so far) has been the only outlier that generates this issue. I'll do some tweaking on both the frame timer and the USB NCO step size to see what happens.
Just now I've targetted the 148.5 MHz zone and in doing so discovered that the revA chip has moods! It can jump from stable to unstable and back again without warning. Also, lower PLL frequencies are noisier in general.
The revB chip is solid above 80 MHz on the PLL (pre-XDIVP), it progressively gets jittery lower down so we definitely need to exercise XDIVP from now on.
Nothing notable observed with 148.5 MHz on either chip.
Here's a screenshot of a very stable clock on the revB Eval board running 400 MHz sysclock producing a 1 MHz square wave on pin0. The trigger point is first rising edge at 5% position. The bottom half of the display is 20x zoom of the top half showing the final low going transition in detail at 10 ns/div. What isn't obvious at first glance is scope's sampling hardware is 200 MS/s (Mega Samples per second). But at the top it says 50 GS/s. What's going on is it actually triggers on many cycles of the "repeating" square wave to progressively build small pieces of the total 100,000 samples. The timing is accurate enough to emulate all the intervening samples needed for 50 GS/s effective ... provided the signal keeps repeating exactly the same all the time. So, if there was any jitter, over what is a relatively large time period, then it would splatter the display.
XDIV = 10
XMUL = 200
XDIVP = 1
What initial divider and multiplier are you using?
With Rev1 silicon jitter was more apparent the lower the base frequency ie the lower frequency after the first divider. This resulted in a higher poll multiplier which seemed to be the culprit.
When I divided the crystal down to 0.5MHz, multiply by 297 and final divide by 1 there was a lot of jitter on the vga running 148.5MHz. But when using the base as 3MHz, multiplier of 99, and final divide by 2 it was way more stable.
And here's what the same revB chip looks like with a comport corrupting rough result at 20.5 MHz sysclock. The comport actually still worked!
XDIV = 40
XMUL = 41
XDIVP = 1
And this one, on same revB chip, is same 20.5 MHz sysclock but the PLL is up at 123 MHz so all is good again.
XDIV = 40
XMUL = 246
XDIVP = 6
Yes, TonyB and JRetSapDoog, I agree that 252MHz would be better for sending VGA over DVI/HDMI, not 250MHz. 250MHz happened to be the frequency setup in the Chip's original HDMI demo code and it worked on my setup, though officially VGA is meant to be using a pixel rate of 25.175MHz in the specs. Being as close as possible to official VGA timing should help a larger range of display devices be supported by P2s, in particular HDTVs.
Most MCU vendors give some MHz band for VCO correct operation.
I think you are saying at no point is Rev B worse than Rev A, which is good to hear ?
Not much progress regarding the 148.5MHz issue. At 148MHz and 149MHz there is no "frame jitter". At 148.5MHz the USB data transfers are clean, too -- squeaky clean. The "frame jitter" complaints all happen when the host/driver is polling the device for mouse/keyboard changes. When no data is available the device will issue an IN-NAK packet, and there are lots of them. The keyboard driver keeps a NAK count to trigger key auto-repeat, so they're almost used like a strobe. There may be something the analyzer sees during this exchange that I don't yet understand. But at this stage, whatever it is that the analyzer doesn't like is not having any affect on data transfers. Since the USB has its own NCO, maybe there's some subtle timing differences that show up at certain sysclock/USB pairings?
I have just used the same test image from Chip's demo which is only 640x350 16bpp but I padded it with 40 pixels of red and green at the sides using immediate mode output to make up the 720 pixel width until I find/make a nicer test image that is already 720 pixels wide. The garbage at the bottom is when the streamer runs out of valid image pixels in hub RAM. Obviously it needs more RAM for holding an entire frame buffer at this resolution and 16bpp colour depth.
I'd quite like to look at that HyperRAM board soon and see if I can stream some pixel data out of it. The problem I see is that this particular HyperRAM only runs at up to 100MHz (200MB/s DDR) but ideally for a frame buffer we might like to clock it at 135MHz (or 126MHz for VGA) as that would be half of the P2 system clock used at these video output resolutions. It now seems like it will have to be clocked at only a quarter of the P2 clock, or 67.5MHz / 63MHz instead, which probably gets you transfers in the vicinity of 120MB/s depending on burst length. It would be much better if the RAM was rated to clock up to ~133MHz at 3.3V but for hitting that RAM clock speed right now you need to operate it at 1.8V. Streaming from this HyperRAM at ~120MB/s after overheads should still support 27M pixels per second at 480p60 using a 32 bit aligned truecolor pixel layout in memory (which is somewhat wasteful) but then it has minimal memory bandwidth remaining for any other read/writes. Using a 16bpp mode instead would leave you approximately half the bandwidth for any other reads/writes and is likely the better choice, or you could even go down to 8 bit colour. Maybe we could gang up two HyperRAMs to effectively make a 16 bit bus instead of an 8 bit bus for faster transfers with the higher colour depths. I wonder if they could then share clocks and chip selects and run from one single COG? Or you could run them individually on a COG each I guess.
Some of the simple tweaks I used with Chip's demo code for getting this video resolution:
Baby steps. Get that going then see what happens with the RAM clock doubled. It may not complain at all.
By the time you're at 3v0 you've only got 100 MHz operation available. I wonder what the midpoint voltage is, where 133 MHz still works? Could we be lucky at 2v5?
Chip has said previously that I/O would be slow and weak at 1v8 VIO, but perhaps there is some compromise in the middle. Not trying to distract from the excellent testing... just an observation
That's a good point. We thought about adding an LDO to run the memory just above the P2's IO logic threshold voltage, but for demo and development decided to keep things as simple and standard as possible, especially as the possibility exists to use up to 3 of these hyper mem breakouts for experimenting with 2x or 3x faster access to data; and that nicely demos the abilities of cogs too.
So yes, tweaking vcc is one option, or multiple modules another.
Another thing about the hyper breakout pcb is that dual ram could be used instead of 1ram, 1flash. That would need an intrepid user with a hot air gun, but it's another option too! Maybe if there's enough demand, Parallax might be able to stock that configuration.