Micropython for P2

rogloh · 2020-01-09 22:13

Yeah I'll have to download and install the latest loadp2 and try your scripting stuff out ersmith. It would be good to help fix the multiple paste requirement issue I hit. In comparsion I found my native MicroPython can absorb the full amount of text in this case without issues, but ozpropdev also mentioned my version has a problem in RAW mode when code is sent. I suspect that is because the CPU is not given any time to process the text line it normally gets during its echo of the received data in non-RAW serial modes. The main issue here is there is no serial flow control implemented and buffers can only delay the inevitable if data is continually being sent faster than it can be processed in MicroPython. Perhaps an XON/XOFF option may be needed in loadp2 one day if we don't have any proper HW flow control on our P2 boards. In theory XOFF could be sent out if a received buffer size increases beyond a certain threshold. If loadp2 interpreted this it could slow the transmission accordingly. Maybe in time we can add it into loadp2 and our Python implementations to fix this issue.

I have another sample python file I am working on that includes the video timing stuff too so people can try out other resolutions or DVI/TV modes etc. I'll try to tidy it up and post it as well, maybe I'll add the mouse to it.

One thing I found is that I was unable to malloc much more than 16kB from the heap to get a frame buffer. I think MicroPython is limiting the maximum block size it can allocate. I found this was about 16kB on my own native MicroPython and might be a bit higher on your RISC-V version due having to a bigger heap to begin with, but it still wasn't enough for a 1 bpp frame buffer of 38400 bytes at VGA resolution. I guess for doing graphics frame buffers in MicroPython we may want to have HyperRAM fitted and its driver enabled unless this block size limitation can be overcome in the MicroPython port. With MicroPython up and running on a P2 there won't be a vast amount of free HUB memory left over and only smallish frame buffers may be possible and/or with limited bit depths. But given there are different regions allowed on the screen it can still be handy to use smaller than full screen sized frame buffers so you can show both text and some graphics together etc.

ersmith · 2020-01-09 22:29

rogloh wrote: »

One thing I found is that I was unable to malloc much more than 16kB from the heap to get a frame buffer. I think MicroPython is limiting the maximum block size it can allocate. I found this was about 16kB on my own native MicroPython and might be a bit higher on your RISC-V version due having to a bigger heap to begin with, but it still wasn't enough for a 1 bpp frame buffer of 38400 bytes at VGA resolution.

How are you trying to allocate it? I was able to do:

  x=bytearray(128000)

without any issue, although probably larger buffers would be a problem. As you say, there isn't too much HUB memory free with MicroPython, but I think there should be enough for 320*240 8bpp or 640*480 1bpp.

I really like what you've done with the video driver, the possibilities seem very exciting!

Eric

rogloh · 2020-01-09 22:43

I used bytearray(size) as well but within the gfxtest() function, perhaps that was the difference. I'll try at global scope instead, maybe that is the key and it was being allocated on the stack instead of the heap.

Update: yeah that fixed it by allocating it at global scope.

ozpropdev · 2020-01-09 23:49

Here's an example of the technique I've used on both Roger's and Eric's Micropython versions to attach my stuff to their binaries.

Basically I use the input selector feature of the smartpins to "patch" the serial streams so I can intercept and inject data.

Smartpin             I/O Pin

#2   <-------------- #63

#1   <-------\
|            |
#0   -----\  |
          |  |
#63  <----/  |
             |
#62  --------o-----> #62

ersmith · 2020-01-10 00:43

That's a pretty cool use of the smartpins, Brian!

As I mentioned above I had added some hooks in an attempt to make this easier, although it looks like you've got a framework already. One thing that I thought users might want to do would be to swap their I/O devices in and out at run time -- although removing devices is probably harder than adding them.

rogloh · 2020-01-10 00:50

I like it. Having the flexibility in the P2 to sniff/filter the serial traffic of another existing COG in this manner could become useful for debug etc without changing the original COG code or needing external host devices to be coded up for specific monitoring/testing etc. You can just do it internally within the P2.

As an alternative to the ozpropdev approach, I think a fully integrated setup for an interactive/standalone MicroPython with local KBM + VGA and serial could start to look something like this...

- 1 MicroPython "main" COG
- 1 (or 2) USB keyboard+mouse COG(s) - longer term with true USB HUB functionality it would be good to reduce to only 1 USB COG per physical USB port.
- 1 (optional) UART COG. Right now my native P2 MicroPython does its serial I/O internally in the main MicroPython COG with Smartpins, interrupts and uses the LUT RAM as a receive buffer, but I think RISC-V uses a standalone serial COG.
- 1 VGA COG driver to generate video
- 1 "console" COG to both poll the USB devices and also maintain a video console by writing the serial data stream and any ANSI/VT100 cursor/colour commands etc from Python and feeding any received USB keyboard input data back to Python. Perhaps some of the functionality of this COG could be merged in with the Serial COG Eric uses to reduce the overall COG count. I think it would be good if the text screen was maintained by MicroPython but any Python code still had the ability to access the screen buffer directly using array addresses etc and other high level control/printing functions. Best of both worlds.

This takes anywhere from 4-6 COGs, leaving other COGs remaining for any HyperRAM, fast SD drivers, networking or other I/O specific drivers, debugger functions etc.

This would certainly make a nice little dev system for P2 experimentation using Python.

Wuerfel_21 · 2020-01-10 01:30

Shouldn't it be possible to run the "console" on the main cog, too?

rogloh · 2020-01-10 01:39

Wuerfel_21 wrote: »

Shouldn't it be possible to run the "console" on the main cog, too?

Yes, and in fact I do that right now for serial in my own code, though one possible issue is that the real-time requirement for polling a USB keyboard and mouse data might affect it. Something needs to do that and manage USB device attach/de-attach etc. If this IO management function can be combined inside an overall "console" COG in charge of all the serial IO + console functions, that could become simpler for the MicroPython implementation which wouldn't have to worry about it. It could then behave differently in cases without video+local keyboard, reverting to standard serial only. Alternatively it could become built in but that may complicate things. Either way might be possible.

Wuerfel_21 · 2020-01-10 01:46

rogloh wrote: »

though one possible issue is that the real-time requirement for polling a USB keyboard and mouse data might affect it..

Isn't that what the USB Keyboard/Mouse cogs are for?

garryj · 2020-01-10 01:53

rogloh wrote: »

Great, I'll give it a go. Yesterday I realized that the internal USB and your own embedded video driver will be affected by the loadp2 frequency change when looking at the codebase. Maybe your embedded video driver wouldn't be needed if replaced by this new driver, but it would be nice to be able to somehow keep USB in there as well. I think the best way forward is to see if @garryj could possibly extend that USB code so it could operate at different frequencies (and use a different P2 pin base) dynamically at COG spawn time based on some input parameters. This may be possible by patching in various values in the code like I do in my video code, based on the frequency and pin parameters passed in.

Yeah, setting the USB pin assignments from parameters shouldn't be a problem. Sysclock patching is probably doable, but there are a lot of USB related timing values that include sysclock in the equation and constant directives are by far the most efficient way to package these calculations. Most of these values are used in buss transactions so they would likely have to be in cog space rather than hub and there is very little cog space left. I can certainly see the value in this, so I'll see what I can do.

Your micropython VGA driver looks very nice, though my aging Samsung SycnMaster 213T monitor has difficulty keeping it synched. The screen gets blanked and re-drawn at irregular intervals spaced from a couple of seconds to ten or more seconds apart. I'm using the A/V add-on board and the VGA 15-pin connection. Eric's micropython VGA driver is very solid.

rogloh · 2020-01-10 02:28

Wuerfel_21 wrote: »

rogloh wrote: »

though one possible issue is that the real-time requirement for polling a USB keyboard and mouse data might affect it..

Isn't that what the USB Keyboard/Mouse cogs are for?

Yes the USB cog(s) do most of the work, but something still needs to poll the result. At what rate I'm not sure. It might be possible to poll slowly directly from the MicroPython COG.

garryj wrote: »

Yeah, setting the USB pin assignments from parameters shouldn't be a problem. Sysclock patching is probably doable, but there are a lot of USB related timing values that include sysclock in the equation and constant directives are by far the most efficient way to package these calculations. Most of these values are used in buss transactions so they would likely have to be in cog space rather than hub and there is very little cog space left. I can certainly see the value in this, so I'll see what I can do.

That'd be great. The dynamic pin one is probably the more useful one, but the clocking one is also good once you mix with video and want things to be more dynamic. I'd agree it is quite a bit of effort to achieve the timing parameter change. Hopefully you might be able to get it to fit. I was amazed how much more I squeezed into the video COG after it was already "full" several times.

The customisation approach I took in my video cog driver was to run some COG configuration code from LUT RAM initially that would compute and patch the variables/constants in the code when the COG is first spawned. You can then re-use the LUTRAM for other purposes after this. That method should avoid consuming significant additional COG RAM. If you need to patch the LUT as well that complicates things, but perhaps running the initial config code in hub mode may help too.

Your micropython VGA driver looks very nice, though my aging Samsung SycnMaster 213T monitor has difficulty keeping it synched. The screen gets blanked and re-drawn at irregular intervals spaced from a couple of seconds to ten or more seconds apart. I'm using the A/V add-on board and the VGA 15-pin connection. Eric's micropython VGA driver is very solid.

Hmm, not sure why it wouldn't sync. Are you 100% sure you've configured the P2 to run at 252MHz with MicroPython? If it is still running at 160MHz it would have a problem. I use the standard VGA pixel timing values and everything I've fed it to syncs up to it fine. The only difference is I run the dot clock at 25.2MHz instead of the official 25.175MHz but I'd doubt that would cause your issue.

Update: Also check you have the right pin settings for VSYNC in the python file, if it was a floating pin maybe you'd see some issues on your monitor. I'd say it is likely that it's not running at 252MHz.

Tubular · 2020-01-10 02:49

I think we have the same old samsung 213T that we should be able to test here.

rogloh · 2020-01-10 02:56

@Tubular Give it a go both at 252MHz, and also leave out the frequency patch and try with 160MHz default, then you might get the same symptoms as garryj....?

cgracey · 2020-01-10 03:56

For these HDMI signalling problems, has anyone tried using float-high or float-low drive modes, letting the resistors pull the other direction?

rogloh · 2020-01-10 04:13

Here's an updated MicroPython video demo that involves custom timing. It uses fractional timing at a ratio of 6.3 yielding 40MHz dot clock from 252MHz P2 using 800x600 resolution, so there may be some noticeable jitter in the output because it is not an integer multiple at that rate.

I've also changed a few of the demos slightly, and included a new one for a mouse sprite (in text mode or 4bpp gfx mode).

File to use is called svgapython. You will still have to patch the VGA pins accordingly.

The P2 still needs to be run at 252MHz for this demo as coded, though with some experimenting you might be able to force a divider of 4 (use 0x400 in svgatiming.xfreq) to run it at the default of 160MHz. I've not tried that so YMMV. You could also set it to 5 (0x500) and run the P2 at 200MHz if it doesn't like running at 160MHz. That should work and be an integer multiple with no pixel jitter.

You'll need to paste this code in multiple text sections again (potentially with new break positions) or use the approach Eric recommends (I still need to try that myself).

jmg · 2020-01-10 04:14

cgracey wrote: »

For these HDMI signalling problems, has anyone tried using float-high or float-low drive modes, letting the resistors pull the other direction?

Sounds a good idea.
The P2 1mA mode could be a bit light for some receivers.
The 1k5 mode would give ~2.1mA from the NFET path. Is that a genuine series R of 1k5, or a weak FET ?
A float-low 'open drain' mode with P-FET shunting to 3v3 would support pull down resistors of ~750 ohms, for 4mA of drive, and this should have the best slew-rates internally.

rogloh · 2020-01-10 04:26

rogloh wrote: »

Wuerfel_21 wrote: »

rogloh wrote: »

though one possible issue is that the real-time requirement for polling a USB keyboard and mouse data might affect it..

Isn't that what the USB Keyboard/Mouse cogs are for?

Yes the USB cog(s) do most of the work, but something still needs to poll the result. At what rate I'm not sure. It might be possible to poll slowly directly from the MicroPython COG.

Another reason we might want a separate COG to manage kbd/mouse/serial/video console stuff, is if there was some Python GUI and we wanted to update the mouse on the screen as it moves. Right now in Python things are not real-time and there is a fair amount of jittery movement on the mouse sprite when I tried controlling it in a simple demo while running Python code. An independent COG managing the IO devices at a fast enough speed could allow for mouse co-ordinate updates to be handled outside of Python and be silky smooth responsive. I guess a mouse update could still potentially be done in the Python COG if interrupts were used for background housekeeping tasks but it is more complex that way.

cgracey · 2020-01-10 04:28

jmg wrote: »

cgracey wrote: »

For these HDMI signalling problems, has anyone tried using float-high or float-low drive modes, letting the resistors pull the other direction?

Sounds a good idea.
The P2 1mA mode could be a bit light for some receivers.
The 1k5 mode would give ~2.1mA from the NFET path. Is that a genuine series R of 1k5, or a weak FET ?
A float-low 'open drain' mode with P-FET shunting to 3v3 would support pull down resistors of ~750 ohms, for 4mA of drive, and this should have the best slew-rates internally.

The resistance drive modes use inverters driving real polysilicon resistors, not FETs.

rogloh · 2020-01-10 04:35

cgracey wrote: »

For these HDMI signalling problems, has anyone tried using float-high or float-low drive modes, letting the resistors pull the other direction?

Yes I have tried that and I think there was a discussion covering it buried somewhere in the main PASM gurus/DVI driver threads around the time when potatohead started his HDMI testing. In my driver code that controls the output I have two options. I prefer the uncommented one as I think that is going to be more consistent. The 1.5k internal pulldown by itself may not be strong enough.

'		 	    wrpin  ##%111001<<8, a          'uncomment for float high, 1.5k low
                            wrpin   ##%10110_1111_0111_10_00000_0, a  '123 ohm BITDAC for pins

cgracey · 2020-01-10 04:47

rogloh wrote: »
cgracey wrote: »

For these HDMI signalling problems, has anyone tried using float-high or float-low drive modes, letting the resistors pull the other direction?

Yes I have tried that and I think there was a discussion covering it buried somewhere in the main PASM gurus/DVI driver threads around the time when potatohead started his HDMI testing. In my driver code that controls the output I have two options. I prefer the uncommented one as I think that is going to be more consistent. The 1.5k internal pulldown by itself may not be strong enough.
'		 	    wrpin  ##%111001<<8, a          'uncomment for float high, 1.5k low
                            wrpin   ##%10110_1111_0111_10_00000_0, a  '123 ohm BITDAC for pins

Have all the bit DAC modes been tried? We can get any of the following:

%10100 = 990 ohm, 0..3.3V
%10101 = 600 ohm, 0..2.0V (DAC pull-down enabled)
%10110 = 123 ohm, 0..3.3V
%10111 = 75 ohm, 0..2.0V (DAC pull-down enabled)

rogloh · 2020-01-10 04:59

Well, I'm not sure I tried them all. Others might have played with them. I guessed the 123 ohm might be the best without extra components.

jmg · 2020-01-10 05:48

cgracey wrote: »

Have all the bit DAC modes been tried? We can get any of the following:

%10100 = 990 ohm, 0..3.3V
%10101 = 600 ohm, 0..2.0V (DAC pull-down enabled)
%10110 = 123 ohm, 0..3.3V
%10111 = 75 ohm, 0..2.0V (DAC pull-down enabled)

Do those DAC modes have enough mW of dissipation rating to drive into a almost-short to 3v3 ?

rogloh wrote: »

Well, I'm not sure I tried them all. Others might have played with them. I guessed the 123 ohm might be the best without extra components.

The 123 ohm one may have higher wasted power ?
With the 50 ohms dominating, the 990 ohms mode may be just as good ?

cgracey · 2020-01-10 05:55

The 75-ohm DAC mode outputting 0V, while tied to 3.3V would draw 3.3V/75ohms, or 44mA, dissipating 145mW. That's less power than a logic pin would dissipate in a short. The DAC was designed for a 3.3V drop on all resistors and drivers.

Tubular · 2020-01-10 06:03

I'm no expert on physical hdmi levels but aren't they referenced to 3v3? So does this prevent using the 75 ohm dac (since it can only signal in the range 0 to 2v out (referenced to ground), which is effectively 3v3 to 1v3 (referenced to 3v3)

garryj · 2020-01-10 20:23

rogloh wrote: »

Your micropython VGA driver looks very nice, though my aging Samsung SycnMaster 213T monitor has difficulty keeping it synched. The screen gets blanked and re-drawn at irregular intervals spaced from a couple of seconds to ten or more seconds apart. I'm using the A/V add-on board and the VGA 15-pin connection. Eric's micropython VGA driver is very solid.

Hmm, not sure why it wouldn't sync. Are you 100% sure you've configured the P2 to run at 252MHz with MicroPython? If it is still running at 160MHz it would have a problem. I use the standard VGA pixel timing values and everything I've fed it to syncs up to it fine. The only difference is I run the dot clock at 25.2MHz instead of the official 25.175MHz but I'd doubt that would cause your issue.

Update: Also check you have the right pin settings for VSYNC in the python file, if it was a floating pin maybe you'd see some issues on your monitor. I'd say it is likely that it's not running at 252MHz.

Doh! Was the VSYNC pin setting. VGA looks good on the SyncMaster and a Coby 12 inch monitor 👍.

ersmith · 2020-01-31 00:07

Would anyone object if I were to drop support for the RevA board in my next micropython release?

potatohead · 2020-01-31 03:43

I will not

ozpropdev · 2020-02-04 10:40

Hi Eric

I've noticed that P24 doesn't seem to work.

It appears that garrj's USB driver sets basepin + 8 in smartpin repository mode.
May I suggest moving this repository to the same pin as the activity led (basepin + 4)
Then changing the 'TT' bits to 01 to aloow OUT contol.
This combines the two functions and frees up P24.

ersmith · 2020-02-04 13:44

Here's a version of micropython for RevB boards with no VGA or USB code included, and with only COG 0 being used (uart I/O is all done in COG 0). Otherwise it's fairly full featured in terms of what Python features (regexp, math, etc.) are included. 260K is available for the user in the Python space, and also the upper 16K (debug area) is left untouched. This may be a more useful platform for those of you (like @ozpropdev and @rogloh ) who are doing their own USB and VGA support.

I have not tested this with RevA and it probably won't work there.

rogloh · 2020-02-05 00:26

Eric, isn't more of the 16kB upper area available? Should you only need to reserve about 2kB for debug state? I forget the exact number, @ozpropdev probably knows it as he uses it in his debugger. Or were you thinking that a debugger itself is stored up there and 16kB is a nice round number? I guess it could be loaded up there and run in hub mode etc.

EDIT: Ok I just read this from the P2 docs:

"The last 16KB of RAM can be hidden from its normal address range and made read-only at $FC000..$FFFFF. This is useful for making the last 16KB of RAM persistent, like ROM. It is also how debugging is realized, as the RAM mapped to $FC000..$FFFFF can still be written to from within debug interrupt service routines, permitting the otherwise-protected RAM to be used as debugger-application space and cog-register swap buffers for debug interrupts."

I guess that makes sense why you have allowed this entire 16kB. It is the biggest amount of space a debug ISR can use exclusively, so you are providing the most room for a comprehensive debugger to fit and store its writeable data there. Some debuggers may not use all of it, but this allows them to. I can see this could come in handy if the debugger is doing its own video interface too, presenting some type of GUI and needs it's own screen buffer not affected by the COGs under test. In some case that alone could require it's own 8kB of data (4kB for a 80x25 colour screen, 4kB for font data, etc), depending on your video driver (could be smaller if font is internal, and only mono). Serial debuggers could be much smaller.

Micropython for P2

Comments