mcufont working with FlexSpin C

Rayman · 2021-07-31 20:36

This code at Github is pretty awesome: https://github.com/mcufont/mcufont

Just got it working with P2, thanks to some fixes to FlexProp C by @ersmith

As you can see in this image, mcufont does antialiased fonts with kerning, justification and word wrapping.

Still figuring it out, but looks very powerful.
For now, I'm testing with a VGA driver in grayscale mode.
Have to think more about how to use color...

Attached is test code that I just got running. You can test by opening "McuFontTest.c" with FlexProp GUI version 5.5.2 (latest). The base pin setting for the VGA adapter is in Platform.h

Rayman · 2021-07-31 20:42

The fonts are compressed to save space. Not 100% sure this is needed, but seems to not hurt as it renders very fast.

Compressing the fonts uses a separate program. I had to use a Mac to do it...
Looks like it can use any ttf type of font...

Rayman · 2021-07-31 20:59

I see now that there are actually only 16 shades of gray used by mcufont.
This is what they meant by "16-level antialiased fonts"

Here's the code that changes makes alpha from a nibble into a span from 0..255:
alpha = ((code & RLE_VALMASK) & 0xF) * 0x11;

This will make it easier to adapt to 8-bit color.

evanh · 2021-07-31 21:02

That's certainly a use case for high-res external frame buffer RAM. Need to get Roger's display driver ported to C now.

Rayman · 2021-07-31 21:08

Maybe no need to port? The VGA driver I'm using here is in Spin2. Can mix and match languages with Flexprop...

Wuerfel_21 · 2021-07-31 21:09

How complex/slow is the font decompression algorithm? Implementing it/something similiar in a scanline renderer would be pretty cool.

Rayman · 2021-07-31 21:10

I've attached the test code to top post, if anyone wants to try it...

Rayman · 2021-07-31 21:15

@Wuerfel_21 said:
How complex/slow is the font decompression algorithm? Implementing it/something similiar in a scanline renderer would be pretty cool.

If you mean real-time scanline rendering, it might be doable at low resolution. Maybe needing multiple cogs...

To check speed, the test code does a 2 second pause before printing the big text.
It shows up in blink of an eye type speed. But, didn't actually measure it.
The fixed 7x14 font line actually looks slower for some reason...

Rayman · 2021-07-31 21:17

I think the characters are just 16 level bitmap images that are RLE compressed beforehand.
They are decompressed upon use.

JRoark · 2021-07-31 21:17

Wicked cool!

evanh · 2021-07-31 21:26

@Rayman said:
I've attached the test code to top post, if anyone wants to try it...

I note the .c files as includes. This is done I presume due to compiler limitations?

Wuerfel_21 · 2021-07-31 21:48

@Rayman said:

@Wuerfel_21 said:
How complex/slow is the font decompression algorithm? Implementing it/something similiar in a scanline renderer would be pretty cool.

If you mean real-time scanline rendering, it might be doable at low resolution. Maybe needing multiple cogs...

To check speed, the test code does a 2 second pause before printing the big text.
It shows up in blink of an eye type speed. But, didn't actually measure it.
The fixed 7x14 font line actually looks slower for some reason...

I did a thing on P1 where it decompresses full screen images in real-time, and that uses 3 cogs (though mostly fine with 2) to decompress a sortof hybrid RLE/2bpp format.

Simple RLE should run great on P2 (even accounting for character overhead). Opaque rendering most certainly, alpha transparency on a 32bpp buffer should be at around 16 cycles per pixel (2 cycles per pixel to fetch and write back, 7 for MIXPIX, 2 for ALTI = 11, then add some for overhead), so assuming we're filling an entire VGA screen with text, that's ~10K cycles, which is too much for one cog (budget is 25000000/31468 -> 7944), but 3 should be more than enough. Could also optimize levels 15 and 0 to not run the mixpix.

Rayman · 2021-07-31 22:02

Was thinking of using MIXPIX for the first time here to get working in color.
Also thought it would have to be 16-bit color.

But, now that I see it's really only 16 levels and not 256, maybe that's overkill and can use palleted 8-bit color.
Just need to figure out how to do things like yellow text on blue background this way...

Rayman · 2021-07-31 22:03

BTW: @"Phil Pilgrim (PhiPi)" had a pretty cool RLE decoding scheme for real time VGA resolution images with P1.
Wait, actually it wasn't RLE, it was color cell compression, never mind.

evanh · 2021-07-31 23:36

Lol, what's with the embedded picture taking up 80% of the binary then not using it?! EDIT The first thing I noticed was the download time.

rogloh · 2021-08-01 06:04

@evanh said:
That's certainly a use case for high-res external frame buffer RAM. Need to get Roger's display driver ported to C now.

Maybe a C header file is all that is required for porting any SPIN2 functions you want to call (assuming you use FlexProp). Once you have the bitmap framebuffer setup you probably don't need too many of my SPIN2 helper functions, at least for HUB RAM buffers. For external memory frame buffers it is indirectly accessed and you'd need to call some write functions to transfer into external memory, or talk via the mailbox directly. I should get back to wrapping that up again once this P2 board I am doing is finally made up. Code was 99% done and working as I recall, just needs documentation/release.

evanh · 2021-08-01 09:22

Rayman,
There's a lot of callback layers in your top level test code. How the hell did you figure all that out to even get it working?

EDIT: Everything I change seems to break something. I'm currently trying to work out why the second text block, word-wrapped, is not being rendered. PS: I've resorted to hard-cording the palette in lutRAM for the moment. I did this after removing the .bmp file and runtime allocating a blank frame-buffer.

EDIT2: Hmm, how weird, adjusting the heap size dramatically affects character by character render speeds and also can affect what is rendered. eg: enum { HEAPSIZE=100 }; produces very slow rendering.

EDIT3: Very large sizes, like 200kB, crashes. And even 64kB reduces display to just the first and last text blocks.

EDIT4: PS: The runtime frame-buffer allocation I'm doing is on the stack using __builtin_alloca(). Oddly, using the heap still takes up download size.

EDIT5: Here's the renamed two files I modified:

Rayman · 2021-08-01 14:23

@evanh I wrote very little of this code myself...

It's a mashup of the example in README.rst and the render_bmp.c example.

I don't think it actually uses the heap, but could be wrong. The render_bmp.c example does use the heap, but I removed that part.,,
200 kB would definitely be a problem because the display buffer is already ~300kB. So, that plus fonts and code would take you over the 512kB limit.

evanh · 2021-08-02 04:59

Lol, okay, the 200 kB case is explained.

ersmith · 2021-11-26 18:18

I took a while, but I've fixed the compile errors in mcufont, and it should compile now with flexspin 5.9.6. I have no idea whether it works correctly or not, though .

EDIT: I meant compile errors in the original source code... AFAIK @Rayman provided work-arounds to get the code to compile with flexspin.

Rayman · 2021-11-26 23:31

Thanks Eric!

Rayman · 2024-11-01 21:20

Was just working with this again to explore working with colors.
Wanted to use the PASM2 pixel mixing instructions, but that doesn't really work here with paletted VGA driver.

For VGA without PSRAM, have to use palette.
mcufont expands to 16-levels of alpha.
The actual values are 8-bit though in the code: 0, 17, 34, 68, etc.
Interestingly, in binary the upper four bits are repeated in lower four bits. This is convenient actually.

So, for any two foreground and background pairs, you need 16 of the 256 colors in the palette.
Guess this allows for 16 combinations at any one time.
Creating the palette in advance makes the rendering pretty trivial...

Right now, in this test using all 256 palette colors. Have to think about fixing that.
Anyway, attached is output of code with yellow text on navy background.
It looks OK on monitor, but looks dark around the text. Think this is some kind of illusion though as don't see it in the captured bmp, saved as .png

Rayman · 2024-11-01 21:23

This is with linear interpolation with alpha, like this:

   for (int a = 0; a < 256; a++)
   {
       r = (int) ((forer * a / 256.) + (backr * (256. - a) / 256.)) &0xFF;
       g = (int)((foreg * a / 256.) + (backg * (256. - a) / 256.)) &0xFF;
       b = (int)((foreb * a / 256.) + (backb * (256. - a) / 256.)) &0xFF;

Guess this is what the PASM2 instruction does? Don't see anywhere exactly what it does...

Maybe @Wuerfel_21 knows a better way to do this?

Rayman · 2024-11-01 21:26

Actually, maybe I could use the PASM2 pixel mixing instructions to create the palette. Have to try that...

Wuerfel_21 · 2024-11-01 21:41

You could use BLNPIX for that, yes.

But you could also do approximate gamma correction, which will look less odd. Essentially: r = sqrt(forer*forer*a + backr*backr*(256-a))>>4

Rayman · 2024-11-01 21:50

@Wuerfel_21 Do you know what BLNPIX is doing? Just linear?

Rayman · 2024-11-01 21:56

Hmm... No SQRT in FlexProp C? Guess will use Spin2 for that...

Wuerfel_21 · 2024-11-01 22:15

Yes, BLNPIX is linear. Note that the blend factor needs to be set with SETPIV.
If you don't have the integer square root function, just make you own with inline ASM (QSQRT + GETQX)

Rayman · 2024-11-01 22:20

Had to overcome a bug in FlexSpin when calling a Spin2 function to do the square root...
Got it now though. Does this look any better?

Rayman · 2024-11-01 22:21

Think this way might not work as well because the "Th" in "This" in the word wrapped text appear to be connected...

Wuerfel_21 · 2024-11-01 22:42

Try the comparison with bright red on bright green

mcufont working with FlexSpin C

Comments