after your midwifery (is that correct english?) LameStation graphics things come into motion. The following code iterates a linear byte buffer lcdmem and gives it out to the display. Currently without any convolution.
pub LCDBITBLT ( -- ) --- transfer lcdmem to LCD Display
\ transfer buffer to LCD
-1 ctr_pages_last !
-1 ctr_halves_last !
0 lcdmemlen ADO
\ iterate all bytes in lcdmem
I 64 / ctr_halves ! \ counter for 64 Packets
I 128 / ctr_pages ! \ counter for pages
I 0 = IF
\ ." first loop" CR
LCDBOTH
lcdon LCDCMD
THEN
ctr_pages @ ctr_pages_last @ <> IF
\ new page
\ ." new page " ctr_pages @ . ." _" I . CR
lcdsetpage ctr_pages @ + LCDCMD --- iterate 'page's
THEN
ctr_halves @ ctr_halves_last @ <> IF
\ ." new half " ctr_halves @ . ." _" I . CR
ctr_halves @ EVEN? IF
LCDBOTH
lcdsetadr LCDCMD --- go left
ELSE
LCDBOTH
lcdsetadr LCDCMD --- go left
THEN
THEN
lcdmem I + C@ LCDDAT
ctr_pages @ ctr_pages_last !
ctr_halves @ ctr_halves_last !
LOOP
;
This is quite slow versus simple deleting but I don't care yet and will look for better performance later.
LAP LCDBITBLT LAP .LAP 137.673ms ok
LAP LCDCLS LAP .LAP 11.243ms ok
The next step will be stuffing the bits together according to the screen buffer organized like this:
Maybe I can do this tomorrow "convolution day". After that drawing to the buffer has to be done. I will dive into VGA.FTH to see wether I can copy from there to draw to the buffer.
@proplem - sounds good that you are getting it working.
There are some strange things in your code though. You test for first time through the loop with I 0= IF when it makes more sense to place the init "LCDBOTH lcdon LCDCMD" before the loop without any need for a test which btw is happening every iteration and helping slow it down. You can also make your code faster by using shifts instead of divides if it is a binary multiple, so 5 >> instead of 64 / and 7 >> instead of 128 /, much much faster. That is why there 2/ (and 2*) words since these are actually shift right and shift left but they take a fraction of a microsecond to execute.
The "ctr_halves @ EVEN?" ends up doing the same thing for both conditions?. Also, you could probably simplify the whole variables as flags thing too by simply leaving values on the stack since there are only two of them, so you end up using DUP or OVER or 3RD etc to "fetch" the variable. But even all that could be simplified too. No matter though, the important thing is to do what you have done, to get it working!
EDIT: Here is a simplification of your code, still as your code, but mainly removing variables etc. This same code can be further reduced as you might see.
pub LCDBITBLT ( -- ) --- transfer lcdmem to LCD Display
\ transfer buffer to LCD
-1 -1 ( oldpage oldhalf ) ( replaces -1 ctr_pages_last ! -1 ctr_halves_last ! )
LCDBOTH lcdon LCDCMD
0 lcdmemlen ADO
\ iterate all bytes in lcdmem (calculate as needed rather than using variables)
OVER ( oldpage ) I 7 >> <> IF
\ new page
\ ." new page " ctr_pages @ . ." _" I . CR
lcdsetpage I 7 >> + LCDCMD --- iterate 'page's
THEN
DUP ( oldhalf ) I 5 >> <> IF
\ ." new half " ctr_halves @ . ." _" I . CR
I 5 >> EVEN? IF ( ???? no difference )
LCDBOTH lcdsetadr LCDCMD --- go left
ELSE
LCDBOTH lcdsetadr LCDCMD --- go left
THEN
THEN
lcdmem I + C@ LCDDAT
--- update "old" page and half stack variables
2DROP I 7 >> I 5 >> ( oldpage oldhalf )
LOOP
2DROP --- discard oldpage and oldhalf
;
After sorting out some bugs I changed the LCDBITBLT word according to Peter's recommendation of using stack operations instead of variables. This was a lot faster about 100 ms now about 50 ms.
pub LCDBITBLT ( -- ) --- start from upper left corner and move down to bottom right and set ch
\ transfer buffer to LCD
LCDBOTH lcdon LCDCMD
-1 -1 ( oldpage oldhalf )
0 lcdmemlen ADO
\ iterate all bytes in lcdmem (calculate as needed rather than using variables)
OVER ( oldpage ) I 7 >> <> IF
\ new page
\ ." new page " ctr_pages @ . ." _" I . CR
LCDBOTH
lcdsetpage I 7 >> + LCDCMD --- iterate 'page's
THEN
DUP ( oldhalf ) I 6 >> <> IF
\ ." new half " ctr_halves @ . ." _" I . CR
I 6 >> EVEN? IF
LCDA
ELSE
LCDB
THEN
lcdsetadr LCDCMD --- go left
THEN
lcdmem I + C@ LCDDAT
\ 2DUP *BTNA HIGH? IF ." I=" I . ." h=" . ." p=" . CR 100 ms THEN
--- update "old" page and half stack variables
2DROP I 7 >> I 6 >> ( oldpage oldhalf )
LOOP
2DROP --- discard oldpage and oldhalf
;
The code is faster but for me less understandable. I think I will revert to my first version using variables. This is a bit slower but I can better extend it. If I will need the speed, I will optimize and refactor later.
@proplem - I see the problem, it's thinking in German and writing in English! (then trying to translate it back to German when you read it)
When Luke Skywalker used the Force he had to let go of his "aids", flipping the targeting computer out of the way. While useful, it can get in the way. Same with variables.
Let's see, if we just simplify and factor, skip checking if we need to set or not, and just set. Maybe this will work?
pub LCDCS ( address -- ) $40 AND IF LCDB ELSE LCDA THEN ;
pub LCDPAGE ( address -- ) LCDBOTH 7 >> $B8 OR LCDCMD ;
pub LCDADR ( address -- ) DUP LCDPAGE DUP LCDCS $3F AND $40 OR LCDCMD ;
pub LCD! ( data address -- ) LCDADR LCDDAT ;
--- transfer buffer to LCD, start from upper left corner and move down to bottom right and set ch
pub LCDBITBLT ( -- ) LCDBOTH lcdon LCDCMD 0 lcdmemlen ADO lcdmem I + C@ I LCD! LOOP ;
@proplem - I see the problem, it's thinking in German and writing in English! (then trying to translate it back to German when you read it)
When Luke Skywalker used the Force he had to let go of his "aids", flipping the targeting computer out of the way. While useful, it can get in the way. Same with variables.
Let's see, if we just simplify and factor, skip checking if we need to set or not, and just set. Maybe this will work?
pub LCDCS ( address -- ) $40 AND IF LCDB ELSE LCDA THEN ;
pub LCDPAGE ( address -- ) LCDBOTH 7 >> $B8 OR LCDCMD ;
pub LCDADR ( address -- ) DUP LCDPAGE DUP LCDCS $3F AND $40 OR LCDCMD ;
pub LCD! ( data address -- ) LCDADR LCDDAT ;
--- transfer buffer to LCD, start from upper left corner and move down to bottom right and set ch
pub LCDBITBLT ( -- ) LCDBOTH lcdon LCDCMD 0 lcdmemlen ADO lcdmem I + C@ I LCD! LOOP ;
Beautiful, it's hard to learn to write in FORTH and not merely translate C, SPIN... into FORTH syntax.
@proplem - I see the problem, it's thinking in German and writing in English! (then trying to translate it back to German when you read it)
When Luke Skywalker used the Force he had to let go of his "aids", flipping the targeting computer out of the way. While useful, it can get in the way. Same with variables.
Let's see, if we just simplify and factor, skip checking if we need to set or not, and just set. Maybe this will work?
pub LCDCS ( address -- ) $40 AND IF LCDB ELSE LCDA THEN ;
pub LCDPAGE ( address -- ) LCDBOTH 7 >> $B8 OR LCDCMD ;
pub LCDADR ( address -- ) DUP LCDPAGE DUP LCDCS $3F AND $40 OR LCDCMD ;
pub LCD! ( data address -- ) LCDADR LCDDAT ;
--- transfer buffer to LCD, start from upper left corner and move down to bottom right and set ch
pub LCDBITBLT ( -- ) LCDBOTH lcdon LCDCMD 0 lcdmemlen ADO lcdmem I + C@ I LCD! LOOP ;
Ha! After abandoning the first shock I tried it out. It works. After thinking wether I should leave the forum and become a manager - I don't want to have anything to do with black magic which is obviously used at this place.
BUT: I compared performance:
Peter's minimalistic FORTH version:
LAP LCDBITBLT LAP .LAP 118.009ms ok
My maybe not so convoluted FORTHish child's book readable version:
LAP LCDBITBLT LAP .LAP 77.955ms ok
Now you're gasping :-D :-D :-D
Nevertheless chapeau Peter - of course it is a beautiful solution.
@proplem - no no no, that can't be right You made me chuckle, well done, and then I just had to code up the rest of it and try it out. Hmmmm..... I tried my version with V4 code:
( 0034 $2DA4 ok ) LAP LCDBITBLT LAP .LAP
5900272 cycles at 96000000Hz or 61.461ms
That's with an lcdmemlen of 1,024 since we have a 128x64 graphic LCD = 128x8 bytes = 1,024.
Anyway here's my code I used.
TACHYON
: KS0108.fth PRINT" KS0108 128x64 Graphic LCD " ;
$3FC == *lcd
$3F == lcdon
#P0 == *rs
#P1 == *lcdce
#P10 == *CS1
#P11 == *CS2
*CS1 MASK *CS2 MASK + == *BOTH
1,024 == lcdmemlen
lcdmemlen bytes lcdmem
--- Output a byte to the LCD as a character
pri LCDDAT ( ch -- )
*rs HIGH
pri WriteLCD ( data -- )
*lcd OUTCLR --- Prep data bus as outputs (all low)
2* 2* OUTSET --- and write data (set after clear)
*lcdce HIGH *lcdce LOW --- 2us pulse chip enable
;
--- Output a byte to the LCD as an instruction
pri LCDCMD ( cmd -- )
*rs LOW WriteLCD
;
pub LCDCS ( address -- ) *BOTH OUTSET $40 AND IF *CS1 LOW ELSE *CS2 LOW THEN ;
pub LCDPAGE ( address -- ) *BOTH OUTCLR 7 >> $B8 OR LCDCMD ;
pub LCDADR ( address -- ) DUP LCDPAGE DUP LCDCS $3F AND $40 OR LCDCMD ;
pub LCD! ( data address -- ) LCDADR LCDDAT ;
--- transfer buffer to LCD
--- start from upper left corner and move down to bottom right and set ch
pub LCDBITBLT ( -- ) *BOTH OUTCLR lcdon LCDCMD 0 lcdmemlen ADO lcdmem I + C@ I LCD! LOOP ;
END
I'm sure I could make it faster and simpler again. I wonder if it works?
If so then what about the optimized version that only sets the page, chip select, and address every 64 bytes since the "Y" address pointer autoincrements.
( 0037 $2DAC ok ) LAP LCDBITBLT LAP .LAP
1345520 cycles at 96000000Hz or 14.015ms
@proplem - no no no, that can't be right You made me chuckle, well done, and then I just had to code up the rest of it and try it out. Hmmmm..... I tried my version with V4 code:
...
@Peter, your version is nearly working - there must be a small mistake , I didn't take time to find it. But on Tachyon V3 runtime of your last posted version is:
LAP LCDBITBLT LAP .LAP 110.631ms ok
Maybe I should try my version on V4 but I don't want you to get a sleepless night :-D
@proplem - I did load up V3 and got poor results but maybe with the same optimizations it will do a lot better, maybe not better than 14ms but I think I got 45ms for that method.
Scratch that, I'm down to 12ms, no tricks.
Scratch that
( 0034 $2DB2 ok ) LAP LCDBITBLT LAP .LAP
847744 cycles at 96000000Hz or 8.830ms
@proplem - I did load up V3 and got poor results but maybe with the same optimizations it will do a lot better, maybe not better than 14ms but I think I got 45ms for that method.
Scratch that, I'm down to 12ms, no tricks.
Scratch that
( 0034 $2DB2 ok ) LAP LCDBITBLT LAP .LAP
847744 cycles at 96000000Hz or 8.830ms
Wow, that's a benchmark! Although I don't know wether the LCD display is in sync with this speed. I saw remarks in the internet that enable signal has to be hold 4 ms to stay in sync. We'll see later.
BTW: Should I change to V4? We're having so much fun in this V3 thread :-)
Wow, that's a benchmark! Although I don't know wether the LCD display is in sync with this speed. I saw remarks in the internet that enable signal has to be hold 4 ms to stay in sync. We'll see later.
BTW: Should I change to V4? We're having so much fun in this V3 thread :-)
I think that 4ms is more of a sync to do with game screen updates. You could have this routine running in a cog and just continually updating the LCD 100 times/second although the STN LCD is slow and probably you wouldn't need an update rate of more than 20 fps I'd say. Or you could have it update when the game requests a screen update. The controller chip itself does not need any such long timing such as 4ms, indeed the E high time is 450ns minimum.
As for V4 I wouldn't want you to lose your momentum with Tachyon but as you can see, it is definitely faster in some areas and overall.
where is the code for those faster and faster versions ?
so we can learn step by step how you did it
to improve our coding style ...
thanks
Have a look here, before they try and put Tachyon in the "Forth" sub-forum, alongside the C, Basic, and Spin sub-forums of course
TACHYON V4
: KS0108.fth PRINT" KS0108 128x64 Graphic LCD " ;
$3FC == *lcd
$3F == lcdon
#P0 == *rs
#P1 == *lcdce
#P10 == *CS1
#P11 == *CS2
*CS1 MASK *CS2 MASK + == *BOTH
1,024 == lcdmemlen
--- lcd buffer
lcdmemlen bytes lcdmem
--- Output a byte to the LCD as a character
pri LCDDAT ( ch -- )
*rs HIGH
pri WriteLCD ( data -- )
--- write data (set after clear)
*lcd OUTCLR 2* 2* OUTSET
--- min 450ns pulse chip enable
H H L
;
--- Output a byte to the LCD as an instruction
pri LCDCMD ( cmd -- )
*rs LOW WriteLCD
;
pub LCDCS ( address -- ) *BOTH OUTSET $40 AND IF *CS1 LOW ELSE *CS2 LOW THEN ;
pub LCDPAGE ( address -- ) *BOTH OUTCLR 7 >> $B8 OR LCDCMD ;
pub LCDADR ( address -- ) DUP LCDPAGE DUP LCDCS $3F AND $40 OR LCDCMD ;
--- transfer buffer to LCD
--- start from upper left corner and move down to bottom right and set ch
pub LCDBITBLT ( -- ) *lcdce PIN L *BOTH OUTCLR lcdon LCDCMD 0 lcdmemlen ADO I LCDADR *rs HIGH I lcdmem + 64 ADO I C@ WriteLCD LOOP 64 +LOOP ;
{
Optimize BitBlt by setting address once every 64 columns as the KS0108 6-bit "Y" address autoincrements
Also during writing 64 bytes of data skip having to set *rs HIGH, just select it at the start of the loop
Reduce write pulse to KS0108
( 0034 $2DB2 ok ) LAP LCDBITBLT LAP .LAP
847744 cycles at 96000000Hz or 8.830ms
}
END
as you are the current LCDBITBLT race leader (I never give up - some races are won in the last lap :-) I would like to slow you a little bit down.
You mentioned already having implemented some graphics routines to draw into a buffer and pointed to VGA.FTH. This is color graphics but the LameStation Display is blue and white.
Would you recommend to use this color graphic routines or should I implement from scratch some blue and white ones? (I could copy from yours.)
where is the code for those faster and faster versions ?
so we can learn step by step how you did it
to improve our coding style ...
thanks
Have a look here, before they try and put Tachyon in the "Forth" sub-forum, alongside the C, Basic, and Spin sub-forums of course
TACHYON V4
: KS0108.fth PRINT" KS0108 128x64 Graphic LCD " ;
$3FC == *lcd
$3F == lcdon
#P0 == *rs
#P1 == *lcdce
#P10 == *CS1
#P11 == *CS2
*CS1 MASK *CS2 MASK + == *BOTH
1,024 == lcdmemlen
--- lcd buffer
lcdmemlen bytes lcdmem
--- Output a byte to the LCD as a character
pri LCDDAT ( ch -- )
*rs HIGH
pri WriteLCD ( data -- )
--- write data (set after clear)
*lcd OUTCLR 2* 2* OUTSET
--- min 450ns pulse chip enable
H H L
;
--- Output a byte to the LCD as an instruction
pri LCDCMD ( cmd -- )
*rs LOW WriteLCD
;
pub LCDCS ( address -- ) *BOTH OUTSET $40 AND IF *CS1 LOW ELSE *CS2 LOW THEN ;
pub LCDPAGE ( address -- ) *BOTH OUTCLR 7 >> $B8 OR LCDCMD ;
pub LCDADR ( address -- ) DUP LCDPAGE DUP LCDCS $3F AND $40 OR LCDCMD ;
--- transfer buffer to LCD
--- start from upper left corner and move down to bottom right and set ch
pub LCDBITBLT ( -- ) *lcdce PIN L *BOTH OUTCLR lcdon LCDCMD 0 lcdmemlen ADO I LCDADR *rs HIGH I lcdmem + 64 ADO I C@ WriteLCD LOOP 64 +LOOP ;
{
Optimize BitBlt by setting address once every 64 columns as the KS0108 6-bit "Y" address autoincrements
Also during writing 64 bytes of data skip having to set *rs HIGH, just select it at the start of the loop
Reduce write pulse to KS0108
( 0034 $2DB2 ok ) LAP LCDBITBLT LAP .LAP
847744 cycles at 96000000Hz or 8.830ms
}
END
thanks @Peter, I had the 'same' idea of factoring around fast 64 byte block moves - so feels good to see the code. I would have taken me quite a while to write it ...
So now on V4 you can even use your IC@ to speed up more :-) ...
as you are the current LCDBITBLT race leader (I never give up - some races are won in the last lap :-) I would like to slow you a little bit down.
You mentioned already having implemented some graphics routines to draw into a buffer and pointed to VGA.FTH. This is color graphics but the LameStation Display is blue and white.
Would you recommend to use this color graphic routines or should I implement from scratch some blue and white ones? (I could copy from yours.)
blue - white is ok for you ... you don't want the fancy half blue ??
then a black&white approach might be sufficient.
I didn'T look into the VGA code recently, but the structure should be easily usable.
blue - white is ok for you ... you don't want the fancy half blue ??
then a black&white approach might be sufficient.
I didn'T look into the VGA code recently, but the structure should be easily usable.
Oh yes, you're right - half blue - I saw that it exists but didn't remember. So the decision is done. Full color support makes sense. I will dive into VGA.FTH ...
Peter,
Are you sure there is a Propeller underneath Tachyon? Seems too good to be true
I just keep getting the distinct feeling that Parallax and the forum in general have a disdain for Forth. I mean, they keep comparing it with other "languages", as if an operating-system/compiler/interpreter running on the basic Propeller itself could be compared with "just a compiler" that runs on a PC. But then again, many have "tried Forth" and have formed and fixed their opinions Not all though I must say.
I remember when Cliff Biffle wrote a Forth for the Propeller, the Parallax guys were all excited and wanted to go meet him. But that was just a Forth, buggy and slow, and not even a rudimentary file system or proper serial communications. What could you do with it? Not much at all. I tried.
Tachyon is a practical solution for the Propeller, leveraging its strengths, and compensating for its weaknesses, then throwing in a whole bunch of absolutely useful stuff that you'd never thought you could do on "just a Propeller" (think here of ignorant disdain from the ARM/AVR/PIC etc crowd). If there existed a "language" that could showcase the Propeller, to have it show off all it's tricks, even all at the same time, wouldn't Parallax be interested, at least just to show what can be done? (long silence here).
Maybe you're right Ray, it seems too good to be true.
OK, it is a pity! What is Tachyon missing? Personally I'm very unhappy not to manage it to switch over to Tachyon. I miss a guy, who can pull people deep into Tachyon, so they find no way out and never want to do so. How to make a Master Plan?
I just keep getting the distinct feeling that Parallax and the forum in general have a disdain for Forth. I mean, they keep comparing it with other "languages", as if an operating-system/compiler/interpreter running on the basic Propeller itself could be compared with "just a compiler" that runs on a PC. But then again, many have "tried Forth" and have formed and fixed their opinions Not all though I must say.
I hope you're including me in the "not all though" part. I've always been intrigued by Forth and now Tachyon given how much you can do with it. I follow the Tachyon threads and try to parse the examples but still haven't found a big block of time I can spend getting up to speed to a point where I can do non-trivial things. However, I can't see how you could argue against it being a big asset to the Propeller. I don't know of any other language that will let you do as much.
I just keep getting the distinct feeling that Parallax and the forum in general have a disdain for Forth. I mean, they keep comparing it with other "languages", as if an operating-system/compiler/interpreter running on the basic Propeller itself could be compared with "just a compiler" that runs on a PC. But then again, many have "tried Forth" and have formed and fixed their opinions Not all though I must say.
I hope you're including me in the "not all though" part. I've always been intrigued by Forth and now Tachyon given how much you can do with it. I follow the Tachyon threads and try to parse the examples but still haven't found a big block of time I can spend getting up to speed to a point where I can do non-trivial things. However, I can't see how you could argue against it being a big asset to the Propeller. I don't know of any other language that will let you do as much.
Absolutely, you are the first guy I thought of and I didn't want to include you with these general comments, hence the "not all though".
Now we discussed calling it something different in the past, like Tachyon Basic, or maybe even BlockyForth, perhaps taChyon++
Maybe it's the German in me that says "if this is the best way to do something, then why aren't I doing it that way". I think plain simple Spin will always be one of the best ways of getting into the Prop and it's sad to see that this nice simple solution is being pushed aside for something more "popular". The saddest part though is that these popular tools don't make the Propeller work better, leaner, or meaner, it simply shows up how inadequate the Propeller is (for those tools).
I'm taking first steps into VGA.FTH . I modified vgapars according to my LCD and now I wanted to PLOT a pixel and I wonder about the PLOT definition:
pub PLOT IF bigch ELSE BL THEN VCHAR ;
How should this set a pixel into a memory bitmap?
How do I set just one pixel?
Remember I didn't try VGA.FTH yet (no VGA monitor and working at notebook). LameStation has no VGA and I'm in flow. Should I change to another board with VGA and do first steps or is it possible just to take the code to do bitmap graphics on my LCD Display?
I'm taking first steps into VGA.FTH . I modified vgapars according to my LCD and now I wanted to PLOT a pixel and I wonder about the PLOT definition:
pub PLOT IF bigch ELSE BL THEN VCHAR ;
How should this set a pixel into a memory bitmap?
How do I set just one pixel?
Remember I didn't try VGA.FTH yet (no VGA monitor and working at notebook). LameStation has no VGA and I'm in flow. Should I change to another board with VGA and do first steps or is it possible just to take the code to do bitmap graphics on my LCD Display?
Maybe VGA.FTH isn't the best one to use come to think of it. I do have a really old 128x64 LCD demo but that drew directly to the LCD, which you could also do and forgo the buffer. The same with QVGA TFT driver but one of the first things I did with Tachyon was run graphics VGA. It's old original Tachyon code <here> but it should give you a good headstart. There are also early youtube videos I did at the time too.
Peter,
What you have achieved is simply amazing. I am in awe of what you can do so simply, quickly and efficiently.
I follow your thread but I cannot understand your code one liners. It's not something I can just see, nor wrap my head around. I've tried a few times. Perhaps my old brain cannot be retrained anymore.
Anyway, keep up the excellent work. I know you are using it in you big project so it's not going to waste.
OK, it is a pity! What is Tachyon missing? Personally I'm very unhappy not to manage it to switch over to Tachyon. I miss a guy, who can pull people deep into Tachyon, so they find no way out and never want to do so. How to make a Master Plan?
YOU are THE guy ;-)
the only guy to pull you in ...
The problem is that Forth is a steep learning curve for people that have already learned to program in other languages.
The great strengths of Forth are that it's (relatively) easy to get going on new hardware, and that it results in amazingly compact code. The fact that it's so compact meant that it was very popular back in the 80s when memory was scarce and expensive but as memory (on most systems) became bigger and cheaper Forth tended to lose more and more of its share of the 'market'.
On the propeller we don't have much RAM - there are ways of adding external RAM but they all suffer from one or more of the flaws of being slow, expensive, or using up a lot of the available I/O pins. Even if there was a very fast, very cheap way of adding lots of RAM that didn't use many pins it would still have the drawback of being non-standard for the propeller. So for the propeller, Forth is great fit - especially Peter's Tachyon flavour of Forth. It's not just a language - it's a compiler and operating system too - all running on the usual prop hardware without RAM expansion. Applications written in Forth run faster and are much smaller than those written in Spin - they're often smaller even than the same application written in assembler - unless that hand-written assembler is optimized very carefully.
There's still the big learning curve to get over for newcomers - but dive in, stick with it, and eventually you develop the Forth way of thinking. It's a fun thing to do - and fun is the reason most of us are playing with the Propeller in the first place.
Maybe VGA.FTH isn't the best one to use come to think of it. I do have a really old 128x64 LCD demo but that drew directly to the LCD, which you could also do and forgo the buffer. The same with QVGA TFT driver but one of the first things I did with Tachyon was run graphics VGA. It's old original Tachyon code <here> but it should give you a good headstart. There are also early youtube videos I did at the time too.
Thanks @Peter, I'm studying VGA DEMO.FTH.
Trying to compile it I have changed some things in order to compile and load. My objective is to draw something into memory and LCDBITLBT this to the LCD device.
0. Some few words about the structure of the bitmap (How are the bits organized?)
One linear bytestream RRGGBBxx | RRGGBBxx | RRGGBBxx | ... ?
1. I made some minor changes to be able to compile (WVARIABLE -> WORD, etc.) things where I know what I do
2. 'pixels' buffer : I introduced 'BUFLEN', allocated bytes for 'pixels', introduced 'colors' like this
3. Here I need help :
I replaced 'FONT' with a dummy because I don't know what it does
pub FONT ;
4. Here I need help :
I replaced '=PIXELS'
pub =PIXELS ;
5. VGAINIT is yet commented out also the last line
PS: still not tried VGA.FTH in hardware - I know that it's good but I want to use it for the LameStation
PPS: please don't spend too much time - I would just appreciate some tips
Maybe VGA.FTH isn't the best one to use come to think of it. I do have a really old 128x64 LCD demo but that drew directly to the LCD, which you could also do and forgo the buffer. The same with QVGA TFT driver but one of the first things I did with Tachyon was run graphics VGA. It's old original Tachyon code <here> but it should give you a good headstart. There are also early youtube videos I did at the time too.
Thanks @Peter, I'm studying VGA DEMO.FTH.
Trying to compile it I have changed some things in order to compile and load. My objective is to draw something into memory and LCDBITLBT this to the LCD device.
0. Some few words about the structure of the bitmap (How are the bits organized?)
One linear bytestream RRGGBBxx | RRGGBBxx | RRGGBBxx | ... ?
1. I made some minor changes to be able to compile (WVARIABLE -> WORD, etc.) things where I know what I do
2. 'pixels' buffer : I introduced 'BUFLEN', allocated bytes for 'pixels', introduced 'colors' like this
3. Here I need help :
I replaced 'FONT' with a dummy because I don't know what it does
pub FONT ;
4. Here I need help :
I replaced '=PIXELS'
pub =PIXELS ;
5. VGAINIT is yet commented out also the last line
PS: still not tried VGA.FTH in hardware - I know that it's good but I want to use it for the LameStation
PPS: please don't spend too much time - I would just appreciate some tips
Thanks in advance, proplem
@Proplem
the LCDBITLBT uses a 1 bit bitmap to send to the LCD.
why don't you start simple with a 1-bit memory bitmap.
If / when it works make it more complicated.
pub DRAWPIX ( x y 0/1 .. ) ...
locate the byte \
merge in the bit \ OR / ANDN ...
Comments
after your midwifery (is that correct english?) LameStation graphics things come into motion. The following code iterates a linear byte buffer lcdmem and gives it out to the display. Currently without any convolution.
This is quite slow versus simple deleting but I don't care yet and will look for better performance later.
The next step will be stuffing the bits together according to the screen buffer organized like this: Maybe I can do this tomorrow "convolution day". After that drawing to the buffer has to be done. I will dive into VGA.FTH to see wether I can copy from there to draw to the buffer.
regards,
proplem
There are some strange things in your code though. You test for first time through the loop with I 0= IF when it makes more sense to place the init "LCDBOTH lcdon LCDCMD" before the loop without any need for a test which btw is happening every iteration and helping slow it down. You can also make your code faster by using shifts instead of divides if it is a binary multiple, so 5 >> instead of 64 / and 7 >> instead of 128 /, much much faster. That is why there 2/ (and 2*) words since these are actually shift right and shift left but they take a fraction of a microsecond to execute.
The "ctr_halves @ EVEN?" ends up doing the same thing for both conditions?. Also, you could probably simplify the whole variables as flags thing too by simply leaving values on the stack since there are only two of them, so you end up using DUP or OVER or 3RD etc to "fetch" the variable. But even all that could be simplified too. No matter though, the important thing is to do what you have done, to get it working!
EDIT: Here is a simplification of your code, still as your code, but mainly removing variables etc. This same code can be further reduced as you might see.
The code is faster but for me less understandable. I think I will revert to my first version using variables. This is a bit slower but I can better extend it. If I will need the speed, I will optimize and refactor later.
When Luke Skywalker used the Force he had to let go of his "aids", flipping the targeting computer out of the way. While useful, it can get in the way. Same with variables.
Let's see, if we just simplify and factor, skip checking if we need to set or not, and just set. Maybe this will work?
Beautiful, it's hard to learn to write in FORTH and not merely translate C, SPIN... into FORTH syntax.
BUT: I compared performance:
Peter's minimalistic FORTH version:
My maybe not so convoluted FORTHish child's book readable version:
Now you're gasping :-D :-D :-D
Nevertheless chapeau Peter - of course it is a beautiful solution.
Best regards, yours :-)
proplem
Anyway here's my code I used.
I'm sure I could make it faster and simpler again. I wonder if it works?
If so then what about the optimized version that only sets the page, chip select, and address every 64 bytes since the "Y" address pointer autoincrements.
Oh, and even in V4 it only requires
@Peter, your version is nearly working - there must be a small mistake , I didn't take time to find it. But on Tachyon V3 runtime of your last posted version is:
Maybe I should try my version on V4 but I don't want you to get a sleepless night :-D
Scratch that, I'm down to 12ms, no tricks.
Scratch that
Wow, that's a benchmark! Although I don't know wether the LCD display is in sync with this speed. I saw remarks in the internet that enable signal has to be hold 4 ms to stay in sync. We'll see later.
BTW: Should I change to V4? We're having so much fun in this V3 thread :-)
I think that 4ms is more of a sync to do with game screen updates. You could have this routine running in a cog and just continually updating the LCD 100 times/second although the STN LCD is slow and probably you wouldn't need an update rate of more than 20 fps I'd say. Or you could have it update when the game requests a screen update. The controller chip itself does not need any such long timing such as 4ms, indeed the E high time is 450ns minimum.
As for V4 I wouldn't want you to lose your momentum with Tachyon but as you can see, it is definitely faster in some areas and overall.
where is the code for those faster and faster versions ?
so we can learn step by step how you did it
to improve our coding style ...
thanks
Have a look here, before they try and put Tachyon in the "Forth" sub-forum, alongside the C, Basic, and Spin sub-forums of course
Are you sure there is a Propeller underneath Tachyon? Seems too good to be true
as you are the current LCDBITBLT race leader (I never give up - some races are won in the last lap :-) I would like to slow you a little bit down.
You mentioned already having implemented some graphics routines to draw into a buffer and pointed to VGA.FTH. This is color graphics but the LameStation Display is blue and white.
Would you recommend to use this color graphic routines or should I implement from scratch some blue and white ones? (I could copy from yours.)
thanks @Peter, I had the 'same' idea of factoring around fast 64 byte block moves - so feels good to see the code. I would have taken me quite a while to write it ...
So now on V4 you can even use your IC@ to speed up more :-) ...
then a black&white approach might be sufficient.
I didn'T look into the VGA code recently, but the structure should be easily usable.
Oh yes, you're right - half blue - I saw that it exists but didn't remember. So the decision is done. Full color support makes sense. I will dive into VGA.FTH ...
Thank you MJB
I just keep getting the distinct feeling that Parallax and the forum in general have a disdain for Forth. I mean, they keep comparing it with other "languages", as if an operating-system/compiler/interpreter running on the basic Propeller itself could be compared with "just a compiler" that runs on a PC. But then again, many have "tried Forth" and have formed and fixed their opinions Not all though I must say.
I remember when Cliff Biffle wrote a Forth for the Propeller, the Parallax guys were all excited and wanted to go meet him. But that was just a Forth, buggy and slow, and not even a rudimentary file system or proper serial communications. What could you do with it? Not much at all. I tried.
Tachyon is a practical solution for the Propeller, leveraging its strengths, and compensating for its weaknesses, then throwing in a whole bunch of absolutely useful stuff that you'd never thought you could do on "just a Propeller" (think here of ignorant disdain from the ARM/AVR/PIC etc crowd). If there existed a "language" that could showcase the Propeller, to have it show off all it's tricks, even all at the same time, wouldn't Parallax be interested, at least just to show what can be done? (long silence here).
Maybe you're right Ray, it seems too good to be true.
Absolutely, you are the first guy I thought of and I didn't want to include you with these general comments, hence the "not all though".
Now we discussed calling it something different in the past, like Tachyon Basic, or maybe even BlockyForth, perhaps taChyon++
Maybe it's the German in me that says "if this is the best way to do something, then why aren't I doing it that way". I think plain simple Spin will always be one of the best ways of getting into the Prop and it's sad to see that this nice simple solution is being pushed aside for something more "popular". The saddest part though is that these popular tools don't make the Propeller work better, leaner, or meaner, it simply shows up how inadequate the Propeller is (for those tools).
How should this set a pixel into a memory bitmap?
How do I set just one pixel?
Remember I didn't try VGA.FTH yet (no VGA monitor and working at notebook). LameStation has no VGA and I'm in flow. Should I change to another board with VGA and do first steps or is it possible just to take the code to do bitmap graphics on my LCD Display?
Maybe VGA.FTH isn't the best one to use come to think of it. I do have a really old 128x64 LCD demo but that drew directly to the LCD, which you could also do and forgo the buffer. The same with QVGA TFT driver but one of the first things I did with Tachyon was run graphics VGA. It's old original Tachyon code <here> but it should give you a good headstart. There are also early youtube videos I did at the time too.
What you have achieved is simply amazing. I am in awe of what you can do so simply, quickly and efficiently.
I follow your thread but I cannot understand your code one liners. It's not something I can just see, nor wrap my head around. I've tried a few times. Perhaps my old brain cannot be retrained anymore.
Anyway, keep up the excellent work. I know you are using it in you big project so it's not going to waste.
YOU are THE guy ;-)
the only guy to pull you in ...
The great strengths of Forth are that it's (relatively) easy to get going on new hardware, and that it results in amazingly compact code. The fact that it's so compact meant that it was very popular back in the 80s when memory was scarce and expensive but as memory (on most systems) became bigger and cheaper Forth tended to lose more and more of its share of the 'market'.
On the propeller we don't have much RAM - there are ways of adding external RAM but they all suffer from one or more of the flaws of being slow, expensive, or using up a lot of the available I/O pins. Even if there was a very fast, very cheap way of adding lots of RAM that didn't use many pins it would still have the drawback of being non-standard for the propeller. So for the propeller, Forth is great fit - especially Peter's Tachyon flavour of Forth. It's not just a language - it's a compiler and operating system too - all running on the usual prop hardware without RAM expansion. Applications written in Forth run faster and are much smaller than those written in Spin - they're often smaller even than the same application written in assembler - unless that hand-written assembler is optimized very carefully.
There's still the big learning curve to get over for newcomers - but dive in, stick with it, and eventually you develop the Forth way of thinking. It's a fun thing to do - and fun is the reason most of us are playing with the Propeller in the first place.
Thanks @Peter, I'm studying VGA DEMO.FTH.
Trying to compile it I have changed some things in order to compile and load. My objective is to draw something into memory and LCDBITLBT this to the LCD device.
0. Some few words about the structure of the bitmap (How are the bits organized?)
One linear bytestream RRGGBBxx | RRGGBBxx | RRGGBBxx | ... ?
1. I made some minor changes to be able to compile (WVARIABLE -> WORD, etc.) things where I know what I do
2. 'pixels' buffer : I introduced 'BUFLEN', allocated bytes for 'pixels', introduced 'colors' like this
3. Here I need help :
I replaced 'FONT' with a dummy because I don't know what it does
4. Here I need help :
I replaced '=PIXELS'
5. VGAINIT is yet commented out also the last line
PS: still not tried VGA.FTH in hardware - I know that it's good but I want to use it for the LameStation
PPS: please don't spend too much time - I would just appreciate some tips
Thanks in advance, proplem
@Proplem
the LCDBITLBT uses a 1 bit bitmap to send to the LCD.
why don't you start simple with a 1-bit memory bitmap.
If / when it works make it more complicated.
pub DRAWPIX ( x y 0/1 .. ) ...
locate the byte \
merge in the bit \ OR / ANDN ...
;