Shop OBEX P1 Docs P2 Docs Learn Events
Prop2 Texture Mapping — Parallax Forums

Prop2 Texture Mapping

cgraceycgracey Posts: 14,151
edited 2013-03-27 17:25 in Propeller 2
I got the texture mapping documented. It will undergo some refinement in the coming days, but all the information is there. There's also an example program:

Prop2_Docs.zip

I would probably never have known how to formulate a simple texture mapper, but Andre LaMothe spent a lot of time explaining it to me, and then Roy Eltham came around later and added pixel-based alpha blending and mirroring.

If any of you guys are very versed in 3D graphics, there should be all the info you need here to implement a 3D rendering engine.
«1

Comments

  • Bill HenningBill Henning Posts: 6,445
    edited 2013-03-13 19:04
    Thanks for the updated docs!

    The texture mapping sounds great.
    cgracey wrote: »
    I got the texture mapping documented. It will undergo some refinement in the coming days, but all the information is there. There's also an example program:

    Prop2_Docs.zip

    I would probably never have known how to formulate a simple texture mapper, but Andre LaMothe spent a lot of time explaining it to me, and then Roy Eltham came around later and added pixel-based alpha blending and mirroring.

    If any of you guys are very versed in 3D graphics, there should be all the info you need here to implement a 3D rendering engine.
  • SapiehaSapieha Posts: 2,964
    edited 2013-03-13 19:19
    Hi Chip.

    Thanks.

    Have You maybe values to change NTSC to PAL ?


    cgracey wrote: »
    I got the texture mapping documented. It will undergo some refinement in the coming days, but all the information is there. There's also an example program:

    Prop2_Docs.zip

    I would probably never have known how to formulate a simple texture mapper, but Andre LaMothe spent a lot of time explaining it to me, and then Roy Eltham came around later and added pixel-based alpha blending and mirroring.

    If any of you guys are very versed in 3D graphics, there should be all the info you need here to implement a 3D rendering engine.
  • TubularTubular Posts: 4,702
    edited 2013-03-13 19:25
    Wow!

    Thanks for the timely example. I've been wrestling with mode F and trying to get the data into the stack ram in between waitvids, and this example shows exactly how to do it. I had been going down a multi threaded path but this is much better. Very nice.
  • cgraceycgracey Posts: 14,151
    edited 2013-03-13 19:59
    Sapieha wrote: »
    Hi Chip.

    Thanks.

    Have You maybe values to change NTSC to PAL ?

    I'll make a VGA example, since that will be sharpest and work everywhere.
  • potatoheadpotatohead Posts: 10,261
    edited 2013-03-13 20:06
    While you wait, it's also possible to take the changes I made to get component NTSC and run this example unchanged otherwise. (I believe anyway... I'm about to do just that.)

    http://forums.parallax.com/attachment.php?attachmentid=99833&d=1362968699

    Let's just say I'm a fan of the lower sweep rates. :)
  • potatoheadpotatohead Posts: 10,261
    edited 2013-03-13 20:10
    WOW, BTW.

    Seems to me, something like WOLF3D can be done in hardware now, not a draw list like Baggers did. I just read that again and it sounds crappy. I don't mean that. The code Baggers posted is awesome for P1. Not supposed to be possible.
  • cgraceycgracey Posts: 14,151
    edited 2013-03-13 20:17
    potatohead wrote: »
    While you wait, it's also possible to take the changes I made to get component NTSC and run this example unchanged otherwise. (I believe anyway... I'm about to do just that.)

    http://forums.parallax.com/attachment.php?attachmentid=99833&d=1362968699


    You probably could get that texture demo working on VGA, unless, of course, it's impossible.
  • Bob Lawrence (VE1RLL)Bob Lawrence (VE1RLL) Posts: 1,720
    edited 2013-03-13 20:23
    Thanks Chip. More fun.
    :cool:

    Here`s the results from my test:


    Displayed on a Hisense LED LCD TV Model H32K38E

    (fFrom Left to right) Picture 1 : NTSC 256 x 192 - luma/color bars , Picture 2: NTSC 256 x 192 - texture (the edges are actually square, the camera angle makes it look squished in on the bottom sides.
    Prop2C.jpg
    Prop2T.jpg
    1024 x 768 - 63K
    1024 x 768 - 111K
  • potatoheadpotatohead Posts: 10,261
    edited 2013-03-13 22:03
    Here it is for NTSC component: I need to get some serious time to play with this. :) I've no VGA display here at the moment.

    FWIW, I like NTSC component, not just for the nice, slow sweeps where there is the max time to do things, but also for the portability. I can setup an NTSC composite or S-video display and run that up to 640x400 or so. The composite / s-video will not render color at that detail, and pixels will be lost, etc... but overall the display works just fine, meaning I can capture it, work on the go, whatever. When I get to a better quality display, switch to the component and work at full detail.

    This is darn cool. Other display mappings require different sweep timings which will eventually impact tighter code. At least with these two, it really is just a color space mapping and pin setup, little else. Just an FYI as to why I went this way right away. I can get full color resolution on the component, and a reasonable pixel density with no meaningful code changes. Nice. The colorburst and such are present in the signal, but ignored by the Y input, FYI.

    Edit: Just thought of something. It's very highly likely that PAL compatable sets with component inputs will display this just fine. Anyone have a device to test? If they won't, I'm thinking a simple change to 50 Hz will fix that, and it will render just fine. If so, many American sets will display 50Hz signals, and they do so even when formatted NTSC. Some computers were capable of this, two I can think of were the C= Amiga and Tandy Color Computer 3, both able to output 50/60 Hz NTSC, or PAL depending on where they were made, and the software options selected.

    A prop could output 50/60 Hz component signals and that signal might just display anywhere there are component inputs. If so, that's a near universal 640x200 or 400-420 line interlaced display.
  • BaggersBaggers Posts: 3,019
    edited 2013-03-19 02:21
    Thanks for the reference potatohead :D and don't worry, I knew what you meant haha

    Anyway, thought I'd give a heads up, as I ( very soon ) will be a DE2 owner :D and will be able to start joining in with the fun.

    As for Wolfenstein, that could be a good starting point :)

    I also live in PAL world, and also have component TVs so will be able to test this for you, once my boards arrive and are set up!

    PS, it's great to be back!

    Cheers,
    Jim.
  • potatoheadpotatohead Posts: 10,261
    edited 2013-03-19 11:28
    Good times ahead! The new video system is fast and capable, even at a mere 60Mhz!

    Cheers, and likewise Baggers.
  • BaggersBaggers Posts: 3,019
    edited 2013-03-19 15:01
    Yeah, I'm looking forward to having a good play with it :D
  • RaymanRayman Posts: 14,633
    edited 2013-03-19 17:32
    I'm curious to see how texture mapping can be applied for 3d graphics here...

    I've always thought that 3d graphics was mostly all about polygon rendering.
    Texture mapping onto the polygons is the next step up in quality from rendering with solid, possibly shaded colors.
  • Roy ElthamRoy Eltham Posts: 3,000
    edited 2013-03-19 21:15
    Rayman,
    What the texture mapper is doing is one step of an inner loop to a span rasterizer. When you do 3D graphics it all boils down to spans of pixels that comprise a triangle (usually) or polygon.

    You will still need to do all the math and setup for a triangle/polygon, and then walk the edges. For each step along the edge(s) you would setup the texture mapper registers, and then loop across the screen pixels for the span.
    You could also walk the edges using these instructions, but I'm not sure if it's a win. because you would need to do a pass walking the edges and saving the values, then loop back over those values again and run the spans.

    Roy
  • cgraceycgracey Posts: 14,151
    edited 2013-03-19 22:30
    Roy Eltham wrote: »
    Rayman,
    What the texture mapper is doing is one step of an inner loop to a span rasterizer. When you do 3D graphics it all boils down to spans of pixels that comprise a triangle (usually) or polygon.

    You will still need to do all the math and setup for a triangle/polygon, and then walk the edges. For each step along the edge(s) you would setup the texture mapper registers, and then loop across the screen pixels for the span.
    You could also walk the edges using these instructions, but I'm not sure if it's a win. because you would need to do a pass walking the edges and saving the values, then loop back over those values again and run the spans.

    Roy

    Roy, did the texture mapping doc's make complete sense to you? (I ask Roy because he knows this stuff inside and out.)
  • Pharseid380Pharseid380 Posts: 26
    edited 2013-03-19 22:37
    I wrote a 3D routine back in the 90's and after it transformed, projected and clipped a polygon, it would go through the vertices of a polygon pairwise using Bresenham's algorithm to fill in values in an array which had as many values in one direction as the height of the screen and 2 values in the other dimension, which among other things would hold the starting and ending x values for that raster line (so the y values were implicit by the position in the array). It didn't do texture mapping, but each entry was a structure which also contained r,g,b color of the surface and a value for lighting. Oh yes, and z values to do the z-buffer algorithm (although I also had a depth-sort version). If it had been upgraded for texture mapping, it would contain starting and ending texture coordinates for a line and starting and ending light values. The point being to fill in this array, you're just doing a lot of interpolating between vertex values, interpolating x and y values, texture coordinates and lighting values. Once the array was "filled" (in the general case, the whole array wouldn't necessarily be filled, I would note the highest and lowest y values) it would go through the appropriate values of the array and successively draw each line. It didn't look that bad although it was pretty slow on the computers of that time. On a Prop ][ you could probably crank polygons out at a pretty good rate, given that you could have different cogs doing different steps in the rendering pipleline, you could just add cogs until you got the performance you needed.
  • Roy ElthamRoy Eltham Posts: 3,000
    edited 2013-03-20 02:17
    Yes, Chip, They made sense to me.

    Also, I keep forgetting that the registers aren't all mapped, so you can't use these to walk the edges, since you can't read back the values to "save" them at each edge step. So doing all the math for the edge walking is going to eat available cog memory. probably will want to use another cog for that, and just feed the rendering cog with the values.
  • cgraceycgracey Posts: 14,151
    edited 2013-03-20 06:24
    Roy Eltham wrote: »
    Yes, Chip, They made sense to me.

    Also, I keep forgetting that the registers aren't all mapped, so you can't use these to walk the edges, since you can't read back the values to "save" them at each edge step. So doing all the math for the edge walking is going to eat available cog memory. probably will want to use another cog for that, and just feed the rendering cog with the values.

    Because you can't read them back, you'll have to compute terminal=initial+delta*steps on your own. That's three instructions per parameter.
  • Roy ElthamRoy Eltham Posts: 3,000
    edited 2013-03-20 10:30
    For a single triangle we'll have 3 sets of X,Y,Z,U,V,R,G,B,A data. Well need to calculate 2 or 3 sets of deltas for the edge(s) depending on the triangle orientation. Then we'll need 2 sets of X,Z,U,V,R,G,B,A data plus a shared Y for both as the current values as we walk down the edges. Those current values will be used to calculate the values to load into the setpix_ instructions. (9*6)+(8*2)+1 = 71 longs The largest single texture we can use takes 256 longs. So we have like 179 longs available for code and any other data needed.

    Roy
  • RaymanRayman Posts: 14,633
    edited 2013-03-20 11:36
    Ok, sounds like you guys have some plans on how to do all this. Glad to hear that.

    BTW: Another way to do 3D is raytracing... It'd be interesting to see if Prop2 could do real-time raytracing at some low resolution...
  • Roy ElthamRoy Eltham Posts: 3,000
    edited 2013-03-20 12:35
    Ray,
    Re: realtime ray tracing:

    Seriously doubt it could be anything near realtime (even at very low res). Maybe if your scene was ultra simple (like no mesh data, just planes and spheres, and no textures), static, the resolution was extremely low (16x16?), and you only did like 1 maybe 2 ray bounces.

    Modern GPUs barely achieve it with reasonable scenes in medium resolution, and they are using massively parallel processing (on the order of 512 to 2048 cores, resulting in a 2-4 TFLOP (TeraFLOP) throughput.

    However, I bet we could get a Ray Tracer working that would produce a pleasing image in a few seconds.
  • RaymanRayman Posts: 14,633
    edited 2013-03-20 12:50
    You're probably right... :( Still, maybe there's some cordic magic that helps. Plus, you can buy a lot of Prop chips for the price of 1 modern GPU...
  • Roy ElthamRoy Eltham Posts: 3,000
    edited 2013-03-20 14:20
    It also depends on what you mean by realtime. Some people would accept 5-10 frames per second as real time. For me it needs to be at least 30, but preferably 60 frame per second, so that means you have 16.66 or 33.33 ms to render the complete frame.
  • Cluso99Cluso99 Posts: 18,069
    edited 2013-03-20 16:11
    Roy: I see a "stack" of P2's in your future ;)

    This is some serious stuff here. Sorry, I have never done anything like this, so I am just enjoying reading what you guys are up to :)
  • cgraceycgracey Posts: 14,151
    edited 2013-03-20 18:06
    Here is some slightly updated TEXTURE MAPPER doc's:
    TEXTURE MAPPER
    --------------
    
    Each cog has a texture mapper (PIX) which can sequentially navigate a rectangular 2D texture
    map with Z-perspective correction to locate a texture pixel, translate that texture pixel into
    A:R:G:B (Alpha:Red:Green:Blue) pixel data, perform discrete scaling on those A:R:G:B components,
    and then alpha-blend the resulting pixel with another pixel for multi-layered 3D effects.
    
    A texture map is stored in register RAM as a sequence of 1/2/4/8-bit texture pixels which build
    from the bottom bits of an initial register, upward, then into subsequent registers. They are
    ordered, in contiguous sequence, from top-left to top-right down to bottom-left to bottom-right.
    These texture pixels get used as offsets into stack RAM to look up A:R:G:B pixel data. Texture
    map width and height are individually settable to 1/2/4/8/16/32/64/128 pixel(s).
    
    The SETPIX instruction is used to configure PIX:
    
        SETPIX  D/#n    - Set PIX configuration to %UUU_VVV_PP_W_H_V_xxxx_AAAAAAAA_RRRRRRRRR
    
              %UUU = texture map width, %VVV = texture map height
    
                     %000 =   1 pixel
                     %001 =   2 pixels
                     %010 =   4 pixels
                     %011 =   8 pixels
                     %100 =  16 pixels
                     %101 =  32 pixels
                     %110 =  64 pixels
                     %111 = 128 pixels
    
               %PP = texture pixel size
    
                     %00 = 1 bit
                     %01 = 2 bits
                     %10 = 4 bits
                     %11 = 8 bits
    
                %W = stack RAM pixel data offset/size
    
                     %0 = long offset, 8:8:8:8 bit A:R:G:B data
                     %1 = word offset, 1:5:5:5 bit A:R:G:B data (gets expanded to 8:8:8:8)
    
                %H = horizontal mirroring
    
                     %0 = OFF, image repeats when U'[15] set
                     %1 = ON,  image mirrors when U'[15] set
    
                %V = vertical mirroring
    
                     %0 = OFF, image repeats when V'[15] set
                     %1 = ON,  image mirrors when V'[15] set
    
         %AAAAAAAA = base address in stack RAM of A:R:G:B pixel data
    
        %RRRRRRRRR = base address in register RAM of texture pixels
    
    
    Aside from SETPIX, which configures PIX's base metrics, there are seven other instructions
    which establish initial values and deltas for the (U,V) texture coordinates, Z perspective,
    and A/R/G/B scalers. These instructions are likely to be used before every sequence of GETPIX
    instructions. They each set the value of their respective 16-bit parameter to the low word of
    their operand, while the high word sets the 16-bit delta which gets added to the parameter
    upon every GETPIX instruction:
    
        SETPIXU D/#n    - Set U to low word and DU to high word
        SETPIXV D/#n    - Set V to low word and DV to high word
        SETPIXZ D/#n    - Set Z to low word and DZ to high word
        SETPIXA D/#n    - Set A to low word and DA to high word
        SETPIXR D/#n    - Set R to low word and DR to high word
        SETPIXG D/#n    - Set G to low word and DG to high word
        SETPIXB D/#n    - Set B to low word and DB to high word
    
    
    Once PIX is configured and initial parameters are set, the GETPIX instruction may be used to
    look up the current texture pixel, scale its A/R/G/B components, blend it with a pixel in D,
    and update the U/V/Z/A/R/G/B parameters with their deltas. GETPIX takes 3 clocks and also
    needs 3 clocks in pipeline stages 2 and 3:
    
            NOP     #2              'ready pipeline, GETPIX needs 3 clocks in pipeline stage 2
            NOP     #2              'ready pipeline, GETPIX needs 3 clocks in pipeline stage 3
            GETPIX  pixel           'execute GETPIX, GETPIX takes 3 clocks in pipeline stage 4
    
    
    To make GETPIX more efficient, it can be repeated using REPD to perform a sequence of pixel
    operations:
    
            REPD    #64,#1          'render 64 texture pixels and blend them with 'pixels'
            SETINDA #pixels         'point INDA to pixels
            NOP     #2              'ready pipeline, 3 clocks in initial pipeline stage 2
            NOP     #2              'ready pipeline, 3 clocks in initial pipeline stage 3
            GETPIX  INDA++          'execute GETPIX, 3 clocks per repeating GETPIX
    
    
    As GETPIX executes, the following sequence occurs over three pipeline stages:
    
    
        In pipeline stage 2:
    
            Z-perspective correction
            ------------------------
            Z' = 256 - Z[15:8]
            U' = (U[15:0] / Z') MOD 256
            V' = (V[15:0] / Z') MOD 256
    
            A texture pixel is read from register RAM at texture map location (U',V'), with
            the U' and V' top-most bits being used as coordinates. For example, if the texture
            size is 32x8, then the top 5 bits of U' and the top 3 bits of V' would be used to
            locate the texture pixel.
    
            parameter updating
            ------------------
            Z = Z + DZ
            U = U + DU
            V = V + DV
    
    
        In pipeline stage 3:
    
            The texture pixel is used as an offset to look up A:R:G:B pixel data in stack RAM,
            which gets assigned to TA:TR:TG:TB.
    
    
        In pipeline stage 4:
    
            pixel scaling
            -------------
            A' = (TA * A[15:8]  +  255) / 256
            R' = (TR * R[15:8]  +  255) / 256
            G' = (TG * G[15:8]  +  255) / 256
            B' = (TB * B[15:8]  +  255) / 256
    
            pixel blending
            --------------
            D[31..24] = 0
            D[23..16] = (A' * R'  +  (255 - A') * D[23..16]  +  255) / 256
            D[15..8]  = (A' * G'  +  (255 - A') * D[15..8]   +  255) / 256
            D[7..0]   = (A' * B'  +  (255 - A') * D[7..0]    +  255) / 256
    
            C = A' <> 0     (for GETPIX D/#n WC, C = texture pixel opacity <> 0)
    
            parameter updating
            ------------------
            A = A + DA
            R = R + DR
            G = G + DG
            B = B + DB
    
    
    Note that if Z[15:8] = 0, no scaling occurs, or (U',V') = (U[15:8],V[15:8]). The bigger
    Z[15:8] gets, the more compressed the texture rendering becomes, until when Z[15:8] = 255,
    (U',V') = (U[7:0],V[7:0]).
    

    I realized today that the next thing I must do is make a driver for the SDRAM chip that is on the DE0 and DE2 boards, as well as the Prop2 module we are building at Parallax. We need to get high-resolution bit-mapped displays going to graphically demonstrate a lot of the Prop2's features.
  • John A. ZoidbergJohn A. Zoidberg Posts: 514
    edited 2013-03-22 07:24
    Could this be comparable to a uh... normal ISA-level video card which processes a moderate amount of polygons? Or a S3-Trio kind of a performance? (S3 Trio is a classic commonly used graphic card back in the late 90s)
  • Bob Lawrence (VE1RLL)Bob Lawrence (VE1RLL) Posts: 1,720
    edited 2013-03-23 19:33
    Just having fun :cool:
    Prop2_Stripes.jpg
    1024 x 768 - 111K
  • BaggersBaggers Posts: 3,019
    edited 2013-03-26 18:04
    Question for Chip,

    Is the output from GetPix always 32bit?
    ie, can it output 16bit? so it can then be fed into a bitmap area? or would it have to be converted to 16bit, if so, is there an instruction to do this quick? or is it a case of shifts, ands and ors per word?
  • cgraceycgracey Posts: 14,151
    edited 2013-03-27 15:04
    Baggers wrote: »
    Question for Chip,

    Is the output from GetPix always 32bit?
    ie, can it output 16bit? so it can then be fed into a bitmap area? or would it have to be converted to 16bit, if so, is there an instruction to do this quick? or is it a case of shifts, ands and ors per word?

    GETPIX always outputs $00_RR_GG_BB data, or 8:8:8 RGB. It's 24-bit, anyway, with 8 leading 0 bits.

    I never thought to make it less than 8:8:8 because 5 bits per color produces obvious gradients. At 7 bits you can hardly see gradients and at 8 they disappear. So, I left it 8:8:8, only.
  • TubularTubular Posts: 4,702
    edited 2013-03-27 16:30
    Baggers wrote: »
    Question for Chip,

    Is the output from GetPix always 32bit?
    ie, can it output 16bit? so it can then be fed into a bitmap area? or would it have to be converted to 16bit, if so, is there an instruction to do this quick? or is it a case of shifts, ands and ors per word?

    You might not need it for this, but have a look at the MovF bitfield mover, it's nice and quick and has some auto increment features.
Sign In or Register to comment.