Shop OBEX P1 Docs P2 Docs Learn Events
New XMM hardware - Page 3 — Parallax Forums

New XMM hardware

1356

Comments

  • David BetzDavid Betz Posts: 14,516
    edited 2012-05-01 06:59
    Dr_Acula wrote: »
    I have my suspicions it is fsrw not liking my SD cards, but I don't want it to be that one because it leads down the path of porting Kye's SD driver into propGCC.
    This won't be difficult. I may try it tonight. The main problem is modifying Kye's driver to support various CS mechanisms. I guess there is already a version that supports the C3 so maybe this won't be a big deal. However, this means that Kye's code will be used for loading stuff from the SD card. It won't be used for filesystem access once the C program is running. That is handled by the PropGCC library and dosfs.c.
  • Dr_AculaDr_Acula Posts: 5,484
    edited 2012-05-01 07:19
    All working!!

    I think I had so many modified files I ended up not even using the files I thought I was using. So - all my mistakes, no one elses.

    So - complete reinstall of GCC.

    Brand new config file.

    Go back to jazzed's original test post #48.

    And it works absolutely perfectly - my little BCX basic program that concatenates strings:
    Loading cache driver
    Initializing SD card
    Mounting SD filesystem
    Opening AUTORUN.PEX
    Loading kernel
    Loading cluster map
    Initializing cache
    Starting program
    Enter a string: test
    Enter another string: test2
    testtest2
    

    Hey thanks guys. I would not have got this working without all your help.

    Out of this comes a lesson for me - keep track of all the version changes I make along the way!

    And I have just one tiny bug fix. In the terminal console, once it scrolls off the bottom of the page, it seems not to display the very last line and you have to keep using the scrollbar to see the last line.

    Thanks again everyone!

    And now... onwards to building a touchscreen GUI in GCC!
  • David BetzDavid Betz Posts: 14,516
    edited 2012-05-01 07:29
    Dr_Acula wrote: »
    All working!!

    I think I had so many modified files I ended up not even using the files I thought I was using. So - all my mistakes, no one elses.

    So - complete reinstall of GCC.

    Brand new config file.

    Go back to jazzed's original test post #48.

    And it works absolutely perfectly - my little BCX basic program that concatenates strings:
    Loading cache driver
    Initializing SD card
    Mounting SD filesystem
    Opening AUTORUN.PEX
    Loading kernel
    Loading cluster map
    Initializing cache
    Starting program
    Enter a string: test
    Enter another string: test2
    testtest2
    

    Hey thanks guys. I would not have got this working without all your help.

    Out of this comes a lesson for me - keep track of all the version changes I make along the way!

    And I have just one tiny bug fix. In the terminal console, once it scrolls off the bottom of the page, it seems not to display the very last line and you have to keep using the scrollbar to see the last line.

    Thanks again everyone!

    And now... onwards to building a touchscreen GUI in GCC!
    Congratulations! I'm glad you got it working! Have you decided to use the SD cache driver instead of creating a new cache driver for your board?
  • Dr_AculaDr_Acula Posts: 5,484
    edited 2012-05-01 07:36
    Oh yes, I'll be using the SD cache driver. I really can't see the need at the moment for putting a program in the external ram. Piles of code to write, no real benefit, and it uses up ram space better used by storing icons and fonts.

    And... in a general sense, I think that the SD cache driver is a solution that works for so many other boards. Indeed, I might go so far as saying that it may render pretty much all the external memory XMM program solutions obsolete.

    All you have to do is tell the program which pins you are using for your SD card.

    I'm not entirely sure the benefits of that have been appreciated yet by the wider prop community. I think one can start to say things like the extra memory in the prop 2 is not going to be needed for storing programs. Which means it is fully free to be used as a screen buffer, while at the same time allowing programs gigabytes in size.

    Your thoughts?
  • David BetzDavid Betz Posts: 14,516
    edited 2012-05-01 07:43
    Dr_Acula wrote: »
    Oh yes, I'll be using the SD cache driver. I really can't see the need at the moment for putting a program in the external ram. Piles of code to write, no real benefit, and it uses up ram space better used by storing icons and fonts.

    And... in a general sense, I think that the SD cache driver is a solution that works for so many other boards. Indeed, I might go so far as saying that it may render pretty much all the external memory XMM program solutions obsolete.

    All you have to do is tell the program which pins you are using for your SD card.

    I'm not entirely sure the benefits of that have been appreciated yet by the wider prop community. I think one can start to say things like the extra memory in the prop 2 is not going to be needed for storing programs. Which means it is fully free to be used as a screen buffer, while at the same time allowing programs gigabytes in size.

    Your thoughts?
    Yes, it is very handy to be able to run large programs with just an SD card interface, I'll be interested to see if you find the performance good enough though. As I think I mentioned in an earlier post, the 512 byte cache lines that are used in the SD cache driver are really too big for an 8k cache. I guess cache performance depends on the specific application though and it may be that this works well for you. Let us know how this works out for you.
  • jazzedjazzed Posts: 11,803
    edited 2012-05-01 07:51
    Dr_Acula wrote: »
    All working!!


    Great! Thanks for the extra push.
    Dr_Acula wrote: »
    And now I am confused. Because I did that with the PropBOE file I modified and renamed Touch161. And I did it with the dracblade file.

    Now in the c:\propgcc\propeller_load directory there is only one file that starts with "T" and that is TOUCH161.CFG However in the dropdown menu at the top of the SIDE there are now three entries - TOUCH161, TOUCH161 : PROPBOE, and TOUCH161-SDXMMC

    So somewhere along the line the IDE seems to be creating files. Which one to use, and what exactly are these files? Can you view them or edit them?

    The IDE looks at your config file and interprets it as follows:

    If your config file contains a [name] that does not match the config file, the IDE assumes an inheritance relationship which is a future feature. That's why you get a TOUCH161:PROPBOE board type. You should either remove the line with [name] or change it to match the file name.

    If your config file contains # IDE:SDXMMC, the IDE will create a TOUCH161-SDXMMC board type. Specifying this board type tells the IDE that you want to build a ".PEX" file and load the program to SDcard where it will be run in XMMC mode. You can either Burn F11 the loader program to EEPROM or just send it to HUB RAM F10.

    If your config file contains # IDE:SDLOAD, the IDE will create a TOUCH161-SDLOAD board type. Specifying this board type tells the IDE that you want to build a ".PEX" file and load the program to SDcard.. At boot up from HUB RAM F10 or EEPROM F11, the loader program will read the AUTORUN.PEX file from SDcard and load it into SRAM (according to the cache driver you've specified) and run the program there. For SRAM only programs, use XMM-SINGLE model; for FLASH+SRAM programs like on C3, use XMM-SPLIT.

    Hope this helps.

    Thanks,
    --Steve
  • jazzedjazzed Posts: 11,803
    edited 2012-05-01 09:05
    Dr_Acula wrote: »
    Oh yes, I'll be using the SD cache driver. I really can't see the need at the moment for putting a program in the external ram. Piles of code to write, no real benefit, and it uses up ram space better used by storing icons and fonts.

    The main drawbacks to SDXMMC are: 1) any global program data is generally limited to HUB RAM, and 2) SDXMMC performance.

    SDXMMC does not obsolete other XMM designs. SDXMMC performance is somewhat impaired relative to other solutions since blocks must be read up to 512 bytes at a time to read a cache line - as David mentioned. Some optimization can be had with the SDXMMC driver so that it uses smaller and more cache lines, but it may not make enough difference for the solution to be faster than a comparable single bit SPI Flash solution. Allowing use of smaller/more cache lines would mean a different driver version which would use a counter clock to "seek" the flash block at highest speed to the right position for collecting the smaller cache line. As it stands SDXMMC performance is probably ok for most "business logic" programs with the help of some HUBTEXT functions, but there are better ways to do it if folks have the pins and patience. Yes, SDXMMC is very useful, but it is not an end all solution for every need.

    Does that help?

    Thanks,
    --Steve
  • David BetzDavid Betz Posts: 14,516
    edited 2012-05-01 09:12
    jazzed wrote: »
    The main drawbacks to SDXMMC are: 1) any global program data is generally limited to HUB RAM
    This is correct but it also applies to any XMMC solution including a single bit SPI flash. The only way around this is to use an XMM driver that puts data in external memory like the one for the C3 but unfortunately performance is further degraded using external data. Also, even with an XMM driver, your stack must still fit in hub memory. We don't currently have a memory model that places the stack in external memory.
  • David BetzDavid Betz Posts: 14,516
    edited 2012-05-01 09:58
    Dr_Acula wrote: »
    And... in a general sense, I think that the SD cache driver is a solution that works for so many other boards. Indeed, I might go so far as saying that it may render pretty much all the external memory XMM program solutions obsolete.

    All you have to do is tell the program which pins you are using for your SD card.
    I have another comment about this. You mention that one of the appealing things about the SD cache driver is the fact that you can use it by just setting a few pin numbers. We also have a SPI flash cache driver that has a similar ability. You just set a couple of cache driver parameters to indicate the MISO, MOSI, and CLK pins and then set a few more parameters to indicate how CS is handled for your SPI flash chip. It should work with any SPI flash chip that is compatible with the Atmel AT26DF081A chip. I think that includes chips from Winbond and SST.
  • Dr_AculaDr_Acula Posts: 5,484
    edited 2012-05-01 16:10
    Thanks for the explanation re the config files. That all makes sense now.
    The main drawbacks to SDXMMC are: 1) any global program data is generally limited to HUB RAM, and 2) SDXMMC performance.

    Re XMM on the SD card, I'm starting to write code that loads and reloads cogs many times a second and I think that opens up the possibility of moving more speed critical code into assembly.

    The other thing that is available on the touchblade is one megabyte of parallel access ram. It is reading one word at a time and so has to be at least 16x faster than any single pin serial solution. It is currently being used for pictures and fonts but there is no reason it can't be used for global program memory data too. In spin for instance, I have reserved 64k of that ram for a text buffer and I think that can become the core of a text editor program.

    GCC opens up a whole lot of possibilities here.

    Re David's post #70 comment, I think that can tie in nicely with jazzed's hardware concept of having groups of propeller pins with a formal handover between groups where all pins go HiZ (with weak pullups) at the transition. With a 74HC137 chip this gives you hundreds of propeller pins. All you have to then do is make sure the software formally relinquishes pins when it is finished. So if you have a flash cache, fine, just grab 4 spare pins (there are always spare pins when you have hundreds available), take the group pin output from the 137, run it through a logic OR gate hc32 with the relevant propeller pin and use that to control the /CS pin on your spi flash cache. From the code perspective, you have to remember to "close" the SPI device when you have finished with it. But that is no different to remembering to close files, and is just one line of code to add eg spi_flash_close();

    So you could have many places you can store data. SD card. SPI flash. Serial and parallel ram.

    The drivers for these sorts of things don't need to be "internal" to GCC. They can be released as C code you can paste into your program, or turned into .h files once they are stable. The parallel ram driver for instance for the touch161 is not going to need any GCC modifications. In fact, all the touch161 config file is going to be is the propboe file but with different pins for the SD card.

    The other thing I just thought of is that with the availability of many pins, you can create many SPI ports. I'm thinking ADC, DAC, SPI to Parallel, all the real world interface things. If each one of those came with a little C driver (some of which already have been done) it makes coding in C much easier.

    Many many exciting possibilities here!
  • pedwardpedward Posts: 1,642
    edited 2012-05-01 16:37
    Is there a good benchmarking demo that would clearly demonstrate the speed of the different XMM options?
  • pedwardpedward Posts: 1,642
    edited 2012-05-01 16:44
    I'm thinking of designing a "flip" board for the PropKey. One side of the board will have the pattern for 1 or 2 qSPI flash chips, for ~16MB of flash, the other side will have SPI RAM chips to offer something like 256K of 8bit SPI memory. You decide which side to populate; which side is "up" determines the pinout and which headers are soldered.

    It is conceivable that you could populate both sides and "flip" the board to do different things -- or you can put headers on both sides and have Flash on one set of pins and RAM on another for XMM-SPLIT.
  • jazzedjazzed Posts: 11,803
    edited 2012-05-01 22:31
    pedward wrote: »
    Is there a good benchmarking demo that would clearly demonstrate the speed of the different XMM options?

    Here are some FIBO results run with SimpleIDE. With FCACHE turned off, you can get some estimate of relative performance for small loops. Of course this is somewhat deceiving. The Dhrystone test shows a different picture. Real world programs can behave very differently from both.




    Memory Model
    Board Type
    FCACHE
    Fibo(24) Time ms
    Fibo(0..24) Accumulated Clock Ticks


    COG
    C3F (HUB)
    N/A
    185
    14863316


    LMM
    C3F (HUB)
    Enabled
    354
    28351824


    LMM
    C3F (HUB)
    Disabled
    660
    52817632


    XMMC
    C3F
    Disabled
    2595
    207669984


    XMMC
    C3
    Disabled
    2595
    207669808


    XMM-SINGLE
    C3
    Disabled
    2925
    234078432


    XMM-SPLIT
    C3
    Disabled
    2925
    23407843







    Memory Model
    Board Type
    FCACHE
    Fibo(24) Time ms
    Fibo(0..24) Accumulated Clock Ticks


    COG
    SDRAM (HUB)
    N/A
    185
    14863316


    LMM
    SDRAM (HUB)
    Enabled
    354
    28351824


    LMM
    SDRAM (HUB)
    Disabled
    660
    52817632


    XMMC
    SDRAM-SDXMMC
    Disabled
    2265
    181260848


    XMMC
    SDRAM
    Disabled
    2889
    231179456


    XMM-SINGLE
    SDRAM
    Disabled
    2888
    231090320


    XMM-SPLIT
    SDRAM
    N/A
    N/A
    N/A






    Dhrystone is a good benchmark. Unfortunately, the SimpleIDE is not designed to produce 2 objects from a single C source file, so I won't post an example of that.

    We can run this dry.c benchmark using Propeller-GCC and make.

    All Dhrystone results gathered from 80MHz C3 and C3F board types.
    Compiled with: propeller-elf-gcc -O2 -mfcache -Dprintf=__simple_printf -DMSC_CLOCK -DINTEGER_ONLY -DFIXED_NUMBER_OF_PASSES=2000 ... pass 1 and pass 2 .o files linked to dry.elf.

    Here are some summary Dhrystones per second results for comparison:



    Memory Model
    Board Type
    Dhrystones/Second


    LMM
    C3 (HUB)
    6983


    XMMC
    C3F
    1422


    XMMC
    C3
    264


    XMMC
    C3-SDXMMC
    176


    XMMC-SINGLE
    C3
    138


    XMMC-SPLIT
    C3
    104







    Memory Model
    Board Type
    Dhrystones/Second


    LMM
    SSF (HUB)
    6983


    XMMC
    SSF
    1256









    Memory Model
    Board Type
    Dhrystones/Second


    LMM
    SDRAM (HUB)
    6983


    XMMC
    SDRAM
    1207


    XMMC-SINGLE
    SDRAM
    640


    XMMC
    SDRAM-SDXMMC
    180


    XMMC-SPLIT
    SDRAM
    N/A




    Here is another performance table that shows some very different results and makes broader comparisons. It compares performance of different solutions using a fairly small transpose program that doesn't reload cache much and a larger program that uses sprintf which results in more cache line reloads. The smaller the results the better.





    Spin
    LMM
    FLASH XMMC
    EEPROM XMMC
    SSF XMMC
    SSF2 XMMC
    SDRAM XMMC
    SDCARD XMMC


    Transpose
    64,400
    887
    902
    904
    902
    902
    902
    903


    sprintf
    1,351
    531
    27,386
    10,046
    3,515
    3,415
    3,886
    34,262


    CacheLines
    N/A
    N/A
    32
    128
    64
    128
    64
    16


    LineSize
    N/A
    N/A
    128
    64
    128
    64
    32
    512


    CacheSize
    N/A
    N/A
    4KB
    8KB
    8KB
    8KB
    2KB
    8KB



    Note that SDCARD XMMC has the worst performance for sprintf which causes many cache misses and SSF has the best.
    This table is consistent with what can be seen in applications like David Betz' EBASIC program.
  • jazzedjazzed Posts: 11,803
    edited 2012-05-01 22:36
    Dr_Acula wrote: »
    ...

    The other thing I just thought of is that with the availability of many pins, you can create many SPI ports. I'm thinking ADC, DAC, SPI to Parallel, all the real world interface things. If each one of those came with a little C driver (some of which already have been done) it makes coding in C much easier.

    Many many exciting possibilities here!

    Interesting reading. I'll help however I can. Seems like you've solved the pins problem ... hope that's working out A-OK.
  • David BetzDavid Betz Posts: 14,516
    edited 2012-05-02 03:28
    Dr_Acula wrote: »
    Thanks for the explanation re the config files. That all makes sense now.



    Re XMM on the SD card, I'm starting to write code that loads and reloads cogs many times a second and I think that opens up the possibility of moving more speed critical code into assembly.

    The other thing that is available on the touchblade is one megabyte of parallel access ram. It is reading one word at a time and so has to be at least 16x faster than any single pin serial solution. It is currently being used for pictures and fonts but there is no reason it can't be used for global program memory data too. In spin for instance, I have reserved 64k of that ram for a text buffer and I think that can become the core of a text editor program.

    GCC opens up a whole lot of possibilities here.

    Re David's post #70 comment, I think that can tie in nicely with jazzed's hardware concept of having groups of propeller pins with a formal handover between groups where all pins go HiZ (with weak pullups) at the transition. With a 74HC137 chip this gives you hundreds of propeller pins. All you have to then do is make sure the software formally relinquishes pins when it is finished. So if you have a flash cache, fine, just grab 4 spare pins (there are always spare pins when you have hundreds available), take the group pin output from the 137, run it through a logic OR gate hc32 with the relevant propeller pin and use that to control the /CS pin on your spi flash cache. From the code perspective, you have to remember to "close" the SPI device when you have finished with it. But that is no different to remembering to close files, and is just one line of code to add eg spi_flash_close();

    So you could have many places you can store data. SD card. SPI flash. Serial and parallel ram.

    The drivers for these sorts of things don't need to be "internal" to GCC. They can be released as C code you can paste into your program, or turned into .h files once they are stable. The parallel ram driver for instance for the touch161 is not going to need any GCC modifications. In fact, all the touch161 config file is going to be is the propboe file but with different pins for the SD card.

    The other thing I just thought of is that with the availability of many pins, you can create many SPI ports. I'm thinking ADC, DAC, SPI to Parallel, all the real world interface things. If each one of those came with a little C driver (some of which already have been done) it makes coding in C much easier.

    Many many exciting possibilities here!
    How do you get hundreds of pins using a 3-8 demultiplexer?
  • Dr_AculaDr_Acula Posts: 5,484
    edited 2012-05-02 04:56
    Oh ok, not that many. 8 groups of 21 pins = 168, as some pins are always connected, eg 4 for SD, 2 for eeprom, 2 for download, 2 for audio, and 1 to control the 137.

    Of course, you can't control all the pins at the same time. But you can select a group and latch out to a latch, or talk to an SPI port, and then use those same pins in another group for different things. I'm talking to external ram, and then in a different group, using the same pins to talk to the touch part of the touch screen via an SPI port.

    The rule is that each group of 21 pins must be isolated from other groups. So if pins are outputs, use an OR gate with that group number. And if an input, isolate with a 244 or a 4016 etc. Sometimes you don't need any logic chips though - eg if an SPI device has the /CS pin connected to that group control (ie the output of the 137) then when the group is deselected (high) all the pins on that SPI device go HiZ.
  • David BetzDavid Betz Posts: 14,516
    edited 2012-05-02 06:40
    Dr_Acula wrote: »
    Oh ok, not that many. 8 groups of 21 pins = 168, as some pins are always connected, eg 4 for SD, 2 for eeprom, 2 for download, 2 for audio, and 1 to control the 137.

    Of course, you can't control all the pins at the same time. But you can select a group and latch out to a latch, or talk to an SPI port, and then use those same pins in another group for different things. I'm talking to external ram, and then in a different group, using the same pins to talk to the touch part of the touch screen via an SPI port.

    The rule is that each group of 21 pins must be isolated from other groups. So if pins are outputs, use an OR gate with that group number. And if an input, isolate with a 244 or a 4016 etc. Sometimes you don't need any logic chips though - eg if an SPI device has the /CS pin connected to that group control (ie the output of the 137) then when the group is deselected (high) all the pins on that SPI device go HiZ.
    I'm a software guy so bear with me. How are the 137's wired to achive a 21 pin group?
  • Dr_AculaDr_Acula Posts: 5,484
    edited 2012-05-02 07:30
    The 137 is a 3 to 8 decoder with a latch. One of the 8 outputs is low at any one time. Y0-Y7

    If Y0 is low then that designates group0. Anything that talks to the 21 propeller pins in group0 is only allowed to do so if Y0 is low. So you need some extra logic in some circumstances. Outputs can only go out if Y0 is low and the relevant prop pin is low. Any inputs coming in are only allowed to get through if Y0 is low.

    Initially this seems like you are going to need a lot of extra chips. But many times, whole groups of pins can be done without any extra logic. An example is the ram chip. These are group1, ie Y1 output from the 137, and that goes to the /CS line on the ram chips. So the ram chips will ignore any inputs to any other pins when Y1 is high, and also all memory pins are in HiZ. So there is no contention with other groups. Ditto the ILI9325 display - the whole display can be disabled with just one /CS pin.
  • Dr_AculaDr_Acula Posts: 5,484
    edited 2012-05-11 00:38
    I think averagejoe and I have finalised on a board design where the SD card is not going to change pins any more. It is pins 24-27 in the order below and there are no group selects or anything complicated on those pins. Just 4 prop pins direct to the SD card.

    I'm using a file called TOUCH161.CFG which is this
    # [propboe]
    # IDE:SDXMMC
        clkfreq: 80000000
        clkmode: XTAL1+PLL16X
        baudrate: 115200
        rxpin: 31
        txpin: 30
        cache-driver: eeprom_cache.dat
        cache-size: 8K
        cache-param1: 0
        cache-param2: 0
        eeprom-first: TRUE
        sd-driver: sd_driver.dat
        sdspi-do: 24
        sdspi-clk: 25
        sdspi-di: 26
        sdspi-cs: 27
    

    And so begins porting over the touchscreen code.

    First thing that is *very* nice compared to C89. Binary numbers 0b work and don't need to translate everything into hex
      int binarynumber;
      binarynumber = 0b11111110; // binary value
      printf("%x\n",binarynumber); // print hex value
    

    which prints FE as expected.
  • average joeaverage joe Posts: 795
    edited 2012-05-13 23:44
    Wow, you guys have been working on this for a while! Why am I always so late to the party???

    So now that I have a couple boards somewhat working, I've been REALLY wanting to get C running. Looks like everything I need to get started is here, just need to run through things. I am wondering though, as a NOOB to C, if DOC could post some stuff that is already working on the new board? Also, any hints about getting up to speed as quick as possible? I've programed with several different languages so C shouldn't be a huge problem. The board design is quite flexible and I can't wait to start playing around.

    A very noob question, how hard would it be to port C code from a different micro-controller *pic18f* and different hardware? I don't expect it to be a carbon copy, just looking for an easy way to get open-source code chunks to handle complex functions, *such as an Arpeggiator or Real-time Pattern Sequencer, etc?*

    *edited*
    ER, well I did some research and I'm STILL looking for the FULL source. But in theory it should be possible! Seems thethe project I want to port's toolchain turns out to be GCC! There's hints as about compiling for PC, so should work!
  • jazzedjazzed Posts: 11,803
    edited 2012-05-14 09:36
    A very noob question, how hard would it be to port C code from a different micro-controller *pic18f* and different hardware? I don't expect it to be a carbon copy, just looking for an easy way to get open-source code chunks to handle complex functions, *such as an Arpeggiator or Real-time Pattern Sequencer, etc?*

    Propeller GCC supports the strict ANSI-C C89 and GNU variant C99 specification.

    Whatever you find for PIC should work assuming:
    1. type widths are ported - type "int" will have different width
    2. any hardware specific device interfaces are ported


    For XMM a cache_interface.spin compatible PASM driver will be needed.
  • average joeaverage joe Posts: 795
    edited 2012-05-21 11:21
    Now that the SD card filesystem is working in XMMC and LMM, it appears we can continue work on porting the hardware over. The most obvious issue to me is the dataBus width change necessary. Since we are using 2 SRAM chips in parallel, we can get *and put* 2 bytes at a time to SRAM. Also, we will need to change the addressing methods to use the counters. Not sure about where the code to control the display should go since it's included in the PASM RamDriver. I believe I have condensed down the "core" object in SPIN and PASM to
    For ILI:
    Program:   3,897 Longs
    Variable:    290 Longs
    For SSD:                   'with glitch fix extension 
    Program:   3,909 Longs
    Variable:    290 Longs
    
    These are old portions, I need to look at the UPDATED SRAM filesystem.
    I read the cache files here : http://forums.parallax.com/showthread.php?140010-Files&p=1099520&viewfull=1#post1099520 I'm lost as to what to do next. I want to help out, but the smart guys need to point me in the right direction, so I can idiot savant my way through. LOL!
  • David BetzDavid Betz Posts: 14,516
    edited 2012-05-21 12:57
    Now that the SD card filesystem is working in XMMC and LMM, it appears we can continue work on porting the hardware over. The most obvious issue to me is the dataBus width change necessary. Since we are using 2 SRAM chips in parallel, we can get *and put* 2 bytes at a time to SRAM. Also, we will need to change the addressing methods to use the counters. Not sure about where the code to control the display should go since it's included in the PASM RamDriver. I believe I have condensed down the "core" object in SPIN and PASM to
    For ILI:
    Program:   3,897 Longs
    Variable:    290 Longs
    For SSD:                   'with glitch fix extension 
    Program:   3,909 Longs
    Variable:    290 Longs
    
    These are old portions, I need to look at the UPDATED SRAM filesystem.
    I read the cache files here : http://forums.parallax.com/showthread.php?140010-Files&p=1099520&viewfull=1#post1099520 I'm lost as to what to do next. I want to help out, but the smart guys need to point me in the right direction, so I can idiot savant my way through. LOL!
    Can you describe your hardware so we know what kind of interface you're using? To make your own cache driver, you just need to fill in the attached cache driver skelton. Mostly, you need to write functions to read and write one cache line.
  • average joeaverage joe Posts: 795
    edited 2012-05-21 13:06
    The schematic is posted here:http://forums.parallax.com/showthread.php?137266-Propeller-GUI-touchscreen-and-full-color-display&p=1095915&viewfull=1#post1095915

    It's basically 2 Sram chips in parallel. 16 bit data bus, 19 bit address bus. Similar to the dracblade except 2 chips and addressed by 161's. Doc posted quite a bit of the code previously, but here's what we've been using for the ramdriver:
    CON
    '' Modified code from Cluso's triblade
    '' commands to move blocks of data to the ILI9325 touchscreen display
    ' DoCmd(command_, hub_address, ram_address, block_length)
    ' I - initialise     
    ' S - Move data from hub to ram
    ' T - Move data from ram to hub
    ' U - Move data from ram to display
    ' V - Hub to display
    ' W - not working - writecom in pasm
    ' E - convert from .raw RGB to two byte ILI format RRRRRGGG_GGG_BBBBB
    ' F - convert from .bmp BGR format to two byte ILI format
    ' X - merge icon and background based on a mask
    ' Y - Change 137 output Returns P0-P20 and P22 in HiZ. Pass hubaddrs
    ' Z - Set 161 pins. Returns in group 1
    
    VAR
    
    ' communication params(5) between cog driver code - only "command" and "errx" are modified by the driver
       long  command, hubaddrs, ramaddrs, blocklen, errx, cog ' rendezvous between spin and assembly (can be used cog to cog)
    '        command  = A to Z etc =0 when operation completed by cog
    '        hubaddrs = hub address for data buffer
    '        ramaddrs = ram address for data
    '        blocklen = ram buffer length for data transfer
    '        errx     = returns =0 (false=good), else <>0 (true & error code)
    '        cog      = cog no of driver (set by spin start routine)
       
    
    DAT
    '' +-----------------------------------------------------------------------------------------------+
    '' | Touchblade 161 Ram Driver (with grateful acknowlegements to Cluso and Average Joe)            |
    '' +-----------------------------------------------------------------------------------------------+
                            org     0
    tbp2_start    ' setup the pointers to the hub command interface (saves execution time later
                                          '  +-- These instructions are overwritten as variables after start
    comptr                  mov     comptr, par     ' -|  hub pointer to command                
    hubptr                  mov     hubptr, par     '  |  hub pointer to hub address            
    ramptr                  add     hubptr, #4      '  |  hub pointer to ram address            
    lenptr                  mov     ramptr, par     '  |  hub pointer to length                 
    errptr                  add     ramptr, #8      '  |  hub pointer to error status           
    cmd                     mov     lenptr, par     '  |  command  I/R/W/G/P/Q                  
    hubaddr                 add     lenptr, #12     '  |  hub address                           
    ramaddr                 mov     errptr, par     '  |  ram address                           
    len                     add     errptr, #16     '  |  length                                
    err                     nop                     ' -+  error status returned (=0=false=good) 
    
    
    ' Initialise hardware tristates everything and read/write set the pins
    init                    mov     err, #0                  ' reset err=false=good
                            'mov     dira,zero                ' tristate the pins with the cog dira
                            and     dira,maskP0P20P22       ' tristates all the common pins
    
    done                    wrlong  err, errptr             ' status  =0=false=good, else error x
                            wrlong  zero, comptr            ' command =0 (done)
    ' wait for a command (pause short time to reduce power)
    pause
    '                        mov     ctr, delay      wz      ' if =0 no pause
    '              if_nz     add     ctr, cnt
    '              if_nz     waitcnt ctr, #0                 ' wait for a short time (reduces power)
                            rdlong  cmd, comptr     wz      ' command ?
                  if_z      jmp     #pause                  ' not yet
    ' decode command
                            cmp     cmd, #"S"       wz      ' hub to ram
                  if_z      jmp     #pasmhubtoram           
                            cmp     cmd, #"T"       wz      ' ram to hub
                  if_z      jmp     #pasmramtohub
                            cmp     cmd, #"U"       wz      ' ram to display
                  if_z      jmp     #pasmramtodisplay
                            cmp     cmd, #"V"       wz      ' hub to display
                  if_z      jmp     #pasmhubtodisplay           
                            cmp     cmd, #"E"       wz      ' convert 3 byte .raw format to 2 byte .ili format - hub to hub
                  if_z      jmp     #rawtoiliformat
                            cmp     cmd, #"F"       wz      ' convert 3 byte .bmp format BGR to 2 byte ili format (same as E but order reversed)
                  if_z      jmp     #bmptoiliformat              
     '                       cmp     cmd, #"W"       wz      ' lcdwritecom in pasm, not working
     '             if_z      jmp     #pasmlcdwritecom
                            cmp     cmd, #"X"       wz      ' merge icon and background based on a mask
                  if_z      jmp     #mergeicons
                            cmp     cmd, #"Y"       wz      ' change the 137 output
                  if_z      jmp     #changegroup
                            cmp     cmd, #"Z"       wz      ' set the 161 counters
                  if_z      jmp     #set161          
                            cmp     cmd, #"I"       wz      ' init
                  if_z      jmp     #init     
                            mov     err, cmd                ' error = cmd (unknown command)
                            jmp     #done
                            
    ' ----------------- common routines -------------------------------------
    
    get_values              rdlong  hubaddr, hubptr         ' get hub address
                            rdlong  ramaddr, ramptr         ' get ram address
                            rdlong  len, lenptr             ' get length
                            mov     err, #5                 ' err=5
    get_values_ret          ret
    
               ' Pass pasm_n = 0- 7 come to this with P0-P20 and P22 tristated and returns them as this too
    set137                  or      dira,maskP22            ' pin 22 is an output
                            andn    outa,maskP22            ' set P22low so Y0-Y7 are all high
                            or      dira,maskP0P20          ' pins P0-P20 are outputs
                            and     outa,maskP0P2low        ' set these 3 pins low
                            or      outa,pasm_n             ' set the 137 pins
                            or      outa,maskP22            ' pin 22 high
    set137_ret              ret                             ' return                        
    
    
    load161pasm                                             ' uses ramaddr
                            or      outa,maskP0P20          ' set P0-P20 high     
                            or      dira,maskP0P20          ' output pins 0-20
                            mov     pasm_n,#0               ' group 0
                            call    #set137                 ' set the 137 output
                            and     outa,maskP0P18low       ' pins 0-18 set low
                            or      outa,ramaddr            ' output addres to 161 chips
                            or      outa,maskP19            ' clock high
                            or      outa,maskP20            ' load high
                            andn    outa,maskP19            ' clock low
                            andn    outa,maskP20            ' load low
                            or      outa,maskP19            ' clock high
                            or      outa,maskP20            ' load high 
    load161pasm_ret         ret
    
    stop                   jmp     #stop                  ' for debugging
    
    memorytransfer          or      dira,maskP16P20         ' so /wr and other pins definitely high
                            or      outa,maskP16P20
                            mov     pasm_n,#1               ' back to group 1 for memory transfer
                            call    #set137                 ' as next routine will always be group 1
                            or      dira,maskP16P20         ' output pins 16-20
                            or      outa,maskP16P20         ' set P16-P20 high (P0-P15 set as inputs or outputs in the calling routine)
    memorytransfer_ret      ret
    
    busoutput               or      dira,maskP0P15          ' set prop pins 0-15 as outputs
    busoutput_ret           ret
    
    businput                and     dira,maskP16P31         ' set P0-P15 as inputs
    businput_ret            ret  
    
    delaynop                nop
                            nop
                            nop
                            nop
    delaynop_ret            ret
    
    
    ' ------------------ single letter commands  -------------------------------------
     
    ' command S
    pasmhubtoram            call    #get_values             ' get hubaddr,ramaddr,len
                            call    #load161pasm            ' load the 161 counters with ramaddr
                            call    #memorytransfer         ' set to group 1, enable P16-P20 as outputs and set P16-P20 high
                            call    #busoutput              ' set prop pins P0-P15 as outputs
    hubtoram_loop           and     outa,maskP16P31         '%11111111_11111111_00000000_00000000       ' clear for output                   
                            rdword  data_16,hubaddr         ' get the word from hub
                            and     data_16,maskP0P15       ' mask to a word only
                            or      outa,data_16            ' send out the byte to P0-P15
                            andn    outa,maskP17            ' set mem write low
                            add     hubaddr,#2              ' increment by 2 bytes = 1 word. Put this here for small delay while writes
                            or      outa,maskP17            ' mem write high
                            andn    outa,maskP19            ' clock 161 low
                            or      outa,maskP19            ' clock 161 high
                            djnz    len,#hubtoram_loop      ' loop this many times
                            jmp     #init                   ' tristate pins and listen for commands
    
    ' command T
    pasmramtohub            call    #get_values             ' get hubaddr,ramaddr,len
                            call    #load161pasm            ' load the 161 counters with ramaddr
                            call    #memorytransfer         ' set to group 1, enable P16-P20 as outputs and set P16-P20 high                             
                            call    #businput               ' set prop pins P0-P15 as inputs
                            andn    outa,maskP16            ' memory /rd low
    ramtohub_loop           mov     data_16,ina             ' get the data
                            wrword  data_16,hubaddr         ' move data to hub
                            andn    outa,maskP19            ' clock 161 low
                            or      outa,maskP19            ' clock 161 high
                            add     hubaddr,#2              ' increment the hub address 
                            djnz    len,#ramtohub_loop
                            or      outa,maskP16            ' memory /rd high  
                            or      dira,maskP0P15          ' %00000000_00000000_11111111_11111111 restore P0-P15as outputs
                            jmp     #init                   ' ' tristate pins and listen for commands
    
    ' command U
    pasmramtodisplay        call    #get_values             ' get hubaddr,ramaddr,len
                            call    #load161pasm            ' load the 161 counters with ramaddr
                            call    #memorytransfer         ' set to group 1, enable P16-P20 as outputs and set P16-P20 high                             
                            call    #businput               ' set prop pins 0-15 as inputs so doesn't interfere with the transfer
                            or      outa,maskP18            ' ILI_RS high
                            andn    outa,maskP16            ' memory /rd low  
    ramtodisplay_loop       andn    outa,maskP20            ' ILI write low
                            or      outa,maskP20            ' ILI write high
                            andn    outa,maskP19            ' clock 161 low
                            or      outa,maskP19            ' clock 161 high
                            djnz    len,#ramtodisplay_loop
                            or      outa,maskP16            ' memory /rd high  
                            or      dira,maskP0P15          ' %00000000_00000000_11111111_11111111 restore P0-P15as outputs
                            jmp     #init
    
    ' command V
    pasmhubtodisplay        call    #get_values             ' get hubaddr,ramaddr,len
                            call    #memorytransfer         ' set to group 1, enable P16-P20 as outputs and set P16-P20 high                             
                            call    #busoutput              ' set P0-P15 as outputs
    hubtodisplay_loop       and     outa,maskP16P31         '%11111111_11111111_00000000_00000000       ' clear for output                   
                            rdword  data_16,hubaddr         ' get the word from hub
                            and     data_16,maskP0P15       ' mask to a word only
                            or      outa,data_16            ' send out the byte to P0-P15
                            andn    outa,maskP20            ' ILI write low
                            or      outa,maskP20            ' ILI write high
                            add     hubaddr,#2              ' one word
                            djnz    len,#hubtodisplay_loop
                            jmp     #init
    
    'command E
    RawtoILIformat          ' takes a .raw 3 byte RRRRRRRR GGGGGGGG BBBBBBBB and converts to 2 byte RRRRRGGG GGGBBBBB
                            ' pass hubaddr, ramaddr and len
                            ' hubaddr is source location, len is number of pixels
                            ' ramaddr is destination in hub (messy naming) and length is 2/3 of blocklength
                            call    #get_values ' gets hubaddress, ramaddress and len (ignores ramaddress)
    rawloop
                            rdbyte red,hubaddr
                            add hubaddr,#1
                            rdbyte green,hubaddr
                            add hubaddr,#1
                            rdbyte blue,hubaddr
                            add hubaddr,#1
                            call #rgbtoili
                            wrbyte ililow,ramaddr
                            add ramaddr,#1
                            wrbyte ilihigh,ramaddr
                            add ramaddr,#1
                            djnz    len,#rawloop            ' loop until done 
                            jmp     #init                   ' set pins to tristate
    
    RGBtoILI                ' pass red,green, blue, returns ililow and ilihigh
                            shr     red,#3                  ' 000RRRRR 
                            shl     red,#3                  ' RRRRR000 
                            shr     green,#2                ' 00GGGGGG
                            mov     ilihigh,green           ' ilihigh = 00GGGGGG
                            shr     ilihigh,#3              ' ilihigh = 00000GGG
                            or      ilihigh,red             ' ilihigh = RRRRRGGG
                            and     green,#%00000111        ' 00000GGG
                            shl     green,#5                ' GGG00000
                            mov     ililow,green            ' ililow = GGG00000
                            shr     blue,#3                 ' blue = 000BBBBB
                            or      ililow,blue             ' ililow = GGGBBBBB
    RGBtoILI_ret            ret
    
    BMPtoILIformat          ' takes a .bmp 3 byte BBBBBBBB GGGGGGGG RRRRRRRR and converts to 2 byte RRRRRGGG GGGBBBBB
                            ' same as E above but BGR instead of RGB
                            ' pass hubaddr, ramaddr and len
                            ' hubaddr is source location, len is number of pixels
                            ' ramaddr is destination in hub (messy naming) and length is 2/3 of blocklength
                            call    #get_values ' gets hubaddress, ramaddress and len (ignores ramaddress)
    bmploop
                            rdbyte blue,hubaddr
                            add hubaddr,#1
                            rdbyte green,hubaddr
                            add hubaddr,#1
                            rdbyte red,hubaddr
                            add hubaddr,#1
                            call #rgbtoili
                            wrbyte ililow,ramaddr
                            add ramaddr,#1
                            wrbyte ilihigh,ramaddr
                            add ramaddr,#1
                            djnz    len,#bmploop            ' loop until done 
                            jmp     #init                   ' set pins to tristate
    ' **** command X *********************
    
    MergeIcons              call    #get_values ' gets hubaddress, ramaddress,len which are used here as background,icon,mask
                            mov     pasm_n,#59               ' do a single row
    mergeiconsloop          rdbyte  ililow,len                 ' reuse ililow, so this is rdword mask,maskcounter
                            and     ililow,#%11111             ' mask off low 5 bits and use just the blue as this is a grayscale bitmap
                            rdword  red,hubaddr              ' reuse red, so actually this is rdword background,backgroundcounter                        
                            cmp     ililow,#%10000   wc       ' compare if >128 (ie mid level gray)
                  if_c      jmp     #mergeskip
                            rdword  green,ramaddr            ' reuse green, so this is rdword iconpixel, iconpixelcounter 
                            wrword  green,hubaddr            ' if replace, then move icon pixel to the background     
    mergeskip               add     hubaddr,#2
                            add     ramaddr,#2
                            add     len,#2
                            djnz    pasm_n,#mergeiconsloop            ' loop until done 
                            jmp     #init                   'set pins to tristate 
    
    ' *** command Y **********
    changegroup             call    #get_values             'gets hubaddr, ramaddr,len which are used here as background,icon,mask           
                            mov     pasm_n,hubaddr          ' pass hubaddr
                            call    #set137                 ' change the group
                            jmp     #init
    
    ' *** command Z **********
    set161                  call    #get_values             ' gets ramaddr = the 161 value to set
                            call    #load161pasm            ' ramaddr to the 161 chips
                            call    #memorytransfer         ' change to group 1
                            jmp     #init                        
    
                            
    
    'pasmlcdwritecom         call    #get_values             ' use hubaddr as the data
    '                        or      dira,maskP0P20          ' set these pins high (pass all pins tristated)
    '                        or      outa,maskP0P20          '  set pins high
    '                        mov     pasm_n,#2               '  mem transfer
    '                        call    #set138                 ' set the 138
    '                        andn    outa,maskP18            ' P18 ILIRS low
    '                        and     outa,maskP16P31         ' set P0-P15 low
    '                        or      outa,hubaddr            ' send out the data
    '                        andn    outa,maskP17            ' ILI write low
    '                        or      outa,maskP17            ' ILI write high
    '                        jmp     #init                   ' set pins to tristate  
    
    ' variables
    pasm_n                  long    0                                    ' general purpose value
    data_16                 long    0                                    ' general purpose value
    ililow                  long    0                                    ' low data byte 
    ilihigh                 long    0                                    ' high data byte 
    red                     long    0                                    ' red, green blue variables
    green                   long    0
    blue                    long    0           
    
    ' constants
    Zero                    long    %00000000_00000000_00000000_00000000 ' used in several places
    maskP0P2low             long    %11111111_11111111_11111111_11111000 ' P0-P2 low
    maskP0P20               long    %00000000_00011111_11111111_11111111 ' P0-P18 enabled for output plus P19,P20    
    maskP0P18low            long    %11111111_11111000_00000000_00000000 ' P0-P18 low
    maskP16                 long    %00000000_00000001_00000000_00000000 ' pin 16
    maskP17                 long    %00000000_00000010_00000000_00000000 ' pin 17
    maskP18                 long    %00000000_00000100_00000000_00000000 ' pin 18
    maskP19                 long    %00000000_00001000_00000000_00000000 ' pin 19
    maskP20                 long    %00000000_00010000_00000000_00000000 ' pin 20
    maskP22                 long    %00000000_01000000_00000000_00000000 ' pin 22
    maskP16P31              long    %11111111_11111111_00000000_00000000 ' pin 16 to pin 31
    maskP0P15               long    %00000000_00000000_11111111_11111111 ' for masking words
    maskP16P20              long    %00000000_00011111_00000000_00000000
    maskP0P20P22            long    %11111111_10100000_00000000_00000000 ' for returning all group pins HiZ
                            fit     496   
    
    I'll start looking at the skeleton file and see if I can decipher it! I'm sorry I'm asking you guys to "hold my hand" as I work through this!
  • David BetzDavid Betz Posts: 14,516
    edited 2012-05-21 13:11
    The schematic is posted here:http://forums.parallax.com/showthread.php?137266-Propeller-GUI-touchscreen-and-full-color-display&p=1095915&viewfull=1#post1095915

    It's basically 2 Sram chips in parallel. 16 bit data bus, 19 bit address bus. Similar to the dracblade except 2 chips and addressed by 161's. Doc posted quite a bit of the code previously, but here's what we've been using for the ramdriver:

    I'll start looking at the skeleton file and see if I can decipher it! I'm sorry I'm asking you guys to "hold my hand" as I work through this!

    You should be able to use your hubtoram and ramtohub functions to implement the cache line read/write and that should be about enough to get you going. Is this hardware available yet?
  • jazzedjazzed Posts: 11,803
    edited 2012-05-21 13:17
    Now that the SD card filesystem is working in XMMC and LMM, it appears we can continue work on porting the hardware over. ...

    Basically you need to copy dracblade_cache.spin to dractouch_cache.spin and replace lines 162 through 254 with code that interfaces with the hardware.

    The fundamental idea is to have BSTART set an address, and have BREAD/BWRITE do read/write based on the cache line length. This looks like a very simple change.
  • David BetzDavid Betz Posts: 14,516
    edited 2012-05-21 13:18
    jazzed wrote: »
    Basically you need to copy dracblade_cache.spin to dractouch_cache.spin and replace lines 162 through 254 with code that interfaces with the hardware.

    The fundamental idea is to have BSTART set an address, and have BREAD/BWRITE do read/write based on the cache line length. This looks like a very simple change.
    Yeah but that's more or less like filling in the skeleton but with the added disadvantage that you may make mistakes excising the old code.
  • average joeaverage joe Posts: 795
    edited 2012-05-21 13:28
    I think doc has the design pretty much nailed down. There is a 2-screen variant I'm eagerly waiting to get my hands on too! Should function similar at the low level.
    I'm not sure what you mean "available," Dr. Acula and I have boards built and running. If you're asking about "distribution" spec boards, you'd have to check with Doc. I know we are "close" to release and sure we could get boards to those interested in the future. I'm not sure how soon exactly.
    These use the 40-pin touchscreen boards and I believe we will be supporting the SSD1289 controller. *These are the displays I'm using. Doc is using an ili9325 right now but it sounds like he will be converting to the SSD soon. I'm not sure if you guys have any of these displays or similar floating around.
    I've been looking at the skeleton cache file and I think I understand it *sort of* I'm sure I will have questions later.
    Thanks again for all the help!
  • David BetzDavid Betz Posts: 14,516
    edited 2012-05-21 13:30
    I think doc has the design pretty much nailed down. There is a 2-screen variant I'm eagerly waiting to get my hands on too! Should function similar at the low level.
    I'm not sure what you mean "available," Dr. Acula and I have boards built and running. If you're asking about "distribution" spec boards, you'd have to check with Doc. I know we are "close" to release and sure we could get boards to those interested in the future. I'm not sure how soon exactly.
    These use the 40-pin touchscreen boards and I believe we will be supporting the SSD1289 controller. *These are the displays I'm using. Doc is using an ili9325 right now but it sounds like he will be converting to the SSD soon. I'm not sure if you guys have any of these displays or similar floating around.
    I've been looking at the skeleton cache file and I think I understand it *sort of* I'm sure I will have questions later.
    Thanks again for all the help!
    You don't need to understand the part at the start. It just takes a request from the XMM kernel and figures out whether the cache line is already in hub memory. If not, it will call your code to read it from external memory. If it needs to reuse a cache line that contains modified data it will call your code to write that cache line to external memory before reusing it. That's about all there is to it.
  • jazzedjazzed Posts: 11,803
    edited 2012-05-21 13:38
    David Betz wrote: »
    Yeah but that's more or less like filling in the skeleton but with the added disadvantage that you may make mistakes excising the old code.

    I totally missed your earlier post about a skeleton driver. Ya, that's the right approach.
Sign In or Register to comment.