Shop OBEX P1 Docs P2 Docs Learn Events
Zog - A ZPU processor core for the Prop + GNU C, C++ and FORTRAN.Now replaces S - Page 14 — Parallax Forums

Zog - A ZPU processor core for the Prop + GNU C, C++ and FORTRAN.Now replaces S

1111214161738

Comments

  • David BetzDavid Betz Posts: 14,511
    edited 2010-08-23 06:39
    Wow! Thanks for all of the detailed information. That should help a lot. I'll let you know when I get something running. I'm currently trying to see if ZOG will work for my bytecode Basic compiler that won't fit in HUB memory using Catalina C.
  • Heater.Heater. Posts: 21,230
    edited 2010-08-23 07:07
    I liked it so much I've massaged it into a FAQ file in Zog. After all there is a serious lack of documentation so far.

    If you or anyone else has some juicy questions I'll try to get into the habit of adding the info there.

    That BASIC compiler sounds intriguing. Not compiling to ZPU byte codes I guess:)
  • David BetzDavid Betz Posts: 14,511
    edited 2010-08-23 07:42
    Heater. wrote: »
    That BASIC compiler sounds intriguing. Not compiling to ZPU byte codes I guess:)

    I didn't know about ZPU when I wrote the original compiler but I guess it wouldn't be hard to modify it to generate ZPU instructions. I'll have to look over the instruction set. Thanks for the suggestion!
  • David BetzDavid Betz Posts: 14,511
    edited 2010-08-23 08:21
    Do you know if there is a PDF file describing the ZPU architecture? I found a web page describing it but it would be nice to have a reference to print out. Is there a PDF version of the architecture document that you know of?
  • Heater.Heater. Posts: 21,230
    edited 2010-08-23 09:19
    There is no nice PDF document of the ZPU instruction set.

    Sometime ago I did volunteer to write one. Perhaps now is the time to do so. Be aware that there are some errors and omissions in the instruction definitions on the Zylin web site.

    As is often the case the best reference is some working code. I started with the source code of the ZPU simulator written in Java. Or there is my version in C, attached.

    Do you already have an interpreter for your byte codes on the Prop?

    Perhaps using ZPU byte codes and ZOG would be a short circuit in development of your BASIC, if the stack based ZPU machine is as efficient as you would like.
  • David BetzDavid Betz Posts: 14,511
    edited 2010-08-23 09:34
    Heater. wrote: »
    Do you already have an interpreter for your byte codes on the Prop?

    Perhaps using ZPU byte codes and ZOG would be a short circuit in development of your BASIC, if the stack based ZPU machine is as efficient as you would like.

    Yes, I already have an interpreter for the Propeller but ZOG still might be a better choice. I have an interpreter for my bytecodes written in C that I have run on the PIC and AVR processors and I ported it to the Propeller using Catalina C and it seems to work. I've also written one in Propeller ASM that I haven't had time to test yet. Still, ZOG would probably be a better choice. I could even support compiled Basic code linked with C/C++ code if I use ZOG!!
  • David BetzDavid Betz Posts: 14,511
    edited 2010-08-23 09:37
    What is required to interface the VMCOG virtual memory manager to an external SRAM chip? Is there support for SPI SRAMs?
  • Bill HenningBill Henning Posts: 6,445
    edited 2010-08-23 09:46
    Hi David,

    VMCOG comes with a built-in SPI SRAM driver for PropCade, and I am adding (hopefully today) a version for SPI RAM's on Morpheus. Either would be very easy to modify for the SPI ram being on other pins.

    Later this week I am adding a driver for my FlexMem board (four bit wide bus using four SPI ram's for >2MB/sec burst transfer, up to 6.6MB/sec with the two cog special driver I am working on)

    VMCOG was designed to allow easily adding additional drivers for any sort of memory interface - I intended it to be a HAL for different memory interfaces. Please see the comments and #ifdef'ed sections in the code.

    The currently released VMCOG only supports 64KB of VM, however I am working on a version that will support at least 2MB of VM, much more if I go to a multi-level TLB.

    Regards,

    Bill

    (p.s. I used XLISP many years ago...)
    David Betz wrote: »
    What is required to interface the VMCOG virtual memory manager to an external SRAM chip? Is there support for SPI SRAMs?
  • David BetzDavid Betz Posts: 14,511
    edited 2010-08-23 09:59
    Later this week I am adding a driver for my FlexMem board (four bit wide bus using four SPI ram's for >2MB/sec burst transfer, up to 6.6MB/sec with the two cog special driver I am working on).

    That sounds quite cool. Is there any chance it could be made to work with a pair of SRAMs in a two bit wide bus?
    (p.s. I used XLISP many years ago...)

    Maybe I should try porting XLISP to the Propeller using ZOG/VMCOG! I'm sure there are lots of people who would want to program the Propeller in Lisp! :-)
  • Heater.Heater. Posts: 21,230
    edited 2010-08-23 10:07
    David Betz,
    I could even support compiled Basic code linked with C/C++ code if I use ZOG!!
    

    That would be brilliant. One could create applications in BASIC if that is ones preferred language and be able to make use of all kinds of C libraries. We would have common drivers for whatever devices in BASIC and C.
  • Bill HenningBill Henning Posts: 6,445
    edited 2010-08-23 10:14
    David Betz wrote: »
    That sounds quite cool. Is there any chance it could be made to work with a pair of SRAMs in a two bit wide bus?

    It would be pretty easy to do, however the speed gain over single bit access would be minimal due to the overhead.

    The main reason I have not made an 8-bit wide version is that it would require ten Prop pins, and the cost/benefit is not there - eight of the 23K256 chips is $12 for 256KB of ram, and would result in an unpalatable price for any board using eight such chips.
    David Betz wrote: »
    Maybe I should try porting XLISP to the Propeller using ZOG/VMCOG! I'm sure there are lots of people who would want to program the Propeller in Lisp! :-)

    Now that would be interesting... I wonder if I still have my old XLISP expert system code in a readable format?
  • jazzedjazzed Posts: 11,803
    edited 2010-08-23 11:29
    Heater. wrote: »
    David Betz,
    I could even support compiled Basic code linked with C/C++ code if I use ZOG!!
    

    That would be brilliant. One could create applications in BASIC if that is ones preferred language and be able to make use of all kinds of C libraries. We would have common drivers for whatever devices in BASIC and C.

    Wow! That would be great!

    Here's my latest SDRAM fibo result.
    I just couldn't help stumbling over an SDRAM Cache optimization :)

    fibo(00) = 000000 (00000ms)
    fibo(01) = 000001 (00000ms)
    fibo(02) = 000001 (00000ms)
    fibo(03) = 000002 (00000ms)
    fibo(04) = 000003 (00001ms)
    fibo(05) = 000005 (00001ms)
    fibo(06) = 000008 (00003ms)
    fibo(07) = 000013 (00005ms)
    fibo(08) = 000021 (00008ms)
    fibo(09) = 000034 (00013ms)
    fibo(10) = 000055 (00022ms)
    fibo(11) = 000089 (00036ms)
    fibo(12) = 000144 (00058ms)
    fibo(13) = 000233 (00094ms)
    fibo(14) = 000377 (00153ms)
    fibo(15) = 000610 (00248ms)
    fibo(16) = 000987 (00404ms)
    fibo(17) = 001597 (00658ms)
    fibo(18) = 002584 (01067ms)
    fibo(19) = 004181 (01725ms)
    fibo(20) = 006765 (02783ms)
    fibo(21) = 010946 (04486ms)
    fibo(22) = 017711 (07237ms)
    fibo(23) = 028657 (11687ms)
    fibo(24) = 046368 (18896ms)

    Cheers.
    --Steve
  • lonesocklonesock Posts: 917
    edited 2010-08-23 12:56
    Hi.

    I'm just looking over the Zog code (1.3) for the first time, and it seems to me that the 'nos' (next on stack, I'm assuming) variable is only ever used right after it's set. Is it possible that 'nos' was going to be an optimization, but then got dropped? I think you could save yourself some cycles and longs by removing it.

    Zog is looking awesome, btw! I'm really looking fwd to playing with it.

    Jonathan

    Edit: Also, it looks like the mult16x16 routine does a full 32 cycles, could you set the number of bits externally before the call?
  • Heater.Heater. Posts: 21,230
    edited 2010-08-23 13:42
    Thank you Lonesock, well spotted. Every cycle counts and LONGs are precious.

    In my C version of ZPU pretty much every instruction uses nos, which might as well have been called "temp" or something. A few of those survived the transition to PASM. I don't recall I had an actual use for nos in mind except to print a register dump compatible with that of the GHDL test harness for the VHDL ZPU even there it is not necessary.

    I'll have a look at the 16x16 mult.

    That reminds me, the ZPU in C should be got running under ZOG on the Prop just to complete the circle.
  • Bill HenningBill Henning Posts: 6,445
    edited 2010-08-23 13:51
    NOS is there to help remove extra stack ops

    TOS holds top of stack, but you need to pop next item off the stack to be able to do any add, sub etc op --> thus you need NOS, which really is just a temp register
    Heater. wrote: »
    Thank you Lonesock, well spotted. Every cycle counts and LONGs are precious.

    In my C version of ZPU pretty much every instruction uses nos, which might as well have been called "temp" or something. A few of those survived the transition to PASM. I don't recall I had an actual use for nos in mind except to print a register dump compatible with that of the GHDL test harness for the VHDL ZPU even there it is not necessary.

    I'll have a look at the 16x16 mult.

    That reminds me, the ZPU in C should be got running under ZOG on the Prop just to complete the circle.
  • Heater.Heater. Posts: 21,230
    edited 2010-08-23 14:01
    Bill, yes, but the pop routine pops into "data" which was being moved to "nos" which was then being used and then being discarded.

    So those redundant moves are gone and "nos" is now "data".
  • Bill HenningBill Henning Posts: 6,445
    edited 2010-08-23 14:40
    nice :)

    Just finished making a test rig for the MCP23S17 port on the new boards; now I can finish my MCP23S17 object and easily test those ports in the future!
    Heater. wrote: »
    Bill, yes, but the pop routine pops into "data" which was being moved to "nos" which was then being used and then being discarded.

    So those redundant moves are gone and "nos" is now "data".
  • lonesocklonesock Posts: 917
    edited 2010-08-23 15:44
    Cool. Other stuff:

    * Checking for the zpu_im, you could do this instead:
                            cmpsub  data, #$80 wc           'Check for IM instruction. This saves table lookup
                  if_c      jmp     #zpu_im                 'for the most common op. 7% fibo speed gain!
    
    that way, you don't need the "and data, #$7F" line inside zpu_im, :next.

    * And you can try this for multiplication code
                            ' make x the smaller of the 2 parameters
                            mov       t2, x
                            max       x, y
                            min       y, t2
    
    mmul                    shr     x,#1            wc,wz    'multiply
            if_c            add     t1,y            
                            shl     y,#1            
            if_nz           jmp     #mmul
    
    Note that the 1st 3 ops are optional, but will increase the average speed, assuming you aren't using the mult operation exclusively for squaring values [8^)

    Jonathan
  • jazzedjazzed Posts: 11,803
    edited 2010-08-23 16:13
    @Lonesock, Nice to see you and others interested in ZOG.

    @Heater,

    I was having malloc troubles ... now I don't.

    Before, I could get malloc to work like 3 times which happened to be after I ran a HUB based program. I've added some code to put garbage into memory, then clean it out to zero before I boot. If I don't clean memory before boot, malloc fails. If I clean it, malloc works reasonably. I'm still not sure why I can't successfully malloc over 16.6MB, but I don't think that's a memory hardware problem. Results posted below.

    --Steve

    Note below that "Free big buffer" after malloc means success, and "Oops ..." means malloc failed.

    Filling first 1MB memory with garbage.
    ZOG v1.2 (CACHE)
    Starting SD driver...0000FFFF
    Mounting SD...00000000
    Booting mall.bin
    00000000
    
    Filling memory with garbage .....
    Reading image... 17339 Bytes Loaded.
    Done
    Waiting 1 seconds before program check...
    
    Starting SD driver...0000FFFF
    Mounting SD...00000000
    Checking image... 17339 Bytes Checked.
    Program Load OK.
    
    Running Program!
    
    
    Malloc Testing.
    
    _hardware    = 0  _cpu_config  = 2
    _use_syscall = 0   ZPU_ID      = 18
    
    Malloc big buffer size: 1024
    Oops, malloc failed.
    Malloc big buffer size: 8192
    Oops, malloc failed.
    Malloc big buffer size: 8000000
    Oops, malloc failed.
    Malloc big buffer size: 16000000
    Oops, malloc failed.
    Malloc big buffer size: 30000000
    Oops, malloc failed.
    Malloc big buffer size: 16777216
    Oops, malloc failed.
    Malloc big buffer size: 16711680
    Oops, malloc failed.
    Malloc big buffer size: 16646144
    Oops, malloc failed.
    Malloc big buffer size: 16515072
    Oops, malloc failed.
    Malloc big buffer size: 16252928
    Oops, malloc failed.
    Malloc big buffer size: 16000000
    Oops, malloc failed.
    
    
    All done!
    

    Filling first 1MB memory with garbage, then cleaning up before boot.
    ZOG v1.2 (CACHE)
    Starting SD driver...0000FFFF
    Mounting SD...00000000
    Booting mall.bin
    00000000
    
    Filling memory with garbage .....
    Clearing memory .........
    Reading image... 17339 Bytes Loaded.
    Done
    Waiting 1 seconds before program check...
    
    Starting SD driver...0000FFFF
    Mounting SD...00000000
    Checking image... 17339 Bytes Checked.
    Program Load OK.
    
    Running Program!
    
    
    Malloc Testing.
    
    _hardware    = 0  _cpu_config  = 2
    _use_syscall = 0   ZPU_ID      = 18
    
    Malloc big buffer size: 1024
    Free big buffer.
    Malloc big buffer size: 8192
    Free big buffer.
    Malloc big buffer size: 8000000
    Free big buffer.
    Malloc big buffer size: 16000000
    Free big buffer.
    Malloc big buffer size: 30000000
    Oops, malloc failed.
    Malloc big buffer size: 16777216
    Oops, malloc failed.
    Malloc big buffer size: 16711680
    Oops, malloc failed.
    Malloc big buffer size: 16646144
    Free big buffer.
    Malloc big buffer size: 16515072
    Free big buffer.
    Malloc big buffer size: 16252928
    Free big buffer.
    Malloc big buffer size: 16000000
    Free big buffer.
    
    
    All done!
    
  • Ding-BattyDing-Batty Posts: 301
    edited 2010-08-23 16:24
    jazzed,

    FWIW, the vaule 16646144 == (16 * 1024 * 1024) - (128 * 1024), which is easy to see if you look at the number in hex. Or in other words, your maximum allocation seems to be one half of (32 MB - 256 KB). Perhaps that might give some hints about how the C runtime heap subsystem is managing memory.

    In other words you are allocating 16.6 MB using 16 * 1000 * 1000, but that is not a "binary" 16.6 MB -- it is really 15.87 MB.
  • Heater.Heater. Posts: 21,230
    edited 2010-08-23 17:22
    Lonesock,

    Great stuff re: IM and CMPSUB.

    IM is on the critical path as it is the most commonly used instruction. Removing the dispatch table lookup for IM made the fibo test 7% faster when running from HUB.

    Looking at IM, I have now thrown out that decode_mask and used self-modifying code for the jmp in the execute loop to get to the right IM. Saves another LONG.

    So IM is now:
    zpu_im_next             shl     tos, #7
                            or      tos, data
                            jmp     #done_and_inc_pc
    
    zpu_im_first            call    #push_tos
                            mov     tos, data
                            shl     tos, #(32 - 7)          'Sign extend
                            sar     tos, #(32 - 7)
                            movs    which_im, #zpu_im_next  
    

    and in the execute loop we have:
                            cmpsub  data, #$80 wc           'Check for IM instruction. This saves table lookup
    which_im      if_c      jmp     #zpu_im_first           'for the most common op. 7% fibo speed gain!
                            movs    which_im, #zpu_im_first 'Self modifying code at which_im selects, first or subsequent IM.
    

    Jazzed,

    That's good. So something is not being initialized properly, that's not good but perhaps we can live with zeroing RAM on start up. How long does that take?
  • jazzedjazzed Posts: 11,803
    edited 2010-08-23 17:38
    GCC malloc is pretty complicated. I can't imagine it needing so much memory for managing memory, but it is possible. Meanwhile I'll see what I can do with what malloc gives me. Maybe if I ask for memory in chunks rather that one big blob, it will give me more.

    Those observations about hex numbers are dead-on. Here's part of my test code:
        malloctest((1<<24));
        for(n = 16; n < 20; n++)
            malloctest((1<<24)-(1<<n));
        malloctest(16000000);
    

    @Heater, I'll do a couple of more days testing and then look at some integration possibilities. You've seen the zog kernel changes I made ... there are 3 places where I in-lined code rather than calling zpu_cache to up performance: execute, push_tos, and pop. Everything is enclosed in USE_JCACHED_MEM. I'm still working in v1_2. Do you want me to migrate to v1_3? I have to get back into hardware mode real soon ....

    I really would like to get this running with Catalina which is really complicated to me, but I think I'm safe waiting until after PCBs go out to FAB before getting back to it.

    As far as zeroing at startup goes, it's all in spin and I'm just clearing 128KB for testing which takes a couple of seconds. I'm not doing anything smart with it such as looking for .bss section which if I remember correctly needs to be zero'd - I noticed heap_ptr lives there. The end of the binary appears to be the beginning of .bss, so you could probably get away with just clearing a chunk after the program load. I think an intel-hex or other hex file would have the .bss size.

    Cheers,
    --Steve
  • David BetzDavid Betz Posts: 14,511
    edited 2010-08-23 19:36
    I'm trying to get debug_zog.spin to work on my Hydra and am blocked by what seems like a trivial change. To get the clock setup correctly for the Hydra I added the following code to debug_zog.spin expecting to be able to use -DHYDRA on the bstc command line to select the HYDRA clock settings (and maybe other Hydra-specific stuff later). Unfortunately, bstc doesn't like this code. What am I doing wrong?
    #ifdef HYDRA
    _clkmode        = xtal1 + pll8x
    _xinfreq        = 10_000_000
    #else
    _clkmode        = xtal1 + pll16x
    _xinfreq        = 6_553_600
    #endif
    

    The bstc manual says that #ifdef, #else, and #endif are supported. I'm running bstc 0.15.3.
  • jazzedjazzed Posts: 11,803
    edited 2010-08-23 19:43
    David Betz wrote: »
    The bstc manual says that #ifdef, #else, and #endif are supported. I'm running bstc 0.15.3.
    bstc -Ox debug_zog
    bstc without arguments gives you a list of options.

    Other popular bstc options
    bstc -Ocgrux -d COM4 -p0 debug_zog

    The bst IDE is also available and has similar look-feel as propeller tool and supports bstc options.

    --Steve
  • David BetzDavid Betz Posts: 14,511
    edited 2010-08-23 19:55
    jazzed wrote: »
    bstc -Ox debug_zog
    bstc without arguments gives you a list of options.

    Other popular bstc options
    bstc -Ocgrux -d COM4 -p0 debug_zog

    The bst IDE is also available and has similar look-feel as propeller tool and supports bstc options.

    --Steve

    Thanks for the tip on using -Ox. That worked great!

    I also figured out that it doesn't seem to like -d COM11. I had to use -d \\.\COM11 instead. I guess that's really a problem with Windows not with bstc though.

    I haven't tried the IDE because I want to run bstc from a Makefile. I'm an old geezer and tend to use command line tools if I can. :-)
  • David BetzDavid Betz Posts: 14,511
    edited 2010-08-23 20:16
    I got the bstc download working and now I'm having trouble getting ZOG to run on the Hydra. I changed the clock speed and I'm using the updated test_libzog.bin file that was built for 80mhz. I also uncommented the setup of the serial port and the first few debug messages in debug_zog.spin but I get no output on my serial terminal when I run the program. I'm using the test.bin file created by building the 'test' directory in the ZOG distribution archive. I'm using putty as a terminal program set to 115200 baud and connected to the COM port with the Propeller attached. Any idea what might be going wrong?

    Thanks!
    David
  • jazzedjazzed Posts: 11,803
    edited 2010-08-23 20:47
    @Heater,

    I've run into some trouble with malloc testing and USE_HUB_MEMORY. I'm attaching my source that has output for you to have a look. I'm scratching my head a bit over it.

    --Steve

    Hmm. I do the same test on SDRAM Cache and it works either way. I suppose some limit was breeched using HUB - don't know for sure. Now I'm testing 8MB SDRAM and other sizes. Will post results later.
    c
    c
    8K
  • jazzedjazzed Posts: 11,803
    edited 2010-08-23 21:05
    @David,

    Putty with a serial port? I *still* learn something new everyday.

    If you're using debug_zog.spin, test_libzog.bin doesn't seem to matter.

    Still, it sounds like you've done everything right so far. One thing to try is using the fibo.bin or hello.bin instead of test.bin. Are there any LEDs blinking on your Hydra? Can you use the Parallax Serial Terminal for your serial port? It comes with Propeller Tool.

    I lost my Hydra 10MHz crystal, so I can't really test your clock settings exactly. The fibo.bin works with my 5MHz Hydra though.
  • Heater.Heater. Posts: 21,230
    edited 2010-08-24 00:28
    Jazzed:
    Do you want me to migrate to v1_3?

    Perhaps that is best. We now have some changes in the pipe for v1_4 but hey can be easily merged with your mods.

    I'm surprised that anything needs zeroing prior to start up. After all ZPU code has normally runs on a core in an FPGA with no loader or other OS support. Do FPGAs zero their RAM blocks on reset?

    I'll have a look at your malloc test when I have a moment.
    Catalina which is really complicated to me

    Memo to Zog Inc. marketing department: "Zog is much simpler than other C solutions for the Propeller":)
  • Heater.Heater. Posts: 21,230
    edited 2010-08-24 00:39
    David Betz,

    A little check list:

    1) Forget about run_zog and test_libzog for now. They are somewhat experimental and do not work from external RAM at the moment.

    2) Start with debug_zog and fibo, be sure you can compile it and make a fibo.bin. Copy fibo.bin to the zog directory.

    3) Be sure you have "fibo.bin" in the file statement in debug_zog and also in the parameter to sd.popen in the load_bytecode method.

    4) Be sure you have "USE_VIRTUAL_MEMORY" commented out and "USE_HUB_MEMORY" is in effect.

    5) When that works simply reverse the defines to use external RAM.

    6) If it does not work, in either case, try uncommenting SINGLE_STEP we should be able to see something happening with that. Just hit SPACE to execute single instructions.

    This all assumes your serial connection is working, easy to see with debug_zog and that your SD card is working for the ext RAM load. Again easy to see with debug_zog.
Sign In or Register to comment.