It really depends on how many symbols you have in the program. For small numbers of symbols then sure hash-bucket collisions will be rare, but for larger numbers of symbols you will get bucket collisions unless you use a vary large number of buckets.
P2 programs are able to be much much larger with a lot more symbols.
It really depends on how many symbols you have in the program. For small numbers of symbols then sure hash-bucket collisions will be rare, but for larger numbers of symbols you will get bucket collisions unless you use a vary large number of buckets.
P2 programs are able to be much much larger with a lot more symbols.
Yes, I agree. It just doesn't seem like comparing full hashes would be that much more efficient than comparing lengths, when maybe you'll never need to walk more than three entries, worst case. Chances are far greater that lengths will be the same than hashes, but the incidence of either, on top of a bucket collision, is near zero.
P.S. I think I'm just wanting to economize, as a byte for the length is less than a long for the hash. And, the length byte is going to be there in any case, since the symbol name is packed into the linked record.
Well yeah, ultimately it doesn't matter that much what you do once you get to the bucket, most of the time savings is in getting there.
My hashtable code is generic and doesn't know anything about the data stored in it. My entry is a hash key, a pointer to the entry data, and a next pointer (for buckets with multiple entries). The lookup is a FindFirst/FindNext mechanism.
It's something I had already before I made OpenSpin. In reality, my hashtable is a combined hashtable and linked list (so there is a separate list next pointer in the entries), so you can walk all the entries in the table without keys.
It's the FindSymbol() function in the symbol engine that knows what the data is and does the string compare of the find results.
I got all the fast symbol stuff working, thanks to Roy. I rewrote all my symbol table code to work with the hashing. I need to find a big program that is typically slow to compile and then check the difference.
Here is my new symbol code in the Spin/PASM compiler:
;
;
;************************************************************************
;* Symbol Engine *
;************************************************************************
;
;
; Hash table index ($1000 elements):
;
; long: pointer to symbol record (0=no record)
;
; Symbol records:
;
; byte: symbol length, including terminating zero
; bytes: symbol chrs + 0
; long: symbol value
; byte: symbol type
; long: pointer to next record (0=no record)
;
;
; Variables
;
ddx symbol_ptr ;these three must be set to the current symbol set
ddx symbol_ptr_limit
ddx symbol_hash_ptr
ddx auto_symbols_hash,1000h ;auto symbols
dbx auto_symbols,auto_symbols_limit
ddx auto_symbols_ptr
ddx main_symbols_hash,1000h ;main symbols
dbx main_symbols,main_symbols_limit
ddx main_symbols_ptr
ddx local_symbols_hash,1000h ;local symbols
dbx local_symbols,local_symbols_limit
ddx local_symbols_ptr
dbx symbol,symbol_limit+2 ;+2 for obj.method extra byte and 0
dbx symbol2,symbol_limit+2
;
;
; Enter auto symbols into hashed symbol table
;
enter_auto_symbols:
call reset_auto_symbols
call write_auto_symbols
lea esi,[automatic_symbols] ;reset pointers
@@next: call hash_symbol ;hash symbol name to get length
lea edi,[symbol2] ;copy symbol name to symbol2
rep movsb
lodsd ;get value
mov ebx,eax
lodsb ;get type
call enter_symbol2 ;enter symbol
cmp [byte esi],0 ;end of automatic symbols?
jne @@next
@@done: ret
;
;
; Reset auto/main/local symbols
;
reset_auto_symbols:
mov edi,offset auto_symbols_hash
jmp reset_hash_table
reset_main_symbols:
mov edi,offset main_symbols_hash
jmp reset_hash_table
reset_local_symbols:
mov edi,offset local_symbols_hash
reset_hash_table:
xor eax,eax
mov ecx,1000h
rep stosd
ret
;
;
; Write auto/main/local symbols
;
write_auto_symbols:
mov [symbol_ptr], offset auto_symbols
mov [symbol_ptr_limit], offset auto_symbols + auto_symbols_limit - (1+32+1+4+1+4)
mov [symbol_hash_ptr], offset auto_symbols_hash
ret
write_main_symbols:
mov [symbol_ptr], offset main_symbols
mov [symbol_ptr_limit], offset main_symbols + main_symbols_limit - (1+32+1+4+1+4)
mov [symbol_hash_ptr], offset main_symbols_hash
ret
write_local_symbols:
mov [symbol_ptr], offset local_symbols
mov [symbol_ptr_limit], offset local_symbols + local_symbols_limit - (1+32+1+4+1+4)
mov [symbol_hash_ptr], offset local_symbols_hash
ret
;
;
; Enter symbol2 into symbol table
; symbol2 must hold name, terminated with 0
; al must hold type
; ebx must hold value
;
enter_symbol2: push eax
push ecx
push edx
push esi
push edi
push eax ;save type
mov edi,[symbol_ptr] ;get symbol record pointer
cmp edi,[symbol_ptr_limit] ;make sure symbol table has enough room for another symbol
jae error_stif
lea esi,[symbol2] ;hash symbol2 name to get length and hash index
call hash_symbol
add edx,[symbol_hash_ptr] ;get hash table pointer
jmp @@check ;append new record
@@skip: mov edx,eax ;point to next record
movzx eax,[byte edx] ;skip over record
add eax,1+4+1 ;account for length byte, (symbol length in eax), value long, type byte
add edx,eax
@@check: mov eax,[edx] ;check link
or eax,eax
jnz @@skip ;if not zero, skip again
mov [edx],edi ;link and enter record
mov al,cl ;enter symbol length
stosb
rep movsb ;enter symbol name with terminating zero
mov eax,ebx ;enter symbol value
stosd
pop eax ;enter symbol type
stosb
xor eax,eax ;enter new link terminator
stosd
mov [symbol_ptr],edi
pop edi
pop esi
pop edx
pop ecx
pop eax
ret
;
;
; Find symbol in auto/main/local symbol table
; symbol must hold name, terminated with 0
; if found, eax=type and ebx=value
; if not found, eax=0 (type_undefined) and ebx=0
;
find_symbol: push ecx
push edx
push esi
push edi
lea esi,[symbol] ;hash symbol name to get length and hash index
call hash_symbol
mov ebx,offset auto_symbols_hash ;search auto symbols
call @@find
jnc @@found
mov ebx,offset main_symbols_hash ;search main symbols
call @@find
jnc @@found
mov ebx,offset local_symbols_hash ;search local symbols
call @@find
jnc @@found
xor eax,eax ;symbol not found
xor ebx,ebx
@@found: pop edi
pop esi
pop edx
pop ecx
ret
@@find: push edx ;preserve hash table index
add edx,ebx ;get hash table pointer
@@link: mov edx,[edx] ;check for symbol record
cmp edx,1 ;if record < 1 then c=1
jc @@nope ;if no record, symbol not found, c=1
movzx eax,[byte edx] ;get symbol length
inc edx ;point edi to symbol
mov edi,edx
add edx,eax ;point edx to next link
add edx,4+1
cmp eax,ecx ;if symbol size mismatch, check next link
jne @@link
push ecx ;symbol size match, compare symbol names
push esi
repe cmpsb ;c=0 if match
pop esi
pop ecx
jne @@link ;if names mismatch, check next link
mov ebx,[edx-1-4] ;found symbol, get value
movzx eax,[byte edx-1] ;get type, c=0
@@nope: pop edx
ret
;
;
; Hash symbol at esi
; on exit:
; ecx = symbol length, including terminating zero
; edx = hash index
;
hash_symbol: push eax
push esi ;save symbol pointer
xor ecx,ecx ;reset symbol length
xor edx,edx ;reset hash
@@hash: xor eax,eax ;get symbol chr into eax
lodsb
inc ecx ;inc length
cmp al,0 ;if chr = 0, finish hash
je @@finish
add edx,eax ;hash += chr
mov eax,edx ;hash += (hash << 10)
shl eax,10
add edx,eax
mov eax,edx ;hash ^= (hash >> 6)
shr eax,6
xor edx,eax
jmp @@hash ;next chr
@@finish: mov eax,edx ;hash += (hash << 3)
shl eax,3
add edx,eax
mov eax,edx ;hash ^= (hash >> 11)
shr eax,11
xor edx,eax
mov eax,edx ;hash += (hash << 15)
shl eax,15
add edx,eax
mov eax,edx ;make 12-bit hash index
ror eax,16
xor edx,eax
mov eax,edx
shr eax,32-4
xor edx,eax
and edx,0FFFh
shl edx,2
pop esi ;restore symbol pointer, ecx=symbol length, edx=hash index
pop eax
ret
;
;
; Backup symbol to symbol2
;
backup_symbol: push ecx
push esi
push edi
lea esi,[symbol]
lea edi,[symbol2]
mov ecx,symbol_limit+2
rep movsb
pop edi
pop esi
pop ecx
ret
Okay, here is the latest PNut_v34b.exe with Spin2. Spin2 is not documented, yet, but you can get a taste of it here. This compiler should be super fast now with the hashed symbol tables.
There is a Spin2 VGA demo and driver in this .zip file. Also, the current Spin2 interpreter and flash loader sources are there, and they are now built into PNut. Just an F10 or F11 runs or programs your code into the P2.
Yes, if the goal is to put something native on P2, chip working in assembly right now makes a lot of sense.
It's just habit for me, mainly. It only took several hours to get the whole hashing thing worked out and implemented into the symbol table system, thanks to Roy's timely help. I was thinking I wouldn't get to this for a few more months because of a bunch of other stuff I need to do, but I had a dream about it the other night and I had a strong confidence I could do it. Everything Roy ever said about his solution stuck in my mind.
Now, has anybody even tried it out? Spin2 is working. That's the bigger news. The faster symbol system just makes it a lot snappier than it's ever been.
Yes, if the goal is to put something native on P2, chip working in assembly right now makes a lot of sense.
This was x86 assembly in the compiler, not PASM.
I realize Chip is writing what he knows, but the compiler is only maintainable by him, which makes him a source of contention in Parallax's workflow.
I don't care is pulling in the STL causes some minor bloat, it eliminates technical debt, uses a standard interface, reduces security vulnerabilities, and it makes the code maintainable by other people.
The P* ecosystem is so fragmented right now it's crazy. There are what, 4 or 5 spin compilers and 3 C compilers at this point?
l> @pedward said:
> The P* ecosystem is so fragmented right now it's crazy. There are what, 4 or 5 spin compilers and 3 C compilers at this point?
Yep! But IMHO, that really should be listed as a feature, not a bug. To wit: If half of the compiler creators were simultaneously kidnapped by aliens, we’d *still* have a nice, maintainable tool set.
Maybe if Parallax was a billion dollar company, with 30 people committed 24/7 to developing these languages against a firm release date and a published hardware spec, this criticism might stick. Instead we have a virtual do-ocracy, where some unpaid enthusiasts saw a need, picked-up the ball, and ran with it. For free. Epically.
So I’m OK with the fragmentation. And I’m frankly impressed with how the individual creators have negotiated some standards and achieved a respectable degree of interoperability. And they did all of this for free, without any official support from Parallax, or even a comprehensive hardware specification, and they did it before the chip was even officially released.
> @"Bob Lawrence (VE1RLL)" said:
> @ cgracey
> Now, has anybody even tried it out? Spin2 is working.
>
>
>
>
> Excellent!!. Thanks Chip!!
>
> It's working great
Try F11. It will run the program on every reset and power-up. Be sure to turn on the DIP switch for the flash.
This limits things, forces thought back down to the nubs. The product is lean and mean.
It will all translate to PASM and be the same way.
Had it been written with something big, higher level, what got done just now would be a dream, likely never completed.
Early on, the point of this exercise was to design chip and tools as one cohesive thing. Now that goal has been actualized, Chip can continue on with some vision he has had for an long time.
Meanwhile, there are a ton of tools! Nobody is really inhibited at this point.
Beautiful to see.
Remember the "only Prop Tool" discussion on P1, until Bill thought up LMM?
pedward,
I code for a living in C++ making commercial products, we don't use STL hardly at all. It's slow (unless you do a bunch of extra work with overriding memory allocation), bloated, and not really cross platform (many things don't work on many platforms).
STL is cool and all if you don't have alternatives, but don't think of it as the end all be all, and certainly be careful how you use it, because it lets you paint yourself into really bad performance.
However, if Chip was using C or C++ he could have used mine, and I could have helped more along the way.
Yes, if the goal is to put something native on P2, chip working in assembly right now makes a lot of sense.
It's just habit for me, mainly. It only took several hours to get the whole hashing thing worked out and implemented into the symbol table system, thanks to Roy's timely help. I was thinking I wouldn't get to this for a few more months because of a bunch of other stuff I need to do, but I had a dream about it the other night and I had a strong confidence I could do it. Everything Roy ever said about his solution stuck in my mind.
Now, has anybody even tried it out? Spin2 is working. That's the bigger news. The faster symbol system just makes it a lot snappier than it's ever been.
Thanks heaps Chip
Unfortunately I won't get a chance to try it out until next w/e. I have some great things that require Spin2 and I really wanted to keep it in the Spin2 language. First cab off the rank is Kye's FAT32 driver - I've done the P2 AMS section months ago. Once that is working, my OS for P2 will almost fall out as done
l> @pedward said:
> The P* ecosystem is so fragmented right now it's crazy. There are what, 4 or 5 spin compilers and 3 C compilers at this point?
Yep! But IMHO, that really should be listed as a feature, not a bug. To wit: If half of the compiler creators were simultaneously kidnapped by aliens, we’d *still* have a nice, maintainable tool set.
Maybe if Parallax was a billion dollar company, with 30 people committed 24/7 to developing these languages against a firm release date and a published hardware spec, this criticism might stick. Instead we have a virtual do-ocracy, where some unpaid enthusiasts saw a need, picked-up the ball, and ran with it. For free. Epically.
So I’m OK with the fragmentation. And I’m frankly impressed with how the individual creators have negotiated some standards and achieved a respectable degree of interoperability. And they did all of this for free, without any official support from Parallax, or even a comprehensive hardware specification, and they did it before the chip was even officially released.
So I’m OK with the fragmentation. And I’m frankly impressed with how the individual creators have negotiated some standards and achieved a respectable degree of interoperability. And they did all of this for free, without any official support from Parallax, or even a comprehensive hardware specification, and they did it before the chip was even officially released.
I have to second this. Nice work people. It's super appreciated. High value.
l> @pedward said:
> The P* ecosystem is so fragmented right now it's crazy. There are what, 4 or 5 spin compilers and 3 C compilers at this point?
Yep! But IMHO, that really should be listed as a feature, not a bug. To wit: If half of the compiler creators were simultaneously kidnapped by aliens, we’d *still* have a nice, maintainable tool set.
Maybe if Parallax was a billion dollar company, with 30 people committed 24/7 to developing these languages against a firm release date and a published hardware spec, this criticism might stick. Instead we have a virtual do-ocracy, where some unpaid enthusiasts saw a need, picked-up the ball, and ran with it. For free. Epically.
So I’m OK with the fragmentation. And I’m frankly impressed with how the individual creators have negotiated some standards and achieved a respectable degree of interoperability. And they did all of this for free, without any official support from Parallax, or even a comprehensive hardware specification, and they did it before the chip was even officially released.
Oh, I forgot to mention that Jeff Martin is plugging the new compiler into the PropellerTool. That has scroll bars and context highlighting.
What happened to PropellerIDE? I thought that was going to be the cross platform replacement for the PropellerTool. Will PropellerIDE be updated to work with Spin2?
Comments
P2 programs are able to be much much larger with a lot more symbols.
Yes, I agree. It just doesn't seem like comparing full hashes would be that much more efficient than comparing lengths, when maybe you'll never need to walk more than three entries, worst case. Chances are far greater that lengths will be the same than hashes, but the incidence of either, on top of a bucket collision, is near zero.
P.S. I think I'm just wanting to economize, as a byte for the length is less than a long for the hash. And, the length byte is going to be there in any case, since the symbol name is packed into the linked record.
My hashtable code is generic and doesn't know anything about the data stored in it. My entry is a hash key, a pointer to the entry data, and a next pointer (for buckets with multiple entries). The lookup is a FindFirst/FindNext mechanism.
It's something I had already before I made OpenSpin. In reality, my hashtable is a combined hashtable and linked list (so there is a separate list next pointer in the entries), so you can walk all the entries in the table without keys.
It's the FindSymbol() function in the symbol engine that knows what the data is and does the string compare of the find results.
Here is my new symbol code in the Spin/PASM compiler:
https://drive.google.com/file/d/1HWnhswX2fd3W9H1gYSFqNeWToBMO7N7T/view?usp=sharing
There is a Spin2 VGA demo and driver in this .zip file. Also, the current Spin2 interpreter and flash loader sources are there, and they are now built into PNut. Just an F10 or F11 runs or programs your code into the P2.
Here is the latest Spin2 operators file.
It's just habit for me, mainly. It only took several hours to get the whole hashing thing worked out and implemented into the symbol table system, thanks to Roy's timely help. I was thinking I wouldn't get to this for a few more months because of a bunch of other stuff I need to do, but I had a dream about it the other night and I had a strong confidence I could do it. Everything Roy ever said about his solution stuck in my mind.
Now, has anybody even tried it out? Spin2 is working. That's the bigger news. The faster symbol system just makes it a lot snappier than it's ever been.
This was x86 assembly in the compiler, not PASM.
I realize Chip is writing what he knows, but the compiler is only maintainable by him, which makes him a source of contention in Parallax's workflow.
I don't care is pulling in the STL causes some minor bloat, it eliminates technical debt, uses a standard interface, reduces security vulnerabilities, and it makes the code maintainable by other people.
The P* ecosystem is so fragmented right now it's crazy. There are what, 4 or 5 spin compilers and 3 C compilers at this point?
Looking forward to putting Spin2 to work.
+1
> The P* ecosystem is so fragmented right now it's crazy. There are what, 4 or 5 spin compilers and 3 C compilers at this point?
Yep! But IMHO, that really should be listed as a feature, not a bug. To wit: If half of the compiler creators were simultaneously kidnapped by aliens, we’d *still* have a nice, maintainable tool set.
Maybe if Parallax was a billion dollar company, with 30 people committed 24/7 to developing these languages against a firm release date and a published hardware spec, this criticism might stick. Instead we have a virtual do-ocracy, where some unpaid enthusiasts saw a need, picked-up the ball, and ran with it. For free. Epically.
So I’m OK with the fragmentation. And I’m frankly impressed with how the individual creators have negotiated some standards and achieved a respectable degree of interoperability. And they did all of this for free, without any official support from Parallax, or even a comprehensive hardware specification, and they did it before the chip was even officially released.
IMHO, thats just really cool. Kudos to them!
The PNut text editor doesn't show any scrollbars. I'm using Windows 7 Pro x64.
Would be nice to have a vertical scroll bar at least.
Excellent!!. Thanks Chip!!
It's working great
Future improvements to make:
1) automatic constant resolving
2) dead code removal
3) debugging - a biggie, formative
I know we need a scroll bar.
> @ cgracey
> Now, has anybody even tried it out? Spin2 is working.
>
>
>
>
> Excellent!!. Thanks Chip!!
>
> It's working great
Try F11. It will run the program on every reset and power-up. Be sure to turn on the DIP switch for the flash.
Really glad you got here Chip!
@pedward
Yeah, lots of tools. Wonderful problem to have.
Re: Chip working x86 assembly
This limits things, forces thought back down to the nubs. The product is lean and mean.
It will all translate to PASM and be the same way.
Had it been written with something big, higher level, what got done just now would be a dream, likely never completed.
Early on, the point of this exercise was to design chip and tools as one cohesive thing. Now that goal has been actualized, Chip can continue on with some vision he has had for an long time.
Meanwhile, there are a ton of tools! Nobody is really inhibited at this point.
Beautiful to see.
Remember the "only Prop Tool" discussion on P1, until Bill thought up LMM?
Not even a thought on P2, we have HUBEXEC.
I code for a living in C++ making commercial products, we don't use STL hardly at all. It's slow (unless you do a bunch of extra work with overriding memory allocation), bloated, and not really cross platform (many things don't work on many platforms).
STL is cool and all if you don't have alternatives, but don't think of it as the end all be all, and certainly be careful how you use it, because it lets you paint yourself into really bad performance.
However, if Chip was using C or C++ he could have used mine, and I could have helped more along the way.
Anyway, really glad to see Spin2 in the wild.
Excellent!
Hope to hook my debugger to Spin2 soon and watch it work it's magic.
For anyone that wants to print it , it's only 49 pages
Thanks heaps Chip
Unfortunately I won't get a chance to try it out until next w/e. I have some great things that require Spin2 and I really wanted to keep it in the Spin2 language. First cab off the rank is Kye's FAT32 driver - I've done the P2 AMS section months ago. Once that is working, my OS for P2 will almost fall out as done
+1
!!! That's going to be very fun.
I have to second this. Nice work people. It's super appreciated. High value.
Thank you. Mean it.
+1