Is There A Better Way?
JonnyMac
Posts: 9,107
I'm working on a program where I need to look for a sub-string inside a larger string. I knocked this together. Is there a better way?
pub find(p_needle, p_haystack) : pos | nlen, hlen, hidx, nidx, check '' Returns index of sub-string p_needle in string p_haystack '' -- if not found, returns -1 nlen := strsize(p_needle) hlen := strsize(p_haystack) if (nlen > hlen) return -1 repeat hidx from 0 to (hlen-nlen+1) check := 0 repeat nidx from 0 to nlen-1 if (byte[p_needle+nidx] == byte[p_haystack+hidx+nidx]) ++check else quit if (check == nlen) return hidx return -1
This discussion has been closed.
Comments
There is an error in my code with single character strings, they will be always found at begin.
The inner loop needs to be changed like that:
But I also tried an inline PASM version of my code, and that is about 10 times faster:
I also compared it with fastspin, and here are the results: Edit: This was a 3 char string that was found at 35th position in a big string
Andy
I know the P1 would calculate it each loop.
If Spin2 does calculate the value each loop, it would likely speed up the method to calculate the value once and save it to a size variable.
I don't think it does; I moved that expression out of the repeat line (to a variable), and the loop ran a little slower. I think that particular expression is evaluated once with the value moved to an internal control variable.
Yes, with "REPEAT N" loops, N is only evaluated when entering the loop, even in Spin1.
I would have thought this is the way Spin behaves, but I recall a thread by Phil about his surprise when he discovered this value was calculated each loop.
I'll need to find the thread to prove to myself I'm not crazy.
"REPEAT i FROM X TO Y (STEP Z)" type loops do evaluate X, Y and Z once per iteration. Are you thinking of those?
Edit: Wuerfel_21 was correct. The "step" inclusion threw me off.
No, I'm just crazy.
I couldn't find the thread I mentioned previously so I wrote a little test program. The repeat command behaves just like you and others claim.
I'm trying to figure out why I thought this was something which could be changed.
Thanks to everyone setting me straight.
In the above code "x+ y" is evaluated once. However the "x +y" in the code below is evaluated each loop.
Edit: This was just what @Wuerfel_21 was trying to tell me.
I can only imagine -- without looking at the byte codes -- that the counter in the first case actually counts from x+y to zero, i.e. in a djnz type of looping.
My preference for the second case would have been to evaluate x + y only once, before the beginning of the loop.
-Phil
Yes, that's how it works - there's a hidden variable that counts to zero. I don't really know why the other kind of loop doesn't cache it's bounds/step and why the REPEAT N loop doesn't give you access to the counter variable.
Once Chip adds in the uncalled code removal to Spin2 it will open up the possibility of building up larger suites of useful library functions. Areas such as string processing, maths and graphics are good candidates and can be optimised by assembly loops where it makes sense for performance.
However right now without the dead code removal in official Spin2, we are somewhat limited and would need to either cut-paste snippets from library code where that is possible and try to construct any libraries using myriads of smaller fully independent files which isn't always convenient to do in cases with shared data, otherwise we risk bloating the final code with an entire library when it may only be a handful of functions needed.
Before you also ban me from your threads I will volunteer to not to post to your threads or comment on any of your posts until you publicly apologize to Rayman.
Dear C programmers: If you the only point of your post is to tell Spin programmers how much better C is, well just don't.
Ken Gracey