Good Luck erco !!! ..Don't let the hiring manager intimidate you, RELAX, and act like you have been there for years. After all what you are applying for is most likely a skill you have been already using. You own this !! :-) .... We need to go out for beers again and "talk shop" soon !! I might possibly be out your direction again towards the end of June. I will let you know.
I am not going to waste my time writing code for this ridiculous little puzzle (My answer: Call me back when you have a serious problem you want solved, but then they're probably not interested in hiring a 51 year old coder anyway). But given the limitations of Spin, the approach I would take would be to bubble-sort the letters of the words, since that can be coded quickly and simply without complicated memory allocation issues, then use strcomp to detect the anagrams. You'd need an unsorted mirror array to generate the final output.
Here's the same thing more or less in C (maybe I'll post Spin later, but I used lots of dynamic stuff here and would have to re-think a bit to minimize that):
#include <stdio.h>
#include <string.h>
#include <malloc.h>
char *words[] = {
"tsar", "rat", "tar", "star",
"tars", "cheese", "mow", "lox",
NULL
};
#define LETTERS 26
// the following assumes the alphabet is no more than 256 symbols
static char hashcount[LETTERS];
// the following assumes a word's symbols never repeats more than 256
static char hashbuffer[LETTERS*3+1];
char *hash(char *word) {
char *ptr = word;
int i;
for (i=0; i<LETTERS; i++) hashcount[i] = 0;
while (*ptr) {
char c = tolower(*ptr);
if (c >= 'a' && c <= 'z') hashcount[c-'a']++;
ptr++;
}
ptr = hashbuffer;
for (i=0; i<LETTERS; i++) {
if (hashcount[i]) {
*ptr++ = i + 'a';
if (hashcount[i] > 1) {
sprintf(ptr, "%x", hashcount[i]);
ptr += (hashcount[i]/16 + 1);
}
}
}
*ptr = '\0';
return strdup(hashbuffer);
}
int main(int argc, char *argv[]) {
char **id = NULL;
char *printed = NULL;
int i, j, len=0;
while (1) {
if (words[len] == NULL) break;
len++;
}
id = (char **)malloc(len*sizeof(char *));
printed = (char *)malloc(len*sizeof(char));
for (i=0; i<len; i++) {
id[i] = hash(words[i]);
printed[i] = 0;
printf("[%d] %s : %s\n", i, words[i], id[i]);
}
for (i=0; i<len; i++) {
if (printed[i] == 0) {
printed[i] = 1;
printf("===\n");
printf("%s\n", words[i]);
for (j=i+1; j<len; j++) {
if (printed[j] == 0 && (strcmp(id[i], id[j]) == 0)) {
printed[j] = 1;
printf("%s\n", words[j]);
}
}
}
}
for (i=0; i<len; i++) free(id[i]);
free(id);
free(printed);
return 0;
}
And its output:
[0] tsar : arst
[1] rat : art
[2] tar : art
[3] star : arst
[4] tars : arst
[5] cheese : ce3hs
[6] mow : mow
[7] lox : lox
===
tsar
star
tars
===
rat
tar
===
cheese
===
mow
===
lox
No, I appreciate the exposure to Lua. I need to grok it better, since the fastest and most comprensive CHDK scripts are written in it.
-Phil
On the x86 and ARM side, LuaJIT (http://luajit.org) is a fantastic setup. Nice C FFI (foreign function interface) so you can interface with existing libraries without too much drama.
CON
_clkmode = xtal1 + pll16x
_xinfreq = 5_000_000
MAXWORDS = 50
VAR
word wordaddrs[MAXWORDS]
byte sigs[MAXWORDS * 27], done[MAXWORDS]
OBJ
pst : "Parallax Serial Terminal"
PUB start | i, j, n
pst.start(115200)
repeat i from 0 to MAXWORDS - 1
bytefill(@sigs[i * 27], 1, 26)
i := n := 0
repeat
wordaddrs[n++] := @words[i]
repeat
sigs[n * 27 + words[i] - constant( 27 + "A")]++
until words[++i] == "," or words[i] == 0
if words[i]
words[i++]~
else
quit
repeat i from 0 to n - 1
if not done[i]
repeat j from i to n - 1
if strcomp(@sigs[i * 27], @sigs[j * 27]) and not done[j]
done[j]++
pst.str(wordaddrs[j])
pst.char(13)
pst.str(string("--------", 13))
DAT
words byte "SPOT,ACME,POT,STOP,MACE,REMISS,RESCAMS,SUER,NYMPH,"
byte "FOREST,POTS,FOSTER,TOP,REDO,OPTS,DOER,MISTER,MISER,"
byte "USER,STOMP,HYMN,SCREAMS,CAME,MISSER",0
and its output:
SPOT
STOP
POTS
OPTS
--------
ACME
MACE
CAME
--------
POT
TOP
--------
REMISS
MISSER
--------
RESCAMS
SCREAMS
--------
SUER
USER
--------
NYMPH
--------
FOREST
FOSTER
--------
REDO
DOER
--------
MISTER
--------
MISER
--------
STOMP
--------
HYMN
--------
How about a couple lines of command line SEd code in a script file, I am pretty sure there are string commands already established to do everything for you already. For that matter Tcl/Tk or "Tickle" might also have everything you need and most of it already has a C dialect <- back to the original question
My limited exposures to both sed and tcl have always driven me back to Perl. I'm sure there's a Perl one-liner out there somewhere, but I really am curious about the lack of response from our dedicated forthwrights. Peter? Prof_b? Oh, well; maybe forth isn't up to this challenge ...
At 51, you've got enough life experience to be a high-priced consultant. Shed the I-need-a-job mentality, and strike out on your own! Is it risky? Sure it is. But that's what makes the adventure worth pursuing!
Phil's comments seconded. Doing that can be very rewarding.
I did some digging on this "code it in the interview" scenario, and often they are looking for how people think and work with others more than they are perfection in the solution.
They want to see somebody talk through the problem, diagram the data representations and structures they use, devise or apply algorithms, etc...
It may be that FB does require the code be correct, and it may be they put you in an editor right there too. If the latter, they are looking for those 100x programmers. The few people who can just knock nearly anything out quick. Often, the former is all about knowing somebody can get there and they can work with them, or feel that adding them will improve everyone.
@Mike, Phil, currently, I'm the oldest guy where I work. Everyone there is one generation younger. It was very interesting to interview. I did have the advantage of knowing these people through some technical training I did for them all some years back. Now I regularly see how they want to work, and those differences from my own generation and older ones are great! In some ways, I've moved to younger modes, or newer modes.
For example, I've never used my office phone. As far as I'm concerned, it does not exist. Another is text vs email. Email gets used for a lot of things, but text has replaced a lot of simple, "where are they?", "how do I?", "when is it again?" type questions. They will group text, get the answer and move on, all while doing whatever it is that needs doing. Damn cool.
Some of my experiences have mapped over too. Shortcuts. They have been learning what, "there is always time to do it right" actually means! Adopting simple checks and process controls seemed overhead, unnecessary, etc... and now that they have seen the costs and risks play out, they now take more time to validate before executing on things. They were almost entirely opportunity cost blind.
These kinds of experiences and being able to map what is good over to improvements on what isn't so good is very high value. As a consultant, this service is worth a ton, and the basic pitch is, "pay me X to save Y, and I'm free" In fact, you can expand it to, "Not talking to me costs you money" and actually mean that in real terms. No joke.
But, you gotta make that case. It's the hardest part Mike. But if you do, it pays well and you will very likely seriously enjoy doing the work. I've done both, and currently am employed, not consulting, but that's only due to the scope of the goal is full time. Overall, the dynamics are the same.
Interestingly, I'm paying a business consultant right now to get the company structure / partners sorted out, and optimize costs, taxes and a few other things. It's literally spend $20K to save $100K. Doing the work myself carries the opportunity cost of diluting my own expertise and the time I can apply it, which is more than that $20K most likely in real terms, and a lot more in opportunity terms. (efficiency * number of people = lots of dollars) And that's the basic equation needed to justify outside consulting, and quantify results.
Not everyone will think that way, but a lot of them will, and if you deliver, it pays very, very well.
I was thinking that you might make lists based on the alphagrams of the input data (I've read that scrabble players typically study alphagrams http://en.wikipedia.org/wiki/Alphagram) and then output the lists. I don't have much time now, but in many of the scripting languages associative arrays would make it easy to create the lists, and then dump them. Maybe one of the solutions above is doing this?
But any order-independent, unique text signature will work.
-Phil
Yeah - it's been a while since I used Perl and couldn't see that, but was suspicious. Alphagrams just came to mind because scrabble players have to be able to run through anagrams in their head really quickly, and use them as a tool to practice. I'm not sure how widely used the term is, so you might run the risk of having to discuss scrabble AI if you mention them to a programmer during an interview ;-)
Edited to add:
I see this approach was suggested early in the thread - I skipped ahead to the Godel stuff ;-)
This gawk code may similar to the condensed perl above:
Had a great interview today, no silly curve ball questions. A solid chat with the (private) company's president, he's a no-nonsense product guy. He scored bonus points with me: he specifically recognized the Sharp IR distance sensor on a robot video I showed him. My kinda guy!
Another way to go about this might be to take the value ?c (in Spin) for each letter (ASCII value) c, and add that value to a sum for each letter in a word. The sum, then, is a 32-bit hash of the word. Since there are way fewer than 232 English words (i.e. slightly over a million), the question then becomes: can you find any collisions between two words that are not anagrams of each other?
Looking at it probabilistically (assuming this hash function behaves like a random number), you have an urn with 232 identifiable balls. If you pick a ball at random and replace it, what are the chances that after one million trials you will have picked a ball more than once?
Looking at it probabilistically (assuming this hash function behaves like a random number), you have an urn with 232 identifiable balls. If you pick a ball at random and replace it, what are the chances that after one million trials you will have picked a ball more than once?
Phil, that is the same as the Birthday Problem, or the chance of collision for a 32-bit hash. For 1M trials the chances of a collision is very close to certainty -- much > 99%.
Comments
And its output:
On the x86 and ARM side, LuaJIT (http://luajit.org) is a fantastic setup. Nice C FFI (foreign function interface) so you can interface with existing libraries without too much drama.
and its output:
-Phil
Nice and concise Phil. I like it!
It's kinda long. I'm sure it could be made much shorter.
-Phil
Only 51? You're a Thundercat and WAY too young for the "sour grapes" routine!
How about a couple lines of command line SEd code in a script file, I am pretty sure there are string commands already established to do everything for you already. For that matter Tcl/Tk or "Tickle" might also have everything you need and most of it already has a C dialect <- back to the original question
My limited exposures to both sed and tcl have always driven me back to Perl. I'm sure there's a Perl one-liner out there somewhere, but I really am curious about the lack of response from our dedicated forthwrights. Peter? Prof_b? Oh, well; maybe forth isn't up to this challenge ...
-Phil
I guess you are wrong there.
I had my share of being interviewed by people half my age. It is no fun.
Now I am working my resume backwards, working for companies I worked before.
As far as I can see I will work as a truck driver again at that rate at age of 60+
Sad
Mike
At 51, you've got enough life experience to be a high-priced consultant. Shed the I-need-a-job mentality, and strike out on your own! Is it risky? Sure it is. But that's what makes the adventure worth pursuing!
-Phil
I did some digging on this "code it in the interview" scenario, and often they are looking for how people think and work with others more than they are perfection in the solution.
They want to see somebody talk through the problem, diagram the data representations and structures they use, devise or apply algorithms, etc...
It may be that FB does require the code be correct, and it may be they put you in an editor right there too. If the latter, they are looking for those 100x programmers. The few people who can just knock nearly anything out quick. Often, the former is all about knowing somebody can get there and they can work with them, or feel that adding them will improve everyone.
@Mike, Phil, currently, I'm the oldest guy where I work. Everyone there is one generation younger. It was very interesting to interview. I did have the advantage of knowing these people through some technical training I did for them all some years back. Now I regularly see how they want to work, and those differences from my own generation and older ones are great! In some ways, I've moved to younger modes, or newer modes.
For example, I've never used my office phone. As far as I'm concerned, it does not exist. Another is text vs email. Email gets used for a lot of things, but text has replaced a lot of simple, "where are they?", "how do I?", "when is it again?" type questions. They will group text, get the answer and move on, all while doing whatever it is that needs doing. Damn cool.
Some of my experiences have mapped over too. Shortcuts. They have been learning what, "there is always time to do it right" actually means! Adopting simple checks and process controls seemed overhead, unnecessary, etc... and now that they have seen the costs and risks play out, they now take more time to validate before executing on things. They were almost entirely opportunity cost blind.
These kinds of experiences and being able to map what is good over to improvements on what isn't so good is very high value. As a consultant, this service is worth a ton, and the basic pitch is, "pay me X to save Y, and I'm free" In fact, you can expand it to, "Not talking to me costs you money" and actually mean that in real terms. No joke.
But, you gotta make that case. It's the hardest part Mike. But if you do, it pays well and you will very likely seriously enjoy doing the work. I've done both, and currently am employed, not consulting, but that's only due to the scope of the goal is full time. Overall, the dynamics are the same.
Interestingly, I'm paying a business consultant right now to get the company structure / partners sorted out, and optimize costs, taxes and a few other things. It's literally spend $20K to save $100K. Doing the work myself carries the opportunity cost of diluting my own expertise and the time I can apply it, which is more than that $20K most likely in real terms, and a lot more in opportunity terms. (efficiency * number of people = lots of dollars) And that's the basic equation needed to justify outside consulting, and quantify results.
Not everyone will think that way, but a lot of them will, and if you deliver, it pays very, very well.
-Phil
Yeah - it's been a while since I used Perl and couldn't see that, but was suspicious. Alphagrams just came to mind because scrabble players have to be able to run through anagrams in their head really quickly, and use them as a tool to practice. I'm not sure how widely used the term is, so you might run the risk of having to discuss scrabble AI if you mention them to a programmer during an interview ;-)
Edited to add:
I see this approach was suggested early in the thread - I skipped ahead to the Godel stuff ;-)
This gawk code may similar to the condensed perl above:
// {
s = alphagram($1)
count++
array[s, count] = $1
}
END {
for (i in array)
print array
}
function alphagram(input, a, len, n, str ) {
len =split(input,a,"");
asort(a);
for (n = 1 ; n <= len ; n++) {
str = str a[n];
}
return(str)
}
I would run this as follows:
gawk -f alphagram.awk < alphagram.txt
It worked in my tests, but I don't use multidimensional arrays often so that part may have an error.
I like gawk since it all fits onto an 18 page Gnu reference card. If only it had a few more features...
-Phil
Days after failing to get this hypothetical Facebook post, just for fun and completeness here is Phil's G
Another way to go about this might be to take the value ?c (in Spin) for each letter (ASCII value) c, and add that value to a sum for each letter in a word. The sum, then, is a 32-bit hash of the word. Since there are way fewer than 232 English words (i.e. slightly over a million), the question then becomes: can you find any collisions between two words that are not anagrams of each other?
Looking at it probabilistically (assuming this hash function behaves like a random number), you have an urn with 232 identifiable balls. If you pick a ball at random and replace it, what are the chances that after one million trials you will have picked a ball more than once?
-Phil
It looks like the same prime number "5" is assigned to "t" and "a".
- Dennis
I completely understand the slip up. Whenever I think about T & A, I can't decide which one has greater value.
Nice solution, Heater!
Phil, that is the same as the Birthday Problem, or the chance of collision for a 32-bit hash. For 1M trials the chances of a collision is very close to certainty -- much > 99%.
What is the value of "t" and "a"? In this case I want the most frequently used letters in English to have the lower prime numbers.
I find this whole prime number thing boggles my mind anyway. Primes are just weird