Are you cut out to work at facebook? group these anagrams in SPIN

Beau Schwabe · 2015-05-30 20:24

Good Luck erco !!! ..Don't let the hiring manager intimidate you, RELAX, and act like you have been there for years. After all what you are applying for is most likely a skill you have been already using. You own this !! :-) .... We need to go out for beers again and "talk shop" soon !! I might possibly be out your direction again towards the end of June. I will let you know.

mindrobots · 2015-05-30 21:59

Good luck, erco! At our point in life, be sure to ask about their age discrimination policies! :0)

localroger · 2015-05-31 06:04

I am not going to waste my time writing code for this ridiculous little puzzle (My answer: Call me back when you have a serious problem you want solved, but then they're probably not interested in hiring a 51 year old coder anyway). But given the limitations of Spin, the approach I would take would be to bubble-sort the letters of the words, since that can be coded quickly and simply without complicated memory allocation issues, then use strcomp to detect the anagrams. You'd need an unsorted mirror array to generate the final output.

base2design · 2015-05-31 08:16

Here's the same thing more or less in C (maybe I'll post Spin later, but I used lots of dynamic stuff here and would have to re-think a bit to minimize that):

#include <stdio.h>
#include <string.h>
#include <malloc.h>
char *words[] = {
	"tsar", "rat", "tar", "star",
	"tars", "cheese", "mow", "lox",
	NULL
};
#define LETTERS 26
// the following assumes the alphabet is no more than 256 symbols
static char hashcount[LETTERS];
// the following assumes a word's symbols never repeats more than 256
static char hashbuffer[LETTERS*3+1];
char *hash(char *word) {
	char *ptr = word;
	int i;
	for (i=0; i<LETTERS; i++) hashcount[i] = 0;
	while (*ptr) {
		char c = tolower(*ptr);
		if (c >= 'a' && c <= 'z') hashcount[c-'a']++;
		ptr++;
	}
	ptr = hashbuffer;
	for (i=0; i<LETTERS; i++) {
		if (hashcount[i]) {
			*ptr++ = i + 'a';
			if (hashcount[i] > 1) {
				sprintf(ptr, "%x", hashcount[i]);
				ptr += (hashcount[i]/16 + 1);
			}
		}
	}
	*ptr = '\0';
	return strdup(hashbuffer);
}
int main(int argc, char *argv[]) {
	char **id = NULL;
	char *printed = NULL;
	int i, j, len=0;
	while (1) {
		if (words[len] == NULL) break;
		len++;
	}
	id = (char **)malloc(len*sizeof(char *));
	printed = (char *)malloc(len*sizeof(char));
	for (i=0; i<len; i++) {
		id[i] = hash(words[i]);
		printed[i] = 0;
		printf("[%d] %s : %s\n", i, words[i], id[i]);
	}
	for (i=0; i<len; i++) {
		if (printed[i] == 0) {
			printed[i] = 1;
			printf("===\n");
			printf("%s\n", words[i]);
			for (j=i+1; j<len; j++) {
				if (printed[j] == 0 && (strcmp(id[i], id[j]) == 0)) {
					printed[j] = 1;
					printf("%s\n", words[j]);
				}
			}
		}
	}
	for (i=0; i<len; i++) free(id[i]);
	free(id);
	free(printed);
	return 0;
}

And its output:

[0] tsar : arst
[1] rat : art
[2] tar : art
[3] star : arst
[4] tars : arst
[5] cheese : ce3hs
[6] mow : mow
[7] lox : lox
===
tsar
star
tars
===
rat
tar
===
cheese
===
mow
===
lox

base2design · 2015-05-31 08:45

Phil Pilgrim (PhiPi) wrote: »

No, I appreciate the exposure to Lua. I need to grok it better, since the fastest and most comprensive CHDK scripts are written in it.

-Phil

On the x86 and ARM side, LuaJIT (http://luajit.org) is a fantastic setup. Nice C FFI (foreign function interface) so you can interface with existing libraries without too much drama.

Phil Pilgrim (PhiPi) · 2015-05-31 09:26

Here's a Spin program that works:

CON

  _clkmode      = xtal1 + pll16x
  _xinfreq      = 5_000_000

  MAXWORDS      = 50

VAR

  word wordaddrs[MAXWORDS]
  byte sigs[MAXWORDS * 27], done[MAXWORDS]

OBJ

  pst   : "Parallax Serial Terminal" 

PUB start | i, j, n

  pst.start(115200)
  repeat i from 0 to MAXWORDS - 1
    bytefill(@sigs[i * 27], 1, 26)
  i := n := 0
  repeat
    wordaddrs[n++] := @words[i]
    repeat
      sigs[n * 27 + words[i] - constant( 27 + "A")]++
    until words[++i] == "," or words[i] == 0
    if words[i]
      words[i++]~
    else
      quit
  repeat i from 0 to n - 1
    if not done[i]
      repeat j from i to n - 1
        if strcomp(@sigs[i * 27], @sigs[j * 27]) and not done[j]
          done[j]++
          pst.str(wordaddrs[j])
          pst.char(13)
      pst.str(string("--------", 13))  
       
DAT

words         byte      "SPOT,ACME,POT,STOP,MACE,REMISS,RESCAMS,SUER,NYMPH,"
              byte      "FOREST,POTS,FOSTER,TOP,REDO,OPTS,DOER,MISTER,MISER,"
              byte      "USER,STOMP,HYMN,SCREAMS,CAME,MISSER",0

and its output:

SPOT
STOP
POTS
OPTS
--------
ACME
MACE
CAME
--------
POT
TOP
--------
REMISS
MISSER
--------
RESCAMS
SCREAMS
--------
SUER
USER
--------
NYMPH
--------
FOREST
FOSTER
--------
REDO
DOER
--------
MISTER
--------
MISER
--------
STOMP
--------
HYMN
--------

-Phil

base2design · 2015-05-31 09:39

Phil Pilgrim (PhiPi) wrote: »

Here's a Spin program that works:

Nice and concise Phil. I like it!

Phil Pilgrim (PhiPi) · 2015-05-31 14:41

Here's the Perl version:

use strict;
my @words = qw/SPOT ACME POT STOP MACE REMISS RESCAMS SUER NYMPH FOREST POTS FOSTER
               TOP REDO OPTS DOER MISTER MISER USER STOMP HYMN SCREAMS CAME MISSER/;
my @sorted = map {join('', sort(split(//, $_)))} @words;
my @done;
foreach my $i (0 .. @words - 1) {
  unless ($done[$i]) {
    print map {"$words[$_]\n"} (grep {!$done[$_] && $sorted[$i] eq $sorted[$_] && ++$done[$_]} ($i .. @words - 1));
    print "---\n";
} }

It's kinda long. I'm sure it could be made much shorter.

-Phil

erco · 2015-05-31 15:33

localroger wrote: »

... but then they're probably not interested in hiring a 51 year old coder anyway.

Only 51? You're a Thundercat and WAY too young for the "sour grapes" routine!

Beau Schwabe · 2015-05-31 17:05

Hey Phil,

How about a couple lines of command line SEd code in a script file, I am pretty sure there are string commands already established to do everything for you already. For that matter Tcl/Tk or "Tickle" might also have everything you need and most of it already has a C dialect <- back to the original question

Phil Pilgrim (PhiPi) · 2015-05-31 17:54

Beau,

My limited exposures to both sed and tcl have always driven me back to Perl.

I'm sure there's a Perl one-liner out there somewhere, but I really am curious about the lack of response from our dedicated forthwrights. Peter? Prof_b? Oh, well; maybe forth isn't up to this challenge ...

-Phil

Heater. · 2015-05-31 21:40

I was fascinated by Phil's suggestion of G

Mark_T · 2015-06-01 17:32

In Common Lisp there's no limit to integers so you could easily use the prime factor trick, something like

(defconstant +primes-table+ #(2 3 5 7 11 13 ....))

(defun word-code (string)
  (let ((prime-product 1))
    (dotimes (n (length string))
      (setf prime-product (* prime-product (svref +primes-table+ (- (schar string n) #\A)))))
    prime-product))

(if I remember my common lisp!)

msrobots · 2015-06-01 19:44

erco wrote: »

Only 51? You're a Thundercat and WAY too young for the "sour grapes" routine!

I guess you are wrong there.

I had my share of being interviewed by people half my age. It is no fun.

Now I am working my resume backwards, working for companies I worked before.

As far as I can see I will work as a truck driver again at that rate at age of 60+

Sad

Mike

Phil Pilgrim (PhiPi) · 2015-06-01 19:56

Mike,

At 51, you've got enough life experience to be a high-priced consultant. Shed the I-need-a-job mentality, and strike out on your own! Is it risky? Sure it is. But that's what makes the adventure worth pursuing!

-Phil

potatohead · 2015-06-02 10:37

Phil's comments seconded. Doing that can be very rewarding.

I did some digging on this "code it in the interview" scenario, and often they are looking for how people think and work with others more than they are perfection in the solution.

They want to see somebody talk through the problem, diagram the data representations and structures they use, devise or apply algorithms, etc...

It may be that FB does require the code be correct, and it may be they put you in an editor right there too. If the latter, they are looking for those 100x programmers. The few people who can just knock nearly anything out quick. Often, the former is all about knowing somebody can get there and they can work with them, or feel that adding them will improve everyone.

@Mike, Phil, currently, I'm the oldest guy where I work. Everyone there is one generation younger. It was very interesting to interview. I did have the advantage of knowing these people through some technical training I did for them all some years back. Now I regularly see how they want to work, and those differences from my own generation and older ones are great! In some ways, I've moved to younger modes, or newer modes.

For example, I've never used my office phone. As far as I'm concerned, it does not exist. Another is text vs email. Email gets used for a lot of things, but text has replaced a lot of simple, "where are they?", "how do I?", "when is it again?" type questions. They will group text, get the answer and move on, all while doing whatever it is that needs doing. Damn cool.

Some of my experiences have mapped over too. Shortcuts. They have been learning what, "there is always time to do it right" actually means! Adopting simple checks and process controls seemed overhead, unnecessary, etc... and now that they have seen the costs and risks play out, they now take more time to validate before executing on things. They were almost entirely opportunity cost blind.

These kinds of experiences and being able to map what is good over to improvements on what isn't so good is very high value. As a consultant, this service is worth a ton, and the basic pitch is, "pay me X to save Y, and I'm free" In fact, you can expand it to, "Not talking to me costs you money" and actually mean that in real terms. No joke.

But, you gotta make that case. It's the hardest part Mike. But if you do, it pays well and you will very likely seriously enjoy doing the work. I've done both, and currently am employed, not consulting, but that's only due to the scope of the goal is full time. Overall, the dynamics are the same.

Interestingly, I'm paying a business consultant right now to get the company structure / partners sorted out, and optimize costs, taxes and a few other things. It's literally spend $20K to save $100K. Doing the work myself carries the opportunity cost of diluting my own expertise and the time I can apply it, which is more than that $20K most likely in real terms, and a lot more in opportunity terms. (efficiency * number of people = lots of dollars) And that's the basic equation needed to justify outside consulting, and quantify results.

Not everyone will think that way, but a lot of them will, and if you deliver, it pays very, very well.

KeithE · 2015-06-02 12:19

I was thinking that you might make lists based on the alphagrams of the input data (I've read that scrabble players typically study alphagrams http://en.wikipedia.org/wiki/Alphagram) and then output the lists. I don't have much time now, but in many of the scripting languages associative arrays would make it easy to create the lists, and then dump them. Maybe one of the solutions above is doing this?

Phil Pilgrim (PhiPi) · 2015-06-02 12:30

The Perl code I posted above uses alphagrams. But any order-independent, unique text signature will work.

-Phil

Heater. · 2015-06-02 13:01

Phil,

The Perl code I posted above uses alphagrams.

How would we ever know

KeithE · 2015-06-02 14:05

Phil Pilgrim (PhiPi) wrote: »

But any order-independent, unique text signature will work.

-Phil

Yeah - it's been a while since I used Perl and couldn't see that, but was suspicious. Alphagrams just came to mind because scrabble players have to be able to run through anagrams in their head really quickly, and use them as a tool to practice. I'm not sure how widely used the term is, so you might run the risk of having to discuss scrabble AI if you mention them to a programmer during an interview ;-)

Edited to add:

I see this approach was suggested early in the thread - I skipped ahead to the Godel stuff ;-)

This gawk code may similar to the condensed perl above:

// {
s = alphagram($1)
count++
array[s, count] = $1
}

END {
for (i in array)
print array
}

function alphagram(input, a, len, n, str ) {
len =split(input,a,"");
asort(a);
for (n = 1 ; n <= len ; n++) {
str = str a[n];
}
return(str)
}

I would run this as follows:

gawk -f alphagram.awk < alphagram.txt

It worked in my tests, but I don't use multidimensional arrays often so that part may have an error.

I like gawk since it all fits onto an 18 page Gnu reference card. If only it had a few more features...

Phil Pilgrim (PhiPi) · 2015-06-02 15:45

heater wrote:

How would we ever know

Whaaat?!! It's not even obfuscated!

-Phil

Alexander (Sandy) Hapgood · 2015-06-03 00:46

Where were you guys when the interview was being conducted?!

erco · 2015-06-03 17:16

Had a great interview today, no silly curve ball questions. A solid chat with the (private) company's president, he's a no-nonsense product guy. He scored bonus points with me: he specifically recognized the Sharp IR distance sensor on a robot video I showed him. My kinda guy!

Heater. · 2015-06-03 20:32

erco,

He scored bonus points with me:

That's the spirit, you interview them.

Days after failing to get this hypothetical Facebook post, just for fun and completeness here is Phil's G

Phil Pilgrim (PhiPi) · 2015-06-03 21:50

Wow, you did it!

Another way to go about this might be to take the value ?c (in Spin) for each letter (ASCII value) c, and add that value to a sum for each letter in a word. The sum, then, is a 32-bit hash of the word. Since there are way fewer than 2³² English words (i.e. slightly over a million), the question then becomes: can you find any collisions between two words that are not anagrams of each other?

Looking at it probabilistically (assuming this hash function behaves like a random number), you have an urn with 2³² identifiable balls. If you pick a ball at random and replace it, what are the chances that after one million trials you will have picked a ball more than once?

-Phil

dbpage · 2015-06-04 02:58

Heater,
It looks like the same prime number "5" is assigned to "t" and "a".
- Dennis

var primes = { e: 2, t: 5, a: 5, o: 7, i: 11, n: 13, s: 17, r: 19, h: 23,

Heater. · 2015-06-04 03:34

Yep, just noticed that myself. "t" should be 3 of course.

erco · 2015-06-04 03:37

dbpage wrote: »

Heater,
It looks like the same prime number "5" is assigned to "t" and "a".
- Dennis

I completely understand the slip up. Whenever I think about T & A, I can't decide which one has greater value.

Nice solution, Heater!

Ding-Batty · 2015-06-04 05:04

Phil Pilgrim (PhiPi) wrote: »

Looking at it probabilistically (assuming this hash function behaves like a random number), you have an urn with 2³² identifiable balls. If you pick a ball at random and replace it, what are the chances that after one million trials you will have picked a ball more than once?

Phil, that is the same as the Birthday Problem, or the chance of collision for a 32-bit hash. For 1M trials the chances of a collision is very close to certainty -- much > 99%.

Heater. · 2015-06-04 05:45

erco,

What is the value of "t" and "a"? In this case I want the most frequently used letters in English to have the lower prime numbers.

I find this whole prime number thing boggles my mind anyway. Primes are just weird

Are you cut out to work at facebook? group these anagrams in SPIN

Comments