Array.sort is in-place, returning unit.
# let sort word = let a = array_of_string word in Array.sort compare a; a;;
val sort : string -> char array = <fun>
# sort "asdf";;
- : char array = [|'a'; 'd'; 'f'; 's'|]
Now if you wanted a scrabble-cheat type program (all "partial-anagrams" of
given letters), you might use a different approach. I ended up scanning the
entire word list and counting occurrences of each character, returning only
words with the proper counts based on optional (lower-case) and required
(upper-case) letters and blanks.
(Ocaml programmers might do well to ignore the rest of this message, about
my "anagram.c" program.) My algorithm is probably overly-complex or poorly
coded and a case of premature optimization. Unfortunately, it's also a
linear search. If only I could form a tree of words containing each letter,
then linked to words containing any two letters...... It would grow huge,
I'm afraid. There are too many combinations of letters. Building the tree
would take immense resources, perhaps more time and RAM than in the
world...looks to me like only a linear search could work.
Anyway, my algorithm;
Searching for "caddy" finds words with any of C or A or D or Y in them... I
initialize my optional (lowercase) letter-counts array with
[|1;0;2;1;0;0;0;...;1;0|], required (uppercase)-letter counts with all 0s,
and blank count set to 0, then decrement my optional letter-counts as I
iterate over the letters in the dictionary word, or decrement the blank
count if I hit a letter in the dictionary word which is not found, and
consider a word to be matching if each letter is found and all required
letters are found, as can be seen in the last few lines of the function
below.
It was a bit inefficient in that I reinitialize the counts for the search
word in each call to this function, so I did a non-threadsafe caching of the
last word searched, since my calling function in the program passed the same
word repeatedly.
[ Ick yuck gag .. I know, horrible C code below... ]
But surely an Ocaml program for this could be written much more succinctly.
It should look more like a simple grep, with a sort at the finish. Maybe
I'll take it on as an exercise someday...
(Oh, I know, *no one* uses the "register" keyword anymore; the compiler is
probably smarter than me about optimization...)
This algorithm takes about 30ms to scan 170,000 words (Athlon 1.6ghz)
/* anagram.c
* Copyright (C) 2004 by aaron w west
* tallpeak%40hotmail.com">tallpeak
hotmail.com / awilliamwest%40yahoo.com">awilliamwest
yahoo.com
* 11/27/2004 to 11/30
...
int matchword(unsigned char *dictword, unsigned char *search, unsigned char
*pattern, int flags, int length)
{
int blanks = 0;
register unsigned c;
unsigned char *s;
register unsigned char *d;
static unsigned char prevsearch[257] = "";
static int capsfound = 0;
static int initblanks = 0;
static unsigned int lowercounts[26]; // lowercase letter counts in
search
static unsigned int uppercounts[26];
unsigned int searchletters[26];
unsigned int capletters[26];
if (countalpha((char*)dictword) != length)
{
return 0;
}
if (pattern[0])
{
s = (unsigned char *)strstr((char*)dictword, (char*)pattern);
if (!s || (flags & BEGIN_WORD_FLAG) && s != dictword
|| (flags & END_WORD_FLAG)
&& s != dictword + strlen((char*)dictword) -
strlen((char*)pattern))
return 0;
}
if (strcmp((char*)search, (char*)prevsearch) != 0)
{
memset(lowercounts, 0, sizeof(lowercounts));
memset(uppercounts, 0, sizeof(uppercounts));
initblanks = 0;
capsfound = 0;
strcpy((char*)prevsearch, (char*)search);
/* first count blanks, # or ?, and each letter */
for (s = search; (c = *s) != ' '; s++)
{
c -= 'A';
if (c < 26) {
capsfound++;
uppercounts[c]++;
lowercounts[c]++;
} else {
c -= 32;
if (c < 26)
lowercounts[c]++;
else if ((c = *s) == '?' || c == '#')
initblanks++;
}
}
}
memcpy(searchletters, lowercounts, sizeof(lowercounts));
memcpy(capletters, uppercounts, sizeof(uppercounts));
blanks = initblanks;
for (d = dictword; (c = *d) != ' '; d++)
{
c -= 'a';
if (c < 26)
{
if (searchletters[c] > 0)
{
searchletters[c]--;
} else {
if (blanks > 0)
{
*d -= 32; // *d = toupper(*d)
blanks--;
} else {
return 0;
}
}
if (capletters[c] > 0)
capletters[c]--;
}
}
//check for caps=required letters in search
if (capsfound)
{
for (c = 0; c < 26; c++)
if (capletters[c] > 0)
return 0; // a cap letter specified wasnt used
}
return 1;
}
________________________________
From: ocaml_beginners%40yahoogroups.com">ocaml_beginners
yahoogroups.com
[mailto: ocaml_beginners%40yahoogroups.com">ocaml_beginners
yahoogroups.com] On Behalf Of Jon Harrop
Sent: Wednesday, February 21, 2007 2:25 PM
To: ocaml_beginners%40yahoogroups.com">ocaml_beginners
yahoogroups.com
Subject: Re: "ocam