I'm moving toward allowing per-board-size counts with faces and values
staying the same. So it makes more sense to have face and values be
the first columns.
Add a basic regular expression engine to the dictiter, and to the UI add
the ability to filter for "starts with", "contains" and "ends with",
which translate into ANDed RE_*, _*RE_* and _*RE, respectively (with
_ standing for blank/wildcard). The engine's tightly integrated with the
next/prevWord() functions for greatest possible speed, but unless
there's no pattern does slow things down a bit (especially when "ENDS
WITH" is used.) The full engine is not exposed (users can't provide raw
REs), and while the parser will accept nesting (e.g. ([AB]_*[CD]){2,5}
to mean words from 2-5 tiles long starting with A or B and ending with C
or D) the engine can't handle it. Which is why filtering for word length
is handled separately from REs (but also tightly integrated.)
Users can enter strings that don't map to tiles. They now get an
error. It made sense for the error alert to have a "Show tiles"
button, so there's now a dialog listing all the tiles in a wordlist,
something the browser has needed all along.
Hungarian is unique (so far) in having two-letter tiles that can be
spelled with one-letter tiles AND in allowing words to be spelled both
ways. This crashed search based on strings because there were
duplicates. So now search is done by tile arrays. Strings are first
converted, and then IFF there is more than one tile array result AND the
wordlist has the new flag indicating that duplicates are possible, THEN
the user is asked to choose among the possible tile spellings of the
search string.
If a Makefile defines a dirty word list then a new python script is
invoked to filter for and remove those words as the dict is being
built. So far I have for English only, which makes sense because only
English wordlists are built-in on Android and Google's rating system
cares only about what's built in.
that make the wordlist unique: tile counts and values, and bitmaps,
and the data. This happens to be contiguous data on non-palm .xwd
files so it's easy to duplicate if the sum isn't there.
accept it if present. Android client will successfully ignore it and
will need to be modified to capture and display it if present. Idea's
to display information about copyright, source, etc. of wordlists.
number of words can be included. Changed to build dicts and linux to
open them. Android still needs to learn. Also, some of the tools in
dawg/ need to be fixed to read old-format (pre-utf8) .xwd files.
dawg/ directory against unicode_branch's. The two branches seem to
have to common ancestor -- probably didn't survive translation from
svn -- so this is the best I can do.
This checkin is all the files that were modified by the patch plus a
couple of simple additions. Next I'll be adding directories that the
patch created. It also reintroduced a bunch of .cvsignore files; I
won't check those in.