Commit graph

330 commits

Author SHA1 Message Date
Eric House
f1f6d5b642 change name of Spanish wordlist
"Spanish" is redundant
2020-05-01 08:59:58 -07:00
Eric House
dfbbf2d480 don't assert wordlist length wrong
For some reason the header and dawg data in Spanish wordlist don't
agree. Until I fix this, remove the assertion from the (dev-use-only)
script that dumps wordlist since it breaks it for other uses.
2020-05-01 08:59:58 -07:00
Eric House
f7374f54c5 fix Spanish support for lowercase
"special" casing is specified in two places, and I forgot to modify the
second one when I added allowing lowercase alternative spellings
2020-04-30 13:06:01 -07:00
Eric House
fb2fcf15cc tmp fix for Hungarian: remove duplicate words
Find-prefix feature in current code crashes on Hungarian because it
allows duplicates (words that occur spelled with the same letters but
different tile combinations.) Modify Makefile to exclude those (as it
does for all other multi-letter-tile languages). And to pull the git
source of the wordlist on demand.
2020-04-29 12:29:26 -07:00
Eric House
1c0348dbf1 add option to print a delimiter between tiles
For Hungarian, there are "duplicate" words because e.g. the string CS
can be spelled with two tiles or one. If a delimiter is printed at tile
boundaries the duplication goes away.
2020-04-24 21:14:20 -07:00
Eric House
adadbd8647 make symlink relative
Useless if it specifies my machine :-)
2020-04-24 20:09:08 -07:00
Eric House
cc4776d29d Populate an actual wordlist for Hungarian
Add Makefile filters to create a wordlist with about 42K words derived
from a github project (thanks to pointers from an informant. :-) Per
him, and contrary to how Catalan does it, double-letter-tile words
also appear in single-letter variants if the tiles allow.
2020-04-24 13:44:55 -07:00
Eric House
cc28562061 files to create empty Hungarian wordlist 2020-04-24 06:34:30 -07:00
Eric House
2204d951a7 don't crash dumping empty wordlists 2020-04-23 22:10:25 -07:00
Eric House
3c1a748272 fix dictionary sum checking server-side 2020-03-18 22:28:58 -07:00
Eric House
390753ae3a fix to correctly wrap L·L in 2x2 cell case 2020-01-31 18:57:12 -08:00
Eric House
d4d4693def makefile for new French wordlist 2019-12-28 09:01:10 -08:00
Eric House
1e7ae2b2ec fix lower->upper translation: tr didn't like my strings
For whatever reason, though emacs thought the lower- and uppercase
strings I was passing to tr were the same but for case two letters were
getting dropped. This lets tr figure things out itself.
2019-11-19 22:27:41 -08:00
Eric House
58c3ab4e4a first cut at python version of dawg2dict
Perl version doesn't work and I don't remember enough of the language to
fix it.
2019-11-13 13:22:30 -08:00
Eric House
c5a8319fa8 fix comment 2019-11-13 12:32:06 -08:00
Eric House
611e046987 check for undefined variable 2019-11-12 09:08:08 -08:00
Eric House
56fc359f42 fix counts/values for 'D' (thanks Peter) 2019-11-03 18:12:24 -08:00
Eric House
a1a71df7c6 update Makefile for latest Catalan wordlist 2019-10-22 13:24:40 +02:00
Eric House
2a54c9ed20 fix Slovak makefile for new wordlist 2019-08-19 07:49:23 +03:00
Eric House
2e4d3a1276 add source of current (dinky) Slovak wordlist 2019-08-08 09:57:40 -07:00
Eric House
df14108e4e add lowercase equivalents
where missing and seems possible
2019-07-07 13:00:06 -07:00
Eric House
4abefb025c add lower-case letters as alternatives 2019-07-07 13:00:06 -07:00
Eric House
411707a3a1 fix NPE with empty wordlist (and add note for Greek) 2019-06-29 16:44:38 -07:00
Eric House
896d63bc48 build from new wordlist 2019-06-07 21:16:56 -07:00
Eric House
838d0e5cc2 Makefile for CSW19 ... ish 2019-05-22 19:24:44 -07:00
Eric House
2f264e36ca add makefile for new wordlist 2019-03-23 18:53:48 -07:00
Eric House
3cf8d7571b fix md5sum calc for non-utf8 wordlists
And use apache logging
2019-01-05 18:46:58 -08:00
Eric House
309622b592 put back a couple of words -- not dirty! 2017-05-05 06:48:52 -07:00
Eric House
8752432de3 add ability to filter out "dirty" words
If a Makefile defines a dirty word list then a new python script is
invoked to filter for and remove those words as the dict is being
built. So far I have for English only, which makes sense because only
English wordlists are built-in on Android and Google's rating system
cares only about what's built in.
2017-05-04 22:45:27 -07:00
Eric House
c62c9899eb Hack: use sed to strip utf-8 marker from start of file. 2016-01-04 20:38:49 -08:00
Eric House
22dde029c8 Merge tag 'android_beta_100' into android_branch
ready for release
2016-01-03 11:36:37 -08:00
Eric House
8c26cf726a file for new French wordlist (not publicly available yet) 2016-01-01 19:32:32 -08:00
Eric House
9d795ff01d Makefile to build new "zinga" wordlist, and config changes to include
all letters, including those for which there are no tiles.
2015-10-12 21:42:22 -07:00
Eric House
32cd08b1dc oops: include new URL in note 2015-09-22 07:23:19 -07:00
Eric House
ccab839e66 change for new list, and also to use format better suited for keeping
wordlists in git.
2015-09-22 07:10:25 -07:00
Eric House
29d8a67c36 Makefile for new wordlist 2015-07-08 20:41:13 -07:00
Eric House
e9cab298d9 makefile for new Czech wordlist 2015-02-24 06:45:32 -08:00
Eric House
d902c35d33 add note 2014-06-24 08:19:20 -07:00
Eric House
b34f701b1e use comm instead of huge grep loop to filter wordlist 2014-06-23 07:29:05 -07:00
Eric House
ed41fdd924 Fix to work with newest format. (This should fix updating of wordlists.) 2014-03-12 19:26:08 -07:00
Eric House
b45fcf2aa6 force output in UTF-8 -- fixes German BYOD display and still works for English 2014-02-28 06:27:54 -08:00
Eric House
565a466164 capitalize note 2014-02-23 12:10:19 -08:00
Eric House
fe2a623a5a use sowpods list from freescrabbledictionary.com 2014-02-23 11:28:49 -08:00
Eric House
f50c1191b5 fix to compile and produce correct output on 64-bit system 2014-02-23 11:26:10 -08:00
Eric House
69e050c8de remove linefeeds; add synonyms 2013-11-27 17:14:38 -08:00
Eric House
85a49885fb sum percentages as sanity check 2013-11-22 18:51:17 -08:00
Eric House
14fa77cf2d remove blank lines 2013-11-22 18:51:04 -08:00
Eric House
9a5408f5db first cut at script to build new, larger wordlist from sourceforge
scrabble wordlist project
2013-11-21 19:20:56 -08:00
Eric House
bd902bfc12 ignore dawg exporting files 2013-10-21 20:39:21 -07:00
Eric House
becbf7aea6 include dict2dawg in exported files 2013-10-09 21:12:52 -07:00
Eric House
c0598aa06b fix to compile without DEBUG defined 2013-10-09 21:02:15 -07:00
Eric House
ad2916b186 add Makefile so byod export will pick it up 2013-10-09 20:54:31 -07:00
Eric House
303f919b5e fix pattern for synonyms 2013-10-09 07:36:04 -07:00
Eric House
0431f1a7ea fix capitalization 2013-10-07 07:01:03 -07:00
Eric House
5e92de6ff8 skip Hëx files when exporting for byod 2013-10-07 07:00:27 -07:00
Eric House
2a2157b504 add DICTNOTE 2013-05-21 06:58:53 -07:00
Eric House
5a026ffda3 merge android_wordlists (local branch) 2013-05-01 06:39:31 -07:00
Eric House
519ba71e87 add new bit indicating that wordlist has synonyms 2013-04-20 19:45:54 -07:00
Eric House
bc1c8d0769 add variants of multi-letter "specials" that mix case 2013-04-18 19:42:02 -07:00
Eric House
07cfdad699 fix to support synonyms within specials too -- for linux only so far.
Seems to work, though the dawg2dict.pl script is broken.
2013-04-09 07:43:04 -07:00
Eric House
03f175dd8f handle new format for tile face 'A|a', meaning "A" or "a", as far as
being able to compile a wordlist and take it apart using dawg2dict.
None of the compiled clients can handle this format yet.
2013-04-06 10:28:22 -07:00
Eric House
cc7220fa85 first cut at makefile for new Brazillian Portugese wordlist 2013-04-04 07:18:36 -07:00
Eric House
7c937fd763 improve script a bit 2013-02-12 07:37:44 -08:00
Eric House
40470b491e update note for new version 2013-02-12 07:08:21 -08:00
Eric House
a61bb31aa5 update for newest wordlist 2012-10-31 07:12:10 -07:00
Eric House
8f4c0169b4 fix to print contents again 2012-10-22 19:19:38 -07:00
Eric House
5d45d8d35f remove param that doesn't work when called from mod_python 2012-09-15 09:03:52 -07:00
Eric House
c815d739cd generate md5sum for old .xwd files that don't have it internally. And
for those that do, verify that stored and generated values match.
2012-09-13 20:54:35 -07:00
Eric House
aaa12a291e use unpack to correctly pull wordcount, note and checksum 2012-09-13 19:07:37 -07:00
Eric House
20afa9fd56 rename: file's obsolete now 2012-09-13 19:06:46 -07:00
Eric House
a85ab865cb up note for new version of DISC 2012-09-13 19:05:09 -07:00
Eric House
1957ad3dbe add options, that sometimes work, to print desc and md5sum from .xwd
files.  Still need to figure out how to parse binary into UTF-8.
2012-09-08 13:20:12 -07:00
Eric House
047f68aafd version with checksum and note 2012-09-08 13:18:58 -07:00
Eric House
ca88c9850e add DICTNOTE 2012-09-08 13:17:49 -07:00
Eric House
977bee15d9 add DICTNOTEs 2012-09-08 10:10:17 -07:00
Eric House
8e58d8c1c0 print header elems, including md5sum, if present rather than just
skipping.
2012-09-07 20:34:55 -07:00
Eric House
0b81516682 add md5sum to dict header, summing not the whole file but the parts
that make the wordlist unique: tile counts and values, and bitmaps,
and the data.  This happens to be contiguous data on non-palm .xwd
files so it's easy to duplicate if the sum isn't there.
2012-09-07 20:32:10 -07:00
Eric House
568bef7ac3 add DICTNOTE 2012-08-30 07:08:55 -07:00
Eric House
a0e8b6c076 add description 2012-08-26 21:36:00 -07:00
Eric House
b29df8512a add null-terminated note to dawg header and modify linux client to
accept it if present.  Android client will successfully ignore it and
will need to be modified to capture and display it if present.  Idea's
to display information about copyright, source, etc. of wordlists.
2012-08-25 10:20:52 -07:00
Eric House
9185ec71ca use newest Catalan wordlist 2012-07-26 21:14:40 -07:00
Eric House
baa8c7472d include stylesheet in generated index.html 2012-05-30 22:16:43 -07:00
Eric House
fd7a25ba3c makefile for just-released DISC2 wordlist for Catalan 2012-05-23 20:04:11 -07:00
Eric House
07e93971d3 makefile for latest CSW 2012-01-17 18:19:57 -08:00
Eric House
cfa4c96d22 just for grins: japanese dict-building files. There are too many kana
for the current format so this can only be for demos, but I might as
well record it.
2011-08-29 20:42:27 -07:00
Andy2
332767105c express size in K (rounding up) 2011-05-15 07:37:29 -07:00
Andy2
7ccacdc26d switch size and wordcount columns 2011-05-15 07:28:10 -07:00
Andy2
deeb2f3cba fix compile-command 2011-04-29 06:24:41 -07:00
Eric House
1ab5aa02b9 Makefile for new dict containing 4288 words: good for the robot. 2011-04-14 22:09:44 -07:00
Andy2
4272686034 Makefile for new smaller Dutch wordlist 2011-04-08 22:13:31 -07:00
Andy2
ce61427bba generate md5 sum file optionally. Later I'll want to download these
to check that the file arrived safely.
2011-03-02 19:00:25 -08:00
Eric House
beaa7ba5a5 assume dict is utf8-encoded but check and fail if it isn't 2011-02-08 20:57:41 -08:00
Eric House
481a533e58 ignore uncompressed dicts too 2011-01-24 22:21:44 -08:00
Eric House
c7b6d799f0 switch to utf8 2011-01-07 18:05:57 -08:00
Andy2
5459631c76 No need for empty .dict when creating empty .dict.gz 2011-01-06 18:20:56 -08:00
Andy2
6f2cde1304 create an index at the top of page; indent dict lines; drop ".xwd" 2011-01-06 18:09:10 -08:00
Andy2
2cc46d8a69 get rid of unused but oft-included file 2010-12-17 19:02:01 -08:00
Andy2
0ee156c9f0 add empty: case for WINCE type too 2010-12-17 18:55:44 -08:00
Andy2
c0bec75fd8 fix crash when input wordlist is empty by not counting zero-length
word as a word.
2010-12-17 18:55:25 -08:00
Andy2
c5e0955460 simplify build rule 2010-12-17 17:39:33 -08:00