Eric House
07cfdad699
fix to support synonyms within specials too -- for linux only so far.
...
Seems to work, though the dawg2dict.pl script is broken.
2013-04-09 07:43:04 -07:00
Eric House
03f175dd8f
handle new format for tile face 'A|a', meaning "A" or "a", as far as
...
being able to compile a wordlist and take it apart using dawg2dict.
None of the compiled clients can handle this format yet.
2013-04-06 10:28:22 -07:00
Eric House
cc7220fa85
first cut at makefile for new Brazillian Portugese wordlist
2013-04-04 07:18:36 -07:00
Eric House
7c937fd763
improve script a bit
2013-02-12 07:37:44 -08:00
Eric House
40470b491e
update note for new version
2013-02-12 07:08:21 -08:00
Eric House
a61bb31aa5
update for newest wordlist
2012-10-31 07:12:10 -07:00
Eric House
8f4c0169b4
fix to print contents again
2012-10-22 19:19:38 -07:00
Eric House
5d45d8d35f
remove param that doesn't work when called from mod_python
2012-09-15 09:03:52 -07:00
Eric House
c815d739cd
generate md5sum for old .xwd files that don't have it internally. And
...
for those that do, verify that stored and generated values match.
2012-09-13 20:54:35 -07:00
Eric House
aaa12a291e
use unpack to correctly pull wordcount, note and checksum
2012-09-13 19:07:37 -07:00
Eric House
20afa9fd56
rename: file's obsolete now
2012-09-13 19:06:46 -07:00
Eric House
a85ab865cb
up note for new version of DISC
2012-09-13 19:05:09 -07:00
Eric House
1957ad3dbe
add options, that sometimes work, to print desc and md5sum from .xwd
...
files. Still need to figure out how to parse binary into UTF-8.
2012-09-08 13:20:12 -07:00
Eric House
047f68aafd
version with checksum and note
2012-09-08 13:18:58 -07:00
Eric House
ca88c9850e
add DICTNOTE
2012-09-08 13:17:49 -07:00
Eric House
977bee15d9
add DICTNOTEs
2012-09-08 10:10:17 -07:00
Eric House
8e58d8c1c0
print header elems, including md5sum, if present rather than just
...
skipping.
2012-09-07 20:34:55 -07:00
Eric House
0b81516682
add md5sum to dict header, summing not the whole file but the parts
...
that make the wordlist unique: tile counts and values, and bitmaps,
and the data. This happens to be contiguous data on non-palm .xwd
files so it's easy to duplicate if the sum isn't there.
2012-09-07 20:32:10 -07:00
Eric House
568bef7ac3
add DICTNOTE
2012-08-30 07:08:55 -07:00
Eric House
a0e8b6c076
add description
2012-08-26 21:36:00 -07:00
Eric House
b29df8512a
add null-terminated note to dawg header and modify linux client to
...
accept it if present. Android client will successfully ignore it and
will need to be modified to capture and display it if present. Idea's
to display information about copyright, source, etc. of wordlists.
2012-08-25 10:20:52 -07:00
Eric House
9185ec71ca
use newest Catalan wordlist
2012-07-26 21:14:40 -07:00
Eric House
baa8c7472d
include stylesheet in generated index.html
2012-05-30 22:16:43 -07:00
Eric House
fd7a25ba3c
makefile for just-released DISC2 wordlist for Catalan
2012-05-23 20:04:11 -07:00
Eric House
07e93971d3
makefile for latest CSW
2012-01-17 18:19:57 -08:00
Eric House
cfa4c96d22
just for grins: japanese dict-building files. There are too many kana
...
for the current format so this can only be for demos, but I might as
well record it.
2011-08-29 20:42:27 -07:00
Andy2
332767105c
express size in K (rounding up)
2011-05-15 07:37:29 -07:00
Andy2
7ccacdc26d
switch size and wordcount columns
2011-05-15 07:28:10 -07:00
Andy2
deeb2f3cba
fix compile-command
2011-04-29 06:24:41 -07:00
Eric House
1ab5aa02b9
Makefile for new dict containing 4288 words: good for the robot.
2011-04-14 22:09:44 -07:00
Andy2
4272686034
Makefile for new smaller Dutch wordlist
2011-04-08 22:13:31 -07:00
Andy2
ce61427bba
generate md5 sum file optionally. Later I'll want to download these
...
to check that the file arrived safely.
2011-03-02 19:00:25 -08:00
Eric House
beaa7ba5a5
assume dict is utf8-encoded but check and fail if it isn't
2011-02-08 20:57:41 -08:00
Eric House
481a533e58
ignore uncompressed dicts too
2011-01-24 22:21:44 -08:00
Eric House
c7b6d799f0
switch to utf8
2011-01-07 18:05:57 -08:00
Andy2
5459631c76
No need for empty .dict when creating empty .dict.gz
2011-01-06 18:20:56 -08:00
Andy2
6f2cde1304
create an index at the top of page; indent dict lines; drop ".xwd"
2011-01-06 18:09:10 -08:00
Andy2
2cc46d8a69
get rid of unused but oft-included file
2010-12-17 19:02:01 -08:00
Andy2
0ee156c9f0
add empty: case for WINCE type too
2010-12-17 18:55:44 -08:00
Andy2
c0bec75fd8
fix crash when input wordlist is empty by not counting zero-length
...
word as a word.
2010-12-17 18:55:25 -08:00
Andy2
c5e0955460
simplify build rule
2010-12-17 17:39:33 -08:00
Andy2
7e46163988
add counts and values -- from wikipedia article, as are Arabic and
...
Turkish files just checked in.
2010-12-17 17:38:47 -08:00
Andy2
18f8b0d4e4
switch to utf-8, adding an iconv call to translate the wordlists.
2010-12-17 17:37:57 -08:00
Andy2
32fccca995
Turkish. As with Arabic, untested.
2010-12-17 17:36:38 -08:00
Andy2
71559e27c6
add Arabic. I have no wordlist but this should still allow play
...
between humans, even over the net. Untested, though, as my phone
doesn't have any Arabic glyphs.
2010-12-17 17:36:03 -08:00
Andy2
d1605c4493
fix: convert to utf8 and replace grep that didn't work (presuambly
...
because ranges have different meanings in utf-8) with one that does.
2010-12-13 20:39:04 -08:00
Andy2
d78584fddf
remove obsolete, pre-utf8 files
2010-12-13 20:09:26 -08:00
Andy2
bb0a79914b
add conversion from ISO88591 since the default dict's in that format.
2010-12-13 20:09:09 -08:00
Andy2
dc807c948a
use sed instead of tr since as with Slovak a letter was getting
...
dropped. Same one in fact.
2010-12-13 19:58:37 -08:00
Andy2
299c84bb2b
use sed rather than tr to uppercase letters. tr was dropping the Á
...
letter for some reason. The sed feature I'm using is a gnu extension
but has the advantage of working. Should probably do this for all
languages and in the info files.
2010-12-13 18:16:22 -08:00
Eric House
894afdc0cb
take words up to 15 letters long. This makes no difference with any
...
dict I've tried as there just aren't any words over 7 letters long
made up of only a-f.
2010-12-12 20:02:28 -08:00
Eric House
e8e0b25fad
go back to old dict -- correcting a change I didn't mean to check in.
2010-12-12 20:01:33 -08:00
Eric House
9c5b2c9f4f
add for current French list
2010-12-09 21:22:37 -08:00
Eric House
98456dd652
fix to build dicts, wince/android format by default
2010-12-09 21:22:14 -08:00
Eric House
6b58c9031f
script to build html page for downloading dicts
2010-12-09 21:21:41 -08:00
Andy2
39b40a9a3d
build with a header giving word count
2010-12-06 18:31:12 -08:00
Andy2
12508b7cd5
cleanup stderr output
2010-12-06 07:23:22 -08:00
Andy2
0072112b5a
fix syntax for including newheader so only one gets included. Fixes
...
bug building multiple dicts where headers would accumulate.
2010-12-06 07:23:05 -08:00
Eric House
c4cdc24b78
initial changes to add a header to xwd format so that stuff like
...
number of words can be included. Changed to build dicts and linux to
open them. Android still needs to learn. Also, some of the tools in
dawg/ need to be fixed to read old-format (pre-utf8) .xwd files.
2010-12-05 19:33:10 -08:00
Eric House
eff2324950
fix compile command
2010-12-05 19:30:00 -08:00
Eric House
bef1e125bf
ignore .pdb files
2010-12-05 19:29:15 -08:00
Andy2
e89feb62d8
second part of manual merge of unicode_branch's dawg/ directory into
...
this one. This adds the directories and their files created inside
dawg.
2010-11-30 18:38:05 -08:00
Andy2
79990bc7b1
first set of changes formed by applyinig diff of android_branch's
...
dawg/ directory against unicode_branch's. The two branches seem to
have to common ancestor -- probably didn't survive translation from
svn -- so this is the best I can do.
This checkin is all the files that were modified by the patch plus a
couple of simple additions. Next I'll be adding directories that the
patch created. It also reintroduced a bunch of .cvsignore files; I
won't check those in.
2010-11-30 18:35:11 -08:00
Eric House
2a2f4d4395
been a while since cvs...
2010-11-09 05:53:49 -08:00
Eric House
3716218a1d
ignore files in dawg/
2010-07-07 23:18:14 -07:00
Eric House
48946996b8
ignore file in dawg/
2010-07-07 23:17:13 -07:00
ehouse
8dca48b3ea
Useful ftell, commented out.
2009-03-29 18:13:09 +00:00
ehouse
9e5b3f8f29
Changes to fix BYOD (though still need native speaker confirmation)
2009-03-14 22:33:53 +00:00
ehouse
690bf80b7b
Fix so can build iso-8859-2 Polish dicts using make (won't work on
...
BYOD yet): add encoding to emacs mode line and fix the letters,
including hard-coding them as decimal numbers until I can figure out
how to get perl (in xloc.pm) to emit iso-8859-2 instead of utf8.
2009-03-14 19:27:29 +00:00
ehouse
0b0bf96cd5
accept ISO-8859-2; remove unused param; add assert that EOF/EOL aren't
...
part of a multibyte char
2009-03-14 19:22:15 +00:00
ehouse
b16a07d0ba
build dict2dawg with debug symbols
2009-03-14 19:21:09 +00:00
ehouse
b9dce19a93
if setlocale doesn't work, try again with en_US -- works around
...
problem on my ISP.
2009-01-28 03:32:21 +00:00
ehouse
b7fa674c28
Set locale based on params passed in, only on ENV if not specified.
2009-01-25 20:13:36 +00:00
ehouse
90f8a276e1
Cleanup to run on a machine that's utf8: specify iso-8859-1 when needed.
2009-01-25 18:57:05 +00:00
ehouse
f6d8924593
make tarball ready to be dropped into byod
2009-01-25 18:48:29 +00:00
ehouse
b2dd3f02b0
Need to escape period in grep pattern to get literal dot!
2009-01-22 04:30:35 +00:00
ehouse
24622876bb
change default dictionary
2009-01-21 05:36:43 +00:00
ehouse
c2f1ff3d06
smartphone-size small bitmaps
2009-01-21 05:25:43 +00:00
ehouse
f422305542
Make smaller bitmaps 8x8 since that's the smallest size that can be
...
required and StretchBlt to smaller can't work for letters.
2009-01-18 18:25:33 +00:00
ehouse
702940fe06
Tweaks to bitmaps; build for wince by default
2009-01-17 18:39:08 +00:00
ehouse
a56d84b64d
add emacs mode line
2009-01-14 13:41:25 +00:00
ehouse
41ae10f8b6
Allow language Makefile to specify encoding. Pass to perl and c++
...
dict builders, using it to open files and to determine whether to do
multi-to-wide conversion.
2009-01-13 13:32:07 +00:00
ehouse
7b8e4e0fd3
Add target to build all languages. Stops on Swedish at the moment.
2009-01-13 13:19:15 +00:00
ehouse
4e619601c2
To support Catalan, add Makefile and bitmaps for three special tiles.
...
The first of these, L-high-dot-L, requires Unicode to be properly
drawn, but the current dict format doesn't support it so it'll be L-L
for now. Bitmaps are still rough.
2009-01-13 13:17:58 +00:00
ehouse
eb1e667c17
Add type Letter to represent what are Tiles in Crosswords:
...
lang-independent indices into the set of letters in use. Should be no
change in functionality or code generated.
2009-01-07 05:13:45 +00:00
ehouse
948981434b
Fix compiler warnings. Should be no change in generated code.
2009-01-07 05:03:13 +00:00
ehouse
96d9baaac1
Compress user-visible name so more likely to fit on-device widgets
2008-10-29 08:47:12 +00:00
ehouse
a8fb37504d
Don't choke when words are longer than 15 letters.
2008-10-08 04:37:44 +00:00
ehouse
564b827f6d
Make new FAA 4.1 the default Spanish dictionary source; build three
...
dicts (8, 9 and 15) by default (all: target).
2008-09-18 03:55:04 +00:00
ehouse
ac03c4be61
Fix to compile with newer g++; increase size of buffer to handle largest Spanish wordlist.
2008-09-18 03:44:43 +00:00
ehouse
df52f6a47a
Accept words that contain no vowels.
2008-07-12 19:37:27 +00:00
ehouse
8e25e205fc
update in accordance with current Dutch practice (says an informant)
2008-07-10 03:13:33 +00:00
ehouse
5fd535d853
Break Czech into two "languages" as a way to support the two encodings in common use.
2008-03-19 04:47:03 +00:00
ehouse
dda5042690
Remove windows LF chars just in case; take SOURCEDICT via cmdline; add emacs modeline.
2008-03-15 15:00:46 +00:00
ehouse
a028b34a11
Compile dict2dawg by default since dict2dawg.pl has problems; fix warnings.
2008-03-15 14:52:23 +00:00
ehouse
9d0231a8b7
line column heads up correctly again
2008-02-23 22:00:40 +00:00
ehouse
907838591e
Fix to work with BYOD: pass -r rather than use grep to pull illegal words; fix language code; include charset.
2008-02-23 21:59:38 +00:00
ehouse
f0b53fd605
First cut at handling Czech. Correspondent says the Palm dict looks right. Still need to test on Windows and on BYOD.
2008-02-20 03:50:32 +00:00
ehouse
ab73fc4d38
cleanup; add lineno so number of letters is apparent
2008-02-20 03:44:31 +00:00
ehouse
22909ce6fb
add target for dict2dawg
2008-01-02 01:44:12 +00:00
ehouse
5457ea1b59
replace all __FUNCTION__ with __func__
2007-12-02 19:13:25 +00:00
ehouse
a2f60cb1f8
Makefile for Collins dict
2007-05-26 14:47:46 +00:00
ehouse
ca7c69bff1
include Makefile.langcommon
2007-04-14 16:03:31 +00:00
ehouse
5d867cd81c
Target to build tarball for uploading to byod.
2007-02-20 07:24:18 +00:00
ehouse
8faacfcede
Fix to work with new byod scheme.
2007-02-20 05:49:57 +00:00
ehouse
b2ed436b74
Add support for Russian. So that Russian text can be processed on systems without setting LANG=ru_RU.CP1251, modify dict2dawg to skip duplicates and words outside of specified lengths. Modify all info.txt files for the new scheme (which includes change to byod.cgi not kept on sourceforge.)
2007-02-17 17:06:05 +00:00
ehouse
3b7e680f2c
increment internal tile values by one so strings can be null-terminated
2007-02-14 15:17:00 +00:00
ehouse
599b43ab78
Y counts as a vowel when removing non-words.
2007-01-30 04:53:32 +00:00
ehouse
5b8e0e89d3
remove duplicates as part of sort process
2007-01-06 04:43:22 +00:00
ehouse
a2840b42ac
Change LANG to XWLANG to avoid conflict with ENV variable.
2006-08-11 01:44:08 +00:00
ehouse
b5164aa0c5
hide dict files -- playing with svn:ignore
2006-07-29 21:36:24 +00:00
ehouse
44a4dab13a
Cleanup prior to adding Swedish to BYOD.
2006-07-22 16:05:45 +00:00
ehouse
d43acd6b46
check for remaining memory being < 0, not just <=, since we allocate exactly as much as we need. Fixes failure due to being out of memory at same time as having finished parsing stdin.
2006-07-22 16:03:14 +00:00
ehouse
3fe3e05548
default dict now gzipped (no real change)
2006-07-01 14:13:29 +00:00
ehouse
5fb3705535
don't cast size to a char!
2006-06-28 14:11:46 +00:00
ehouse
de20e83bdb
A couple of tweaks so it works on byod with sample wordlist.
2006-06-28 03:38:42 +00:00
ehouse
5ebbf3f4d0
Support for Portuguese based on info from user in Brazil
2006-06-28 03:08:22 +00:00
ehouse
99ba48ce3e
add poolsize and fsize args to better warn users when dict is too big.
...
Later need to modify the build process to specify the size needed.
2006-05-02 13:28:07 +00:00
ehouse
653fdb6a7b
Improve out-of-memory message; don't double-count words.
2006-05-01 14:00:06 +00:00
ehouse
22e6ddde2a
Bring over from personal archive. I don't know if this works yet:
...
waiting for a wordlist.
2006-04-30 16:17:21 +00:00
ehouse
c0c5332098
add 'sort -u' to get rid of duplicates. All info files should have this....
2006-04-30 15:15:28 +00:00
ehouse
328c96c617
fix filter to eliminate words with unused letters; catch up count of
...
'G' tiles with gtoal's list.
2006-04-30 14:52:43 +00:00
ehouse
0295579e32
More cleanup for Spanish dict building. Seems to work now.
2006-04-30 04:44:10 +00:00
ehouse
8124a01010
Cleanup for Spanish dict building: die when can't build correctly, and
...
do same for WINCE as for FRANK re: specials
2006-04-30 04:27:33 +00:00
ehouse
834c43e131
sort to get rid of duplicates and so sort inside dict2dawg won't be needed
2006-04-30 02:35:26 +00:00
ehouse
3a37c11970
check that this version number stuff works
2006-04-29 16:47:01 +00:00
ehouse
4493ed8482
attempt to print subversion revision number with -v option
2006-04-29 16:40:48 +00:00
ehouse
1d40eddbb5
exit if can't open table file; include assert for compile on sarge
2006-04-14 08:23:28 +00:00
ehouse
3df1e461e4
For already-sorted case, read words from file on as-needed basis rather
...
than build a vector to hold them.
2006-04-14 05:23:30 +00:00
ehouse
8f909cd3a7
Use new compiled dict2dawg when present.
2006-04-13 15:30:15 +00:00
ehouse
b70bee3d53
A final bit of cleanup. All the perl is gone.
2006-04-13 04:04:03 +00:00
ehouse
d6dc4bf30c
Cleanup: remove dead code.
2006-04-13 03:58:54 +00:00
ehouse
131d4c9bd4
Use a single huge buffer for all strings rather than calling malloc
...
for each. Makes a measureable speed difference.
2006-04-13 03:52:48 +00:00
ehouse
08557184a5
debug: works now! Also ifdef out debug/verbose code.
2006-04-13 03:49:41 +00:00
ehouse
72532d72a8
print letter as well as tile in text dumps (same as cpp version)
2006-04-13 03:06:18 +00:00
ehouse
b89ed5b999
add -debug arg for parity with cpp version, and add -mn flag to usage().
2006-04-13 02:58:39 +00:00
ehouse
0c7081bf36
Tons of changes continuing port from perl. Doesn't quite work yet, but close.
2006-04-13 02:57:43 +00:00
ehouse
2863379b9b
Starting work on cpp version of dict2dawg.pl. This is nowhere near complete.
2006-04-12 04:39:49 +00:00
ehouse
6f9e7ed94c
add an underbar to separate numerals
2006-03-18 03:35:20 +00:00
ehouse
162cb99c53
ignore .stamp files
2006-03-04 15:36:06 +00:00
ehouse
772c262b5e
first checked in. works
2006-02-26 23:51:57 +00:00
ehouse
233479a959
get rid of null-termination and 'sort -z' since that option isn't on
...
new ISP's BSD sort.
2006-02-10 05:12:25 +00:00
ehouse
92485783af
update email address in header comments: no code change
2006-01-08 01:25:02 +00:00
ehouse
5e6eca025a
fix so hex dicts build again
2005-10-30 19:05:40 +00:00
ehouse
fb8d643ea2
replace sed with awk
2005-10-30 19:04:49 +00:00
ehouse
3b12c4df87
syntax error
2005-07-09 15:36:39 +00:00
ehouse
77374484f8
ditch words without vowels
2005-07-06 00:58:44 +00:00
ehouse
78aefbefea
fix description at user's suggestion
2005-06-27 05:23:14 +00:00
ehouse
5e02ca1c86
first checked in. Seems to work.
2005-06-22 06:40:53 +00:00
ehouse
e2cbee1210
path to local copy of wordlist
2005-06-16 05:12:49 +00:00