Commit graph

308 commits

Author SHA1 Message Date
Eric House
76dc78cd1c use sed instead of tr to uppercase -- everywhere
Required for some unicode chars, but might as well use it everywhere to
make copying easier.
2022-01-27 19:36:55 -08:00
Eric House
3912a60ee9 limit word lengths to 2-15
dict2dawg crashes when given a 1-letter word. Easier to fix in the
filtering that has to be there anyway.
2022-01-23 17:46:52 -08:00
Eric House
4b55b0b873 fix typo in Makefile
Didn't seem to matter...
2022-01-20 22:14:13 -08:00
Eric House
7d869a6bda add sorting options 2022-01-17 16:31:19 -08:00
Eric House
ff767f6711 add Romanian metadata 2021-12-01 08:31:25 -08:00
Eric House
de739586c3 add makefile and info for Romanian 2021-11-27 12:41:21 -08:00
Eric House
d8a89abd53 need to specify *which* python now 2021-11-27 11:26:31 -08:00
Eric Prod
ebe1c5a22d just the message, ma'am 2021-01-01 16:15:42 -08:00
Eric House
e4a3d55876 include all possible case combinations (like Spanish) 2020-12-20 21:23:45 -08:00
Eric House
4936adab75 add option to output number of words 2020-12-14 08:57:11 -08:00
Eric House
ef7c0965ba fix to build wordlist from current sources
I'd lost the old source, so uncompressed a current list to recreate.
2020-12-14 08:57:11 -08:00
Eric House
57ab42223a fix to print older wordlists with tiny headers 2020-12-14 08:57:11 -08:00
Eric House
b8f359c3e5 add filtering to wordlist browser
Add a basic regular expression engine to the dictiter, and to the UI add
the ability to filter for "starts with", "contains" and "ends with",
which translate into ANDed RE_*, _*RE_* and _*RE, respectively (with
_ standing for blank/wildcard). The engine's tightly integrated with the
next/prevWord() functions for greatest possible speed, but unless
there's no pattern does slow things down a bit (especially when "ENDS
WITH" is used.) The full engine is not exposed (users can't provide raw
REs), and while the parser will accept nesting (e.g. ([AB]_*[CD]){2,5}
to mean words from 2-5 tiles long starting with A or B and ending with C
or D) the engine can't handle it. Which is why filtering for word length
is handled separately from REs (but also tightly integrated.)

Users can enter strings that don't map to tiles. They now get an
error. It made sense for the error alert to have a "Show tiles"
button, so there's now a dialog listing all the tiles in a wordlist,
something the browser has needed all along.
2020-08-05 09:47:44 -07:00
Eric House
7daf3313e0 fix to include optional info.txt info 2020-07-25 13:58:29 -07:00
Eric House
d98430aa0d improving prep of byod files 2020-07-25 13:58:29 -07:00
Eric House
6f6e5516c9 add LANGFILTER so byod can build Hungarian 2020-07-25 13:58:29 -07:00
Eric House
d2a997d0ee don't exit badly when piped 2020-07-25 13:58:29 -07:00
Eric House
67d91111df more tweaks for byod 2020-07-25 13:58:29 -07:00
Eric House
a75264c8eb tweaks for byod 2020-07-25 13:58:29 -07:00
Eric House
666d2db62a add Makefile as symlink 2020-07-25 13:58:29 -07:00
Eric House
83b775a52c convert two more perl scripts to python 2020-07-25 13:58:29 -07:00
Eric House
042e5e6eab remove files I'll never need again 2020-07-25 13:58:29 -07:00
Eric House
f30bc77a5f rewrite some dawg perl scripts in python 2020-07-25 13:58:29 -07:00
Eric House
db30cea947 update to work with uncompressed Portuguese source 2020-05-18 20:24:36 -07:00
Eric House
4c28013439 fix per informant's instructions to build from git src 2020-05-15 08:33:23 -07:00
Eric House
0e9661aa19 fix search of wordlists containing duplicates
Hungarian is unique (so far) in having two-letter tiles that can be
spelled with one-letter tiles AND in allowing words to be spelled both
ways. This crashed search based on strings because there were
duplicates. So now search is done by tile arrays. Strings are first
converted, and then IFF there is more than one tile array result AND the
wordlist has the new flag indicating that duplicates are possible, THEN
the user is asked to choose among the possible tile spellings of the
search string.
2020-05-04 08:33:15 -07:00
Eric House
851fa1a76e let's not change the Spanish wordlist name rashly 2020-05-03 21:28:33 -07:00
Eric House
67f74b3808 cleanup hungarian Makefile 2020-05-01 09:26:08 -07:00
Eric House
f1f6d5b642 change name of Spanish wordlist
"Spanish" is redundant
2020-05-01 08:59:58 -07:00
Eric House
dfbbf2d480 don't assert wordlist length wrong
For some reason the header and dawg data in Spanish wordlist don't
agree. Until I fix this, remove the assertion from the (dev-use-only)
script that dumps wordlist since it breaks it for other uses.
2020-05-01 08:59:58 -07:00
Eric House
f7374f54c5 fix Spanish support for lowercase
"special" casing is specified in two places, and I forgot to modify the
second one when I added allowing lowercase alternative spellings
2020-04-30 13:06:01 -07:00
Eric House
fb2fcf15cc tmp fix for Hungarian: remove duplicate words
Find-prefix feature in current code crashes on Hungarian because it
allows duplicates (words that occur spelled with the same letters but
different tile combinations.) Modify Makefile to exclude those (as it
does for all other multi-letter-tile languages). And to pull the git
source of the wordlist on demand.
2020-04-29 12:29:26 -07:00
Eric House
1c0348dbf1 add option to print a delimiter between tiles
For Hungarian, there are "duplicate" words because e.g. the string CS
can be spelled with two tiles or one. If a delimiter is printed at tile
boundaries the duplication goes away.
2020-04-24 21:14:20 -07:00
Eric House
adadbd8647 make symlink relative
Useless if it specifies my machine :-)
2020-04-24 20:09:08 -07:00
Eric House
cc4776d29d Populate an actual wordlist for Hungarian
Add Makefile filters to create a wordlist with about 42K words derived
from a github project (thanks to pointers from an informant. :-) Per
him, and contrary to how Catalan does it, double-letter-tile words
also appear in single-letter variants if the tiles allow.
2020-04-24 13:44:55 -07:00
Eric House
cc28562061 files to create empty Hungarian wordlist 2020-04-24 06:34:30 -07:00
Eric House
2204d951a7 don't crash dumping empty wordlists 2020-04-23 22:10:25 -07:00
Eric House
3c1a748272 fix dictionary sum checking server-side 2020-03-18 22:28:58 -07:00
Eric House
390753ae3a fix to correctly wrap L·L in 2x2 cell case 2020-01-31 18:57:12 -08:00
Eric House
d4d4693def makefile for new French wordlist 2019-12-28 09:01:10 -08:00
Eric House
1e7ae2b2ec fix lower->upper translation: tr didn't like my strings
For whatever reason, though emacs thought the lower- and uppercase
strings I was passing to tr were the same but for case two letters were
getting dropped. This lets tr figure things out itself.
2019-11-19 22:27:41 -08:00
Eric House
58c3ab4e4a first cut at python version of dawg2dict
Perl version doesn't work and I don't remember enough of the language to
fix it.
2019-11-13 13:22:30 -08:00
Eric House
c5a8319fa8 fix comment 2019-11-13 12:32:06 -08:00
Eric House
611e046987 check for undefined variable 2019-11-12 09:08:08 -08:00
Eric House
56fc359f42 fix counts/values for 'D' (thanks Peter) 2019-11-03 18:12:24 -08:00
Eric House
a1a71df7c6 update Makefile for latest Catalan wordlist 2019-10-22 13:24:40 +02:00
Eric House
2a54c9ed20 fix Slovak makefile for new wordlist 2019-08-19 07:49:23 +03:00
Eric House
2e4d3a1276 add source of current (dinky) Slovak wordlist 2019-08-08 09:57:40 -07:00
Eric House
df14108e4e add lowercase equivalents
where missing and seems possible
2019-07-07 13:00:06 -07:00
Eric House
4abefb025c add lower-case letters as alternatives 2019-07-07 13:00:06 -07:00