mirror of
https://github.com/apprenticeharper/DeDRM_tools
synced 2025-01-09 17:24:52 +01:00
93c2ccd2c2
(With some additions) Lots of authors, brought together by Apprentice Alf.
129 lines
4.6 KiB
Text
129 lines
4.6 KiB
Text
Contributors:
|
|
cmbtc - removal of drm which made all of this possible
|
|
clarknova - for all of the svg and glyph generation and many other bug fixes and improvements
|
|
skindle - for figuing out the general case for the mode loops
|
|
some updates - for conversion to xml, basic html
|
|
DiapDealer - for extensive testing and feedback, and standalone linux/macosx version of cmbtc_dump
|
|
stewball - for extensive testing and feedback
|
|
|
|
and many others for posting, feedback and testing
|
|
|
|
|
|
This is experimental and it will probably not work for you but...
|
|
|
|
ALSO: Please do not use any of this to steal. Theft is wrong.
|
|
This is meant to allow conversion of Topaz books for other book readers you own
|
|
|
|
Here are the steps:
|
|
|
|
1. Unzip the topazscripts.zip file to get the full set of python scripts.
|
|
The files you should have after unzipping are:
|
|
|
|
cmbtc_dump.py - (author: cmbtc) unencrypts and dumps sections into separate files for Kindle for PC
|
|
cmbtc_dump_nonK4PC.py - (author - DiapDealer) for use with standalone Kindle and ipod/iphone topaz books
|
|
decode_meta.py - converts metadata0000.dat to make it available
|
|
convert2xml.py - converts page*.dat, other*.dat, and glyphs*.dat files to pseudo xml descriptions
|
|
flatxml2html.py - converts a "flattened" xml description to html using the ocrtext
|
|
stylexml2css.py - converts stylesheet "flattened" xml into css (as best it can)
|
|
getpagedim.py - reads page0000.dat to get the book height and width parameters
|
|
genxml.py - main program to convert everything to xml
|
|
genhtml.py - main program to generate "book.html"
|
|
gensvg.py - (author: clarknova) main program to create an xhmtl page with embedded svg graphics
|
|
|
|
|
|
Please note, these scripts all import code from each other so please
|
|
keep all of these python scripts together in the same place.
|
|
|
|
|
|
|
|
2. Remove the DRM from the Topaz book and build a directory
|
|
of its contents as files
|
|
|
|
All Thanks go to CMBTC who broke the DRM for Topaz - without it nothing else
|
|
would be possible
|
|
|
|
If you purchased the book for Kindle For PC, you must do the following:
|
|
|
|
cmbtc_dump.py -d -o TARGETDIR [-p pid] YOURTOPAZBOOKNAMEHERE
|
|
|
|
|
|
However, if you purchased the book for a standalone Kindle or ipod/iphone
|
|
and you know your pid (at least the first 8 characters) then you should
|
|
instead do the following
|
|
|
|
cmbtc_dump_nonK4PC.py -d -o TARGETDIR -p 12345678 YOURTOPAZBOOKNAMEHERE
|
|
|
|
where 12345678 should be replaced by the first 8 characters of your PID
|
|
|
|
|
|
This should create a directory called "TARGETDIR" in your current directory.
|
|
It should have the following files in it:
|
|
|
|
metadata0000.dat - metadata info
|
|
other0000.dat - information used to create a style sheet
|
|
dict0000.dat - dictionary of words used to build page descriptions
|
|
page - directory filled with page*.dat files
|
|
glyphs - directory filled with glyphs*.dat files
|
|
|
|
|
|
3. REQUIRED: Create xhtml page descriptions with embedded svg
|
|
that show the exact representation of each page as an image
|
|
with proper glyphs and positioning.
|
|
|
|
The step must NOW be done BEFORE attempting conversion to html
|
|
|
|
gensvg.py TARGETDIR
|
|
|
|
When complete, use a web-browser to open the page*.xhtml files
|
|
in TARGETDIR/svg/ to see what the book really looks like.
|
|
|
|
If you would prefer pure svg pages, then use the -r option
|
|
as follows:
|
|
|
|
gensvg.py -r TARGETDIR
|
|
|
|
|
|
All thanks go to CLARKNOVA for this program. This program is
|
|
needed to actually see the true image of each page and so that
|
|
the next step can properly create images from glyphs for
|
|
monograms, dropcaps and tables.
|
|
|
|
|
|
4. Create "book.html" which can be found in "TARGETDIR" after
|
|
completion.
|
|
|
|
genhtml.py TARGETDIR
|
|
|
|
|
|
***IMPORTANT NOTE*** This html conversion can not fully capture
|
|
all of the layouts and styles actually used in the book
|
|
and the resulting html will need to be edited by hand to
|
|
properly set bold and/or italics, handle font size changes,
|
|
and to fix the sometimes horiffic mistakes in the ocrText
|
|
used to create the html.
|
|
|
|
If there critical pages that need fixed layout in your book
|
|
you might want to consider forcing these fixed regions to
|
|
become svg images using the command instead
|
|
|
|
genhtml.py --fixed-image TARGETDIR
|
|
|
|
This will convert all fixed regions into svg images at the
|
|
expense of increased book size, slower loading speed, and
|
|
a loss of the ability to search for words in those regions
|
|
|
|
FYI: Sigil is a wonderful, free cross-
|
|
platform program that can be used to edit the html and
|
|
create an epub if you so desire.
|
|
|
|
|
|
5. Optional Step: Convert the files in "TARGETDIR" to their
|
|
xml descriptions which can be found in TARGETDIR/xml/
|
|
upon completion.
|
|
|
|
genxml.py TARGETDIR
|
|
|
|
|
|
These conversions are important for allowing future (and better)
|
|
conversions to come later.
|
|
|