------------------------------------
CPT Dictionary 1.0

Linux (x86,glibc) version, Java JDK/JRE
1.1, 1.2 or 1.3 required.

Shareware, free for non-commercial use.
Freely distributable.

Updated: 20-June-2000
------------------------------------

DESCRIPTION
-----------
CPT Dictionary is browser for dictionary files (CTrees),
created by the program CPT Word Lists 1.0.

Features:
- browsing/searching in any standard encoding including
  decomposition and bidi support;
- creates display list of all words or clues (definitions);
- supports inverted indexes;
- many options for the keyboard input, the searching,
  and the information to be shown;
- can be localized by the user.

The distribution contains two sample dictionaries,
just to test the installation:
- "The Unofficial Smiley Dictionary" is CTree with
clues (smile.dic).
- "2000K" is artificial word list of two million words,
stored in 9KB file (2000K.wlz).

The only documentation for now is this file and for
the details you should look in the description
of CPT Word Lists.

SYSTEM REQUIREMENTS
-------------------
- Supported OS: Linux (x86), tested on:
  Red Hat 5.x, 6.x, and compatible,
  (with Corel Linux 1.0 we fetched a problem
   with the installation of JRE, but it works),
  for Win 9X and Win NT/2K there is a separate distribution.
- Requires 600 KB of disk space and 32 MB RAM.
- This is Java program and Sun's JDK/JRE 1.1.6 or
  greater or compatible is needed -
  see LinuxJRE.txt and JavaFonts.txt.

INSTALL
-------
1. Extract this archive into temporary directory.

2. Edit "install" script to reflect your Java VM.
   Run it as root/su from the temporary directory
   in xterm (it runs under X Window).
   This will start the wizard and according to your
   choices, the CPT Dictionary program will be installed.

3. The installation program (install.class) is a
   self-extracting class file whose contents get
   extracted during the installation and two
   directories will be created:
   - the target one chosen for the installation;
   - <home>/bin directory for the uninstall program
   (see UNINSTALL below).
   If your window manager is recognized
   (KDE, Gnome, Window Maker, Fvwm, ...),
   entries for CPT Dictionary in the desktop menu
   will be added.

4. If the installation fails, you still can extract
   the program from data.zip file.

5. If you have problem running CPT Dictionary,
   check/modify the generated cpt_dc10 script to
   reflect your JDK/JRE environment, especially
   if you install a new version of JRE after you
   install this program.

UNINSTALL
---------
To uninstall, do one of the following:

1. Click on the 'uninstall' item added to the
   desktop menu. This will work only if the
   installation program managed to recognize
   your window manager and a menu folder
   'Crossword Power Tools' has been added,
   if not - see 2.

2. Go to your <home>/bin and run
   ./juninst <inst-dir>/UnInst
   where <inst-dir> is the path where you
   installed the program.

If you have installed a new version of JRE after
the installation of this program, check/modify
'juninst' in your <home>/bin subdirectory.

After the uninstallation the '<home>/bin' directory
will not be removed because it serves all CPT
packages. If you don't have any other CPT
program, you can delete it.


DOCUMENTATION
-------------

A. Introduction

The program can do extremely fast and incredibly slow
searches depending of the settings. The rules of thumb are:
- when using regular expressions do not put excessive '*' or '?'
in the beginning of the search pattern - the search will be optimized
if the pattern starts with real letter;
- do not set 'Unicode Normalization', if you don't know what it means
in the specific case (usually, it will switch off most of the optimizations);
- when the main search list is clues, choose 'Browse Style';
- open the dictionaries in 'Low' memory/speed mode, the other
modes are for the users who know what they are doing
(see the documentation of CPT Word Lists).

The rules above are effective for big dictionaries,
having thousands or millions of words. To be more clear,
'extremely fast' means 'less than a second' -  e.g. searching a
word in 5 millions words CTree, 'incredibly slow' means
'more than 10 minutes' -  e.g. searching a clue pattern in 150K words
with 150K clues packed dictionary in 'Search Style'.

Well, after the 'special notes' above, here is the short description
of the program.

B. Select Dictionary

After starting the program, click on the left most
button to open a dictionary and/or to add new one to the list.
For now, you can do searching only in one opened file.

The radio button group 'Open selected on start up'
allows to choose one dictionary and to forget about
this dialog. 'None' is used to clear any selection made,
without browsing the whole list.

The radio button group 'RAM used and search speed'
is almost obsolete. In most cases you should select
'Low' (the packed CTrees now have reasonable speed,
and the inverted indexes will force 'Low'). 
If you select 'High' for big CTree with clues, you will really gain
in speed for multiple searches in 'Search Style', but the openining
of the dictionary will be very slow.

C. Display Options

The second button from the bar will start a dialog
with the following options:

C.1. Format Tab.
- 'Right Alignment' should be set for right-to-left scripts.
- 'Shaping' should be set if you need Arabic shaping or if
  the dictionary is in Thai composed form.
- 'Search Style' means no display list and allows all matches
from the searching to be shown.
- 'Browse Style' means to create display list and only the first
match will be selected. When you click on a word, you will see
the tags and clues linked to this word.
- 'Search/Browse in Clues' will switch the main search list
to the clues if available.
- 'Browse with Inverted Index' will create/use supporting
inverted index when searching in clues. If the file does not
contain inverted index, its creation could be very slow.
In this mode when you select a clue (click or search),
you will see all words, which have links to this clue.
The main idea behind the inverted index is to use a
dictionary in both directions - e.g. if it is de-en you can
browse it as en-de.

C.2. Tags Text
Use this tab to select the text of the tags to be shown.
Note: 'Wrap Tags/Clues Lines' should not be selected if
the clues are in Thai or in RTL script stored in visual
order - the wrapping will not be correct.

C.3. Clues Data
Use this tab to select the text of the clues
linked to each word to be shown. The clue types are
presented by the codes and display text of the tags.
In the cases when the filtering is not supported,
the selection will be disabled.
This selection will be a filter as well when the clues
are the main browse list and inverted index is used.

D. Search Text Field

Here you can enter a word to search for. Simple regular
expressions, bidi, and Unicode notation are supported,
Note that the searching is for words, not strings.
To find clue entry containing "word", you have to enter
the regular expression "*word*".
The communication with clipboard is always in Unicode
and in logical order when the dictionary is stored
in logical order (RTL scripts).

You can use the 'Search' button instead of <Return>
key to start the searching. This button will mean 'Find Next'
when working in 'Browse Style' - the searching will
start from the list item following the last selected.

E. Search Options

The first button on the right of the text field will
start a dialog for the keyboard input and search options.

E.1 Input Tab
- 'Allow \uxxxx notation' will transparently convert
the \uxxxx encoded characters to Unicode.
- 'Regular expressions' will switch on this processing.
- 'Keyboard converter' is option only for Linux. If set,
the selected encoding from 'Select Font' dialog will
be used to convert the typed 8-bit characters to Unicode.

E.2. Search Tab
- 'Ignore case' will switch on caseless searching.
- 'Special casing' will switch on the special Unicode
casing when changing the letter case.
- 'Stop on first match' is valid for 'Search Style' mode.
- 'Unicode Normalization' means to apply the selected
normalization to the source text and to the search pattern.

E.3 Unicode Tab
Use the radio buttons to select the desired Unicode
normalization. The processing for any of the normalizations
is described in the documentation of CPT Word Lists.

F. Select Font

The 'a' button will start the dialog for selecting
the display font characteristics. For Sun's Java 1.1 the
font list is limited to several fonts. For any other
JVM the list will contain most of the installed
fonts on your OS. Some of the problems with Java
keyboard input could be solved if you set the 'Encoding'.
You can type or paste in the text field any
sample text to see how it will be shown.

G. Quit

Finally, to stop the program, click on the right most
button. The current setting for the dictionaries from
the list will be saved.

H. Localization

If you want the program to talk to you in your language,
you have to do the following:

H.1. Replace in cpt_dc10.pr the line
ProgramLocale=<locale>
where <locale> is ISO-639 language code,
optionally followed by "_" plus ISO-3166 country code.
For example, el or el_GR is for Greek, en for English,
ru for Russian, etc. This is the easy part.

H.2. Put in 'locale' directory a file with name
'<locale>.msg', which contains the messages in
your language. Use the 'default.msg' file as
a template to translate the text.
There is another 'Readme.txt' file with
instructions in the same directory.

H.3. Ensure that in Java's 'font.properties' file,
the 'dialog.plain.' and 'dialog.bold.' fonts are
assigned to your locale font. This step should
be done for Java 2 (v1.2 and v1.3) as well.

CONTACT
-------
We are very interested in receiving your comments,
suggestions, and bug reports at our email:
cpt.software@usa.net