
    $Id: README.txt,v 1.6 2004/11/27 22:33:27 bigphil Exp $

    LLIAPHON : Un systeme de phonetisation compact derive de LIA_PHON
    -----------------------------------------------------------------
    developpe sous licence GPL par le projet BigLux
    contact : biglux@culte.org

    sur la base de :
    LIA_PHON : Un systeme complet de phonetisation de textes
    --------------------------------------------------------

    Copyright (C) 2001 FREDERIC BECHET

    ..................................................................

    This file is part of LIA_PHON

    LIA_PHON is free software; you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
    the Free Software Foundation; either version 2 of the License, or
    (at your option) any later version.

    This program is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
    GNU General Public License for more details.

    You should have received a copy of the GNU General Public License
    along with this program; if not, write to the Free Software
    Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
    ..................................................................

    For any publication related to scientific work using LIA_PHON,
    the following reference paper must be mentioned in the bibliography: 
        
    Bechet F., 2001, "LIA_PHON - Un systeme complet de phonetisation
    de textes", revue Traitement Automatique des Langues (T.A.L.)
    volume 42, numero 1/2001, edition Hermes
    ..................................................................
                              
    Contact :
              FREDERIC BECHET - LIA - UNIVERSITE D'AVIGNON
              AGROPARC BP1228 84911  AVIGNON  CEDEX 09  FRANCE
              frederic.bechet@lia.univ-avignon.fr
    ..................................................................


 ------
 README
 ------

  I   : Content of the LLIAPHON package 
  II  : Installation
  III : Useful commands
  IV  : Warning
  V   : Contact


I - Contents of package LLIAPHON
--------------------------------

Once you have LLiaPhon files you should obtain 
a directory 'lliaphon' containing the following
files and directories:

  - Makefile : to compile sources and data files of LLIAPHON
  - README   : this present file
  - *.c, *.h : source files
  - /bin     : directory which will contain the executable files
  - /data    : directory containing the following files and sub-directories
    - format.cfg : configuration file for accessing data used by the format step
    - phone.cfg  : configuration file for accessing data used by the phonetizer
    - tagger.cfg : configuration file for accessing data used by the tagger
    - /noarch    : directory containing the target independent data files
        - desigle.pron             : decision spelled/read for acronyms
        - epeler_sig.pron          : pronunciation rules for spelled acronyms
        - french01.pron            : pronunciation rules for standard French
        - initfile.lia             : correspondence phonetic alphabets MBROLA/LIA_PHON
        - lire_sig.pron            : pronunciation rules for read acronyms
        - model_morpho.[un,bi,tri] : 3-letter models for out-of-vocabulary words in POS tagging
        - model_np.[un,bi,tri]     : 3-letter models for the proper-name identification
        - propername_[1-8].pron    : pronunciation rules for different proper-name sets
        - regles_l.pro3            : rules managing the liaison insertion between words
        - rule_phon.pro            : post-processing rules for the phonetic strings
    - /src       : directory containing the source data files to be compiled
        - h_aspi.sirlex            : exception list of words starting with an 'h'
        - lex10k                   : 10000 words lexicon used by the POS tagger
        - list_exep                : pronunciation exceptions of 'french01.pron' rule database
        - lm3classe.arpa           : ARPA-standard LM for POS tagging produced by CMU-Cambridge SLMT
    - /tools     : directory containing source code for compiling data/src
  - /doc     : documentation about LIA_PHON as follows
        - gnu_gpl.txt              : GNU General Public License Version 2
  - /examples: directory containing the following files
        - dino.txt		   : test file
        - dino.dat*		   : intermediate results you should obtain from                                     the 'dino.txt' test file if OK
        - dino.ola		   : final result you should obtain from the
	                             'dino.txt' test file if everything is OK
        - test.txt		   : test file
        - test.ola		   : result you should obtain from the test file                                     if everything is OK
  - /man     : upper directory of manual pages tree
  - /script  : script command for using and testing LLIAPHON
	- lliaphon_test : to test LLiaPhon with MBrola
        - play_ola      : play with Mbrola a .ola file obtained from LLiaPhon



II - Installation
-----------------

LLIAPHON is composed of a set of modules all written in
standard C, using only standard libraries. It has been
successfully compiled on the following UNIX environments:
- LINUX

0) Check you have got a "configure" script file. If not, please first read file
README.dev which includes instructions you need to follow if you have downloaded
LLiaPhon from its CVS server. When done please go on from here to follow the
instructions below.

1) Configure the package with the command './configure"
   See 'INSTALL' file for any instructions on how to configure
   target directories or compilation options.

2) Compile the package (binaries and ressources) with the command: 'make'

3) You may check the result of the build process by launching : make check

4) Install the package with the command 'make install'
   You may need to become super-user for this step
   if the target directories do not belong to your current account.

5) You may play a file phonetized by lliaphon using the utility scripts/play_ola.
Thus you will need to use mbrola and a diphone database that you can designate
by positionning the environment variable MBROLA_VOICE.
A really free audio player is underwork.

That's all !!

If you want to suppress the compiled files, just execute:

make clean



III - Useful commands
---------------------

LIA_PHON contains a set of scripts that transform a raw
text into its phonetic form, which can be used as an input
for the speech synthesizer MBROLA. There are four main steps in
the original LIA_PHON process:
- Formatting the text
- POS tagging + accentuation
- Grapheme-to-Phoneme transcription
- MBROLA format

The LLIAPHON main goal is to produce a final wholly phonetized format
that can be easily transformed to an audio signal.
Nowadays this format is the input format for accessing MBrola diphone synthesis.

Anyway LLIAPHON aims to keep traceability with most of the main steps
of the initial LIA_PHON.
So it produces intermediate files corresponding to some of the numerous scripts
provided with the original LIA_PHON 1.1.


1) MBROLA format
   .............
In order to use the MBROLA speech synthesizer
(see http://tcts.fpms.ac.be/synthesis/ for more details),
the main output of 'lliaphon' is formatted into MBROLA
format, by adding some (minimal) prosodic information
to the phonetic output.
WARNING: this prosodic generation is here just to avoid
listening to a 'flat' voice, but does not pretend
to reflect a 'realistic' prosody !

To obtain this result you must use the command:

lliaphon file[.txt]
   - input  : file ou file.txt
   - output : file.ola


2) How to play a phonetized file ?
   ..............................
The Mbrola format may be read at adjustable speed and frequency with
the following script.
WARNING : The MBrola database location you will use must be customized
according to your actual installation.
This database must be dowloaded from http://tcts.fpms.ac.be/synthesis.
WARNING : The ".ola" format produced by LLIAPHON does not use the MBROLA
default phonems encoding. So the following script uses
the $LLIAPHON/data/noarch/initfile.lia transcoding table.
WARNING : This script requires access to the "play" command provided with the
"sox" free software package (SOund eXchange : a universal sound sample translator).

To play the contents of a phonetized file just use :

play_ola [-t time_ratio] [-f frequency] file.ola
  - input  : file.ola
  - output : audio output

example : play_ola -t 0.72 -f 1.4 dino.ola


3) How to test LLIAPHON
   ....................
In order to help us to improve the quality produced by LLIAPHON
you would be kind to inform us of your unsuccessful usage of this software.
So to help you record the problems to encountered you are encouraged to
use the following script :

lliaphon_test [-l] file[.txt]
  - input  : file.[txt]
  - output : file.ligne.err, file.ligne.cmt

This file synthetizes the input file on a paragraph
or line (with option -l) basis and ask you to type a comment
if the result is not satisfactory.
If you do so your comment is saved in a file.line.cmt file with line
corresponding to the start of the unsatisfactory piece of text
and this piec of text is saved in a file.line.err file.
You will be kind to send us such files (preferably in compressed format)
if the issue is not described in the BUGS file included in LLIAPHON package.
In case of big data (more than 50 KBytes) first contact biglux@culte.org
to ask for a developper who will be able to receive and treat your data.


4) Intermediate LIA_PHON-like files
   ................................
LLIAPHON produces intermediate files which are supposed to be similar
to be (nearly ?) equivalent with some of the outputs produced by LIA_PHON 1.1
using the lex10k lexicon :

file.dat1 : similar to the output of former lia_nett script
file.dat2 : similar to the output of former lia_taggreac script
            without option -reacc
file.dat3 : similar to the output of former lia_phon script

For more information on this topic please first refer to the LIA_PHON
documentation.


IV - Warning
     -------

Be careful of the accent encoding of the files !!
Check that the accents of the rule file 'french01.pron' are
correct (ASCII  ISO8859-1). If not, you have to use an
accent transcoder to correct them. This transcoding MUST
be applied to ALL the files in the directories /data
and /src


V - Contact
    -------

If you have any problem, question or suggestion
and/or if you would like to join our developping/testing task
you would subscribe to the BigLux mailing list from the following Web page :
http://www.culte.org/listes/

or directly send an email to :
biglux-subscribe@culte.org

and then ask your question by email to :
biglux@culte.org

