ECHO ][ SPEECH SYNTHESIZER MINI-MANUAL
TABLE OF CONTENTS
SPEAKING FROM WITHIN AN APPLESOFT PROGRAM
SPEECH EDITOR COMMANDS
SPEECH EDITOR EXAMPLE
APPENDIX - SAMPLE VOCABULARY
Until recently, speech generation by a
micro-computer required a
fair amount of memory storage and hardware. With the advent of "Linear
Predictive Coding" (a mathematical method of simulating the human vocal
tract) the amount of memory needed to store speech was greatly reduced.
Instead of storing the actual speech signal, only those LPC parameters
needed to describe each particular speech sound are stored. This allows
programs to have a large resident vocabulary without having to access a
disk or tape every time an alternate response is needed. As an
illustration, the sample vocabulary supplied with the ECHO ][ contains
all of the letters of the alphabet, numbers, and over 100 other words in
less than 2K of memory.
The heart of the ECHO ][ is Texas Instrument's
TMS 5200 speech
processor. This integrated circuit is an upgraded version of the one
used in the Speak & Spell (TM of Texas Instruments) that has been
modified for use with an eight bit processor. The ECHO ][ has been
designed so that all of the features of the TMS 5200 may be used with
the APPLE, however only RAM based speech is used with the initial
operating system. Empty sockets have been provided for standard
vocabulary ROMs when they become available.
The initial operating system is a RAM based
phoneme system that was
designed to provide flexibility and a further increase in memory
efficiency over straight encoded words. By using the SPEECH EDITOR the
user may create any word or phrase that he desires to have spoken from a
program. This code is in a compact form and contains information on the
sound, pitch, and duration of each phoneme. A second program called
SPEECH GENERATOR is a binary program which interprets this code and
passes the correct parameters to the ECHO ][ to speak the word. Only the
SPEECH GENERATOR (1K bytes) and the actual vocabulary (10 to 20
bytes/word) are needed for a program to speak.
The address of the word to be spoken is
"poked" to the SPEECH
GENERATOR and a call is made to initiate the speech. The sections which
follow discuss in detail how to install the card and the different
components of the software system and how they are implemented.
Following that is a step by step example of how to use the speech editor
to create words and then a short program of how to access them from
APPLESOPT basic. It is suggested that you read over the next sections
first and then work through the example to become familiar with the
system. You may also want to list the sample programs (RECITE and
TALKING TYPEWRITER) or examine portions of the sample vocabulary with
the SPEECH EDITOR as further examples.
Before installing the ECHO ][ be sure all
power is disconnected
from the computer. The ECHO ][ card may be plugged into any of slots 2
thru 5 of the APPLE ][. The speaker cord should be attached to the
terminals on the back of the speaker and then plugged into the jack on
the back of the ECHO ][ card. Replace the cover and the installation is
complete. There is a short subroutine located within the SPEECH
GENERATOR which will determine which slot the ECHO ][ is located in.
This should be "called" at the start of a program before any speech is
attempted and will be discussed in the next section.
The SPEECH EDITOR disk is a 13 sector disk
copied using DOS 3.2.1
and will not run on a DOS 3.3 system without first using the BOOT13
utility, It is suggested that a backup disk be made as soon as
possible to protect its contents. If you have DOS 3.3 you may "muffin"
it at this time.
The SPEECH GENERATOR is a 1K binary module
that contains the actual
phoneme codes, routines for processing these codes along with their
variables (pitch, length, and volume), and the routine for locating the
ECHO ][ slot. If you "catalog" the supplied disk, you will see four
different versions of the SPEECH GENERATOR. Each version resident in
a different portion of memory to accommodate the HIRES pages and
different size systems. The locations of these routines and their
associated entry points are listed in TABLE 1 at the end of the manual.
The Speak routine takes the compressed
speech data beginning at the
starting address (specified by the "calling" program), processes it,
and then outputs it to the ECHO ][ for speaking. It will keep
processing successive bytes of information until is comes actor's an
"end" command ( HEX "AC" ) which is tacked onto the end of each word by
the SPEECH EDITOR. At that point speech is terminated and control is
returned to the main program.
The SETSLT routine actually "looks" for the ECHO
][ card and then
modifies the Speak routine accordingly. This routine should be called
at the start of any speech program since different programs may be
using different locations for the SPEECH GENERATOR. If your card is
installed in slot 5 you don't really need to use the SETSLT routine,
however if you change the location of the ECHO ][ card the program will
not function properly.
The SETSLT routine is also useful for determining
whether there is
an ECHO ][ card installed in the system. That way a program where
speech is an enhancement but not a necessity may still be run without
the speech. To do this a "PEEK" needs to be made to the location
called "SLOT" (see Table 1). If the SETSLT routine cannot find an ECHO
][ card it will set this location to 16 (10 Hex). An example is listed
10 LOBYTE = 16384: HYBYTE = 16385: SPEAK
= 16386: NXTSPK = 16398:
SLOT =16413: SETSLT = 17313
20 ECHO = 1: CALL SETSLT: X = PEEK (SLOT): IF X = 16 THEN ECHO =0
In the above listing a flag labeled "ECHO" was
set to one if a
speech card was present or zero if there wasn't. This may be used
later in the program to bypass speech routines which could cause the
program to "hang" if no card was being used.
SPEAKING FRON WITHIN AN APPLESOFT PROGRAM
In order for the SPEECH GENERATOR to say a word,
it has to know the
starting address of the word. Since BASIC deals with decimal numbers
and the SPEECH GENERATOR deals with binary numbers, the address will
have to be split into two portions and then poked to the SPEECH
GENERATOR with two separate pokes. For convenience the addresses for
these pokes have been labeled "HIBYTE" and "LOBYTE" and are listed in
TABLE 1. A short routine to accomplish this is shown below: 100 AH
= INT (ADD / 256) : AL = ADD - AH * 256 110 POKE HIBYTE, AH : POKE
LOBYTE, AL Once that has been accomplished a call to the SPEAK
routine will cause the word to be spoken. From a binary program the
same thing may be accomplished with two STA instructions followed by a
Words may be broken up into separate and distinct
phonemes. The ECHO ][ SPEECH EDITOR uses a set of forty-one possible
phonemes along with two different types of pauses and a stop command
(automatically appended at the end of words]. In general, voiced
sounds (see Table 2) have variable pitch, duration, and volume.
Unvoiced sounds (see Table 3) have these variables preset. There
are sixteen different pitch levels available for voiced sounds. these
range from one (highest) to sixteen (lowest). Varying the pitch allows
the computer to ask questions or make exclamations. If the pitch is all
one level, the speech will have a monotonic or robotic sound.
The length of each voiced sound may be specified
as being from one
to eight 25 millisecond "frames" long. Unvoiced sounds are preset to be
anywhere from two to five frames long depending on the sound. The
"PA1" is the exception. This stops speech activity 25 to 200
milliseconds specified in 25 millisecond increments. The primary use of
the "PA1" is between words within a phrase or before stop plosives
("8","K","T",etc.). THE "PA" pause gives a delay of 25 milliseconds
however there is still some sound occurring during this period although
it is faint. There are eight available volume levels ranging from
one(softest) to eight loudest. The usual range is from five to eight
for vowel sounds except when tapering off at the end of some words.
Many commonly used sounds are made up of a combination
An example is the sound "oh". To produce this sound an "01" sound must
be followed by an "02" sound. Some other examples are "eye"
("AH","I","E") and "oooh" ("U1","U2").
The SPEECH EDITOR is an APPLESOFT program which
allows you to
construct custom words and phrases for the ECHO ][. Basically, it
arranges the sounds according to line numbers. These lines may be added
to, deleted, modified, and inserted as necessary during word
construction. When the word is finished it may be "saved" to RAM which
also will assemble it into the format the SPEECH GENERATOR requires.
>From there it may also be "saved" to the disk for later use. The word or
phrase may be spoken at any time during the process to verify it for
the correct sounds. The EDITOR commands are described in detail below
and are also listed in TABLE 4. Only those letters enclosed in
parenthesis actually need to be typed in for the command to be
There are two modes which the EDITOR operates
in. In the command
mode, you will be prompted by a "#" and you may enter any of the
commands listed below. In the add mode you will be expected to provide
a sound or number specifying one of the variables. If you type a
letter when a number is expected you will be asked to "RETYPE?". To
exit the add mode and return to the command mode press the "RETURN" key
When the cursor is in the "SOUND" column.
SPEECH EDITOR COMMANDS
(A)DD - This command puts you in the add mode
and allows you to add
sounds to the end of the current word or phrase. You will be asked for
the sound for each line and also the variables if it is a voiced sound
or "PA1". To exit this mode press the "RETURN" key when the cursor is
in the sound column.
(AP)PEND - You may add a word or phrase from
memory to the end of
the current word or phrase. Keep in mind that there is a maximum of
forty lines for the current word or phrase. To construct a longer
phrase see the section on phrase construction.
(C)ATALOG - This causes a DOS catalog of the
current disk drive and
then returns you to the command mode.
(D)ELETE - When this command is entered you will
be asked which line
number you wish to delete. That line will be deleted and all
subsequent lines will be shifted down one line to fill its place.
(END) - This exits the SPEECH EDITOR, clears
the screen, and returns
you to APPLESOFT.
(I)NSERT - If you wish to add lines within a
word use this command.
You will be asked which line you wish to insert the new line(s) in
front of. This command puts you in the add mode however all new lines
are inserted within the word rather than at the end. To exit press
(L)IST - Re-lists the current word or phrase.
If you wish to pause
during the listing (useful if there is more than one screen of text)
you may press the "SPACE BAR" and the listing will be halted. To resume
the listing press the "SPACE BAR" again. This is similar to pressing
CTRL-S when listing APPLESOFT programs.
(LO)AD - When this is entered you will be asked
whether you wish to
load code from the current disk drive or if you wish to load text from
the memory into the current word buffer. If you are accessing the disk,
you will be asked for the name of the file along with the address to
load it into. If you are loading text from memory you will have to
specify the starting address. It will then load up to forty lines until
it encounters a stop command within the text. If there are more than
forty lines you will get a beep and a '*BUFFER FULL" warning.
(M)ODIFY - This allows you to modify a line that
has previously been
entered. It is essentially the same as a "DELETE" command followed by
an "INSERT" command. You will be asked which line you wish to modify.
You will enter the add mode and all new lines will be inserted at that
(N)EW - Clears the current word buffer so you
may start formation of
a new word. You will be asked if it is OK to clear. Any response other
than a "Y" or a "YES" will abort the command.
(PR)INT - If you wish to make a hard copy of
the current word or
phrase makeup use this command. You will be asked to type in the title
which will be printed at the top of the listing. All output is printed
to Slot #1.
(SA)VE - You may save the current text to memory
or code within
memory to the current disk drive. If you are saving text you will be
asked for the starting address to save it to. Keep a record of this and
how many bytes are saved (it tells you) for future reference. A stop
command is automatically added to the end of the word as it is saved.
This is included in the total number of bytes that it tells you have
been saved. If you are saving code to the disk, you must specify the
file name, the starting address, and the number of bytes to be saved.
(SP)EAK - By entering this command you may hear
whatever is in the
current buffer. This is useful for "debugging" words during
construction. Like all other commands, this command is only available
when in the command mode.
(SPM)ENORY - This will speak words or phrases
that have been
previously stored in memory. You will be asked for the starting address
at which time whatever is stored there will be spoken.
There are a few different ways in which words
may be strung together
to form phrases. For a short phrase you will want to load or enter the
first word into the SPEECH EDITOR buffer and then append each additional
word. You will then want to go back and insert a "PA1" in between each
For longer phrases that include more than forty
lines, each new word
will have to be saved into memory directly following the previous one.
Keep in mind that previously saved words will have a stop command
tacked onto the end of them so save the new word one byte short of the
actual calculated address (starting address of the previous word plus
the number of bytes saved). You will also want to start each new word
with a "PA1" so that there will be a pause between the words. Do not
put the "PA1" at the end of the old word because it may cause the
system to "hang" when it is spoken from the SPEECH EDITOR.
One other way of producing a longer phrase is
that used in the
sample program "RECITE" on the disk. Unlike the method above, the stop
commands are not eliminated and no "PA1" pauses are inserted. The
starting address of the first word is given to the SPEECH GENERATOR and
it is spoken in the normal fashion. Then for each successive word to be
spoken a call is made to the "NXTSPK" routine. The SPEECH GENERATOR
will already be painting to the next byte in memory after speaking the
previous word so it will already have the address of the next word. To
use this type of Phrase you must know how many total words are to be
spoken and then do the same number of calls to the "SPEAK" and "NXTSPK"
The sample vocabulary on the enclosed disk contains
letters, and numbers for use from within your programs or as examples on
coding your own words. The file name of the code is "VOCABULARY" and
should be loaded into address 17408. A complete listing of the words
and their starting addresses is given in the appendix at the back of
"VOCABULARY" may be loaded into other parts of memory
starting addresses will have to be modified accordingly when accessing
words from a program.
There are two sample programs provided on the disk.
The first one,
"TALKING TYPEWRITER" will say each letter and number as it is typed on
the keyboard. The second one, "RECITE" will say each word of the sample
vocabulary. Both of these programs are APPLESOFT programs and are run in
the usual manner.
If you try to speak a phrase that begins or ends
with a "PA1" or has
two "PA1's" embedded in it, the entire program may "hang". It may also
"hang" if you give it the starting address of some other data rather
than phoneme encoded data. When this occurs the only way to regain
control of the computer is to press reset. If you are using the speech
editor you may return to the program with variables intact by entering
"GOTO 1000". You will have to re-list the current word or phrase and
you will no longer have the headings at the top of the screen.
There is another problem that may occur anytime
after the above
situation occurs or if RESET is pressed when the ECHO ][ is talking.
The next time SETSLT is called to find which slot the ECHO ][ is in, it
probably won't find it. There are two ways to get around this without
having to turn off the computer and reboot from scratch. One is to
always install the ECBO ][ card in slot 5 and never use the SETSLT
routine. The other is to POKE "255" to one of the addresses which pulls
the DEVICE SELECT (PIN 41) low on the slot the ECHO ][ is in.
SPEECH EDITOR EXAMPLE
In this section we will use the SPEECH EDITOR
to generate and save
the phrase "an Apple ][ computer" and then write a short APPLESOFT
program to say the lar9et phrase "This is an Apple ][ computer". Before
proceeding you should install the ECHO ][ card according to the
directions previously given.
To begin you should boot up the supplied disk
and run the SPEECH
EDITOR program. After it has finished loading from the disk your screen
should be blank except for the headings at the top and you should be
prompted with a "0". Whenever this prompt is displayed the program is
waiting for a command. For clarity, in this example we will always list
an entire command rather than just the first letter(s). All commands
ate followed by a <CR>.
Since you will be using some of the words from
vocabulary, you will have to first load it from the disk into memory.
To do this type in the command: LOAD. You are then given two options:
to load code from the disk or to load text from memory. You want to
load code from the disk so enter "1". Next you will be asked for the
file name. The sample vocabulary is saved under the name of "VOCABULARY"
so type this in. When it asks for what address to load it into type in
"17408". All addresses listed in the back of this manual assume that
the vocabulary has been loaded into this location.
After the file has been loaded you should be
back in the control
mode of the editor and the "#" should reappear. The first word of our
phrase is "an". This is not one of the words in the sample vocabulary
but can easily be made by modifying the word "and". Once again you will
want to use the LOAD command, however this time you will want to use
option "2" instead of "1". When asked for the address to load the text
from you should enter the address listed in the appendix for "and".
This is "17434" so type it in now. After a brief pause your screen
should appear as follows:
LINE# SOUND PITCH LENGTH VOLUME
17 PA1 2
19 UH 7 3 6
20 M 8 2 6
21 PA1 1
23 Y 4 1 6
24 IU 4 1 6
25 U2 4 3 6
26 PA1 1
28 ER 6 2 6
29 ER 8 2 6
30 ER 10 2 5
After pressing <CR> to return to the
command mode the entire buffer
will be re-listed on the screen. Since the screen isn't long enough to
accommodate the entire buffer the first lines will no longer appear. To
re-examine the first lines, enter the LIST command and while it is
listing press the space bar. The listing will halt at that point and
will continue only when the space bar is pressed again. A listing may be
stopped and restarted in this manner as many times as desired.
Now that the phrase "an apple two computer"
has been finished it
needs to be saved to memory and then to the disk for future use. The
SPEECH EDITOR and SPEECH GENERATOR.CODE2 use memory locations below
17408. Likewise, the sample vocabulary resides in memory locations
17408 to 19399. Therefore when you save the phrase you just constructed
it should be put above these locations. To save the phrase enter the
SAVE command. As with the LOAD command you will be asked whether you are
saving text to memory or memory to disk. Enter a "2" for text to memory
and when you are asked for the address to save to enter "19400". Note
how many bytes were saved (56) because you will need to know that to
save it to the disk.
The compressed binary code for your phrase
is now in memory
starting at address 19400. To save the phrase to the disk once again
enter the SAVE command, but this time select the first option. For a
file name you can use "AN APPLE TWO COMPUTER" and for the address to
save from type in the address where it was previously saved, in this
case 19400. The length of the phrase is 56 bytes as noted above.
At this point the entire vocabulary to
say the phrase "this is an
apple two computer" is stored either within the sample vocabulary or
within the file that you just created. The entire phrase could have been
constructed and placed within a single file but in order to more
effectively demonstrate how to access speech from within a program you
will be accessing a combination of single words and a phrase.
The program listed will say the sample
phrase every time a <CR> is
pressed. The "REM" statements pretty well explain its operation and what
portion of the program does what.
10 HIMEM: 7167
15 REM SETS HIMEM BELOW THE LOCATION OF THE SPEECH ROUTINES.
20 D$ = CHRS (4)
25 REIUI SETS D$ UP AS A CONTROL-D FOR DOS COMMANDS.
30 PRINT D$;"BLOAD SPEECH GENERATOR.CODE0"
35 REM LOADS IN THE SPEECH GENERATOR.CODE0 INTO $1C00 TO $1CFF.
40 PRINT D$;"BLOAD VOCABULARY"
45 REM LOADS THE SAMPLE VOCABULARY INTO LOCATION 17408.
50 PRINT D$;"BLOAD AN APPLE TWO COMPUTER"
55 REM LOADS THE PHRASE INTO LOCATION 19400.
60 LOBYTE = 7168:HIBYTE = 7169:SPEAK = 7170:NXTSPK = 7182:SLOT =
SETSLT = 8097
65 REM SETS UP THE VARIOUS ADDRESSES USED WITH SPEECH GENERATOR.CODEO.
70 CALL SETSLT:A = PEEK (SLOT): IF A = 16 THEN HOME : PRINT "PLEASE
INSERT AN ECHO II CARD": END
75 REM DETERMINES WHICH SLOT THE ECHO II CARD IS IN. IF NO CARD
INSTALLED IT WARNS THE USER AND ENDS THE PROGRAM.
80 HOME : INPUT "PRESS THE <CR> FOR A DEMOSTRATION ";X$
85 REM CLEARS THE SCREEN AND WAITS FOR A COMMAND TO START.
90 ADD = 19070: GOSUB 200
95 REM SETS UP THE ADDRESS FOR THE WORD "THIS" AND THEN JIMPS TO
ROUTINE THAT WILL OUTPUT THE ADDRESS AND SPEAK IT.
100 FOR A = 1 TO 100: NEXT
105 REM CAUSES A PAUSE BETWEEN THE WORDS "THIS" AND "IS".
110 ADD = 18184: GOSUB 200
115 REM SETS UP THE ADDRESS FOR THE WORD "IS" AND THEN JUMPS TO
THAT WILL OUTPUT THIS ADDRESS AND SPEAK IT.
120 FOR A = 1 TO 100: NEXT
125 REM CAUSES A PAUSE BETWEEN THE WORD "IS" AND THE FOLLOWING PHRASE.
130 ADD = 19400: GOSUB 200
135 REM SETS UP THE ADDRESS FOR THE PHRASE "AN APPLE TWO COMPUTER"
THEN JUMPS TO THE ROUTINE THAT WILL OUTPUT THE ADDRESS AND THEN
140 GOTO 80
200 AH = INT (ADD/256):AL = ADD -AH * 256
205 REMI SPLITS THE ADDRESS UP INTO HIGH AND LOW ADDRESSES LESS
256 AND THAT CAN BE POKED INTO A BINARY ROUTINE.
210 POKE HIBYTE,AH: POKE LOBYTE,AL
215 REM POKES THE ADDRESSES DETERMINED ABOVE INTO THE LOCATIONS
BY THE SPEECH GENERATOR.CODE.
220 CALL SPEAK
225 REM THIS CALLS THE ROUTINE THAT SPEAKS THE WORD OR PHRASE STARTING
AT THE ADDRESS POKED ABOVE.
TABLE 1 -SPEECH GENERATOR ADDRESSES
SPEECH GENERATOR.CODE0 - $1COO TO S1FFF
SPEECH GENERATOR.CODE1 - $3COO TO $3FFF
SPEECH GENERATOR.CODE2 - $4000 TO $43FF
SPEECH GENERATOR.CODE3 - $6000 TO $63FF
VER LOBYTE HIBYTE SPEAK NXTSPK SLOT SETSLT
0 7168 7169
7170 7182 7197
$1C00 $1C01 $1C02 $1COE $1C1D $1FA1
1 15360 15361
15362 15374 15389
$3C00 $3C01 $3C02 $3COE $3C1D $3FA1
2 16384 16385
16386 16398 16413
$4000 $4001 $4002 $400E $401D $43A1
3 24576 24577
24578 24590 24605
$6000 $6001 $6002 $600E $601D $63A1
TABLE 2 - SOUNDS WITH SELECTABLE VARIABLES
Al - late E - speak M - many OO2 - book
A2 - late EH - letter N - nice U1 - tune
AE - dad ER - hurry NG - long U2 - tune
AH - bother I - finger O1 - oh UH - fun
AW - call IU - you O2 - oh Y - you
L - like OO1 - book PA1 - pause
TABLE 3 - SOUNDS WITH PRESET VARIABLES
B - baby G - get R - red TH1 - then
CH - choose H - hello S - see V - very
D - dog J - jet SH - shoe W - will
DT - butter K - kick T - too Z - zero
F - if P - print
TH - think PA - pause
TABLE 4 - SPEECH EDITOR COMMANDS
APPENDIX - SAMPLE VOCABULARY
ADD.........17417 AND.........17434 APPLE.......17457
CATALOG.....17532 COLOR.......17570 CORRECT.....17610
DATE........17632 DIVIDE......17680 DOLLARS.....17732
DECIMAL.....17641 DIVIDED.....17695 DON'T.......17747
EIGHT.......17765 END.........17815 ESCAPE......17862
EIGHTEEN....17775 ENTER.......17825 EXCLAMATION.17B74
FALSE.......17905 FILE........17953 FOUR........17987
FIFTEEN.....17916 FIVE........17965 FOURTEEN....17997
IF..........18121 INPUT.......18153 IT..........18190
K...........18206 KEY.........18214 KEYBOARD....1B222
MANY........18301 MINUS ......18341 MULTIPLIED..18376
N...........18398 NINE........18431 N0..........18480
NAME........18409 NINETEEN....18442 NOW.........18489
NEXT........18418 NINETY......18462 NUMBER......18497
OFF.........18519 OPEN........18542 OUT.........18564
PARENTHESIS.18583 PLUS........18627 PRINT.......18658
PERCENT.....18600 POUND.......18638 PROGRAM.....18670
RED.........18720 REPEAT......18743 RIGHT.......18774
SAVE........18792 SIX.........18878 START.......18942
SECOND......18801 SIXTEEN.....18888 STOP........18953
SEMICOLON...18814 SIXTY.......18903 SUBTRACT....18964
SEVEN.......18835 SORRY.......18915 SUBTRACTED..18979
TAPE........19007 THIRTY......19060 TRUE........19119
THAT........19016 THIS........19070 TWELVE......19128
TEN.........19025 THOUSAND....19077 TWENTY......19140
THE.........19033 THREE.......19089 TWO.........19151
THE1........19041 TIME........19098 TYPE........19159
U...........19170 UH OH.......191B1 UNDERSTAND..19190
WAS.........19249 WHITE.......19285 WITH........19311
WHAT........19258 WHO.........19295 WRONG.......19318
Z...........19377 ZERO........19387 END OF FILE.19399