SAM

Note: SAM only understands a special phonetic language. You will have to learn it [see SAM 128 (general)] if you want "advanced speech effects" (like regional dialects or emphasis of specific words/syllables). For "simple" American English text, see the page for Using Reciter.

Using SAM 128 with the BASIC Wedge

This section explains using SAM 128 in BASIC with the "wedge" installed, and is very easy to use. (You may also use SAM from ML or from BASIC without the "wedge", see below.) See the main SAM 128 page for details about installing SAM and the BASIC Wedge.

Note: If you installed Reciter instead of SAM, then you need switch the C128 into "SAM-Mode" with the BASIC-Wedge command:

]SAM (if you want to use phonetic text described on this page),

or else see the Reciter page (if you want to use English text instead).

Speaking phonetic text is very easy, just enter the "SAY" command followed with a string expression that contains valid phonetic text. Some examples:

]SAY "/HEHLOW WERLD." :REM constant string

]SAY A$ :REM scalar variable

]SAY T$(2) :REM array variable

]SAY LEFT$(X$,4)+"S." :REM general string expression

I hope you agree by the examples, that the syntax is simple and flexible. The only difficult thing about using SAM is being sure the string argument evaluates as valid phonetic text. Phonetic text will take some time to master because it uses a large "alphabet" of about 40 phonemes (each consisting of 1 or 2 characters) -- see the Phonemes page, or the Alpha Long page, for a list of phonemes that can be used in the string. Phonetic text also offers advanced features, like custom stress and questions [see Using SAM 128 (general), below].

If the string does not contain valid phonetic text, then SAM will generate a "double-beep" sound instead of speaking. When this happens, you can find out where the (first) error in the string occurs with the BASIC-Wedge command ERROR. For example, you ask SAM to say a phrase ("stop it") like this:

]SAY "STOP IT."

This example, instead of rendering speech, will fail with a BEEP-BEEP sound. To find out where the error is in your phonetic text, enter ]ERROR, and you will get the following output:

STOP IT.

In this case, it means neither the character "O", nor the combination "OP" is recognized. This error display only shows the first error found... any following errors will not be highlighted. (As we shall see...)

If you do some research, you should find that character pair "AA" will correct the (original "O") error. So now try this command:

]SAY "STAAP IT."

Again you will hear "BEEP-BEEP" instead of speech. One last time, ask for details by entering ]ERROR, and you will see:
STAAP IT.

This means the character I, or the pair IT, is invalid. A little research reveals that "IH" is (probably) the character pair wanted for this phoneme. So finally, enter this command:

]SAY "STAAP IHT."

SAM will say this string because it can match all text to a series of phonemes.

Another variation you can try is:

]SAY "STAAP IXT."

What's the difference? Not much! The IH and IX phonemes sound almost the same -- the IX one is slightly shorter in duration. You should try both to decide which sounds best...

Besides using stress marks in the phonetic text, there are several ways you can alter SAM's speech. See the page about Common Routines.

Using SAM 128 with ML

Note: This section may be useful to BASIC users who are not using the Wedge.

There are two ways to call SAM 128 from a machine language (ML) / assembly language program. The most common will probably be to call 'SayIt' at$EE09 (60937). Before doing that, you will need to copy the phonetic text you want spoken into SAM's buffer. Your text string should be terminated with a 'shifted ESCape' character: $9B (155). If not, SAM will say 'random' junk after your desired text is spoken (or halt with an error without saying anything). In the current (Epsilon) release, the buffer 'samStr' is located at $D800~$D8FF. Because the buffer may move in later versions, you can find the buffer address in memory locations $EE22/23 (a word).

Another method, which only applies if you are using BASIC variables, is to simply call 'SayItBASIC' at $EE06 (60934). Before calling this routine, you first need to have a text variable named SA$ defined. If you don't, an empty string named SA$ will be created. You don't need a special terminator byte in the string (because BASIC knows how long the string is).

Either way, your string should not have any embedded 'shifted ESC' ($9B) character(s). If it does, SAM will ignore whatever follows (the first).

The following examples show the non-wedge BASIC calls to SAM (i.e., SayItBASIC at $EE06). So if you are using ML, just use a simple JSR if running in BANK 1 (unlikely) or else use JSR_FAR. And ML user's that don't want to use BASIC strings must call SayIt at $EE09 instead.

To say "Hello world!", for example,

SA$="/HEHLOW WERLD.":BANK 1:SYS 60934

Note: if you are using BASIC, it saves a lot of typing (and reduces memory used and improves RUN speed) if you use a variable for the value 60934. Also, you only need to use BANK 1 if you have used some other BANK command (or no BANK command) before using SYS. Thus a more practical (say "the easy way") example is,

SA$="DHIY IY5ZIY WEY5.":SYSAM

This assumes BANK 1 was the last BANK command issued, and variable AM = 60934. The following examples use this assumption.

Another common SYS command you may use (at least while debugging) is the Error Display at 60949 ($EE15). SAM has a very specific set of phonemes he will accept. If your input text fails to match any phoneme in his special "phonetic language" then you will hear "Beep Beep" (instead of speech). When this happens, just call the error display like this:

SYS 60949

(the example assumes BANK 1 is active).

SAM will 'print' to the current output device (usually the screen) the text you entered, except the character it could not match to a phoneme will be in 'reverse-video'. (The concept of reverse-video really only makes sense on the computer display, or some Commodore-specific printers.)

For example, you ask SAM to say a phrase ("stop it") like this:

SA$="STOP IT.":SYSAM

(Again BANK 1 is assumed active, and variable AM is assumed to equal 60934.) This example, instead of rendering speech, will fail with a BEEP-BEEP sound. To find out where the error is in your input text, enter SYS 60949, and you will get the following output:

STOP IT.

In this case, it means the character "O" (and the combination "OP") is not recognized. This routine only shows the first error found... any following errors will not be highlighted. So in this example, the single letter "I" will also generate an error after you correct the "O" error...

If you do some research, you should find that character pair "AA" will correct the (original "O") error. So now try this command:

SA$="STAAP IT.":SYSAM

Again you will hear "BEEP-BEEP" instead of speech. One last time, call the Error Display with SYS 60949, and you will see:
STAAP IT.

This means the letter I, or the pair IT, is invalid. The problem is the single character, I, is not valid. A little research reveals that "IH" is (probably) the character pair wanted for this phoneme. So finally, enter this command:

SA$="STAAP IHT.":SYSAM

SAM will say this string because it can match all text to a series of phonemes.

Another variation you can try is:

SA$="STAAP IXT.":SYSAM

What's the difference? Not much! The IH and IX phonemes sound almost the same -- the IX one is slightly shorter in duration. You should try both to decide which sounds best...

If an error occurs (you hear 'Beep Beep' instead of speech), you can call $EE15 ('DspErr'). This will write the contents of 'samStr' to the active output device (usually the screen) with the "undefined phoneme" shown in reverse video. Like the original (C64) version, there is no CPU flag set to indicate an error upon return.... I guess I should fix this for my final release?

Unless you are running code in BANK 1, then you will need to use FAR_CALL to call these routines, and IND_STORE to write your string data.

Note: even if you are not using BASIC at all, the MMU pre-configuration registers B (and possibly D) must be set to the same values used by BASIC ROM ($7F and $41, respectively). The 'D' pre-configuration register is only used for finding the BASIC string SA$ and for printing the error text. In both these cases, the BASIC ROM must also be installed in the C128. For the second case, the KERNAL ROM must also be installed.

Pure ML Example

The above section gave several examples of calling SAM routines directly from BASIC (without the Wedge), however it did not include any ML examples. This was to keep the text concise. Below is a real ML example of calling SAM from BANK 15 (you can modify it to run in BANK 0). This example runs at address $1300, and can be entered and tested with the built-in ML MONITOR.

.1300 LDA #0 ;use BANK 15 (for KERNAL routine $FF77)

.1302 STA $FF00

.1305 LDA #00 ;point to SAM buffer ($d800)

.1307 LDX #D8

.1309 STA $FE ;pointer in $FE,$FF

.130B STX $FF

.130D LDA #$FE ;where our pointer is

.130F STA $02B9 ;set for IND_STORE

.1312 LDY #$16 ;length of string (followed by $9b terminator)

;this loop copies string to SAM buffer

.1314 LDA $1330,Y ;read from current bank

.1317 LDX #1 ;select BANK 1

.1319 JSR $FF77 ;write to another bank

.131C DEY ;count chars

.131D BPL $1314 ;loop until all copied

;now call SayIt at $ee09

.131F LDX #1 ;run code in BANK 1

.1321 STX $02

.1323 LDA #$EE ;address $EE09

.1325 LDX #$09

.1327 STA $03 ;set for JSR_FAR

.1329 STA $04

.132B JSR $02CD ;call JSR_FAR

;your code continues...

.132E NOP

.132F RTS

;the text "AESEHMBLIY LAENXGWIHJ." + $9B

>01330 41 45 53 45 48 4D 42 4C 49 59 20 4C 41 45 4E 58

>01340 47 57 49 48 4A 2E 9B

;test the example

J 1300

Note if you want to run in BANK 0 (instead of BANK 15), then the first LDA should have an argument of #$3E (MMU value for BANK 0), and inside the copy loop you should change "LDX #1 / JSR $FF77" to "LDX #$7F / JSR $02AF" (load .X with the MMU value of BANK 1, and call "stash" code in Common RAM).

Using SAM 128 (general)

You may use 'pure' SAM (without Reciter) to gain enhanced pronunciation (for example, speak with a regional dialect), but then you must use SAM's obscure phonetic language (different from English, or any other language!). Many (most?) people prefer to use Reciter, because learning/using SAM's phonetic language is a moderate pain... but you can not implement dialects with Reciter! Also, Reciter's dictionary misses some common words and thus will be spoken incorrectly. Sometimes you can use "creative spelling" to force Reciter to say the word correctly, but other times your only choice is to use SAM's phonetic language instead. In summary, Reciter is easy to use, but always speaks with Western American accent and will mispronounce some words.

You must use 'pure' SAM if you want to speak another language or speak English with a different (not Western American) accent, like:

Australian
British
Canadian
Jamaican
Midwest (US)
New England (US)
Scottish
Southern (US)

(The details needed to actually implement any of those non-standard dialects is beyond the scope of this web page... but you should find the needed info here or the linked C64 documentation if you want to try.)

SAM requires its "phoneme-text" to be in upper-case ASCII (same as un-shifted PETSCII). Any lower-case text (or any character with value > 127 or value < 32) will cause SAM to generate a "Beep Beep" error sound and return without saying anything. The only exception is the end-of-string marker (character $9B [155]). SAM will ignores this and all following characters.

Most phonemes understood by SAM consist of two characters. (However, most consonants require only one character.) There are several possible ways to list them. First I will give an "alphabetic" list which should be helpful to anyone, but is aimed at the beginner. (If you like this format, there is an extended listing here.)

English Letter(s)	SAM Phoneme	SAM code (hex)	Example(s) pronunciation
ə (ambiguous)	AX	$0D	about, fallen
a	AE	$08	Sam
a	EY	$30	made
augh	AO	$0B	caught
ay	EY	$30	may
b	B	$36	bad
c	K	$48	come
c	S	$20	receive, cede
ch	CH	$2A	chew
ch	K/X	$48, $25	loch
d	D	$39	dog
e	EH	$07	beg
e	IY	$05	free
ei	IY	$05	receive
eigh	EY	$30	weigh
er	ER	$0F	herd, bird
ew	UW	$35	new, crew
f	F	$22	fish
g	G	$3C	go
g	GX	$3F	progress
h	/H	$24	head
i	IH	$06	fit
	IX	$0E	digit
	AY	$31	I, die, high, nitrogen
j	J	$2C	Jew, fudge, forge
k	K	$48	kitchen
k	KX	$4B	necklace
l	L	$18	long
l	LX	$13	call
m	M	$1B	may
n	N	$1C	not
ng	NX	$1D	song, running
o	AA	$09	pot
o	OH	$11	cone
oa	OH	$11	foam
oo	UH	$0C	book
	UX	$10	loot
	AH	$0A	blood
ow	AW	$33	how
ow	OW	$34	slow
oy	OY	$32	boy
p	P	$42	poke
q	KW	$48, $19	quit
r	R	$17	red
r	RX	$12	bar
s	S	$20	Sam
s	ZH	$27	measure
sh	SH	$21	she
t	T	$45	talk
t	DX	$1E	pity
th	DH	$29	then, the
th	TH	$23	thin, path
u	AH	$0A	bug
u	UW	$35	huge
v	V	$28	vote, seven
w	W	$19	win, weather
w	WX	$14	saw
wh	WH	$16	when, whether
wh	/X	$25	who
x	KS	$48, $20	exact, jinx
y	Y	$1A	you
y	YX	$15	say
z	Z	$26	zoo, bits
z	ZH	$27	azure

Phonemes by Classification

Below is a set of tables which group the phonemes "logically" (my opinion). If you want to see a single huge table (sorted by SAM code), check out my Phonemes page.

Optimistic Note: don't be scared by the size of the following tables! For normal use, all you need to worry about is the Input Text (what you type) and the Example Pronunciation (what it will sound like). Everything else is just extra info some people may be curious about (or need for advanced programming).

Short Vowels

Input
Text

SAM code
(hex)

Example
pronunciation

Normal
period

Stressed
period

Mouth
frequency

Throat
frequency

Classification

Voiced

Vowel_X

Consonant

Fricative

Occlusive

Plosive

Rushing

$08

Sam

$07

beg

$0D

about

$0F

bird

$06

fit

$0E

digit

$09

pot

$0B

talk

$0C

book

$0A

bug

Long Vowels

Input
Text

SAM code
(hex)

Example
pronunciation

Normal
period

Stressed
period

Mouth
frequency

Throat
frequency

Classification

Voiced

Vowel_X

Consonant

Fricative

Occlusive

Plosive

Rushing

$30
($30,$15)

made

20
(13+7)

22
(14+8)

~=15.8
(18+14)

~=69.9
(72+68)

$05

free

$31
($31,$15)

high

19
(12+7)

23
(15+8)

~=18.2
(26+14)

~=47.1
(38+68)

$11

cone

$10

loot

Complex Vowels (Diphthongs)

Input
Text

SAM code
(hex)

Example
pronunciation

Normal
period

Stressed
period

Mouth
frequency

Throat
frequency

Classification

Voiced

Vowel_X

Consonant

Fricative

Occlusive

Plosive

Rushing

$32
($32,$15)

boy

19
(12+7)

23
(15+8)

~=16.5
(20,14)

~=41.6
(30,68)

$33
($33,$14)

how

20
(12+8)

23
(15+8)

~=16.4
(26,12)

~=33.6
(42,28)

$34
($34,$14)

slow

22
(14+8)

~=14.4
(18,12)

~=29.0
(30,28)

$35
($35,$14)

crew

17
(9+8)

22
(14+8)

12
(12,12)

~=30.7
(34,28)

Approximants

Input
Text

SAM code
(hex)

Example
pronunciation

Normal
period

Stressed
period

Mouth
frequency

Throat
frequency

Classification

Voiced

Vowel_X

Consonant

Fricative

Occlusive

Plosive

Rushing

$17

red

$18

allow

$19

win

$1A

you

Obscure

Input
Text

SAM code
(hex)

Example
pronunciation

Normal
period

Stressed
period

Mouth
frequency

Throat
frequency

Classification

Voiced

Vowel_X

Consonant

Fricative

Occlusive

Plosive

Rushing

$16

when, whine

Consonants - Nasal

Input
Text

SAM code
(hex)

Example
pronunciation

Normal
period

Stressed
period

Mouth
frequency

Throat
frequency

Classification

Voiced

Vowel_X

Consonant

Fricative

Occlusive

Plosive

Rushing

$1B

Sam

$1C

man

$1D

song

Consonants - Voiced Fricatives

Input
Text

SAM code
(hex)

Example
pronunciation

Normal
period

Stressed
period

Mouth
frequency

Throat
frequency

Classification

Voiced

Vowel_X

Consonant

Fricative

Occlusive

Plosive

Rushing

$26

zoo

6**

$27

measure

6**

$28

seven

7**

8**

$29

then

6**

$2C
($2C,$2D)

Jew

10**
(8+3**)

12**
(9+4**)

~=5.45
(6,5**)

~=71.9
(66,79**)

Consonants - Plosives

Input
Text

SAM code
(hex)

Example
pronunciation

Normal
period

Stressed
period

Mouth
frequency

Throat
frequency

Classification

Voiced

Vowel_X

Consonant

Fricative

Occlusive

Plosive

Rushing

$36
($36,$37,$38)

bad

9
(6+1+2)

12
(8+2+2)

6
(6,6,6)

26
(26,26,26)

$39
($39,$3A,$3B)

bad

7
(5+1+1)

10
(7+2+1)

6
(6,6,6)

66
(66,66,66)

$3C
($3C,$3D,$3E)

again

8
(6+1+1)

11
(7+2+2)

6
(6,6,6)

110
(110,110,110)

$42
($42,$43,$44)

poke

10*
(8+2*+2)

~=6.0
(6,6*,6)

~=26.0
(26,26*,26)

$45
($45,$46,$47)

talk

6*
(4+2*+2)

8*
(6+2*+2)

~=6.0
(6,6*,6)

~=66.0
(66,66*,66)

$48
($48,$49,$4A)

cake

11
(6+1+4)

13
(7+2+4)

~=8.2
(6,10,10)

~=100
(109,86,109)

Consonants - Unvoiced / Rushing

Input
Text

SAM code
(hex)

Example
pronunciation

Normal
period

Stressed
period

Mouth
frequency

Throat
frequency

Classification

Voiced

Vowel_X

Consonant

Fricative

Occlusive

Plosive

Rushing

$20

Sam

6 x0

73 x0

$21

fish

6 x0

79 x0

$22

fish

6 x0

26 x0

$23

path, thin

6 x0

66 x0

$24

ahead

14 x0

73 x0

$25

who

16 x0

37 x0

$2A
($2A,$2B)

chew

6*
(6+2*)

~=6.0
(6,6*)

~=79.0
(79,79*)

Shortcuts

Input
Text

SAM code
(hex)

Example
pronunciation

Normal
period

Stressed
period

Mouth
frequency

Throat
frequency

Classification

Voiced

Vowel_X

Consonant

Fricative

Occlusive

Plosive

Rushing

$4E
($0D,$18)

settle

11
(5+6)

15
(6+9)

~=16.5
(20,14)

~=35.7
(44,30)

$4F
($0D,$1B)

astronomy

12
(5+7)

14
(6+8)

~=9.2
(20,6)

~=45.0
(44,46)

$51
($0D,$1C)

function

12
(5+7)

14
(6+8)

~=9.2
(20,6)

~=48.5
(44,54)

The above tables list "common" SAM phonemes with the internal SAM code byte(s), in hexadecimal. The internal "SAM code" is mainly for hackers and nerds (like myself)... but it also shows (for everyone) how some phonemes are actually expanded into multiple "phones". (I ain't qualified to explain the difference between phonemes and phones, so see the linked Wikipedia articles if you are curious!)

Note about phoneme "expansion" (for geeks): the affricates (CH and J) and the plosives (B,D,G,P,T,K) will be expanded into (respectively, 2 or 3) "sub-phonemes" (I believe the technical term is "phone", but I ain't no linguistic expert). There is no way for you to reference/generate the added phones manually. On the other hand, the diphthongs (dual-sound vowels) also expand into two phones. The second (added) phone is one of the "rare phonemes" (listed below), so you could manually generate those phonemes (err, phones) if you want.

The final table of "common phonemes" (the 3 "shortcuts") are really just alpha pairs that get re-mapped into a pair of common phonemes -- these are strictly unnecessary! (As a lame "proof", there is no "phoneme data" associated with these three "alpha pairs" in SAM.) I prefer to imagine they were added as a form of compression for Reciter (but the world may never know)...

I hope we all agree: the most important column is the "Example Pronunciation"... it tells you what the output will sound like! The examples are based on (Western) American dialect! There are a lot of things I could say here... but a major issue (my opinion) is that SAM, like most English speakers (not just Americans), make (virtually) no distinction between "W" and "WH". In other words, SAM suffers from the "wine/whine merger". As a default, I think this is acceptable, but (so far) I have not found any way to render a "real" WH phoneme (nothing is obviously distinct from the common W phoneme). The combination "/HW" is the best I can do, but it just sounds wrong! Also, there is no way to generate a "rolling R" common in French and Spanish.

Phoneme Trivia

This section talks about the "geek" data in the phoneme tables. If you are new to SAM, I suggest you skip it!!

Each phoneme lists two time periods:

Normal period (unstressed phoneme)
Long period (stressed phoneme)

You can ignore these unless for some reason you want to calculate a phoneme's duration (or you are just curious). These values determine how long it takes SAM to say a phoneme (see Poker / SPEED and the section below about Stress). Because some phonemes consist of two or three parts, a total (sum) is shown for them (along with the list of parts). Any period value showing * is using the special "pure noise" algorithm, and the normal rules of SPEED and phoneme period do not apply (these always show a period of 2). Any period value showing ** is using the special "75% then noise" algorithm. The normal rules apply to 75% of the listed value, but then a "special noise" algorithm begins (and like the "pure noise", this part ignores the SPEED setting). See Poker/NOISES for more info about these two types of phonemes

Each phoneme lists two frequencies (these names were used in the original C64 documentation):

"Mouth" (Formant 1)
"Throat" (Formant 2)

Each phoneme also has an undocumented Formant 3 (I don't know if it represents Nasal, Lips, Tongue, or something else!). If you're wondering, "What is a formant?", the short answer is an important (harmonic) waveform of a phoneme/phone. SAM usually generates phonemes with 3 formants plus a fundamental frequency (base wavelength). Formants 1 and 2 have sinusoidal waveform while Formant 3 has a square waveform.

I included the first 2 frequencies in the tables because you can manipulate both frequencies with KNOBS and you need to know the first frequency ("Mouth") in order to calculate the "final" base wavelength (see Poker / PITCH). Interesting note: the 3 formant frequencies are not affected by either PITCH or Stress! Assuming you use the recommended TIMEBASE settings, the frequency values shown in the tables are multiples of 25.5 Hz. Taking the phoneme "Y" (as in you) as an example, the "mouth" frequency (formant 1) will be 25.5 * 8 = 204 Hz, and the "throat" frequency (formant 2) will be 25.5 * 82 = 2091 Hz (about 2.1 kHz). And in case you are wondering, PITCH and Stress affect the "preliminary" base wavelength -- and thus the "final" base wavelength.

When a phoneme consists of multiple (2 or 3) phones, the first value listed for frequency is the harmonic mean (indicated by ~= which means "approximate average"). Below that is the (exact) frequencies for the individual phones.

Some "sub-phonemes" (i.e., some phones) partially or completely ignore the ("Mouth and Throat") formant frequencies (and also PITCH and SPEED). Values indicated with ** work like normal for the first 75% of the "final" base wavelength. In these cases, the frequency values are still relevant and included in the average frequency listed (although I suspect the calculation is inaccurate in these cases). Values indicated with * completely ignore formant frequencies. In these cases, they are not included in the average (shown for phonemes composed of multiple phones). A handful of phonemes (the "rushing" consonants) consist only of one phone of this type. For them, an "x0" is shown (think "times zero") by the frequency value to remind you that the listed frequency is completely ignored. (Technical note: the frequencies are completely ignored for rendering of that phone/phoneme, but due to blending, the value may affect adjacent phonemes.)

Sorry if that hurts your head from information overload! But just know that I spared you from the frequency of Formant 3, and the power of all three formants. (The main reason that info is absent is because SAM allows no way to modify them! And none of that info is vital to "nerdy" calculations of pitch and speed!)

The "Classification" lists some of the most important (my opinion) classes of phonemes. In reality, SAM associates each phoneme with 15 different classes! I hope you agree, this would result in "information overload" (and add little conceptual value) if all were shown here. You do not (strictly) need to know to which class(es) a phoneme is assigned, but I think the info is (potentially) helpful. The class called "Vowel_X" includes both "real" vowels and modified approximants (LX, RX, WX, and YX). The class called "Fricative" here (and in my source code) includes real fricatives (including sibilants) and affricates, but not the psudo-fricative "H" sound (phonemes /H and /X).

Rare Phonemes

During an early phase of parsing your input text, SAM will substitute some phonemes with other 'internal' phonemes (I think "phone" is the technical term). Usually this happens due to a consonant+consonant or vowel+consonant pair being discovered. It seems using stress on phonemes may also affect their generation. Anyway, these extra phonemes are available to the user and listed below in the 'rare' phonemes table. Most of them sound so similar to 'common' phonemes that you won't hear a difference if you try them. You shouldn't need to learn these to master SAM's phonetic language.

Rare Phonemes

Input
Text

SAM code
(hex)

Example
pronunciation

Normal
period

Stressed
period

Mouth
frequency

Throat
frequency

Classification

Voiced

Vowel_X

Consonant

Fricative

Occlusive

Plosive

Rushing

$12

bar

$13

call

$14

saw

$15

say

$1E

pity

$1F

kitten

$2B

shtick

6 x0

79 x0

$3F
($3F,$40,$41)

progress

9
(6+1+2)

11
(7+2+2)

6
(6,6,6)

84
(84,84,84)

$4B
($4B,$4C,$4D)

necklace

11
(6+1+4)

12
(7+1+4)

6
(6,6,6)

84
(84,84,84)

Half of the phonemes are modified approximants (RX, LX, WX, YX). These are created by SAM when a normal approximant (R,L,W,Y) follows a vowel. Notice they are classified as a vowel by SAM (most of the time) and thus belong to the class I call "Vowel_X". Note SAM also has a class for real/normal vowel (which doesn't include these modified approximants), but that (real vowel) classification is rarely considered by SAM's code.

Two of the "rare" phonemes are modified plosives (GX and KX). These are created when the normal plosives (G and K) are followed by a consonant.

The phoneme DX sounds half-way between T and D... some call it "a quick flap of the tongue". This replaces either T or D when preceded by a vowel_x and also followed by either a stressed vowel_x or a space and any vowel_x. (Where vowel_x is a real vowel or a modified approximant.)

The Q phoneme represents a glottal stop -- a halting of air flow. The example pronunciation ("kitten") highlights the pair of t's. In American English (most dialects), the T sound is not actually made (if you are shocked, so was I). The "t-sound" is actually an abrupt blockage (in the throat?) of the preceding "i-sound". Wictionary lists "kitten" with both American (glottal stop) and Imperial (actual t) pronunciations and has audio clips of both. Contrary to what you might expect (based on the original documented example of "kitten"), Reciter will never generate any Q's! SAM will insert a Q in two cases: (1) between two stressed vowel(x)s which are separated by a space, and (2) if you write a really long string of text without any non-space punctuation. In the second case (as described in C64 documentation as SAM "running out of breath"), SAM will change the last space (before "he runs out of breath") into a glottal stop (Q), and pause before he continues speaking.

The * phoneme is undocumented; I would call it a bug, but some may call it a "feature". It sounds somewhere between SH and CH. I would call it a short-SH (based mostly on sound, but also the computer code). Surrounding phonemes can affect how it is perceived --- sometimes it sounds like CH. I think it works great in the word "shtick" because if you use the official "SH" phoneme, the SH-sound has too long a duration in my opinion. To hear the ambiguity between SH and CH, have SAM say "potato chip/ship wreck" with code like this:

SA$="PEHTEYTOW *IP REHK":SYSAM

SAM Stress and Punctuation

SAM does not understand numbers at all -- at least not how humans use them. SAM uses the numeric digits 1~8 to indicate stress for the preceding phoneme. Stress affects two things of a phoneme: wavelength (pitch) and rate (speed). If you do not include numeric digits in your input (phonetic) text, then SAM will generally say all phonemes with the "same" stress (default pitch, but some phonemes may have slightly modified rate). I said "generally" because some punctuation will also modify pitch and speed (see below). So without "stress digits", SAM's speech often sounds ambiguous/unnatural. (One of the cool things about Reciter is it will automatically insert "stress digits"... but the automatic method is not perfect... hence you may need to use SAM!)

A larger numeric digit results in less stress: increased wavelength (lower pitch) and increased rate (faster speed). Naturally then, smaller numbers create more stress: reduced wavelength (higher pitch) and reduced rate (slower speed).

Note: for stress digits, the concept of pitch is the same as the PITCH setting (greater values = lower pitch), but the concept of stress rate is the opposite of the SPEED setting (SPEED actually controls duration, the inverse of rate). Sorry if this is confusing (I didn't design SAM), but it is easy to use with just a little practice! Following is a table which will (hopefully) clarify things:

Stress Level	PITCH Change (delta wavelength)	Period (duration)	Description
0	0 (normal wavelength)	short	No stress digit (default pitch, normal duration)
1	-32 (extremely shorter wavelength)	long	Extreme stress (very high pitch)
2	-26	long	Very high stress
3	-20	long	High stress
4	-13 (moderately shorter wavelength)	long	Typical stress (higher pitch)
5	-7 (slightly shorter wavelength)	long	Mild stress (slightly high pitch)
6	0 (normal wavelength)	long	Neutral stress (default pitch but long duration)
7	+6 (slightly longer wavelength)	long	Mild negative stress (slightly lower pitch)
8	+12 (moderately longer wavelength)	long	Moderate negative stress (lower pitch)
9	+6 (bug?)	long	Bug? Same as level 7!

Note that digit 0 is not allowed... this is what happens when no stress digit follows a phoneme. Also digit 9 is not allowed; however, internal SAM transformations may produce stress level 9.

Note that stress-level 0 (no stress digit) and stress-level 6 both use the default PITCH (normal wavelength). The difference (not clearly explained in the C64 documentation) is that any stress digit (even the "neutral" 6) will cause the phoneme to have a longer duration.

So now you may be confused! Well, let me give you 3 realistic examples. To begin, I hope/expect you know that a simple English sentence can have multiple (subtle) meanings depending on how it is spoken (which words are stressed). Below is a table of values you can try with SAM (or just imagine in your mind)...

English	SAM (phonetic)	Meaning
I AM SAM	AY3 AEM SAEM.	I (not the other guy[s] around me) am called "Sam".
I AM SAM	AY AE4M SAEM.	I truly am Sam (when another doubts you really are Sam).
I AM SAM	AY AEM SAE4M.	My name is "Sam" (not "Jack" nor "Jill", for example).

Hopefully you will notice/appreciate the expressive power of SAM. If you use Reciter (instead of SAM), then the phrase "I AM SAM." will always be spoken with the stress/emphasis on the word "I"... if this is not what you intend, you should consider the more complex SAM phonetic language.

SAM only understands a few punctuation marks. These affect the way "text input" is spoken:

Mark	ASCII(hex)	SAM code (hex)	Name	Time period	Effect
	$20	$00	space	0	(ignore)
,	$2C	$03	comma	18	pause speech (long)
-	$2D	$04	dash	8	pause speech (short)
.	$2E	$01	period	18	reduce pitch of previous phonemes, then pause speech (long)
?	$3F	$02	question mark	18	increase pitch of previous phonemes, then pause speech (long)

Like other phonemes, the punctuation marks actually have a stressed and normal time period defined. But for all punctuation marks, both values are identical, so there is only one column shown in the table above.

Any non-space punctuation mark will also extend the duration of the preceding syllable by 50% (excluding the phonemes S,SH,F and TH). Here "syllable" means the first prior vowel_x and everything after (until the punctuation mark).

Note that SAM will always insert a short pause when he sees a dash (-). In contrast, Reciter will only insert a short pause when the dash is surrounded by spaces (or other SAM punctuation marks). Otherwise, Reciter will treat a dash like a space (i.e., ignore it)!

Finally...

Be sure to see the page about Common Routines. In particular, the section on PITCH describes how it, Stress, and the "mouth" frequency (formant 1) all combine to form the "final" base wavelength (fundamental frequency) of a phone/phoneme.

Links

Project64 documentation of (C64) SAM.
Find out more about SAM+Reciter (C64) on Wikipedia!