SuperCollider is an environment and programming language for real time audio synthesis and algorithmic composition. It provides an interpreted object-oriented language which functions as a network client to a state of the art, realtime sound synthesis server.
From the album liner notes written by D.H. VanLenten:
“This recording contains samples of synthesized speech – speech artificially constructed from the basic building blocks of the English language. A machine which produces synthesized speech is called, fittingly, a talking machine. There are many possible kinds of speech synthesizers or talking machines. Instead of building and testing a variety of them, scientists at Bell Telephone Laboratories simulate their behavior with a high-speed, general purpose computer. The computer is instructed (programmed) to accept in sequence on punched cards the names of the speech sounds which make up an English sentence. It then processes this information, in accordance with the linguistic rules governing the English language, and produces an output analogous to the output of the talking machine it is programmed to simulate. The talking machine simulated by the computer in this recording would normally be operated by continuously feeding it a set of nine control signals. The signals correspond to voice pitch, voice loudness, lip opening and other speech variables. When every instant of sound is specified, and every variable accounted for, such a machine produces human-sounding speech.
Setting up the computer to simulate this talking machine requires two sets of instructions or, more precisely, a two-part computer program. One part of the computer program performs the actual sound making function – it imitates the “talking’ of a talking machine. The second part consists of rules for combining individual speech sounds into connected speech, and for producing the nine control signals that activate the talking machine. Scientists at Bell Telephone Laboratories have developed a computer program that permits them to feed the names of speech sounds into the computer on punched cards. They also have devised a phonetic code using the letters of the alphabet. At present, it is made up of 22 consonant and 12 vowel sounds:
CONSONANTS: P – B – T – D – K – G – M – N – NG (as in sing) – F – V – S – Z – SH (as in she) – ZH (as in azure) – H – W – R – L – Y – TH (as in thin) – DH (as in then)
VOWELS: EE (as in bee) – I (as in ill) – AY (as in rate) – E (as in end) – AE (as in add) – AH (as in ah) – AW (as in jaw) – (as in go) – OO (as in foot) – UU (as in food) – UH (as in up) – ER (as in her)
Each speech sound is specified on a separate punched card. When a sequence of cards is fed into the computer, it “operates’ on the information – following the rules set up in the second part of its program – to produce the nine control signals that activate the talking machine program. For example, if the sequence of cards, H – EE – S – AW – DH – UH – K – AE – T, is fed into the computer, the machine will say “He saw the cat,’ in flat monotones. Proper inflection and phrasing are achieved by specifying on each card the changes in pitch and timing natural to human speech.
By specifying the pitch of the sounds, it also is possible to make the computer sing. In two of the samples recorded, the computer first sings a familiar tune and then, singing the same song, is accompanied by music played by another computer. The “speech’ of the simulated talking machine comes out of the computer as tiny magnetized spots on half-inch magnetic tape. The tape is fed to another machine which converts the spots to a tape suitable for playing on an ordinary tape recorder.
The first eight and very last samples of synthesized speech on this recording are part of a research program aimed, principally, at formulating a minimum set of rules for making plausible English speech. The ninth and tenth selections were produced by analyzing a person’s speech and re-constructing it synthetically on a computer. The objective of this program is to duplicate the sounds and transitions made by a human speaker, including his accent and dialect.
Knowledge developed through such research programs may be useful in devising new techniques for transmitting speech more efficiently over communications systems. In the near future, for example, a person may be able to type on a keyboard and cause a typing machine thousands of miles away to speak for him. There is also the possibility that talking machines may be built for people who are unable to speak.” Link To MP3
Ring Modulators have been around a long time and were very popular on the earliest of synthesizers. Still popular today, the number of users has grown to include guitar players and others looking for a unique sound. A Ring Modulator needs 2 inputs to produce any output but on most units there is a internal oscillator that will function as one of the inputs. The internal oscillator is usually referred to as the “carrier” and many times can be voltage-controlled from an external source. The ring modulator produces sum and difference frequencies between the interaction of the carrier oscillator and the audio input signal. So if the carrier frequency is 1000 Hz (Cycles per Second) and the audio input frequency is 800 Hz, the Ring Modulator’s output will be 1800 Hz and 200Hz. Depending on the make and model of the Ring Modulator you should not hear the carrier oscillator or the input audio waveform, although in real world use, you may hear some leakage through the unit. Many models will also have a internal Low Frequency Oscillator (LFO) tied into the carrier, this LFO will modulate or change the frequency of the carrier to expand the range of Ring Modulator effects even more. The LFO is used to create slow effects like tremelo or vibrato and may have the choice of several waveforms such as sine, triangle or square wave. Also the LFO should have an “amount” or “drive” control that allows the user to select exactly how much of the LFO effect should be applied to the carrier. Typical frequency range of an LFO may be .1 Hz to 30 Hz. The carrier oscillator may range as low as 1Hz to a high of 3 to 7KHz.
Helmholtz has an entire chapter on the sum and different frequencies in his landmark work, “On The Sensations of Tone”, here is a small excerpt:
“It is the occurrence of Combinational Tones, which were first discovered in 1745 by Sorge, a German organist, and were afterwards generally known, although their pitch was often wrongly assigned, through the Italian violinist Tartini (1754), from whom they are often called Tartini’s tones.”
“These tones are heard whenever two musical tones of different pitches are sounded together, loudly and continuously. The pitch of a combinational tone is generally different from that of either of the generating tones, or of their harmonic upper partíais. In experiments, the combinational are readily distinguished from the upper partial tones, by not being heard when only one generating tone is sounded, and by appearing simultaneously with the second tone. Combinational tones are of two kinds. The first class, discovered by Sorge and Tartini, I have termed differential tones, because their pitch number is the difference of the pitch numbers of the generating tones. The second class of summational tones, having their pitch number equal to the sum of the pitch numbers of the generating tones, were discovered by myself.”
So it was Helmholtz himself that discovered the sum component of the combinational tones.
Here is a chart from his book that describes combinational tones that are generated from various inputs.
Sum And Difference Chart
We can use this chart, constructed over 100 years ago to, calculate the output of a Ring Modulator if certain musical ratios are presented to the X and the Y inputs of the modulator. The first interval listed is the octave, but that ratio may not give a very interesting output. So lets try the next interval listed, the Fifth. With the Fifth’s natural frequency ratio of 2:3, the output will have a fundamental frequency that is one octave lower than the lower of the two inputs. This should not sound like the typical output of a Ring Modulator and may be more musically useful to some composers.
This is an example of a vocal sample and sine wave input. To try to make some valid comparisons of the various sounds, I played a simple C scale for all examples.