United States Patent Application 20020040317, April 4, 2002
A method and apparatus are described for asynchronously conducting interviews though a user interface executing on a client. The user interface prompts an interviewee for at least one audio response, which is digitally recorded. User interfaces are generated by a client according to code defining the user interfaces downloaded from a server via, for example, the Internet. The server may be remote from the client, thereby allowing interviewees to interact with the user interfaces on their own computers. The user interface queries an interviewee and the interviewee responds, either by entering text or digitally recording a response using controls supplied by the user interface. The responses are down loaded via, for example, the Internet to a server. Evaluators may review an interviewee's response through the use of user interfaces. A user (e.g. interviewer) may specify the format of asynchronous interviews, by providing user input that specifies queries to ask, the manner of asking the queries, and the manner in which an interviewee may respond. Based on the user input, data defining the format of an asynchronous interview is generated and may be stored, for example, on a server.
Inventors: Neumeyer, Leonardo; (Palo Alto, CA) ; Rtischev, Dimitry; (Los Altos, CA) ; Doval, Diego; (Mountain View, CA) ; Gargiulo, Juan; (Palo Alto, CA) ; Parker, Dylan; (Palo Alto, CA)
United States Patent 6,760,697, July 6, 2004
Described herein is a system that enables service provider's to integrate speech functionality into their applications. A service provider maintains a set of application servers. To provide a particular speech service to a client of the application server, the application server causes the client to request the speech service from another set of servers. This set of servers is responsible for providing this speech service as well as others. Such speech services include recording digital speech data at the client, and storing the recordings. Later, the application servers may retrieve the recordings, and even more, retrieve data derived from the recordings, such as data generated through speech recognition processes.
Inventors: Neumeyer; Leonardo (Palo Alto, CA); Rtischev; Dimitry (Menlo Park, CA); Doval; Diego (Mountain View, CA); Gargiulo; Juan (Mountain View, CA)
United States Patent 6,302,695, October 16, 2001
A method for language fluency training on a computer system having an audio output device includes invoking a web browser program, receiving a pre-recorded file including a message in a spoken language from a conversation partner, playing the message to a user seeking fluency training in the spoken language from within the web browser program on the audio output device, asynchronously with playing the message, recording a user file including a message in the spoken language from the user in response to the message from within the web browser program, outputting the user file to the conversation partner and to a language instructor, receiving an instruction file including an instruction message in the spoken language from the language instructor in response to the user message and playing the instruction message to the user from within the web browser program on the audio output device.
Inventors: Rtischev; Dimitry (Menlo Park, CA); Hubbard; Philip L. (Stanford, CA); Neumeyer; Leonardo (Palo Alto, CA); Shibatani; Kaori (San Francisco, CA)
United States Patent 6,256,607, July 3, 2001
An automatic recognition system and method divides observation vectors into subvectors and determines a quantization index for the subvectors. Subvector indices can then be transmitted or otherwise stored and used to perform recognition. In a further embodiment, recognition probabilities are determined for subvectors separately and these probabilities are combined to generate probabilities for the observed vectors. An automatic system for assigning bits to subvector indices can be used to improve recognition.
Inventors: Digalakis; Vassilios (Hania, GR); Neumeyer; Leonardo (Palo Alto, CA); Tsakalidis; Stavros (Makrohori Verias, GR); Perakakis; Manolis (Ierapetra, GR)
United States Patent 6,226,611, May 1, 2001
Pronunciation quality is automatically evaluated for an utterance of speech based on one or more pronunciation scores. One type of pronunciation score is based on duration of acoustic units. Examples of acoustic units include phones and syllables. Another type of pronunciation score is based on a posterior probability that a piece of input speech corresponds to a certain model such as an HMM, given the piece of input speech. Speech may be segmented into phones and syllables for evaluation with respect to the models. The utterance of speech may be an arbitrary utterance made up of a sequence of words which had not been encountered before. Pronunciation scores are converted into grades as would be assigned by human graders. Pronunciation quality may be evaluated in a client-server language instruction environment.
Inventors: Neumeyer; Leonardo (Palo Alto, CA); Franco; Horacio (Atherton, CA); Weintraub; Mitchel (Fremont, CA); Price; Patti (Menlo Park, CA); Digalakis; Vassilios (Chania, GR)
United States Patent 6,055,498, April 25, 2000
Pronunciation quality is automatically evaluated for an utterance of speech based on one or more pronunciation scores. One type of pronunciation score is based on duration of acoustic units. Examples of acoustic units include phones and syllables. Another type of pronunciation score is based on a posterior probability that a piece of input speech corresponds to a certain model, such as a hidden Markov model, given the piece of input speech. Speech may be segmented into phones and syllable for evaluation with respect to the models. The utterance of speech may be an arbitrary utterance made up of a sequence of words which had not been encountered before. Pronunciation scores are converted into grades as would be assigned by human graders. Pronunciation quality may be evaluated in a client-server language instruction environment.
Inventors: Neumeyer; Leonardo (Palo Alto, CA); Franco; Horacio (Atherton, CA); Weintraub; Mitchel (Fremont, CA); Price; Patti (Menlo Park, CA); Digalakis; Vassilios (Chania, GR)
United States Patent 5,864,810, January 26, 1999
A method and apparatus for automatic recognition of speech adapts to a particular speaker by using adaptation data to develop a transformation through which speaker independent models are transformed into speaker adapted models. The speaker adapted models are then used for speaker recognition and achieve better recognition accuracy than non-adapted models. In a further embodiment, the transformation-based adaptation technique is combined with a known Bayesian adaptation technique.
Inventors: Digalakis; Vassilios (Crete, GR); Neumeyer; Leonardo (Menlo Park, CA); Rtischev; Dimitry (Fremont, CA)