Главная страница Случайная страница КАТЕГОРИИ: АвтомобилиАстрономияБиологияГеографияДом и садДругие языкиДругоеИнформатикаИсторияКультураЛитератураЛогикаМатематикаМедицинаМеталлургияМеханикаОбразованиеОхрана трудаПедагогикаПолитикаПравоПсихологияРелигияРиторикаСоциологияСпортСтроительствоТехнологияТуризмФизикаФилософияФинансыХимияЧерчениеЭкологияЭкономикаЭлектроника |
Boundary conditions ⇐ ПредыдущаяСтр 7 из 7
Speech technology is in an essential way an enabling technology: it does not serve an independent useful function, but it can only be applied to improve existing services or to enable services which were previously impossible to provide (either because of economical, human factors, or technological reasons). The economical factors are easiest to understand: while human operators in an information or transaction service are extremely expensive, especially during non-sociable hours, computers keep working 24 hours a day and seven days a week, virtually without additional costs. Human factors issues which may call for the use of speech technology comprise the eyes busy - hands busy situations, for instance while driving a car or flying a fighter plane. But similar situations may occur in using advanced computer programmes, e.g., in Computer Aided Design. Speech technology will definitely prove to be an essential user interface modality if the intelligent agents become a reality: when it will no longer be necessary to instruct computers how they must carry out what we want them to do (by clicking large numbers of icons and other objects in the right order) but when we can instead tell the machine what we want to accomplish. Especially in cases where we do not really know exactly HOW the machine must do the job, but we know approximately what we want to see as a result, speech is expected to be the medium of choice, if only because it is so easy to be vague using speech. It will then be left to the speech understanding capability of the agents and to their (artificial) intelligence to sort out what we want and how to accomplish that. Finally, a service like " make all the calls you can via voice over IP at a fixed flat monthly rate" would have been impossible without speaker verification technology, that prevents subscribers from selling their account data to others, to let them use the service too. In planning to use speech technology as an interface modality one must be aware of the fact that speech communication is natural and efficient, but that it is far from perfect at the same time. In human-human communication errors and misunderstandings abound. Sometimes these are fatal, but usually the miscommunications are detected in time and after having been detected they are often easy to repair. Most probably, the ease with which humans detect and repair speech understanding problems is due to the enormous amounts of redundancy in the communication channels, as well as in the processes which are being performed. Many acoustic confusions between words never occur to us, simply because the wrong recognition alternative does not make sense. Without noticing and knowing, humans use an enormous amount of 'intelligence’ to support speech recognition. For machines, which must make do without much knowledge of the world around us, there is little to make the interpretation " wreck on nice beach" less meaningful than " recognise speech". It is highly unlikely that we will be able to increase the speech recognition capabilities of machines far beyond human capabilities, so that they will be able to use the subtle differences between the ways the two phrases in the example are pronounced to avoid the wrong interpretation. Consequently, we have no choice than providing the artificial agents with the means to discard nonsensical interpretations of the speech on semantic and pragmatic grounds. This remains a major challenge.
|