Usage of Voice Recognition inside an Elevator
This paper was presented at Paris 2014, the International Congress on Vertical Transportation Technologies, and first published in IAEE book Elevator Technology 20, edited by A. Lustig. It is a reprint with permission from the International Association of Elevator Engineers (website: www.elevcon.com). This paper is an exact reprint and has not been edited by ELEVATOR WORLD.
Key Words: Speech recognition technology, voice-controlled elevator, elevator industry, talk2lift®
In the past decades there were several developments in the elevator industry regarding safety, speed, but also economy and convenience. The single task was to make the use of elevators more efficient, easier and more accessible to an ever increasing number of people. Towards this direction, voice recognition has a significant impact on accessibility issues, at the same time providing a faster way of finding and reaching a destination, within a complex building. The present paper addresses the rationale of the development of an innovative product that uses speech technology providing the opportunity of the emergence of the first series of voice-controlled elevators in the history of the lift industry.
The present paper describes an invention related to an elevator in which passengers destination can be specified through voice recognition. It aims to present the rationale, the background context and the development process of the product called talk2lift®. Moreover, the paper addresses the main implications for elevator engineers, installers, lift related industries. Finally the paper briefly analyses the main business prospects of the product. This paper is organised in four main sections. The first section examines the background and the rationale of the innovation. The second section analyses the function of the innovation by emphasizing its innovative features and capabilities. Next section examines the practical implications and the use of the innovation while the last one summarises the main elements of the paper by focusing on them main critical aspects of the product
2. RATIONALE OF THE INNOVATION
In general, an elevator is provided with buttons for specifying a floor number. A user-passenger in the elevator can specify a desired floor number by pressing the button of the floor number.
Two sets of floor number buttons are provided within one elevator in case of a large-scale elevator. Furthermore, the blind suffer from an inconvenience of having to grope about buttons in order to find Braille type. Although there are already voice recognition systems that allow users to give voice commands using only their voice, they have a significant limitation which is the ability to reject the ambient noise, resulting in a relatively low rate of successful recognition of commands.
All the above mentioned are dealt by the innovation of talk2lift®. This invention relates to a voice recognition device, which fits on the console of the cab of an elevator and via voice commands the user-passenger of the elevator can be directed to the desired destination within the building. The voice recognition engine is designed and developed in such a way as not to need training on the voice of a user-passenger but can recognize and perform all voice commands from diferent users. Briefly , the main function of the invention is to convert voice commands to corresponding keystrokes of the elevator. Beyond this basic function, the talk2lift® device provided there is an interface to the elevators main controller , can also collect about the elevators operation and send to a centrall location for further processing. The system can be connected to a display in the cab of the elevator, which will present additional data to better inform the user-passenger, according to the voice commands that she/he has given.
Finally, the system is able to be continuously trained on recordings of ambient noise, stores them in a database, and distributes to all users, so that all installed systems to continuously trained in recognizing and rejecting noise and further enhancing the quality of voice recognition.
2.1 talk2lift® Function
The present invention can be fully understood from the above detailed description of the design of the invention as illustrated in Figure 1.When entering a user-passenger the cabin of the lift sensor is activated presence atom (see Figure 1 – Point 8). Then the sensor sends a signal to the central processor (see Figure 1 – Point 4), which in turn activates the microphone (see Figure 1 – Point 1) and the voice recognition unit (see Figure 1 – Point 10 ) and so begins the process of recognition of the voice command of the user-passenger. During the activation process, the user-passenger is asked by the system to state his/her destination through a particular phrase : “Please tell me the floor” or “Please tell me your destination”.
After the answer of the user-passenger of the lift, the microphone (see Figure 1 – Point 1) converts the voice into an electrical signal. Then step ADC ( see Figure 1 – Point 2), converts the analog signal to digital. After digitizing the signal level DSP (see Figure 1 – Point 3) processes the digital signal so that it is ready for the introduction of the central processing unit (see Figure 1 – Point 4 ) and then the voice recognition unit (see Figure 1 – Point 10).
When the voice recognition unit receives the digitized signal is then undertaken to compare it with an existing list of possible commands, which is stored in the storage unit (see Figure 1 – Point 5). Once the identification of the command, the appropriate signal to the central processor (4) is sent and undertakes to promote the appropriate command to the controller of the lift (see Figure 1 – Point 7). As a confirmation to the user-passenger a speech synthesizer (Figure 1 – Point 9) repeats the recognized command. Moreover the system has the ability to give visual confirmation to the user-passenger through the screen (Figure 1 – Point 12).
In the event of interruption of power supply, the central processor realizes the break and so triggered the following procedure: The speech synthesizer (see Figure 1 – Point 9 ) asks the user-passenger if there is a need to communicate with the appropriate emergency service (eg Fire department, police, etc). Then the voice recognition unit (see Figure 1 – Point 10) is activated. If the answer is positive then the central processor (see Figure 1 – Point 4 ) enables the unit to connect to the telephone network (see Figure 1 – Point 6 ) and this in turn calls the appropriate service.
2.2 Noise Cancellation & Voice Recognition Algorithm
The most important function of talk2lift® is its ability to interact with the user via speech recognition and synthesis. Obviously, the most difficult problem in this communication is to recognize the voice of the user and not be confused through the variety of electromagnetic noise that emerges in an elevator system. Therefore, it is necessary to use a noise suppression algorithm for noise cancellation. Unfortunately, the audible noise is well captured by theories of normal, Gaussian distribution.
As a result of this noise, most algorithms (developed on the Gaussian assumption) either not provide the expected results, or collapse completely due to the presence of the shock noise.
The particular innovation though, includes the use of a central server, which collects samples of recognized commands of the user, in a variety of ambient noise by all system users. The recordings received by the server are collected and used to train the algorithm. Educational outcomes then can be shared back to the devices of users so that they contain the most updated parameters of the algorithm. This algorithm is designed and then implemented in high-level language . Then, the algorithm is optimized for the execution time and memory requirements to be implemented in the circuit of the final system. Finally, the algorithm is written in fixed-point arithmetic that can be executed on the available processors, digital signal processing (digital signal processors — DSPs).
In order to develop a noise suppression algorithm, the design, development and implementation of a specialized voice recognition board was required. This board includes the appropriate tools and interfaces for testing various techniques, circuits, algorithms and interfaces. Also, it includes the appropriate circuits for digital signal processing for testing various algorithms voice recognition and noise suppression. It is also equipped with the appropriate interfaces (interfaces) to interface with other circuits, measuring instruments, recorders, microphones, inputs from various sensors etc, and the host computer for loading algorithms. This board has been designed by a research group, specialized and experienced in voice recognition technologies.
3. INNOVATIVE FEATURES & USE
talk2lift® brings the power of speech recognition on any elevator cabin, enabling passengers to voice-control the lift. talk2lift® allows users to pronounce the floor number or any other related information (house owner, profession, sector/area etc.). Finally, it supports multilingual operation.
talk2lift® is a speech recognition electronic device adjustable on the electronic board of an elevator’s cabin. Through the use of speech technology, passengers can talk to the elevator pronouncing the floor that they wish to go within a building or even the name of the person they want to visit or some attributes characterizing them (e.g., dermatologist). The software system of the speech recognition has been designed and developed in order to recognize the voice of any given user, independently of the tone or loudness of her voice. In short, the main function of Talk & Lift is to transform the voice of the elevator’s users into an additional press of the button on the board of an elevator. Finally, talk2lift® can be connected to a screen close to the board of an elevator, on which users can see additional information such as the selected floors along with any information publicly available to the system (e.g., occupation). Beyond this basic function, the talk2lift® device provided there is an interface to the elevators main controller , can also collect about the elevators operation and send to a centrall location for further processing.
The main capabilities of talk2lift® are the following:
- The recognition process of the system is speaker independent, meaning the user does not have to train the system to his/her voice.
- Capability of recognizing up to 10,000 predefined voice commands.
- The system has been developed and trained for best use in the environment of an elevator cabin.
- The system is designed such as to be able to be attached to any elevator system without the need for hardware modifications.
- The system is able to handle more that one language. For example the system can be fitted with two buttons, one for the English and one for the German language.
- The system is available in the following languages: German, English, French, Portuguese, Spanish, Dutch, Greek, Italian, Polish, Portuguese, Swedish, Turkish, Russian, Finnish, Danish, Mandarin Chinese.
3.1 Implications and Indicative Use
talk2lift® has a lot of practical implications. First of all, the cabin holds an “extra button”, which, upon pressing, prompts the passengers to pronounce their destination. Users can thus use the lift in the conventional manner, allowing for a transition phase. The extra button (see photo 2) is specially designed to be accessible by wheelchair users as well as visually impaired people. Passengers can be transferred at the desired floor just by pronouncing it (e.g., “-first floor”, “ground floor” or “parking”).
It is possible to pronounce the name of an individual or a company (e.g., “Jim Brown”) or any other word which has been correlated with a floor (e.g., “tax office”, “lawyer”, “cardiology department”, “coffee”).
In short, talk2lift® can be used as a great assistance to visually impaired people that can be supported in their access and destinations. Also, habitants and visitors of buildings can have easy and quick access to apartments, offices, etc. Furthermore, it can offer time saving for visitors in vast buildings (offices, hospitals, public services, etc.) by improving accuracy and reducing the defects of moving inside the building. Finally, talk2lift® is an environmental friendly system since it saves energy by applying intelligent management system practices.
Indicative use of the system could target apartment buildings, supporting floor numbers and owners’ or renters’ names. Also, it can installed in complex buildings hosting company offices, linked with keywords associating floors with company names, brands, products, employee names, sectors etc.
Finally, talk2lift® is a useful tool for elevators in large organizations public or private with many floors and departments, like hospitals, in which buildings typically host tens of departments and clinics.
The need for a solid and trustworthy voice recognition system is commonly known for years now among elevator professionals.
Indicative use of the system could target apartment buildings, complex buildings hosting company offices, and large organizations with many floors and departments, like hospitals. talk2lift® can be used as a great assistance to visually impaired people that can be supported in their access and destinations. Also, habitants and visitors of buildings can have easy and quick access to apartments, offices, etc.
The need was apparent and so were the various uses as explained. The product was designed accordingly to cover all those needs and it was tested thoroughly at KLEEMANN’s test tower, where it was installed a few months prior to its launch. During those months the product was being used in real conditions by employees and visitors of KLEEMANN and showed no inaccuracy.
In sum, talk2lift® has the following innovative characteristics:
- It is voice independent, works for new passengers without the need of prior training.
- Provides an improved accuracy for lift cabins with precision rates as high as 97%.
- Provides a central and dynamic control of keyword associations to floors through an intuitive user interface.
- It adds value to both the elevator and the building by giving a sense of luxury and high end technology
- It does not need mechanical maintenance.
- It is customizable in order to fit specific needs.