Abstract : Vol.39No.1(2004.3)
Special Issue:Speech-Based Interfaces in Vehicles
|
Review
|
|
P.1 |
Speech-Based
Interfaces in Vehicles |
|
|
Toshihiro Wakita
|
|
Recently, speech-based interfaces in vehicles have
become a popular means of improving the accessibility
of in-vehicle information equipment. In this edition,
we present the results of our research into the "noise-robustness
of speech recognition" and "driver distraction."
|
Research Report
|
|
P.4 |
Noise-Robust
Speech Recognition in a Car Environment Based on the Acoustic
Features of Car Interior Noise |
|
|
Hiroyuki Hoshino
|
|
This paper describes an efficient method of improving
the noise-robustness of speech recognition in a noisy
car environment by considering the acoustic features
of a car's interior noise. We analyzed the relationship
between the Articulation Index values and the recognition
rates in car environments under different driving conditions.
We clarified that the recognition rate significantly
worsens when the engine noise (periodic sound) components
in the frequency range above 200 Hz were large. We developed
a preprocessing method to improve the noise-robustness
despite large amounts of engine noise. With this method,
the cutoff frequency of the front-end high-pass filter
is adaptively changed from 200 through 400 Hz according
to the level of the engine noise components. The use
of this method improved the average recognition rate
for all eight cars under the second range acceleration
condition by 11.9%, with the recognition rate for one
of the cars being improved considerably by 38.6%.
|
P.10 |
Estimating
Speech-Recognizer Performance Based on Log-Likelihood
Difference Distribution of Word-Pairs |
|
|
Ryuta Terashima
|
|
This paper describes an efficient method of estimating
word recognition rates without speech data. The method
is based on the minimum value of the word-pair recognition
rate, which correspons to the word recognition rate.
The estimated word-pair recognition rate can be calculated
by the measured log-likelihood difference distribution
that can be obtained by phoneme recognition, and it
is assumed that the distribution is approximated by
a normal distribution. To illustrate the effectiveness
of our method, we evaluated the performance of the proposed
method by actual recognition experiments using 3000
word-pairs. The correlation coefficient value between
the estimated and the measured recognition rates was
0.87 when the phoneme lengths of the word-pairs were
equal. Furthermore, we also evaluated a 95% confidence
interval for the measured recognition rates. The percentage
of estimated words that fell within the confidence interval
was 94.8%.
|
P.16 |
Voice
Information System that Adapts to Driver's Mental Workload |
|
|
Yuji Uchiyama, Shinichi Kojima, Takero
Hongo,
Ryuta Terashima, Toshihiro Wakita
|
|
With in-vehicle information systems,
there is a danger of voice messages causing the user
to be distracted while driving. To reduce this danger,
the ideal would be for the system to adapt to the driver's
mental workload. Such an adaptive system would deliver
voice messages only when the driver's mental workload
was low, and suppress messages whenever his or her workload
is high. Therefore, such a system would have to be able
to estimate the current driver workload from the outputs
of the car's sensors such as the speed, steering wheel
angle, and accelerator pedal position. To establish
a relationship between the driver's mental workload
and the data that is output by the car's sensors, a
dual-task experiment was conducted on a public road.
In this experiment, participants performed a memory-task
while driving a test car. At the same time, the data
from the car's sensors was recorded. The correlation
coefficients linking the performance of the memory-task
to the data received from the car's sensors showed that
the driver's releasing the accelerator pedal was the
most significant indicator of workload. Based on these
results, a workload estimation model was developed,
which was then applied to a voice information prototype
system in a test car. The driving situations in which
the system postpones the delivery of voice messages
were then confirmed.
|
P.23 |
Evaluating
the Safety of Verbal Interface Use while Driving |
|
|
Shinichi Kojima, Yuji Uchiyama,
Hiroyuki Hoshino, Takero Hongo
|
|
This paper proposes a method of
evaluating the degree of safety of a verbal interface
that is used while driving. Recently there have been
concerns about driver distraction when a person uses
voice commands to operate their in-vehicle multimedia
systems while driving, since such distraction has the
potential to cause or contribute to a crash. With our
evaluation method, the reaction time from the instant
that an in-vehicle LED (positioned in the driver's peripheral
vision) is turned on to the time that the subject presses
a button is measured. We found from the histogram made
by the many reaction time data that the number of the
delayed reaction time trials increased as a result of
the subjects' using a verbal interface compared with
the condition that the subjects were only driving. It
suggested that the rate of the delayed reaction time
trials was available as the evaluation index. Based
on the data obtained with an actual vehicle, we found
that our method produces more useful results than other
methods that use the average reaction time as an index.
Additionally, we show that we can find the point at
which the subjects' reactions are delayed during a verbal
task by processing the delayed reaction time trials.
.
|
 |