Speech recognition – fact or fiction? - Veterinary Practice
Your browser is out-of-date!

Update your browser to view this website correctly. Update my browser now

×

InFocus

Speech recognition – fact or fiction?

Adam Bernstein suggests there are money and time savings to be made, and talks to a number of software providers who explain how the technology works.

KUBRICK’S MASTERPIECE, 2001: A SPACE ODYSSEY, features
a number of technologies that have
moved from lm to workplace. One, speech recognition, is
now gaining
traction and
while we have
some way to
go before we
have sentient
computers,
the latest
developments in speech recognition
now make it suitable for many
environments including veterinary
practices – for both secretarial and vet
use alike.

In simple terms, speech recognition
systems are computer-based applications that accept a source of
audio and turn that into text with
(in the practice context) the aid of a
medical-specific vocabulary.

Sarah Fisher,
responsible
for healthcare
regional
marketing at Nuance
Communications,
says the
technology
allows practice staff to dictate directly into
patient records and other clinical
documentation “to document animal
care in their own words and at the
point of care… [while] spending less
time typing and clicking”.

Indeed she reckons, with Nuance’s
software at least, that staff can navigate
systems using voice commands up to
three times faster than most people can
type or click with a mouse.

There are, says Dr Andrew Whiteley,
managing director of Lexacom, two
different ways of converting speech
to text: “The first system uses a first person (foreground) method,
whereby words appear on the screen
as they are dictated, and the second
uses a deferred (background) method
whereby a document is dictated and
the audio is sent to a server. The server
turns that document into text and
returns it.”

Foreground is typically used for
data entry, while background is
often used for creating letters and
lengthier documents. Either way, both
are a marked improvement on tape
transcription which can take days to
turn around.

Clearly anything that can be used to
reduce the burden of data entry into
clinical systems is to be welcomed,
which means there is great potential
for speech recognition.

In particular, the technology can reduce administration time for vets
while improving accuracy of captured
information within clinical systems. For
secretaries, background systems can
be used to reduce the time and effort
involved in creating referral letters and
other correspondence.

John Bendall, director of UK
Operations at Crescendo Systems,
makes the point that savings follow
from either approach because of
reduced keyboard activity and this
improved accuracy. He adds that
systems aren’t dumb and that “any
corrections made by the user to
recognised text are returned to the
system so it can learn new words”.

“The benefits are obvious,”
says Whiteley. “It can reduce the
turnaround times of letters, improving
referrals, and can also reduce the cost
of secretarial support while smoothing
out fluctuations in typing demand.”
For Fisher the saving is clear: “Time saved can equate to up to 30 to 60
minutes per day, per user.”

A few considerations

There are no drawbacks as such, but
it’s important to understand that there
is no “one size ts all” approach
to speech recognition and so the
technology needs to be used with
common sense.

By definition of it being sound-
actuated, speech recognition will
not work well in noisy or disruptive
environments. Further, systems need
both high-quality recording devices,
and for the author to dictate in a
specific way, being explicit about the
punctuation for instance. There may
also be some accuracy issues – busy
vets may not check the words as
they appear to ensure the system has
translated correctly.

It’s also worth noting that there may
be privacy and confidentiality issues
– from overhearing clients and staff
alike – as details are spoken instead
of typed. This is less of an issue for
veterinary practices compared to GP
surgeries, but privacy issues do remain.

One obvious question that follows
is how systems cope with regional
accents. Well, reckons Whiteley. He
says systems are configured to learn
and adapt to such things: “Speech
recognition is based on the probability
of a word being that word, rather than another. In medicine, there are lots of
complicated words; unique sounding
words, for instance the probability of
‘feline lymphoma’ being anything [else]
is very small.”

He notes, however, saying “tree”
could lead a system to think it’s
“three”, or “thee”, etc. The advice is
that time and patience are required
during the learning process, which
can be likened to teaching Siri on the
iPhone to recognise a given user’s
voice.

Fisher says it’s important to
recognise that in most situations,
success gains follow from
understanding the working practices
of the clinic and by creating standard
templates to speed up processes.

What to look for

The technology isn’t cheap but it can
offer good value and savings over time.
Here Fisher says that Nuance estimates
– for a single user – its “software will
pay for the investment in software, the
microphone and half a day of training
and set-up in less than six weeks” (the
company is assuming a vet “costs”

£70 per hour and works five days
per week). And on top of that come
savings from reduced administration,
improved quality of notes, and an
improved client experience.

Standalone speech recognition
systems can be bought for £800-£1,000 per user plus the cost of
training – Nuance, for example,
charges £995 for the software and its
partners may charge between £350
and £500 for training and set-up, and
Crescendo charges vary from £899 for
a one-off licence to as low as £700 for
larger volumes.

For some, this large up-front cost
combined with uncertainty of what
success a system may bring may
be putting them off. There is an
alternative – practices can subscribe on
a monthly basis. Lexacom, for example,
charges £20 per user per month – for
an embedded application which, says
Whiteley, “fully integrates with all
primary care clinical systems”.

This option may well be helpful
to practices that don’t want a large
single bill without understanding if
speech recognition is suitable for
them. However, they will also need
to subscribe to other products from
Lexacom for the service to work.
(Nuance also offers subscription
access, with prices being set by its
resellers and Crescendo is in the
process of launching a subscription
model too.)

There is another cost consideration,
one made by Bendall, and it’s that a
system must allow for both foreground
and background speech recognition to eliminate the need to pay for two
licences.

Overall, it’s the outright cost or
subscription obligation that makes it
important that practices try before they
buy and also ensure they have buy-in
from potential users. Further, practices
need to look at what problems they are
trying to solve.

As Whiteley puts it: “If they are
buying the technology to reduce
the workload for a secretarial team,
transferring the burden to the vet is
unlikely to help.” In this scenario, he
recommends a background system.

For these reasons practices should
look for a system that allows the
use of digital dictation, outsourced
transcription and speech recognition so that there are a variety of solutions for
each user.

Fisher agrees on this: “Some may
prefer to dictate with their client
listening during a consultation;
some may wish to summarise the
consultation after the face-to-face.”
She thinks the software used should
support whatever the preferred work
practice of the individual.

It also makes sense to have a system
that, if required, integrates well with
other practice systems and which is
virtually training-free. Bendall suggests
a system must be capable of handling
“roaming profiles” so vets can move
between consultation rooms and use
their personal profile for the best
accuracy.

Also, systems should be capable of
creating voice shortcuts so individuals
can perform commands by voice such
as launching a specific template or
opening a client record.

Adoption of the technology is slow,
but growing as it overcomes earlier
(failed) attempts to use it. Says Fisher:
“Many medical professionals may
have tried earlier versions of speech
software years ago and may not have
been impressed with the results.
However, modern speech recognition
solutions take minutes to get set up
and get going.

“The latest speech recognition
solutions designed specifically for
medical situations – combined with
today’s more powerful PCs – boast
performance and capabilities that far
exceed their predecessors.”

Whatever practices do, they
shouldn’t necessarily choose the
cheapest option and certainly not one
without a medical dictionary built in.
Bendall illustrates this: he says Dragon
Medical in a surgery is over 30% more
accurate than Dragon Professional,
and that “choosing non-medical
versions of speech recognition means
the users have to add drug names
and terminology on an ongoing basis,
which takes time and can be very
frustrating”.

Have you heard about our
IVP Membership?

A wide range of veterinary CPD and resources by leading veterinary professionals.

Stress-free CPD tracking and certification, you’ll wonder how you coped without it.

Discover more