In this section, you will learn how to use Jovo to craft a response to your users.
- Introduction to Output Types
- Basic Output
- Advanced Output
- Visual Output
- No Speech Output
What do users expect from a voice assistant? Usually, it's either direct or indirect output in form of speech, audio, or visual information. In this section, you will learn more about basic output types like
ask, but also how to use SSML or the Jovo speechBuilder to create more advanced output elements.
Jovo's basic output options offer simple methods for interacting with users through text-to-speech. If you're interested in more, take a look at Advanced Output.
The tell method is used to have Alexa or Google Assistant say something to your users. You can either use plain text, SSML (Speech Synthesis Markup Language), or a speechBuilder object (
Important: The session ends after a
tell method, this means the mic is off and there is no more interaction between the user and your app until the user invokes it again.
Whenever you want to make the experience more interactive and get some user input, the
ask method is the way to go.
This method keeps the mic open, meaning the speech element is used initially to ask the user for some input. If there is no response, the reprompt is used to ask again.
Google Assistant offers the functionality to use multiple reprompts.
You can find more detail about this feature here: Platforms > Google Assistant > Multiple Reprompts.
It is recommended to use a
RepeatIntent (e.g. the
AMAZON.RepeatIntent) that allows users to ask your app to repeat the previous output if they missed it.
This feature makes use of the Jovo User Context. To be able to use it, please make sure that you have a database integration set up and the Jovo User Context enabled.
Voice platforms offer a lot more than just converting a sentence or paragraph to speech output. In the following sections, you will learn more about advanced output elements.
SSML is short for "Speech Synthesis Markup Language." You can use it to add more things like pronunciations, breaks, or audio files. For some more info, see the SSML references by Amazon, and by Google. Here's another valuable resource for cross-platform SSML.
Here is an example how SSML-enriched output could look like:
But isn't that a little inconvenient? Let's take a look at the Jovo speechBuilder.
speechBuilder, you can assemble a speech element by adding different types of input:
Jovo uses a package called i18next to support multilanguage voice apps.
If you prefer to return some specific responses in a raw JSON format, you can do this with the platform-specific functions
Learn more about platform-specific features and resonses here: Platforms.
The Jovo framework, besides sound and voice output, can also be used for visual output.
Sometimes, you might want to end a session without speech output. You can use the
endSession method for this case: