Visual Output

Learn more about how to build Alexa Skills with visual output using the Jovo Framework.

Introduction to Visual Ouput

Visual output is used to describe or enhance the voice interaction. This ranges from simple cards, Echo Show and Echo Spot Display Templates to displaying a video.


Cards are used for the most basic cases of visual output. They can be used to display plain text and images or to ask for certain permissions (Account Linking, to-do/shopping lists, etc.) in addition to the speech output.

Simple Card

The simple card can only contain plain text, which is split up into a title and content.

Official Amazon reference.

Standard Card

The standard card allows you to add an image in addition to the plain text, which has to be provided in two different sizes.

Official Amazon reference.

Display Templates

Display Templates can be used to include content on the screen of the Echo Show or Spot. There is a variety of templates, each having a different composition and features. You can find the official Amazon reference here.

To be able to use display templates for devices like Echo Show, you need to enable them in the Interfaces tab in the Amazon Developer Console:

Alexa Console: Enable Display Interface

Body Templates

Body templates are only capable of displaying images and text. There are multiple body templates, each having a different composition.





List Templates

The list template is used to display a set of scrollabe and selectable items (text and images).




To launch videos on an Echo Show you can use the VideoApp interface:

You can also optionally add a preamble message that Alexa will read before the video plays:

Find the official Amazon reference here.

Comments and Questions

Any specific questions? Just drop them below. Alternatively, you can also fill out this feedback form. Thank you!

Join Our Newsletter

Be the first to get our free tutorials, courses, and other resources for voice app developers.