Add a Custom Platform to the Jovo Framework

by Kaan Kilic on Jun 29, 2020

Tutorial: Adding your own custom platform to Jovo

In this tutorial, we will go over the whole process of adding a new custom platform integration to the Jovo Framework.

Introduction

The Jovo Framework can already be used to build voice experiences that work across platforms like Alexa, Google Assistant, Facebook Messenger, Samsung Bixby, and more. Additionally, the framework is highly extensible and offers the ability to add more platforms. Reasons for this could be:

  • To build your own custom assistant that works with your own ASR and/or NLU
  • To add a platform that is not supported yet (for example, Slack)

Each Jovo platform integration consists of a handful of different parts needed for the core functionality. In this tutorial, we will go over the framework's architecture, understand how the framework handles the request & response cycle and which part of it is handled by the platform integration. After that, we will get into more detail and lay out the structure of a platform integration package.

While doing all of that, we will take the Twilio Autopilot integration (find it on the Jovo Marketplace or on GitHub) as an example. It fits our purpose perfectly since it is not very complex. We've also only integrated the key features to get the platform working at this point, meaning we won't have to go over too much stuff that is only specific to the Autopilot platform. At some point we will talk about stuff that is more advanced and not included with the Autopilot integration, in that case, we will use the Alexa integration(find it on the Jovo Marketplace or on GitHub) as an example.

The tutorial will be quite long and you won't be able to remember everything we've gone over here, so it would be best to revisit the tutorial multiple times throughout your endeavor to add a new platform.

Now let's start with the framework's architecture.

Jovo Framework Architecture

The basics: the framework is written in Typescript, uses Jest for its tests and is a monorepo managed using Lerna. It's separated into five parts:

  • jovo-core: contains the core functionality of the framework
  • jovo-clients: contains all the client integrations (like web)
  • jovo-platforms: contains all the platform integrations
  • jovo-integrations: contains all the other integrations (e.g. DB, analytics, CMS, etc.)
  • jovo-framework: extends the jovo-core package with logging, hosting, and integration capabilities

Each interaction goes through a request & response cycle, where the Jovo app receives the request from the platform in a JSON format, routes through your logic, and then assembles a response that is sent back to the platform. If you don't have a set structure yet, you can take a look at request and response formats of the Jovo Core Platform:

To receive the request and to send out the response takes a handful of steps. First, the request has to be processed. We have to determine from which platform it came, what kind of request it is (LAUNCH, INTENT, END, etc., learn more about intents here), and if any inputs were parsed. After that, we load the user's data from the DB and determine the route to the correct handler function. With every piece initialized and processed, the handler function can be executed. There are three pieces each handler function can modify: session data ($session), user's DB entry ($user), and the response ($output). After the handler functions are run, the framework saves the updated user data to the DB, and creates the response using both $session and $output.

Contrary to popular belief, stuff like ask(), showImageCard(), and all the other calls that modify the response don't do so directly. All of that only modifies the $output object, which is then used to by the platform integrations to build the actual response.

The whole process explained above is handled by a bunch of middlewares that are executed in a particular order. Each middleware provides the possibility for plugins to hook up their functions which will also be executed.

Jovo Middleware Execution

By using the middleware approach (learn more here), we allow anybody to easily extend the framework. You can either write packages that hook up into the middlewares or make use of hooks to do it directly in your Jovo project. Each middleware has a distinct purpose:

Middleware Description
setup First initialization of app object with the first incoming request. Is executed once as long as app is alive
request Raw JSON request from the platform gets processed. Can be used for authentication middlewares.
platform.init Determines which platform (e.g. Alexa, GoogleAssistant) sent the request. Initialization of abstracted jovo (this) object.
asr Request gets routed through external ASR. Only used by certain platforms.
platform.nlu Natural language understanding (NLU) information gets extracted for built-in NLUs (e.g. Alexa). Intents and inputs are set.
nlu Request gets routed through external NLU (e.g. Dialogflow standalone). Intents and inputs are set.
user.load Initialization of user object. User data is retrieved from the database.
router Request and NLU data (intent, input, state) is passed to router. intentMap and inputMap are executed. Handler path is generated.
handler Handler logic is executed. Output object is created and finalized.
user.save User gets finalized, DB operations.
tts Output is routed through external TTS. Only used by certain platforms.
platform.output Platform response JSON gets created from output object.
response Response gets sent back to platform.
fail Errors get handled if applicable.

Depending on the incoming request and the platform, not all of them are necessarily needed. For example, the asr middleware is not needed if the request is coming from Alexa, which already uses built-in speech recognition.

Now that we've covered the architecture of the framework, let's continue with the general architecture of a Jovo package.

Jovo Package Architecture

The easiest would be to copy all of the above files (besides the package-lock.json)from a different package. After that, we only have to change the README.md and package.json. For example, find the Autopilot package on GitHub to see the final result.

package.json

First, we have to change the name of the package. We use the following naming pattern: jovo-platform-[platform-name], e.g. jovo-platform-twilioautopilot. After that, we remove the dependencies we don't need besides:

  • dependencies:
    • jovo-core
  • devDependencies:
    • @types/jest
    • @types/node
    • jest
    • jovo-framework
    • prettier
    • rimraf
    • source-map-support
    • ts-jest
    • tslint
    • typedoc
    • typescript

I won't provide a JSON snippet for you to copy since the versions will be most likely outdated at the time you read this. Take a look at the Twilio Autopilot package.json here.

tsconfig.json

We use the following tsconfig.json for all of our packages:

README

The README will contain all the documentation for the package. You can find the one for Twilio Autopilot here. This README will later be parsed for the Jovo Marketplace if you decide to share your integration with the community.

Jovo Platform Package Architecture

Let's start with the src/ folder where all the logic is stored:

We will go over each file and its purpose. First, the Autopilot.ts file. It's the heart of the integration and implements the class the user later on imports into their Jovo project. It is also the only class that hooks up to the middlewares directly.

Core

Next up, the core/ folder which contains the platform's modules that are user-facing, e.g. $autopilotBot, $request, $user, etc.

Let's start with the AutopilotBot.ts file which implements the platform's Jovo object and adds some of the helper functions. In your Jovo project you would reach the object using this.$autopilotBot:

Now, we continue with the AutopilotRequest.ts file. The class has the same properties as the platform's request JSON. Besides that, it implements a handful of getter and setter functions as well as the toJSON and fromJSON functions:

The AutopilotResponse.ts file is not any different than the request file so we will skip that. Instead, we will have a look at the AutopilotUser.ts file which implements the $user object. Since the Autopilot platform doesn't provide any user-specific functionality, the class just provides the basic implementation.:

In the Alexa integration we use the class to implement stuff like shopping lists, profile data, etc.:

Next up, the AutopilotSpeechBuilder.ts file. The class is used for both $speech and $reprompt. Again, we only provide a basic implementation of the SpeechBuilder. Other integrations do extend it with platform-specific stuff (e.g. Amazon Polly):

The last two mandatory files in the core/ folder are AutopilotRequestBuilder.ts and AutopilotResponseBuilder.ts. Both are needed for the Jovo TestSuite. Let's start with the request builder.

The AutopilotRequestBuilder.ts allows you to create all type of requests in your unit tests:

The AutopilotResponseBuilder only has one job. Create an AutopilotResponse from the response the Jovo app send back after receiving the request created with the request builder:

That's it for the core/ folder.

Modules

The modules/ folder contains all the files that are responsible to process requests and prepare the response. For example, all the methods hooked up to the action set are located here.

The AutopilotCore plugin takes care of all the core functionality. Initializing the AutopilotBot, and $request object, determining the request type, initializing the session data, and also creating the response from the $output object.

Next up, the AutopilotNLU.ts file which implements the plugin to parse the request's intent and input data:

Besides that, the modules/ folder contains all the output capabilities of a platform. For example, there is an AudioPlayer module that handles all of the audio functionality. Most of the time, these modules have their object which can be accessed using the platform's jovo object, e.g. this.$autopilotBot.$audioPlayer.

There is a small difference between all the previous files and the upcoming modules. We differentiate between the plugin and the object that can be accessed later on. The plugin object is initialized on the start-up of your Jovo app instance, meaning it's the same for multiple users. The object you access using this.$autopilotBot.$audioPlayer is unique for every user since it's initialized with every incoming request.

Another example would be cards, which also have their module, but not their object accessed using the jovo object. These can be accessed from the jovo object directly, e.g. this.$autopilotBot.showStandardCard():

index.ts

The last piece of your integration's logic is the index.ts file. It has two purposes. First, we export all the types that might be useful to the user. Besides that, we extend some of the existing interfaces:

Tests

Last but not least, let's have a look at tests. Every platform integration has to fulfill the FrameworkBase tests. These test the base capabilities that are the same for every platform (routing, basic output, session, user, etc.).

They are e2e (in our case request to response) tests which make use of the Jovo TestSuite. Here's a small part of the test file:

If you're not familiar with the Jovo TestSuite, check out the docs here.

Besides that, it would be best to add additional tests for platform-specific stuff.

Development

The easiest way to develop your package is to use the jovo-framework repository. To get started, create your package, and add all the necessary configuration files (package.json, tsconfig.json, .npmignore, tslint.json, etc.). After that, run npm run bootstrap from the root directory of the repository to bootstrap the packages. After that, you can start developing the integration.

To test your package, first, compile all the packages using npm run tsc (again from the root directory). Now, add your integration as a dependency to one of the example projects in the examples folder, e.g. the hello-world project.

After that, run npm run clean to first delete all the node_modules folders and then run npm run bootstrap again. This time, the example project will include your local package as well.

You can now go ahead and add the platform to the hello-world as you would with any other package and start testing.

Besides that, you have to configure the app on the platform you want to add. Most likely, you will only have to add your webhook URL to receive requests. But, that is specific to the platform.

Conclusion

Well, we've finally reached the end. As I said at the beginning of the tutorial, it's not difficult to add a platform integration. It's just quite a good amount of work.

Technically there is still more to do. We could also add Jovo CLI support to build and deploy platform files but that is a whole post in itself.

If you get stuck along the way, feel free to reach out to us on Slack.


Icon by Eray Zesen under Creative Commons (Attribution 3.0 Unported)


Kaan Kilic

Technical Content Marketing Associate at Jovo

Comments and Questions

Any specific questions? Just drop them below or join the Jovo Community Forum.

Join Our Newsletter

Be the first to get our free tutorials, courses, and other resources for voice app developers.