[Tutorial] Build a Google Action in Node.js with Jovo

By Jan König (@einkoenig), published on August 2nd, 2017, last modified on February 28th, 2018 at 5:47 pm
Tags: Google Assistant, Tutorials

In this Google Action tutorial for beginners, you will learn how to build an Action for Google Assistant (the voice assistant living inside Google Home) from scratch. We will cover the essentials of building an app for Google Assistant, how to set everything up on Dialogflow and the Actions on Google Console, and how to use Jovo to build your Action’s logic.

Beginner Tutorial: Build a Google Action in Node.js

See also: Build an Alexa Skill in Node.js with Jovo

What you’ll learn

About Jovo: Jovo is an open source Node.js development framework for voice applications for both Amazon Alexa and Google Assistant. Check out the GitHub repository or the documentation, if you’re interested in learning more.

What We’re Building

To get you started as quickly as possible, we’re going to create a simple Action that responds with “Hello World!”

Please note: This is a tutorial for beginners and explains the essential steps of Google Action development in detail. If you already have experience with Google Home or Google Assistant and just want to learn more about how to use Jovo, either skip the first few sections and go right to Code the Skill, or take a look at the Jovo Documentation.


1) How do Google Actions Work?

In this section, you will learn more about the architecture of Google Assistant and how users interact with its Actions. First, let’s take a look at the wording of the different kinds of software and hardware that’s involved:

The difference between Google Home, Google Assistant, and Google Actions

While it’s the hardware device that most users see, Google Home is not the name of the assistant you can develop actions for (wich sometimes causes confusion when people talk about “building an Action for Google Home“). The artificial intelligence you can hear speaking from inside the smart speaker is called Google Assistant (which is now also available on Android and iOS smartphones). Actions on Google are the applications that can be built on top of the Google Assistant platform.

The main difference between the architecture of building Google Actions and Alexa Skills is that for Google you need an additional layer to handle the natural language. Most Action developers use Dialogflow to configure their application’s language model:

Google Assistant and Dialogflow integration

We will take a deeper look into Dialogflow in section 2: Create an Agent on Dialogflow.

To understand how Google Actions work, let’s take a look at the two important elements:

a) User Input

There are a few steps that happen before a user’s speech input is reaching your Action. The voice input process (from left to right) consists of three stages that happen at three (four, if you count Google and Dialogflow as two) different places:

Google Assistant Speech Input Request

  1. A user talking to an Assistant device like Google Home (speech input), which is passed to…
  2. the Actions API which uses its Dialogflow integration to understand what the user wants (through natural language understanding), and creates a request, which is passed to…
  3. your Action code which knows what to do with the request.

b) Assistant Output

The voice output process (from right to left) goes back and passes the stages again:

Google Assistant Output Response

  1. Your Action code now turns the input into a desired output and returns a response to…
  2. the Assistant API (through Dialogflow), which turns this response into speech via text-to-speech, sending sound output to…
  3. the Assistant device, where your user is happily waiting and listening

In order to make the Action work, we need to configure it on both Dialogflow (for the natural language understanding) and the Actions on Google Console (for the Assistant integration). We’re going to create a new Dialogflow Agent in the next step.

2) Create an Agent on Dialogflow

An Dialogflow agent offers a set of modules and integrations to add natural language understanding (NLU) to your product. Although it’s owned by Google, it’s platform agnostic and works for other channels like Facebook Messenger, as well.

We’re going to add our own agent now. Let’s get started:

a) Log in with your Google Account

Go to dialogflow.com and click “Go to console” on the upper right:

Dialogflow Website

Now sign in with your Google account. To simplify things, make sure to use the same account that’s registered with your Actions on Google enabled device like Google Home (if possible) for more seamless testing.

Sign into Dialogflow with your Google account

b) Create a New Agent

Great! Once you’re in the console, click “create agent”:

Create a new Dialogflow agent

We’re just going to name it “HelloWorldAgent” and leave the other information out for now:

Create a HelloWorldAgent on Dialogflow

After creating the agent, you can see the screen Intents:

List of intents of Dialogflow Agent: Default Fallback Intent and Default Welcome Intent

These intents are part of the Agent’s language model. Let’s take a deeper look into how it works in the next section.


3) Create a Language Model

Dialogflow offers an easy (but also highly customizable) way to create a language model for your Google Action.

Let’s take a look:

a) An Introduction to Dialogflow Interaction Models

Google Assistant and Dialogflow help you with several steps in processing input. First, they take a user’s speech and transform it into written text (speech to text). Afterward, they use a language model to make sense out of what the user means (natural language understanding).

A simple interaction model for Google Assistant (built with Dialogflow) consists of three elements: Intents, user expressions, and entities.

Intents: What the User Wants

An intent is something a user wants to achieve while talking to your product. It is the basic meaning that can be stripped away from the sentence or phrase the user is telling you. And there can be several ways to end up at that specific intent.

FindRestaurantIntent and User Expressions

For example, a FindRestaurantIntent from the image above could have different ways how users could express it. In the case of Dialogflow language models, these are called user expressions:

User Expressions: What the User Says

An user expression (sometimes called utterance) is the actual sentence a user is saying. There are often a large variety of expressions that fit into the same intent. And sometimes it can even be a little more variable. This is when entities come into play:


No matter if I’m looking for a super cheap place, a pizza spot that serves Pabst Blue Ribbon, or a dinner restaurant to bring a date, generally speaking it serves one purpose (user intent): to find a restaurant. However, the user is passing some more specific information that can be used for a better user experience. These are called entities:

FindRestaurantIntent with User Expressions and Entities

These are just the very basic components for you to get introduced to some terms. We don’t need to know a lot about entities for this simple tutorial. However, it’s good to know for later steps.

Now that we know a little bit about how language models work, let’s create our first intent that’s being used to ask for our user’s name.

b) Create a New Intent: HelloWorldIntent

After creating the agent, you can see that there are two standard intents already in place. We’re going to keep them. The “Default Welcome Intent” will later be mapped to the Jovo “LAUNCH” intent.

HelloWorldAgent: List of Intents

Let’s create another intent and name it “HelloWorldIntent”:

Create HelloWorldIntent on Dialogflow

Also add the following example phrases to the “Training Phrases” tab:

Training Phrases

Save the intent and create another one named “MyNameIsIntent”. With this one we are also going to add example phrases of what the user could say to “Training Phrases” and also add an entity called “name” in the “Action and parameters“:

Create MyNameIsIntent on Dialogflow

Now we have to map the entity we created to the “Training Phrases” section by selecting the word “name” and choosing “@sys.given-name:name“:

Map Entity on Dialogflow

Now, let’s look at the code!


4) Build Your Action’s Code

Now let’s build the logic of our Google Action.

We’re going to use our Jovo Framework which works for both Alexa Skills and Actions on Google Home.

a) Install the Jovo CLI

The Jovo Command Line Tools (see the GitHub repository) offer a great starting point for your voice application, as it makes it easy to create new projects from templates.

This should be downloaded and installed now (see our documentation for more information like technical requirements). After the installation, you can test if everything worked with the following command:

This should look like this:


b) Create a new Project

Let’s create a new project with the $ jovo new command (“helloworld” is the default template and will clone our Jovo Sample App into the specified directory):

c) A First Look at a Jovo Project

For now, you only have to touch the app.js file in the /app folder. This is where all the configurations and app logic will happen. You can learn more about the Jovo Architecture here.

Let’s take a look at app.js:

d) Understanding the App Logic

The handlers variable is where you will spend most of your time when you’re building the logic behind your Google Action. The “helloworld” template already has three intents:

What’s happening here? When your skill is opened, it triggers the LAUNCH-intent, which contains a toIntent call to switch to the HelloWorldIntent. Here, the ask method is called to ask for your user’s name. After they answer, the MyNameIsIntent gets triggered, which greets your user with their name.

That’s it for now. Of course, feel free to modify this as you wish. To create more complex Google Actions, take a look at the framework’s capabilities here: Jovo Framework Docs: Building a Voice App.


5) App Configuration: Where to Run Your Code

So where do we send the response to? Let’s switch tabs once again and take a look at the Fulfillment section at Dialogflow:

Dialogflow Webhook Fulfillment

To make a connection between the Dialogflow and your application, you need to an HTTPS endpoint (a webhook).

Jovo currently supports an Express server and AWS Lambda. We recommend the first one for local prototyping, but you can also jump to the Lambda section.


a) App Configuration: Local Prototyping with Express

Jovo project come with off-the-shelf server support so that you can start developing locally as easy as possible.

You can find that part in the index.js file:

Run Local Server

Let’s try that out with the following command (make sure to go into the project directory first):

$ jovo run

This will start the express server and create a subdomain,which you can then submit to Dialogflow:

This should be enough for now to test and debug your Google Action locally. If you want to learn more about how to make it work on AWS Lambda, proceed to the next section.

Or, jump to the section Add Endpoint to Dialogflow.


b) App Configuration: Host your Code on AWS Lambda

AWS Lambda is a serverless hosting solution by Amazon. Many Alexa Skills are hosted on this platform, thus it might make sense for you to host your cross-platform voice application (including your Google Action). This is what we’re going to do in this section. This usually takes a few steps, so be prepared. If you only want to get an output for the first time, go back up to Local Prototyping.

In the next steps, we are going to create a new Lambda function on the AWS Developer Console.

Create a Lambda Function

Go to aws.amazon.com and log into your account (or create a new one):

AWS Portal

Go to the AWS Management Console:

AWS Services

Search for “lambda” or go directly to console.aws.amazon.com/lambda:

AWS Lambda Functions

Click “Create a Lambda function” and choose “Blank Function” from the selection:

AWS Lambda Blueprints

As a trigger, choose “Alexa Skills Kit.” This way, your code will later also work with Alexa skills. We are going to create an API Gateway after creating the Lambda function.

AWS Lambda: Configure Triggers

Now, you can configure your Lambda function. We’re just going to name it “myGoogleAction”:

AWS Lambda: Create myGoogleAction

Upload Your Code

Now let’s get to the fun part. You can either enter to code inline, upload a zip, or upload a file from Amazon S3. As we’re using other dependencies like the jovo-framework npm package, we can’t use the inline editor. We’re going to zip our project and upload it to the function.

Let’s take a look at the project directory:

Jovo Project files in Mac Finder

To upload the code to Lambda, please make sure to zip the actual files inside the directory, not the HelloWorld folder itself:

Select and zip all files in the folder

Let’s go back to the AWS Developer Console and upload the zip:

Lambda Function: Upload ZIP

Now scroll down to the next step:

Lambda Function Handler and Role

For the Lambda Function Handler, use index.handler.

Lambda function handler and role

You can either choose an existing role (if you have one already), or create a new one. We’re going to create one from a template and call it “mySkillRole” with no special policy templates:

AWS Lambda: Create myActionRole

Click “Next” to proceed to the next step. In the “Review” process, click “Create function” to the lower right corner:

AWS Lambda Function: Review

Test Your Lambda Function

Great! Your Lambda function is now created. Click “Test” to see if it works:

AWS Lambda Function: Test

The beautiful thing about the Jovo Framework is that it works for both Google Assistant and Amazon Alexa. So, for this input test event, you can just use the “Alexa Start Session” template and it works. Look for the green checkmark at bottom of the page:

AWS Lambda Function: Test Success

If you want to test it with a “real” Google Assistant request, you can also copy-paste this one:

	"originalRequest": {
		"source": "google",
		"version": "2",
		"data": {
			"isInSandbox": true,
			"surface": {
				"capabilities": [
						"name": "actions.capability.AUDIO_OUTPUT"
						"name": "actions.capability.SCREEN_OUTPUT"
			"inputs": [
					"rawInputs": [
							"query": "talk to my test app",
							"inputType": "KEYBOARD"
					"intent": "actions.intent.MAIN"
			"user": {
				"locale": "en-US",
				"userId": "1501754379730"
			"device": {},
			"conversation": {
				"conversationId": "1501754379730",
				"type": "NEW"
	"id": "ce231a64-af08-4c33-bfa3-0724a80d5b2c",
	"timestamp": "2017-08-03T09:59:39.741Z",
	"lang": "en",
	"result": {
		"source": "agent",
		"resolvedQuery": "GOOGLE_ASSISTANT_WELCOME",
		"speech": "",
		"action": "input.welcome",
		"actionIncomplete": false,
		"parameters": {},
		"contexts": [
				"name": "google_assistant_welcome",
				"parameters": {},
				"lifespan": 0
				"name": "actions_capability_screen_output",
				"parameters": {},
				"lifespan": 0
				"name": "actions_capability_audio_output",
				"parameters": {},
				"lifespan": 0
				"name": "google_assistant_input_type_keyboard",
				"parameters": {},
				"lifespan": 0
		"metadata": {
			"intentId": "b0b7962c-cae0-4437-bddf-e72f457959d6",
			"webhookUsed": "true",
			"webhookForSlotFillingUsed": "false",
			"nluResponseTime": 2,
			"intentName": "Default Welcome Intent"
		"fulfillment": {
			"speech": "Greetings!",
			"messages": [
					"type": 0,
					"speech": "Hi!"
		"score": 1
	"status": {
		"code": 200,
		"errorType": "success"
	"sessionId": "1501754379730"

Create API Gateway

For Alexa Skills, you can just use the Lambda function’s ARN to proceed, for Dialogflow, we need to create an API Gateway.

Go to console.aws.amazon.com/apigateway to get started:

Amazon API Gateway Website

Let’s create a new API called “myGoogleActionAPIGateway” (you can call it whatever, though):

Create myGoogleActionAPIGateway

After successful creation, you will see the Resources screen. Click on the “Actions” dropdown and select “New Method”:

API Gateway: New Method

Dialogflow needs a webhook where it can send POST requests to. So let’s create a POST method that is integrated with our existing Lambda function:

API Gateway: Create POST Method

Grant it permission:

API Gateway: Lambda Function Permission

And that’s almost it. You only need to deploy the API like this:

API Gateway: Deploy API

And create a new stage:

API Gateway: Deployment stage

Yes! Finally, you can get the URL for the API Gateway from here:

API Gateway: Invoke URL

There’s one more step we need to do before testing: we need to use this link and add it to Dialogflow.


6) Add Endpoint to Dialogflow

Now that have either our local webhook or the API Gateway to AWS Lambda set up, it’s time use the provided URL to connect our application with our agent on Dialogflow.

a) Agent Fulfillment Section

Go back to the Dialogflow console and choose the Fulfillment navigation item. Enable the webhook and paste either your Jovo webhook URL or the API Gateway:

Dialogflow Webhook Fulfillment with URL

b) Add Webhook to Intents

Dialogflow offers the ability to customize your language model in a way that you can choose for every intent how it’s going to be handled.

This means we need to enable webhook fulfillment for every intent we use in our model.

Go to HelloWorldIntent first and check “Use webhook” in at the bottom of the page:

Dialogflow add webhook fulfillment to HelloWorldIntent

Do the same for the “MyNameIsIntent” and also take a look at the “Default Welcome Intent” and don’t forget to check the box there as well. The intent comes with default text responses, which would otherwise cause random output instead of your model, when the application is launched.

Dialogflow add webhook fulfillment to Default Welcome Intent

Great! Now let’s test your Action.


7) “Hello World!”

The work is done. It’s now time to see if Google Assistant is returning the “Hello World!” we’ve been awaiting for so long. There are several options to test our Google Action:

a) Test in Dialogflow

For quick testing of your language model and to see if your webhook works, you can use the internal testing tool of Dialogflow.

You can find it to the right. Just type in the expression you want to test (in our case “my name is jan”) and it returns your application’s response and some other information (like the intent):

Dialogflow Internal Testing Tool

Testing with Dialogflow will often be enough (and especially useful, as other tools can sometimes be a bit buggy). However, it doesn’t test the integration between Dialogflow and Google Assistant. For this, you need to use the Actions on Google Simulator (see next step).


b) Test in the Actions on Google Simulator

Now, let’s make our Dialogflow agent work with Google Assistant. Open the Integrations panel from the sidebar menu:

Dialogflow Integrations

Here, choose the “Actions on Google” integration:

Dialogflow Actions on Google Integration

Click “Test” and, on the success screen, “Continue”:

Dialogflow Assistant app successfully updated

In the Simulator, you can now test your Action:

Actions on Google Simulator

Yeah! Your application is now an Action on Google Assistant.


The Simulator can be unreliable sometimes. For example, there are a few things that could make an error message show up: “Sorry, this action is not available in simulation”

Sorry, this action is not available in simulation

There are several things that could be useful for troubleshooting:

  • Use the right sample prhase (Talk to my test app)
  • Make sure you’re using the same Google account for logging into Dialogflow and the Actions on Google console
  • If you have more Actions projects, disable all others for testing
  • Turn on Voice & Audio Activity, Web & App Activity, and Device Information permissions for your Google Account here: Activity controls

It can also be helpful to go through the process one more time. Go to Integrations on Dialogflow, choose the Actions on Google integration, and click on “Test”:


Dialogflow Actions on Google integration Test

Let us know in the comments if it worked!

c) Test on your Assistant enabled device

If you want to test your Action on a Google Home (or other device that works with Google Assistant), make sure you’re connected to it with the same Google account you’re using for the Simulator (and that testing is enabled, see previous step).

Then, use the invocation that was provided by the Simulator:

OK Google, talk to my test app


Next Steps

Great job! You’ve gone through all the necessary steps to prototype your own Google Action. The next challenge is to build a real Action. For this, take a look at the Jovo Documentation to see what else you can do with our Framework:

Jovo Documentation for Alexa Skills and Google Actions

Stay up to date

Get news and free voice develpment resources in your inbox. No spam.

No spam. You can find previous editions here.


Any specific questions? Just drop them below. Alternatively, you can find other channels to reach us here. Thank you!