This tutorial shows you how to create a fun treasure hunt game for Android devices using the IBM Watson™ Visual Recognition and Text to Speech services, as well as the IBM Cloud™ App ID service.

Learning objectives

This tutorial creates a treasure hunt game in which the player is given the hints to find something and takes a photo of it to pass a level. To give the hints, the Watson Text to Speech service is used. To determine if the player passes the level, custom visual recognition models are built in Watson Studio with data sets of images. This is integrated within the application. The models are trained with Watson Studio. Furthermore, App ID service is used to add authentication to the game, to track players of the game.

Prerequisites

  1. Sign up for an IBM Cloud account
  2. Install Android Studio
  3. A data set for the visual recognition model; use these sample data sets or any data set of your choice

Estimated time

This tutorial takes approximately 20 minutes to complete if you already have an IBM cloud account set up.

Steps

First, you must set up the services on IBM Cloud, then you set up the client application.

Set up services on IBM Cloud

  1. Create an instance of the Watson Visual Recognition service and get your credentials:

    1. Go to the Watson Visual Recognition page in the IBM Cloud Catalog.
    2. Log in to your IBM Cloud account.
    3. Click Create.
    4. Click Service Credentials > New Credentials > Add.

      alt

    5. Copy the apikey value, or copy the username and password values if your service instance doesn’t provide an apikey.

    6. Copy the url value.

      alt

  2. Create an instance of the App ID service and get your credentials:

    1. Go to the App ID page in the IBM Cloud Catalog.
    2. Click Create.
    3. Click Service Credentials > New Credentials > Add.
    4. Copy the apikey value, or copy the username and password values if your service instance doesn’t provide an apikey.
    5. Copy the url value.
  3. Create an instance of the App ID service and get your credentials.

    1. Go to the Watson Text to Speech page in the IBM Cloud Catalog.
    2. Click Create.
    3. Click Service Credentials > New Credentials > Add.
    4. Copy the apikey value, or copy the username and password values if your service instance doesn’t provide an apikey.
    5. Copy the url value.
  4. Create an instance of Watson Studio.

    1. Go Watson Studio in the IBM Cloud Catalog.
    2. Click Create.

      alt

    3. Click Get Started.

      alt

    4. Click Create a Project > Visual Recognition.

      alt

    5. Name the project. Select Storage > cloud object storage. Click Create.

  5. Create the Visual Recognition models.

    1. Click Create a Class. Give it a name. Create at least two classes. In this tutorial, five classes: Trees, BurjKhalifa, BurjArab, MiracleGarden, and GlowGarden.
    2. Upload the ZIP files for the data set: Upload to Project > Browse > Choose folder.
    3. Open the class you want to add the data set to. Click on the uploaded ZIP folder on the right > Add to Model.

      alt

    4. After this is added, Click Train Model. This will take some time.

      alt

      To change the name of the model, edit the following field:

      alt

      The following image shows the updated model name.

      alt

  6. Go back to Projects.

    alt

    1. Click on the Project Name.

      alt

    2. Copy the Model ID provided.

Set up the client application: Android App

  1. Clone this repo.
  2. Start Android Studio and open project.
  3. In Android Studio under Gradle Scripts/build.gradle (Module:app), add the following to the dependencies block:

    For App ID:

    implementation 'com.github.ibm-cloud-security:appid-clientsdk-android:5.0.0'
    

    For Android SDK for IBM Watson:

    implementation 'com.ibm.watson.developer_cloud:android-sdk:0.5.0'
    

    For Java™ SDK for IBM Watson:

    implementation 'com.ibm.watson.developer_cloud:java-sdk:6.13.1'
    

App ID

  1. Add this in defaultConfig in the same gradle file:

    manifestPlaceholders = ['appIdRedirectScheme': android.defaultConfig.applicationId]
    
  2. Click Sync Now.

Manifest file

Add the following permissions to the manifest file:

    <uses-permission android:name="android.permission.RECORD_AUDIO" />
    <uses-permission android:name="android.permission.INTERNET" />
    <uses-permission android:name="android.permission.WRITE_EXTERNAL_STORAGE" />
    <uses-permission android:name="android.permission.CAMERA" />

Credentials

  1. Add the tenant ID from App ID credentials in the following line of code in Activity_Login:

     AppID.getInstance().initialize(getApplicationContext(), "tenant id", AppID.REGION_US_SOUTH);
    
  2. Add the credentials for Visual Recognition. Edit the API key in the strings.xml file:

     <string name="api_key">api key</string>
    
  3. Add the credentials for the Watson Text to Speech service. Edit the API key in the strings.xml file:

     <string name="api_keyTTS">api key for text to speech</string>
    

    Add the URL in the following code from service credentials (for all Level Activities under speakhint ):

      public void speakhint() {
         IamOptions options = new IamOptions.Builder()
                 .apiKey(getString(R.string.api_keyTTS))
                 .build();
    
         textToSpeech = new TextToSpeech(options);
    
         //Add the url from service credentials
         textToSpeech.setEndPoint("add url here");
    
         new SynthesisTask().execute(hint);
    
  4. Add the Model ID in the following line of code in all Level Activities (Level1, Level2, Level3, Level4 and Level5):

     ClassifyOptions classifyOptions = new ClassifyOptions.Builder()
                             .imagesFile(imagesStream)
                             .imagesFilename(photoFile.getName())
                             .threshold((float) 0.6)
                             .classifierIds(Arrays.asList("Model ID"))
                             .build();
    

After you’ve followed the instructions to add credentials (Model ID, API keys, and tenant ID), you’re done. Run the app and play the game!

Code

App ID Login

The following code was added in the Activity_Login to implement App ID for user authentication; in this game, we use this service to track the users of the game because it adds authentication method to the app:

    @Override
    protected void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.activity_login);
        AppID.getInstance().initialize(getApplicationContext(), "tenant id", AppID.REGION_US_SOUTH);
        handler.postDelayed(runnable, 2000); //2000 is the timeout for the splash

        btn_login = (Button) findViewById(R.id.btn_login);
        btn_login.setOnClickListener(new View.OnClickListener() {
            @Override
            public void onClick(View v) {

                btn_login.setVisibility(View.GONE);

                LoginWidget loginWidget = AppID.getInstance().getLoginWidget();
                loginWidget.launch(Activity_Login.this, new AuthorizationListener() {
                    @Override
                    public void onAuthorizationFailure (AuthorizationException exception) {
                        //Exception occurred
                    }

                    @Override
                    public void onAuthorizationCanceled () {
                        //Authentication canceled by the user
                    }

                    @Override
                    public void onAuthorizationSuccess (AccessToken accessToken, IdentityToken identityToken, RefreshToken refreshToken) {
                        //User authenticated

                        Intent intent = new Intent(Activity_Login.this, MainActivity.class);
                        startActivity(intent);
                    }
                });
            }
        });
    }

Visual Recognition

In Level Activities, under the onActivityResult method, the following code was added to implement the Visual Recognition service:

@Override
    protected void onActivityResult(int requestCode, int resultCode, @Nullable Intent data) {
        super.onActivityResult(requestCode, resultCode, data);

        if (requestCode == CameraHelper.REQUEST_IMAGE_CAPTURE) {
            final Bitmap photo = mCameraHelper.getBitmap(resultCode);
            photoFile = mCameraHelper.getFile(resultCode);
            //  mImageView.setImageBitmap(photo);

            backgroundThread();

        }
    }

In the backgroundThread method, the following code was added for making the network call, parsing the result from Visual Recognition service to determine whether to start next activity (level passed) or not (level failed):

private void backgroundThread(){

        AsyncTask.execute(new Runnable() {
            @Override
            public void run() {
                InputStream imagesStream = null;
                try {
                    imagesStream = new FileInputStream(photoFile);
                } catch (FileNotFoundException e) {
                    e.printStackTrace();
                }
                ClassifyOptions classifyOptions = new ClassifyOptions.Builder()
                        .imagesFile(imagesStream)
                        .imagesFilename(photoFile.getName())
                        .threshold((float) 0.6)
                        .classifierIds(Arrays.asList("Model Number"))
                        .build();
                ClassifiedImages result = mVisualRecognition.classify(classifyOptions).execute();
                Gson gson = new Gson();
                String json = gson.toJson(result);
                Log.d("json", json);
                String name = null;
                try {
                    JSONObject jsonObject = new JSONObject(json);
                    JSONArray jsonArray = jsonObject.getJSONArray("images");
                    JSONObject jsonObject1 = jsonArray.getJSONObject(0);
                    JSONArray jsonArray1 = jsonObject1.getJSONArray("classifiers");
                    JSONObject jsonObject2 = jsonArray1.getJSONObject(0);
                    JSONArray jsonArray2 = jsonObject2.getJSONArray("classes");
                    JSONObject jsonObject3 = jsonArray2.getJSONObject(0);
                    name = jsonObject3.getString("class");

                } catch (JSONException e) {
                    e.printStackTrace();
                }
                final String finalName = name;

                runOnUiThread(new Runnable() {
                    @Override
                    public void run() {
                        mTextView.setText("Detected Image: " + finalName);

                        Log.d(TAG, "Ans: " + finalName);

                        if(finalName.equals("Trees")){
                            Intent mass = new Intent(Main2Activity.this, Main3Activity.class);
                            startActivity(mass);
                        }
                        else {

                            Toast toast = Toast.makeText(getApplicationContext(), "Sorry. Try Again!", Toast.LENGTH_LONG);
                            toast.setGravity(Gravity.CENTER_VERTICAL, 0, 0);
                            toast.show();

                        }

                    }
                });

            }
        });

    }

In the previous code, the name of the class should be edited in the following condition in the place of Trees, depending on the names of the classes that you added in the Visual Recognition model:

 if(finalName.equals("Trees")){
                            Intent mass = new Intent(Main2Activity.this, Main3Activity.class);
                            startActivity(mass);
                        }

Text to Speech service

In the Level Activities, the following code was added to implement the Watson Text to Speech service:

 public void speakhint() {
        IamOptions options = new IamOptions.Builder()
                .apiKey(getString(R.string.api_keyTTS))
                .build();

        textToSpeech = new TextToSpeech(options);

        //Add the url from service credentials
        textToSpeech.setEndPoint("add url here");

        new SynthesisTask().execute(hint);
    }

 private class SynthesisTask extends AsyncTask<String, Void, String> {
        @Override
        protected String doInBackground(String... params) {
            SynthesizeOptions synthesizeOptions = new SynthesizeOptions.Builder()
                    .text(params[0])
                    .voice(speakLanguage)
                    .accept(SynthesizeOptions.Accept.AUDIO_WAV)
                    .build();
            player.playStream(textToSpeech.synthesize(synthesizeOptions).execute());
            return "Did synthesize";
        }
    }

Architecture

The following images show the app’s architecture.

Application architecture

Demo

Take a look at how the game works.

Login

Demo

Authentication

Demo1

Level 1: Visual Recognition

Demo3

Level Passed

Demo4

Summary

This tutorial introduced Visual Recognition models by showing how to build a fun treasure hunt game. You started by creating services on the IBM Cloud platform, then you built the application on the client side — and there you have it!

Contributors

Kunal Malhotra, Ayush Maan, Anchal Bhalla contributed to this game and demo.