IBM Data Science Canvas
Data Science Experience Platform

As a part of IBM’s Analytics Platform, my team created a data modeling SaaS product for Data Scientists. We focused on the “canvas” portion of the software, which is a flexible, collaborative GUI-to-code interface for data modeling.

 

My main responsibilities were to clarify our target persona, engage with users for feedback, and create UX designs based on research insights. I also led the creation of 2-week design/research sprint cycles, which was recognized and shared internally.
 

Below, I’ve included some of my research and design artifacts that hint at how we converged on our designs. However, I’ve had to emit a great deal of material due to IBM’s NDA. Please reach out and I would be happy to share more details on the experience and skills that I gained while creating the Canvas Modeling UX.

Role

User Research

UX Design

Collaboration with

Visual designer

Frontend designer

Design lead

Development

Product manager

Problem: shortage of data scientists

In the last 5-6 years, the demand for businesses to leverage data-driven insights has increased exponentially. Yet because data science is still a new domain, there is a shortage of seasoned data scientists who can create advanced analytical models. This trend has encouraged people with math, business

and computer science backgrounds to transition into data science roles. However, their limited knowledge working in code(RStudios and/or Python) limits them from creating and exploring a wide breadth of advanced analytical models.

 

6 out of 10 companies provide in house training for employees to do data science

Source: MIT Sloan Mangement Review survey of 2,710 business executives, managers and analytics professionals worldwide.

Solution: enable emerging data scientists

The ultimate vision of the “canvas” is to provide data science teams with an intuitive GUI to work collaboratively and create advanced analytical models (import and transform data, apply algorithms, create models, and deploy the models) without having to spend any time working in code.

Early Concept

Legacy Product: SPSS Modeler

IBM currently offers a similar product called the SPSS modeler, a clunky but robust on-prem GUI for advanced data modeling. The canvas took technical capabilities of the SPSS Modeler, and redesigned the UX starting with a thorough understanding of data scientists' needs.

SPSS Modeler

Cross Platform Research

User Research

I worked with product designers and researchers across the Analytics platform to better understand the spectrum of users in the data science ecosystem. My responsibilities included engaging with user communities, recruiting users, and planning research activities for the team.

I engaged with data scientists at the University of Texas, Galvanize, Meetups, and Verizon. We conducted contextual observations, user interviews in-person and over the phone, and received feedback through surveys.

Defining Personas

Our team’s first challenge was to define our products' personas. Our persona lay somewhere between the data scientist and business analyst. From our initial research, we found that there was an ambiguous line distinguishing the roles of  data scientist versus a business analyst in the field of data analytics.

Data Scientist Development Stages

Early Career

Either currently in or recently graduated from a masters or immersive Data Science program.

Relatively lower skill level and tends to be relatively equal in the aspects of math, computing, and business.

Transitioning

Transitioning into data science from a business analytics background or from an engineering background.


Many companies support this transition in their employees since they already have domain knowledge.

Practicing

Stronger math or quantitative background.

Proficient in computing as well. There are some who are stronger on the computing side.

Data Science Process

The data science process is an iterative process like any scientific experiment. It requires data scientists to test their hypothesis, change variables, log outcomes, and iterate accordingly. Depending on the size of the organization,

data science teams can range from 1-10 members. Some data scientists will cover the end to end data science process alone or collaborate with data engineers and business analysts.

DSX Canvas's Focus

Data Science Pain Points

IBM Internal and Competitors' Offerings Research

We researched competitors offerings and also regularly met with two of IBM's expert data scientists to understand and break down the capabilities of SPSS Modeler, an IBM legacy product offering GUI for advanced data modeling.

SPSS Modeler Heuristic Evaluation

SPSS Modeler

Competitor Offerings Evaluation

Design Principles Based on User Research

We used these design tenets as guiding principles for making design decisions, and further validated our ideas through user testing.

 

Biweekly Research Sprints

An essential part of our design process was managing agile internal collaboration with the research, design, dev team and offering managers. We had to collaborate closely with our offering managers to understand the business goals, developers for tech feasibility, and research to make sure design decisions were validated by user feedback.

I led the creation of a two-week research sprint cycle. These research-centric sprints followed a reliable cadence that helped to align business goals with user needs and technical feasibility. My research sprint cycle was recognized internally and shared with all design researchers.

  • Collaborative for data science teamwork

  • Flexible for iterative experimentation

  • Clarity for navigating complex workflows

Below are some of the key components of DSX Canvas.

  • Node designs (key contributor)

  • Tools palette (key contributor)

  • Node settings editor designs (key contributor)

  • Commenting on canvas

  • Canvas search

  • Error states

Canvas Design Opportunities

Node Interactions

Based on technical research, we noted 3 types of nodes- input, intermediate, and output. We explored multiple variations of the nodes, so they would be easy to differentiate from a quick glimpse. Our users should be able to tell apart the data nodes from the algorithm nodes and the output model nodes.

Data scientists are generally skeptical of black box data, so providing transparency to information was an important factor to consider in all of our designs. When the nodes are connected and processing results, we let the user feel in control by making the progress status visible.

Tool Palette

The tool palette was an important experience to consider on the canvas because we needed to maximize the canvas's  screen space for experimentation. We explored the idea of a tool palette sliding out from the right side navigation on top of the canvas and a flexible moving tool palette that users can drag around the canvas. The tool palette had to be easily and quickly accessible because it is one of the features which users will interact with the most.

Through user testing, we found that flexibility is key for data scientists’ workflow because this allows our users to creatively iterate on experiments. We went with the flexible tool palette, which appears when users click on the palette icon. The tools were categorized by their capabilities, but we also added another category for recently nodes.

Editor Setting

We went through the nodes (approx 90 total) in the SPSS Modeler to look for common patterns inside the node’s editor settings. We designed a template for the editors, so that the developer could populate them with the settings for the individual nodes.

Through user testing, we found that SPSS users open multiple editors dialogues at the same time to compare settings. We also found that it is important to have context to how the nodes are connected together when viewing the editor settings. We designed the editors to expand when the user clicks on the node and still be able to view the nodes that are connected to it.

Evaluative Research: User Feedback, Surveys, Usability Tests

We created multiple iterations of lo-fi to mid-fi wireframes. In our 2 week design sprints, we looped in user feedback through recurring usability tests.

Open Beta Canvas on IBM Data Science Experience

The integration of SPSS Modeler "canvas" in the Data Science Experience was announced at the 2016 World of Watson conference. The canvas is available as open BETA on the IBM Data Science Experience Platform.

Explorations

Whiteboard Sessions Laying Out Site Map and User Flow

Key Learnings

  • Launching agile designs sprints with the entire team

  • Pros and cons of working in agile sprints

  • Communicating with dev and PM teams to understand their needs and business objectives

  • Learning to stay focused while working through ambiguity

  • Business driven UX process is not a linear process

  • Building internal stakeholder's trust and confidence through user research

  • UX design isn’t just about designing interfaces, but so much more about figuring out relationships within your company and understanding the core business objective.

Challenges

  • Silo-ed and dispersed teams- research, design, dev, PM

  • High learning curve to overcome technical domain knowledge

  • Lack of clarity on business objectives

  • Disconnected research and design goals

  • Unanswered questions and assumptions about our users

Overcoming Challenges

  • Providing research driven design guidance to new DSX canvas

  • Implementing agile sprints and evangelizing sprint process across DSX product teams

  • Positive feedback from data scientists for the new SPSS Modeler

  • Increased understanding of user centric design processes across PM and dev team while working on the next project, Watson Machine Learning/Deep Learning.

NGWB Information Architecture

Collaborative session mapping out the NGWB information architecture. We wrote all the features that came to mind on post-its, then structured them into an IA flow.