IBM Data Science Canvas
Data Science Experience Platform
As a part of IBM’s Analytics Platform, my team created a data modeling SaaS product for Data Scientists. We focused on the “canvas” portion of the software, which is a flexible, collaborative GUI-to-code interface for data modeling.
My main responsibilities were to clarify our target persona, engage with users for feedback, and create UX designs based on research insights. I also led the creation of 2-week design/research sprint cycles, which was recognized and shared internally.
Below, I’ve included some of my research and design artifacts that hint at how we converged on our designs. However, I’ve had to emit a great deal of material due to IBM’s NDA. Please reach out and I would be happy to share more details on the experience and skills that I gained while creating the Canvas Modeling UX.
Problem: shortage of data scientists
In the last 5-6 years, the demand for businesses to leverage data-driven insights has increased exponentially. Yet because data science is still a new domain, there is a shortage of seasoned data scientists who can create advanced analytical models. This trend has encouraged people with math, business
and computer science backgrounds to transition into data science roles. However, their limited knowledge working in code(RStudios and/or Python) limits them from creating and exploring a wide breadth of advanced analytical models.
6 out of 10 companies provide in house training for employees to do data science
Source: MIT Sloan Mangement Review survey of 2,710 business executives, managers and analytics professionals worldwide.
Solution: enable emerging data scientists
The ultimate vision of the “canvas” is to provide data science teams with an intuitive GUI to work collaboratively and create advanced analytical models (import and transform data, apply algorithms, create models, and deploy the models) without having to spend any time working in code.
Legacy Product: SPSS Modeler
IBM currently offers a similar product called the SPSS modeler, a clunky but robust on-prem GUI for advanced data modeling. The canvas took technical capabilities of the SPSS Modeler, and redesigned the UX starting with a thorough understanding of data scientists' needs.
Cross Platform Research
I worked with product designers and researchers across the Analytics platform to better understand the spectrum of users in the data science ecosystem. My responsibilities included engaging with user communities, recruiting users, and planning research activities for the team.
I engaged with data scientists at the University of Texas, Galvanize, Meetups, and Verizon. We conducted contextual observations, user interviews in-person and over the phone, and received feedback through surveys.
Our team’s first challenge was to define our products' personas. Our persona lay somewhere between the data scientist and business analyst. From our initial research, we found that there was an ambiguous line distinguishing the roles of data scientist versus a business analyst in the field of data analytics.
Data Scientist Development Stages
Either currently in or recently graduated from a masters or immersive Data Science program.
Relatively lower skill level and tends to be relatively equal in the aspects of math, computing, and business.
Transitioning into data science from a business analytics background or from an engineering background.
Many companies support this transition in their employees since they already have domain knowledge.
Stronger math or quantitative background.
Proficient in computing as well. There are some who are stronger on the computing side.
Data Science Process
The data science process is an iterative process like any scientific experiment. It requires data scientists to test their hypothesis, change variables, log outcomes, and iterate accordingly. Depending on the size of the organization,
data science teams can range from 1-10 members. Some data scientists will cover the end to end data science process alone or collaborate with data engineers and business analysts.
DSX Canvas's Focus
Data Science Pain Points
IBM Internal and Competitors' Offerings Research
We researched competitors offerings and also regularly met with two of IBM's expert data scientists to understand and break down the capabilities of SPSS Modeler, an IBM legacy product offering GUI for advanced data modeling.
SPSS Modeler Heuristic Evaluation
Competitor Offerings Evaluation
Design Principles Based on User Research
We used these design tenets as guiding principles for making design decisions, and further validated our ideas through user testing.
Biweekly Research Sprints
An essential part of our design process was managing agile internal collaboration with the research, design, dev team and offering managers. We had to collaborate closely with our offering managers to understand the business goals, developers for tech feasibility, and research to make sure design decisions were validated by user feedback.
I led the creation of a two-week research sprint cycle. These research-centric sprints followed a reliable cadence that helped to align business goals with user needs and technical feasibility. My research sprint cycle was recognized internally and shared with all design researchers.
Collaborative for data science teamwork
Flexible for iterative experimentation
Clarity for navigating complex workflows
Below are some of the key components of DSX Canvas.
Node designs (key contributor)
Tools palette (key contributor)
Node settings editor designs (key contributor)
Commenting on canvas
Canvas Design Opportunities
Based on technical research, we noted 3 types of nodes- input, intermediate, and output. We explored multiple variations of the nodes, so they would be easy to differentiate from a quick glimpse. Our users should be able to tell apart the data nodes from the algorithm nodes and the output model nodes.
Data scientists are generally skeptical of black box data, so providing transparency to information was an important factor to consider in all of our designs. When the nodes are connected and processing results, we let the user feel in control by making the progress status visible.
The tool palette was an important experience to consider on the canvas because we needed to maximize the canvas's screen space for experimentation. We explored the idea of a tool palette sliding out from the right side navigation on top of the canvas and a flexible moving tool palette that users can drag around the canvas. The tool palette had to be easily and quickly accessible because it is one of the features which users will interact with the most.
Through user testing, we found that flexibility is key for data scientists’ workflow because this allows our users to creatively iterate on experiments. We went with the flexible tool palette, which appears when users click on the palette icon. The tools were categorized by their capabilities, but we also added another category for recently nodes.
We went through the nodes (approx 90 total) in the SPSS Modeler to look for common patterns inside the node’s editor settings. We designed a template for the editors, so that the developer could populate them with the settings for the individual nodes.
Through user testing, we found that SPSS users open multiple editors dialogues at the same time to compare settings. We also found that it is important to have context to how the nodes are connected together when viewing the editor settings. We designed the editors to expand when the user clicks on the node and still be able to view the nodes that are connected to it.
Open Beta Canvas on IBM Data Science Experience
The integration of SPSS Modeler "canvas" in the Data Science Experience was announced at the 2016 World of Watson conference. The canvas is available as open BETA on the IBM Data Science Experience Platform.
Whiteboard Sessions Laying Out Site Map and User Flow
Launching agile designs sprints with the entire team
Pros and cons of working in agile sprints
Communicating with dev and PM teams to understand their needs and business objectives
Learning to stay focused while working through ambiguity
Business driven UX process is not a linear process
Building internal stakeholder's trust and confidence through user research
UX design isn’t just about designing interfaces, but so much more about figuring out relationships within your company and understanding the core business objective.
Silo-ed and dispersed teams- research, design, dev, PM
High learning curve to overcome technical domain knowledge
Lack of clarity on business objectives
Disconnected research and design goals
Unanswered questions and assumptions about our users
Providing research driven design guidance to new DSX canvas
Implementing agile sprints and evangelizing sprint process across DSX product teams
Positive feedback from data scientists for the new SPSS Modeler
Increased understanding of user centric design processes across PM and dev team while working on the next project, Watson Machine Learning/Deep Learning.