Thursday, June 4, 2020

How to Select a Data Visualization

WHAT IS A DATA VISUALIZATION?

A data visualization (often referred to as a chart or graph) is a graphical representation of data. Its goal is to convey information in a clear and succinct format.
Data visualization is as much an art as it is a science. Selecting the right type of chart and adding features that will enhance the visualization while keeping it readable can be quite challenging.

HOW TO SELECT A DATA VISUALIZATION

There are many of types of charts that are easily available in most data visualization products. For example, Chartio provides over 15 types of charts - and each chart can have multiple variations. If you’re an advanced data modeler, you have hundreds, if not thousands, of options using data programming languages. So how do you select what chart to use?
In general, you want to select the most straightforward visualization that will convey the point you want to make. When selecting a visualization, always think from the perspective of your audience, and make design choices that will allow them to quickly understand the insights you’re trying to convey.

COMMON CHART TYPES USED BY USE CASE

While many chart types can be used for the same set of data, here are some common use cases and what charts are typically associated with them.
To trend your data over time:
  • Line chart
  • Column chart
Line charts and column charts are used to show trends over time.
To compare values of different categories:
  • Column chart
  • Bar chart
  • Pie chart 
Bar charts and pie charts are used to compare values between groups.
To show the composition of a total:
  • Pie chart
  • Stacked bar chart
Pie charts and stacked bar charts are used to show how parts combine into a whole.
To understand relationships between factors:
  • Scatter plot
Scatter plots show how two numeric factors relate between one another.
To understand the distribution of your data:
  • Scatter plot
  • Box plot
Scatter plots and box plots can show how your data is distributed.

BEST PRACTICES FOR DATA VISUALIZATIONS

Formatting Chart Axes

The axes of a chart convey the context and scale of the data presented in the chart.
Take for example this fictitious trend of monthly revenue.
This line chart of revenue by month includes a zero baseline.
Many data visualization tools default to start the y-axis at 0 the way this chart does. However that scale may make it difficult to see the actual values of data. A rule of thumb is to select the range in your y-axis so that the trend takes about two-thirds of the chart space as in this example:
By reducing the vertical range of the plot, the changes in revenue are more easily seen.
On the other hand, using this contracted scale might make it look as though revenue fluctuates more than it really does. Manipulating the ranges in your axes could be deceitful to your reader.
Be mindful when selecting the scale of your chart axes. Make sure the scale is adequate to help tell your story, but ensure that it is not deceptive. You should always clearly label the axes on your chart to make them easier to read, but it’s especially important that you do so if you’re manipulating the ranges to avoid deceiving the reader.

Use of Color

Colors in charts aren’t only for aesthetic purposes, they can also serve a key role in your data visualization. Heat maps are great examples of using color as an additional dimension to a chart. They’re often used in cohort charts where the shading of the color represents the degree of the cohort value. The shading in this Google Analytics cohort chart makes it clear that the value of a cohort decays quickly over time for all cohorts.
Color is used in a heatmap where darker shades indicate larger values.
Color is also often used in scatter plots to add a categorical dimension to the chart. For example this chart uses color to distinguish the points that correspond to male and female children.
Distinct colors in a scatterplot can encode different groups.
You can also use color to focus attention on the story you’re telling with your data. For example, you may be writing about how close the 2000 US presidential elections was. You can include a chart like this one, where you provide election results for all candidates, but use color to indicate that the point you’re making is about the results between Al Gore and George W Bush, graying out the results for the other candidates.
Color in a bar chart can be used to call attention to specific groups.
When using colors remember to use them with discretion - you want the color to add to the visualization, not take away from it. If color isn’t playing a role in telling your data story, make sure colors are few and as muted as possible.

Other Visualization Best Practices

There are lots of other features you can add to your charts: trend lines, annotations, additional dimensions, etc… With common business tools, you can create incredibly complex and thorough visualizations. But as mentioned before, always keep in mind that the goal of a visualization is to be clear and concise.
A trap analysts often fall into is to add many features to a chart to show very detailed information. The person who has been thoroughly analyzing the data might be able to easily look at a very complex chart and understand the story it’s telling. However, those who haven’t been as close to the data may struggle. Make sure you’re designing a chart for your audience, not for yourself. Remember, a clear and simple chart tells a stronger story than a complex chart that no one understands.

CONCLUSION

Creating a data visualization that is clear and concise is as much an art than a science. Remember to keep your charts simple and readable and to approach the design of your visualization from the perspective of its intended audience.

No comments:

Post a Comment