How to Choose the Most Appropriate Chart?
Updated: Nov 21, 2020
In this information rich age, data visualizations are designed to make the knowledge transfer between deliverers and receivers easier. Therefore, it is crucial for the dashboard creators to know which chart is aligned with the key delivery objectives. On the other hand, having a basic understanding of the underlying meaning of each chart also helps the audience to interpret dashboards effectively. In this article, I introduced a way that may help to better understand some common charts and graphs, e.g. scatter plot, map, pie graph and stacked bar chart etc, by categorising them into four main types: distribution, comparison, composition and correlation. It doesn't mean that it is a clear-cut solution or a rigid boundary that limits us to only use a chart in one certain way. Rather, it is a conclusion drawn from my experience regarding what is the main objective each chart is able to communicate. Moreover, designing effective dashboards is beyond choosing appropriate charts. Read more dashboard design principles in this article.
This type of data visualization helps to interpret univariate analysis result in the early analytical stage. Simply put, it shows where data points are dense and where they are sparse in one dimension. Distribution charts can also be widely applied in market research, such as demographics analysis and customer segmentation. Some of the common charts under this category are histogram, box plot and map. However, I am more leaning towards categorizing box plot into "comparison" type, which I will explain it in the later section.
Histogram looks very similar to bar chart because, oh well, it is also composed of bars. However, instead of comparing the categorical data, it breaks down a numeric data into interval groups and shows the frequency of data fall into each group. It is commonly used to gain insights about your customers, e.g. Pinterest use histograms to show the age distribution of your audience. Histogram is good at identifying the pattern of data distribution on a numeric spectrum. For example, it magnifies what is the most probable value range and whether the data is skewed or centred.
Map is also frequently used to show demographical data. By linking to the geospatial data, it indicates where are your audience or customers located. The logic behind map charts is that numeric values are aggregated by a geospatial attribute (e.g. regions, city, country or state etc). Then use gradient colors to represent the variations in data density among locations. In the graph below, regions with higher values are in darker color and vice versa.
It is hard to compare which number is larger or smaller when we are displayed in a table or spreadsheet. Adding a visual element to the comparison and contrast significantly reduces the amount of time and mental energy required to interpret the data. These visual representations can be achieved through bar chart, line chart or box plot.
Bar chart compares the measure of categorical dimension. As we can see, comparing the height of each bar gives us a more intuitive perception than looking at the table alone. Bar chart is very similar to a histogram. The fundamental difference is that the x-axis of bar charts is categorical attribute instead of numeric interval in the histogram. For example, in this chart, we compare the profit value of each market "EMEA", "APAC" ... Whereas in a histogram, we break down a numeric attribute age into intervals "18-24", "25-34" ...
Furthermore, bar chart is not just limited to plot one categorical data. An extension of bar chart, clustered bar chart (or group bar chart) compares two categorical attributes. For example, the comparison of market profit can be further broken down into different year segments. This allows us to compares based on market to market and also based on different periods of order time.
It indicates trends and developments of numeric data over time. It is commonly used in time series analysis, by visualizing the fluctuation of a numeric variable against a date-type variable. Each line itself is a comparison between one historical time point and another. Additionally, we can introduce a categorical attribute and use distinct colors to bring out the contrast of each category. For example, the chart below plots the number of orders over time and each line indicates one category of customer segments. Therefore, horizontally it illustrates the time series analysis of order quantities. While by comparing the line vertically, we can draw out the conclusion that the number of orders differs remarkably among various segments.