Box plot also known as box-and-whisker diagram or candlestick chart. It is a good graphical method for assessing characteristic of one or more data set.
|
It consist of 5 major component
Data points beyond lowest and upper observe values were consider as outliers. Some statistical analysis software will include mean in the box plot. |
Without any statistical assumption, population distribution can be assessing through box plot. The length of the box indicates variance and skew.
Box plot can be drawn either in horizontal or vertically.
|
Box plot step by step; Let say this is your data: 5, 3, 6, 2 7, 8, 2, 4, 6
Step 1: Sort the data in ascending order
Step 2: Figure out medium
Step 3: Figure out lower quartile
Step 4: Figure out upper quartile
Step 5: Figure out smallest and largest observe value
|
Most statistical software will include mean in the box plot. Mean and median will be close to each others when the data is normally distributed.
I like to use box plot for data behavior assessment before further data analysis. Especially those long term data that involve people to people variation; machine to machine variation as well as material batch to batch variation.Understanding the data behavior is important, so that we can make a meaningful conclusion from the data collected.
Box plot below shows data that collected from a same process in four different time interval. From the box plot, what can you tell about the process?



