What is histogram?

Histogram is a graphical view of variation in a set of data. The pictorial nature of the histogram enables us to see patterns easily. But it is very difficult to see the pattern in a simple table of numbers. It is a tool used to determine the normal distribution of a process. 

A French statistician A M Guerry first developed a histogram in 1833. Guerry introduced a new kind of bar graph to describe his analysis of crime data.

Key concepts about histogram :

Values in a set of data usually show variation: Variation is everywhere. It is inevitable in the output of any process. It is impossible to keep all factors in a constant state all the time.

Variation always shows a pattern: Different factors will have different variations, but there is always some pattern to the variation. These patterns of variation in data are called distributions.

There are three important characteristics of a histogram :
ⓐ It’s center
ⓑ Its width
ⓒ Its shape

③ Pattern of variation is difficult to see in a simple table of numbers

④ Pattern of variation is easier to see when data are summarized pictorially in a histogram.

How to use and interpret a histogram?

Identifying and explaining pattern of variation: The goal of the analysis of a histogram is to :

  • Identify and classify the pattern of variation
  • Develop a reasonable and relevant explanation for the pattern. 

Classification of the histogram as per the pattern of variation :

① Bell shaped Distribution : 

A symmetrical shape with a peak in the middle of the range of the data. This is the normal distribution of data from a process. Deviation from this bell-shaped may indicate the presence of outside influence. 


② Double Peaked Distribution :

A significant drop in the middle of the range of data with a peak on either side. This pattern is normally a combination of two bell-shaped distributions and suggests that two distinct processes are at work.


③ Plateau Distribution :

A flat top with no distinct peak and a slight tail on either side. This pattern is likely to be the result of much different bell-shaped distributions with centers spread evenly throughout the range of the data.


Comb Distribution : 

High and low values alternating in a regular fashion. This pattern typically indicates measurement error, errors in the way the data were grouped to construct the histogram. The presence of alternating high and low is a warning of possible errors in data collection.

Comb Distribution

⑤ Skewed Distribution :

An unsymmetrical shape in which the peak is off-center in the range of data and the distribution tails off sharply on one side and gently on another side. This may be positively skewed or negatively skewed as per rightward or leftward respectively.


⑥ Truncated Distribution :

An unsymmetrical shape in which the peak is at or near the edge of the range of data and the distribution ends on one side and tail off gently on the other.


⑦ Isolated peaked Distribution :

A small, separate group of data in addition to the larger distribution. It is like the double-peaked distribution. But the small size of the second peak indicates an abnormality.


⑧Edge peaked Distribution:

A large peak is attached to an otherwise smooth distribution. This shape occurs when the extended tail of the smooth distribution has been cut off and lumped into a single category at the edge of the range of the data.

Precautions in interpretation of Histogram

There are 3 main precautions during interpreting histograms.

  • Data should be from current and typical condition of the process.
  • Sample size should be large.
  • Interpretation of histogram must be confirmed through again analysis & observation of the process.

Application of Histograms

  • Identifying the root cause
  • The histogram is a simple but a powerful analytical tool that help us to understand the process and develop reasonable, fact-based theories about the root cause of the problems.
  • To check the process performance

Steps in constructing a histogram :

Step 1: On the table of raw data, determine the high value,low value and the range.

Step 2: Decide on the number of cells

No of Data pointsRecommended number of cells
20 – 506
51 – 1007
101 – 2008
201 – 5009
501 – 100010
Over 100011 – 20

Step 3: Calculate the approximate cell width

Step 4: Round the cell width to a convenient number

Step 5: Construct the cells by listing the cell boundaries

Step 6: Tally the number of data points in each cell

Step 7: Draw and label the horizontal axis

Step 8: Draw and label the vertical axis

Step 9: Draw in the bars to represent the number of data point in each cell.

Step 10: Title the chart,indicate the total number of data point and show nominal values and limits.

Step 11: Identify and classify the pattern of variation

Step 12: Develop a reasonable and relevant explanation for the pattern

You may like to learn about SPC.

You may also like...

Leave a Reply

Your email address will not be published. Required fields are marked *