In statistics, a histogram is a type of graph that shows the distribution of frequencies of data, usually in the form of vertical bars. This type of graph is also commonly called a frequency histogram, and sometimes a bar graph or bar chart. In a histogram graph, the height of each bar shows the number of items that fall into that range on the graph. Histogram distribution graphs are often considered the most important tool in studying the distribution of data.
The histogram plot is generally shown with the frequency of the data values on the vertical Y-axis of the graph and the different types or categories of data along the horizontal X-axis of the graph. In addition to showing how often a particular value occurs in a set of data, some other information can be gleaned using statistical analysis of histogram data. This includes the ‘shape’ of the data, such as a ‘flat’ distribution or a ‘bell-shaped’ distribution.
A Pareto graph is a special kind of histogram where the bars are ordered by size, with the tallest bar shown at the leftmost side of the graph and the smallest at the right. These Pareto graphs are often used in quality control projects to highlight the commonest product defects in manufacturing systems. They get their name from the ‘Pareto Principle’, which states that 20% of the inputs to the manufacturing process will cause 80% of the defects.
Some examples of patterns that may be detected in histograms include single peak and dual peak patterns. A single peak in the data is also the statistical mean average for the data. When the statistical mean is not in the center of the graph, this may indicate a special reason, which might be useful to investigate.
A dual peak pattern occurs where there are two very high bars on the graph. When this pattern is seen, it may indicate that there are two distinct sources of data. For example in a production line, the peaks might be attributable to two individual operators.
Histogram equalization is a method used in image manipulation, such as digital photography software. It uses histograms to work out which intensity levels in the image are most common, and then distributes these more efficiently. In this way, the overall contrast of the image can be improved for better viewing.