Creating a histogram is an effective way of displaying univariate data in a way that reflects the data's frequency distribution. There are several variables to consider when creating histograms, ranging from the actual analysis of the raw data to the preferences of the intended audience. When creating the optimal histogram, one must carefully consider the nature of the data, the analysis of the data, the preferences of the audience, and software or materials available.
Before creating a histogram, it is important for one to consider the nature of the data to be analyzed. Histograms are generally used to show the distribution of univariate data sets. More specifically, histograms are a visual representation of the data's frequency distribution or probability density function. It is advisable for one to consider alternate graphs that could better represent the data before constructing a histogram.
If a histogram is indeed the best choice for representing the data, the next variable for consideration is the intended audience. College professors, high school math teachers, engineering managers and media consumers may all have different expectations and demands. For example, a mathematics professor may wish to see a histogram constructed on graph paper by hand for an assignment in statistics, whereas an engineering manager may wish to see a histogram in a specific format required by the company. In all cases, easily readable labels on the axes and neat, precise construction are desirable traits.
Creating a histogram by hand is the method most often encountered by students of statistics. To begin, bin sizes are calculated and labeled on a horizontal scale. In practice, the square root of the number of observations in the data set can be used to determine the number of evenly spaced bins. A vertical scale is then marked with the bin frequencies or relative frequencies. Above each bin, a straight edge is used to draw a rectangle with a height equal to the bin's corresponding frequency, and the axes are clearly labeled.
Software packages also may be used for creating a histogram. Modern statistics programs offer a variety of services that extend beyond the construction of the histogram itself. These programs can produce color histograms, predict the normality of the data, offer predictions of the probability density function overlain on the data itself, and calculate simple statistics. For professional work, software packages are often the best choice for creating a histogram because of the added sophistication in analysis and the enhanced presentation.