What is Primary Data?

Mary McMahon
Mary McMahon

Primary data is original research data in its raw form, without any analysis or processing. This data provides a wealth of information for researchers. Depending on the nature of a study, the primary data may be provided along with reports and analysis so readers can look at it directly, or it may be kept confidential. Access to this data can be very valuable for people who want to learn more about study methodology, anomalies that occurred during studies, and other topics.

Brain images can provide primary data in many research studies.
Brain images can provide primary data in many research studies.

This data can contain results from empirical testing, transcripts of interviews and surveys, and recorded observations. A person conducting a study on mice, for example, would have primary data like test results from blood and urine analysis, along with detailed observations of the mice on a day-to-day basis. The primary data could also include x-rays, brain imaging, and other diagnostic imaging, depending on the nature of the study.

Research data in its original, raw form is referred to as primary data.
Research data in its original, raw form is referred to as primary data.

People can distinguish primary data from other kinds of data by the fact that it is directly collected and presented without commentary. Secondary data consists of things like research papers based on the data. The major disadvantage of primary data is the sheer volume of information. People would need to read through pages and pages of information to extract usable data. In data processing, researchers use statistics and other tools to present the data in a more accessible format, turning raw results into meaningful statements like “20% of study participants reported feeling nauseous.”

Primary data is information in its rawest form, before processing.
Primary data is information in its rawest form, before processing.

Primary data records may be digital or hard copy, depending on the nature of the study. Digitization is very common with many studies because it makes it easier to transmit and review the data. A digital copy is easier to work with during analysis and reduces the risk of analytical errors. As long as people enter data correctly the first time, it will be accurate in statistics programs and other tools people use to explore the raw data.

Primary data may include X-rays.
Primary data may include X-rays.

Data analysis can break down the data into useful components for people who may have an interest in the study. It will also discuss outliers and things in the data that did not make sense, such as a single person in a study who failed to respond to an otherwise effective treatment. In analysis, researchers have an opportunity to probe into the information to draw useful conclusions about the research. They can also offer theories and explanations about mysteries found in the data.

Primary data may include test results from blood analysis.
Primary data may include test results from blood analysis.
Mary McMahon
Mary McMahon

Ever since she began contributing to the site several years ago, Mary has embraced the exciting challenge of being a wiseGEEK researcher and writer. Mary has a liberal arts degree from Goddard College and spends her free time reading, cooking, and exploring the great outdoors.

You might also Like

Readers Also Love

Discussion Comments


@hamje32 - Well, in general I agree; the academic stuff can be hard to decipher. However I don’t think we should just throw in the towel and leave it up to the experts.

I think it only takes a little bit of study to be able to understand a peer reviewed journal and its data on a particular topic. You don’t have to be an expert on everything, just brush up on that topic so you can read with comprehension.

Also, I do think that most reputable journals go to great lengths to make themselves readable even to the masses, so I don’t think the data or its findings are as complicated as you think. It just requires a bit of self education.


@nony - You raise a good point about primary and secondary data. I think that for that reason alone, we generally leave it up to the experts to ferret out the data and provide the general public with a summary that the average reader can understand.

The summary is kind of like a dummies guide to the research findings and conclusions. Personally, I think that the summary is more than adequate. If I doubt the conclusions, I can also see if there is an expert somewhere who shares similar doubts and can present his views in an easily understood manner.

Unless we are trained scientists ourselves, I think we will to some extent be at the mercy of professionals who ferret out the information and present their findings to us.

Fortunately, there can be a wide spectrum of opinion about the research, so I think we can get a complete overview of the body of scientific opinion and draw our own conclusions.


@chivebasil - There is one problem when considering primary data vs secondary data however.

Primary data can be hard to understand, especially for the lay public. For example, a lot of primary research is published in peer reviewed journals.

These publications target a very specialized market and it would take a specialist to filter through the data collection models cited through the study, and judge the merits of the conclusions.

I haven’t found peer reviewed journals easy to understand, especially in the technical and medical specialties. I can go as far as understanding maybe the introduction and the gist of the conclusion, but all the meaty stuff in between that deals with mathematical formulas and such leaves me dazed.


@nextcorrea - I tend to agree, the only problem is that primary data collection can be very expensive and take a very long time. Scientists and labs are reluctant to make this data available to anyone because they have invested so much money into gathering it. They have a certain proprietary claim to it. In some cases this might slow down the goals of science but in other cases it might create a competitive spirit that advances innovation.


For the sake of science and human progress, I think that most primary data should be made public. In a lot of cases this data can be reexamined and reworked to produce different secondary data. A lot of times this new secondary data can be as valuable as the original conclusions.

Its kind of like a novel. One person will read it and reach one conclusion. Another will read it and reach a drastically different conclusion. They are not necessarily competitive or mutually exclusive, they just draw on the same primary material. If there was this spirit of sharing in the sciences I think we would see a lot more breakthroughs.


I think in almost all legitimate studies they will make the primary data available to any interested parties. To hold it back suggests something fishy about their methodology or their statistical analysis. The number one principle of science is repeatability. If you reach a result and you can't show to others how it was reached it can never be considered valid. There are so many statistics and surveys and studies that people reference these days, its important to separate the good information from the bad.

Post your comments
Forgot password?