Aside from consulting the primary origin or source, data can also be collected through a third party, a process common with secondary data. It takes advantage of the data collected from previous research and uses it to carry out new research.
Secondary data is one of the two main types of data, where the second type is the primary data. These 2 data types are very useful in research and statistics, but for the sake of this article, we will be restricting our scope to secondary data.
We will study secondary data, its examples, sources, and methods of analysis.
Secondary data is the data that has already been collected through primary sources and made readily available for researchers to use for their own research. It is a type of data that has already been collected in the past.
A researcher may have collected the data for a particular project, then made it available to be used by another researcher. The data may also have been collected for general use with no specific research purpose like in the case of the national census.
Data classified as secondary for particular research may be said to be primary for another research. This is the case when data is being reused, making it primary data for the first research and secondary data for the second research it is being used for.
Sources of secondary data include books, personal sources, journals, newspapers, websitess, government records etc. Secondary data are known to be readily available compared to that of primary data. It requires very little research and needs for manpower to use these sources.
With the advent of electronic media and the internet, secondary data sources have become more easily accessible. Some of these sources are highlighted below.
Books are one of the most traditional ways of collecting data. Today, there are books available for all topics you can think of. When carrying out research, all you have to do is look for a book on the topic being researched, then select from the available repository of books in that area. Books, when carefully chosen are an authentic source of authentic data and can be useful in preparing a literature review.
There are a variety of published sources available for different research topics. The authenticity of the data generated from these sources depends majorly on the writer and publishing company.
Published sources may be printed or electronic as the case may be. They may be paid or free depending on the writer and publishing company’s decision.
This may not be readily available and easily accessible compared to the published sources. They only become accessible if the researcher shares with another researcher who is not allowed to share it with a third party.
For example, the product management team of an organization may need data on customer feedback to assess what customers think about their product and improvement suggestions. They will need to collect the data from the customer service department, which primarily collected the data to improve customer service.
Journals are gradually becoming more important than books these days when data collection is concerned. This is because journals are updated regularly with new publications on a periodic basis, therefore giving to date information.
Also, journals are usually more specific when it comes to research. For example, we can have a journal on, “Secondary data collection for quantitative data” while a book will simply be titled, “Secondary data collection”.
In most cases, the information passed through a newspaper is usually very reliable. Hence, making it one of the most authentic sources of collecting secondary data.
The kind of data commonly shared in newspapers is usually more political, economic, and educational than scientific. Therefore, newspapers may not be the best source for scientific data collection.
The information shared on websites is mostly not regulated and as such may not be trusted compared to other sources. However, there are some regulated websites that only share authentic data and can be trusted by researchers.
Most of these websites are usually government websites or private organizations that are paid, data collectors.
Blogs are one of the most common online sources for data and may even be less authentic than websites. These days, practically everyone owns a blog, and a lot of people use these blogs to drive traffic to their website or make money through paid ads.
Therefore, they cannot always be trusted. For example, a blogger may write good things about a product because he or she was paid to do so by the manufacturer even though these things are not true.
They are personal records and as such rarely used for data collection by researchers. Also, diaries are usually personal, except for these days when people now share public diaries containing specific events in their life.
A common example of this is Anne Frank’s diary which contained an accurate record of the Nazi wars.
Government records are a very important and authentic source of secondary data. They contain information useful in marketing, management, humanities, and social science research.
Some of these records include; census data, health records, education institute records, etc. They are usually collected to aid proper planning, allocation of funds, and prioritizing of projects.
Podcasts are gradually becoming very common these days, and a lot of people listen to them as an alternative to radio. They are more or less like online radio stations and are generating increasing popularity.
Information is usually shared during podcasts, and listeners can use it as a source of data collection.
Some other sources of data collection include:
Popular tools used to collect secondary data include; bots, devices, libraries, etc. In order to ease the data collection process from the sources of secondary data highlighted above, researchers use these important tools which are explained below.
There are a lot of data online and it may be difficult for researchers to browse through all these data and find what they are actually looking for. In order to ease this process of data collection, programmers have created bots to do an automatic web scraping for relevant data.
These bots are “software robots” programmed to perform some task for the researcher. It is common for businesses to use bots to pull data from forums and social media for sentiment and competitive analysis.
This could be a mobile phone, PC, or tablet that has access to an internet connection. They are used to access journals, books, blogs, etc. to collect secondary data.
This is a traditional secondary data collection tool for researchers. The library contains relevant materials for virtually all the research areas you can think of, and it is accessible to everyone.
A researcher might decide to sit in the library for some time to collect secondary data or borrow the materials for some time and return when done collecting the required data.
Radio stations are one of the secondary sources of data collection, and one needs radio to access them. The advent of technology has even made it possible to listen to the radio on mobile phones, deeming it unnecessary to get a radio.
Secondary data analysis is the process of analyzing data collected from another researcher who primarily collected this data for another purpose. Researchers leverage secondary data to save time and resources that would have been spent on primary data collection.
The secondary data analysis process can be carried out quantitatively or qualitatively depending on the kind of data the researcher is dealing with. The quantitative method of secondary data analysis is used on numerical data and is analyzed mathematically, while the qualitative method uses words to provide in-depth information about data.
There are different stages of secondary data analysis, which involve events before, during, and after data collection. These stages include;
Before collecting secondary data for analysis, you need to know your statement of purpose. That is, a clear understanding of why you are collecting the data—the ultimate aim of the research work and how this data will help achieve it.
This will help direct your path towards collecting the right data, and choosing the best data source and method of analysis.
This is a written-down plan on how the research activities will be carried out. It describes the kind of data to be collected, the sources of data collection, method of data collection, tools, and even method of analysis.
A research design may also contain a timestamp of when each of these activities will be carried out. Therefore, serving as a guide for the secondary data analysis.
After identifying the purpose of the research, the researcher should design a research process that will guide the data analysis process.
It is not enough to just know the research purpose, you need to develop research questions that will help in better identifying Secondary data. This is because they are usually a pool of data to choose from, and asking the right questions will assist in collecting authentic data.
For example, a researcher trying to collect data about the best fish feeds to enable fast growth in fishes will have to ask questions like, What kind of fish is considered? Is the data meant to be quantitative or qualitative? What is the content of the fish feed? The growth rate in fishes after feeding on it, and so on.
After developing the research questions, researchers use them as a guide to identifying relevant data from the data repository. For example, if the kind of data to be collected is qualitative, a researcher can filter out qualitative data.
The suitable secondary data will be the one that correctly answers the questions highlighted above. When looking for the solutions to a linear programming problem, for instance, the solutions will be numbers that satisfy both the objective and the constraints.
Any answer that doesn’t satisfy both, is not a solution.
This stage is what many classify as the real data analysis stage because it is the point where analysis is actually performed. However, the stages highlighted above are a part of the data analysis process, because they influence how the analysis is performed.
Once a dataset that appears viable in addressing the initial requirements discussed above is located, the next step in the process is the evaluation of the dataset to ensure the appropriateness for the research topic. The data is evaluated to ensure that it really addresses the statement of the problem and answers the research questions.
After which it will now be analyzed either using the quantitative method or the qualitative method depending on the type of data it is.
Most of the sources of secondary data are easily accessible to researchers. Most of these sources can be accessed online through a mobile device. People who do not have access to the internet can also access them through print.
They are usually available in libraries, book stores, and can even be borrowed from other people.
Secondary data mostly require little to no cost for people to acquire them. Many books, journals, and magazines can be downloaded for free online. Books can also be borrowed for free from public libraries by people who do not have access to the internet.
Researchers do not have to spend money on investigations, and very little is spent on acquiring books if any.
The time spent on collecting secondary data is usually very little compared to that of primary data. The only investigation necessary for secondary data collection is the process of sourcing for necessary data sources.
Therefore, cutting the time that would normally be spent on the investigation. This will save a significant amount of time for the researcher
Secondary data makes it easy to carry out longitudinal studies without having to wait for a couple of years to draw conclusions. For example, you may want to compare the country’s population according to census 5 years ago, and now.
Rather than waiting for 5 years, the comparison can easily be made by collecting the census 5 years ago and now.
When re-evaluating data, especially through another person’s lens or point of view, new things are uncovered. There might be a thing that wasn’t discovered in the past by the primary data collector, that secondary data collection may reveal.
For example, when customers complain about difficulty using an app to the customer service team, they may decide to create a user guide teaching customers how to use it. However, when a product developer has access to this data, it may be uncovered that the issue came from and UI/UX design that needs to be worked on.
The data collected through secondary sources may not be as authentic as when collected directly from the source. This is a very common disadvantage with online sources due to a lack of regulatory bodies to monitor the kind of content that is being shared.
Therefore, working with this kind of data may have negative effects on the research being carried out.
Researchers spend so much time surfing through a pool of irrelevant data before finally getting the one they need. This is because the data was not collected mainly for the researcher.
In some cases, a researcher may not even find the exact data he or she needs, but have to settle for the next best alternative.
Some data sources are known to exaggerate the information that is being shared. This bias may be some to maintain a good public image or due to a paid advert.
This is very common with many online blogs that even go a bead to share false information just to gain web traffic. For example, a FinTech startup may exaggerate the amount of money it has processed just to attract more customers.
A researcher gathering this data to investigate the total amount of money processed by FinTech startups in the US for the quarter may have to use this exaggerated data.
Some of the data sources are outdated and there are no new available data to replace the old ones. For example, the national census is not usually updated yearly.
Therefore, there have been changes in the country’s population since the last census. However, someone working with the country’s population will have to settle for the previously recorded figure even though it is outdated.
Secondary data has various uses in research, business, and statistics. Researchers choose secondary data for different reasons, with some of it being due to price, availability, or even needs of the research.
Although old, secondary data may be the only source of data in some cases. This may be due to the huge cost of performing research or due to its delegation to a particular body (e.g. national census).
In short, secondary data has its shortcomings, which may affect the outcome of the research negatively and also some advantages over primary data. It all depends on the situation, the researcher in question, and the kind of research being carried out.
You may also like:
A simple guide on numerical data examples, definitions, numerical variables, types and analysis
A simple guide on categorical data definitions, examples, category variables, collection tools and its disadvantages
In this article, we are going to break down the brand and category development index along with how it applies to all brands in the market.
Simple guide on secondary and primary data differences on examples, types, collection tools, advantages, disadvantages, sources etc.