Thursday, January 10, 2013

Some Uncommon Sampling Methods Employed in Research


Inverse Sampling to Estimate Rare Events

In number of studies inverse sampling for the estimation of a rare population parameter has been used. The use is due to saving as the sample size requirement is less for the same precision as in conventional sampling.  Even this can be adopted for the large scale survey at national level.
In addition, when conventional sampling is used to detect rare events there is likelihood of not getting even a single event even after covering a large sample. In such situations, sampling techniques for rare event are more appropriate to be used for estimation. Inverse sampling is one of such techniques that detect predetermined cases in the study population. It is said to be appropriate for the survey of rare events wherein the number of rare events are fixed in advance or predetermined and the sampling is continued till the predetermined number of rare event is obtained in the population.

Under inverse sampling the number of rare event is fixed in advance say ‘m’ (new cases of maternal death) and sampling is continued till the desired numbers of such rare events appear in the population. Apparently the required sample size ‘n’ is a random variable. It is contrary to the conventional cluster sampling where sample size ‘n’ is fixed in advance and the rare event is counted after attaining the sample size ‘n’, then the rare event ‘P’ is estimated as m/n where m is the number of events in the study population.
Under inverse sampling if ‘n’ is the sample size (random variable), at which the mth rare event occures, an unbiased estimate of P is given by p = (m-1)/(n-1). The unbiased variance estimator of p is given by (N-n+1)*p*(1-p)/[N*(n-2)], where, N is the total population of the study area.
Hence, the coefficient of variation (CV) is given by √[(N-n+1)*(1-p)/{N*p*(n-2)}] * 100.
A sample of m new cases of rare event is assumed. The total sample size to be covered at this stage is unknown (random variable). Hence, sampling is to be continued until m new cases of rare events were found.  It is observed that the inverse sampling can yield better precision at very low sample size requirement. 

Snowball Sampling
Some populations that we may be interested in studying can be hard-to-frame. These include populations such as of AIDS/HIV positive individuals, individuals/ institutions involved in some illegal activities like theft, burglary, prostitution, use of banned drugs, abortions for sex determination etc. and so forth. Snowball sampling is a non-probability based sampling technique that can be used to gain access to such populations partially up to certain number and then say the findings based on the group selected about such group of individuals.
To have such a sample from the hard-to-frame population, there are two steps namely, i) try and identify one or more sample units in the desired population or render access to such individuals/units; and ii) use these individuals/units to find further similar individuals/units and so on until the required sample size is obtained.
Supposing, the population we are interested in are the students of a university that take banned drugs. Each student may be referred to as a sample unit. Collectively, all students of the university who are such drug users make up our population. However, we are only interested in examining the sample of these drug users who are the students of the university.
Firstly, we need to try and find one or more such students from the university we are concerned. Finding just a small number of individuals willing to identify themselves and take part in the research study on banned drug users may be quite difficult, so the aim is to start with just one or two students.
Due to the sensitivity of the study, the researcher should ask the initial students who agreed to take part in the study to help also in identifying some more students that are also the banned drug users. The process continues until sufficient students of the university have been identified to meet the desired sample size. We need not consider the individuals who are not part of the University at that point of Time.
Snowball sampling is a useful choice of sampling strategy when the population is hard-to-frame because:
• It is difficult to identifying individuals/units to include in your sample, perhaps because there is no obvious list/frame of the population you are interested in.
• There may be no other way of accessing/getting your sample, making snowball sampling the only viable choice of sampling strategy.
• The sensitivity of coming forward to take part in a survey is more adverse in such contexts. However, since snowball sampling involve like individuals who know each other and may take part in such a survey as there may be some common characteristics and other social factors between these individuals that help to break down some of the barriers that prevent them from taking part outside their association.
• The unknown nature of some groups may also make it difficult to identify various parts of the population that warrant investigation. In the case of banned drug users, it may be obvious to identify strata such as gender, type of banned drugs used, frequency of the drugs used and so forth. One need to find the characteristics of the population you want to examine at the start of the survey and the same may not be known in its entirety. The snowball sampling may also be helpful in finding the unknown characteristics that could be of interest before settling on your sampling criteria.

Snowball sampling is a not very useful choice of sampling strategy when the population is hard-to-frame if we need to determine the possible sampling error and make generalizations from the sample to the population, since snowball sampling does not select sample units randomly as in case of probability sampling techniques. As such, snowball samples should not be considered to be true representative of the population being studied.
Quota Sampling

Quota sampling technique is a method for selecting survey respondents from a population. In quota sampling, a population is first segmented into mutually exclusive sub-groups (or two or more strata). Then judgment is used to select the units from each segment based on a specified proportion. For example, a surveyor may be asked to sample x males and y females between the certain age groups. This means that individuals can set a demand on who they want to sample.
This second step of selection makes the technique non-probability sampling. In quota sampling, the selection of the sample is non-random sample and thus can be unreliable for making inferences. It is just possible that interviewers might be tempted to interview only those people who look to be most helpful, or may choose to use accidental sampling to question those closest to them, for time-saving sake. The problem is that such samples may be biased because not everyone gets a chance of selection. This non-random element is a source of biasness in the actual sample. Quota is normally confused and is advocated to give some probability of selection of units/individuals in the sample.
Quota sampling is useful when time is limited, a sampling frame is not available, the research budget is very tight or when detailed accuracy is not important. Subsets are chosen and then either convenience or judgment sampling is used to choose people from each subset. The researcher decides how many of each category is selected.
Quota sampling is clearly the non probability version of stratified sampling. In stratified sampling, subsets of the population are formulated so that each subset has a common characteristic, such as gender, age. Random sampling chooses a number of subjects from each subset with, unlike a quota sample, each potential subject having a known probability (normally equal probability) of being selected. Fixing of quota or sample size of each strata (or group) on the basis of reliability of estimates one wish to have is fine. But the second stage selection should then be on basis of some random selection as otherwise we are assuming every one is same (homogeneous) within the various groups and this can’t be true. One can’t have convenience sampling at the second stage of sampling.
The Indian corporate sector is divided into two segments namely, Public Limited Companies and Private Limited Companies and some quota of companies to be selected from each segment is fixed. But as the full sampling frame for both the segments is not available even with the Department of Company Affairs, Government of India as there are large many number of respective companies who do not file their annual reports with the Ministry of Corporate Affairs. Many a times it has been noted that even Top Indian Companies do not bother of filing their annual reports with the Ministry regularly. Many companies get themselves registered with the Ministry at the time of its constitution, but quite a few Private Limited Companies close their operations and do not report this important event to the Ministry. In the absence of correct sampling frames, random selection can’t be done. Thus, the estimates generated by using Quota Sampling are under question on account of their reliability. Many such examples are exiting and are crippling the Indian Official System.

Accidental Sampling / Grab Sampling / convenience sampling or opportunity sampling
Supposing one is interested in knowing the incidence of pest infestation for a particular crop in a size able area. Getting interior to the crop is difficult. Investigators may decide to move around the area on the road and inspect the road side plots for pest infestation for the crop. Can they make a correct idea about the percentage of effected crop area? The answer should be ‘No’ as he has not inspected the random portion of the crop area to be inspected. The sampling method adopted is “Accidental Sampling”. It is a type of non probability sampling which involves the sample being drawn from that part of the population which is close to hand. That is, a sample selected from the population is biased one, because it is that part which is convenient to choose. The researcher using such a sample cannot scientifically make generalizations about the total population from this sample as it would not be representative enough. This accidental sampling can be adopted at best for some pilot testing or can be used to frame certain hypothesis to be tested scientifically. If some questionnaires need to be field tested, there may not be a big requirement of selecting a rigorous random sample using a population frame. Thus, for such type of work, accidental sampling can be adopted. If one has to just get a feel of public opinion on some happening around in the community at large, well framed questions can be put on even on some social site and may get the answer to the same from the public whosoever wants to give some reaction. But caution is that one should not generalize the same for the entire community.
Accidental Sampling is also called Grab Sampling or Convenience Sampling or Opportunity Sampling.

Wednesday, January 9, 2013

MEASUREMENT OF TREND COMPONENT OF A TIME SERIES


A time series is a set of observations recorded at successive intervals of time. The simplest and more commonly employed methods to measure Trend Component of a Time Series are: 

1.      Graphic or Free hand Method:   First of all the data is plotted on a graph paper and the trend line is fitted by a line or a freehand curve by just inspecting and following the graph of the series.  The curve need to be smooth and with almost equal number of points above and below it.  By eye estimate, the sum of the vertical deviations of the given points above the trend line should approximately equal to the sum of the vertical deviations of the given points below the trend line. Also the sum of the squares of the vertical deviations of the given points from the trend line should be minimum possible. 

2.      Moving Average Method:  This method comprise of taking arithmetic means of the data/values for a certain span and then placing the value so calculated against the middle of the time span.  The time span should be equal to the average fluctuation period.  If this span is of period k, then the moving averages obtained by averaging k at a time are called Moving Averages of period or extent k.   If k is even, the successive values of moving averages are placed in the center/middle of the period/span of time.