This is a compilation of questions from both my master's program, stat trek, and some other stat website.
Define a p-value
In statistical significance testing, the p-value is the probability of obtaining a test statistic at least as extreme as the one that was actually observed, assuming that the null hypothesis is true. If the p-value is less than 0.05 or 0.01, corresponding respectively to a 5% or 1% chance of rejecting the null hypothesis when it is true (Type I error).
What is sampling? How many sampling methods?
In statistical significance testing, the p-value is the probability of obtaining a test statistic at least as extreme as the one that was actually observed, assuming that the null hypothesis is true. If the p-value is less than 0.05 or 0.01, corresponding respectively to a 5% or 1% chance of rejecting the null hypothesis when it is true (Type I error).
Whats the difference between bayesian and frequentist view of stats?
Bayesians condition on the data actually observed and consider the probability distribution on the hypotheses;
Frequentists condition on a hypothesis of choice and consider the probability distribution on the data, whether observed or not.
What is the Central Limit Theorem?
As the sample size increases, the sampling distribution of sample
means approaches a normal distribution
Are expected value and mean different?
They are not different but the terms are used in different contexts. Mean is generally referred when talking about a probability distribution or sample population whereas expected value is generally referred in a random variable context.
-- For Sampling Data
Mean value is the only value that comes from the sampling data.
Expected Value is the mean of all the means i.e. the value that is built from multiple samples. Expected value is the population mean.
-- For Distributions
Mean value and Expected value are same irrespective of the distribution, under the condition that the distribution is in the same population.
What is the Normal Distribution?
Data is usually distributed in different ways with a bias to the left or to the right or it can all be jumbled up. However, there are chances that data is distributed around a central value without any bias to the left or right and reaches normal distribution in the form of a bell shaped curve. The random variables are distributed in the form of an symmetrical bell shaped curve
What does P-value signify about the statistical data?
P-value is used to determine the significance of results after a hypothesis test in statistics. P-value helps the readers to draw conclusions and is always between 0 and 1.
- P- Value > 0.05 denotes weak evidence against the null hypothesis which means the null hypothesis cannot be rejected.
- P-value <= 0.05 denotes strong evidence against the null hypothesis which means the null hypothesis can be rejected.
- P-value=0.05 is the marginal value indicating it is possible to go either way.
How can you make data normal with a Box-cox transformation
Answer:
A car has a 0.9 probability of passing a store every 30 mins. What is the probability of the car passing every 15 min?
Conversely: A car has a 0.4 probabiliyt of passing a bench at a 15 min interval. What's the probability of a car passing within a 45 min interval?
Three friends in Seattle told you it's rainy. Each has a probability of 1/3 of lying. What's the probability that Seattle is rainy?
What is the fundamentals of naive bayes? How do you set the threshold?
given n samples from a uniform distribution [0,d]
how to estimate d?
Flip a coin three times. Lext X be the number of heads observed. Write down a probability mass function p(x) that characterizes the way that X allocates probability to the integers 0, 1, 2, and 3
A graduate class consists of six students. What is the probability taht exactly three of them are born either in April or in October?
Suppose that 2.5% of the population of a border town are illegal immigrants. Find the pro ability that in a theater of his town with 80 random viewers, there are at least two illegal immigrants
The probabilty is p that a randomly-chosen light bulb is defective. We screw a bulb into a lamp and switch on the current. If the build works, we stop; otherwise, we try another and continue until a good bulb is found. What is the probability that at least n bulbs are required?
Consider a coin that is flipped 34 times and comes up heads 15 times. You do not know what the true proportion of the time it is that this coin will come up heads. Build a 95% confidence interval for the true proportion of times the coin will come up heads.
Using the confidence interval that you constructed in the previous problem, would you reject H_0 = 0.75 in favor of its two sided alternative at the 5% level of significance? Why or why not?
For the scores on an achievement test given to a certain population of students, the expected value is 500 and the standard deviation is 100. Let X_bar be the mean of hte scores of a random sample of 35 students from the population. What does the central limit say the approximate distribution of X_bar is ? then compute the pro babilty that X_bar is between 460 and 540. (Hint: if Z=(X-u))