Skip to main content

Tags

A tag is a keyword or label that categorizes your question with other, similar questions. Using the right tags makes it easier for others to find and answer your question.

Techniques for analyzing the relationship between one (or more) "dependent" variables and "independent" variables.
for any *on-topic* question that (a) involves `R` either as a critical part of the question or expected answer, & (b) is not *just* about how to use `R`.
Machine learning algorithms build a model of the training data. The term "machine learning" is vaguely defined; it includes what is also called statistical learning, reinforcement learning, unsupervis…
Time series are data observed over time (either in continuous time or at discrete time periods).
A probability provides a quantitative description of the likely occurrence of a particular event.
12691 questions
Hypothesis testing assesses whether data are inconsistent with a given hypothesis rather than being an effect of random fluctuations.
A distribution is a mathematical description of probabilities or frequencies.
A routine exercise designed to test one's knowledge; often from a textbook, course, or test used for a class or self-study. This community's policy is to "provide helpful hints" for such questions rat…
Artificial neural networks (ANNs) are a broad class of computational models loosely based on biological neural networks. They encompass feedforward NNs (including "deep" NNs), convolutional NNs, recu…
Bayesian inference is a method of statistical inference that relies on treating the model parameters as random variables and applying Bayes' theorem to deduce subjective probability statements about t…
Refers generally to statistical procedures that utilize the logistic function, most commonly various forms of logistic regression
Mathematical theory of statistics, concerned with formal definitions and general results.
Statistical classification is the problem of identifying the sub-population to which new observations belong, where the identity of the sub-population is unknown, on the basis of a training set of dat…
Mixed (aka multilevel or hierarchical) models are linear models that include both fixed effects and random effects. They are used to model longitudinal or nested data.
Statistical significance is a characteristic of a statistic viewed in light of a null hypothesis and a given significance level. It reflects whether the statistic belongs to the rejection region (is s…
A measure of the degree of association among a pair of variables.
The normal, or Gaussian, distribution has a density function that is a symmetrical bell-shaped curve. It is one of the most important distributions in statistics. Use the [normality] tag for asking ab…
Regression that includes two or more non-constant independent variables.
ANOVA stands for ANalysis Of VAriance, a statistical model and set of procedures for comparing multiple group means. The independent variables in an ANOVA model are categorical, but an ANOVA table can…
Python is a programming language commonly used for machine learning. Use this tag for any *on-topic* question that (a) involves `Python` either as a critical part of the question or expected answer, &…
A generalization of linear regression allowing for nonlinear relationships via a "link function" and for the variance of the response to depend on the predicted value. (Not to be confused with "genera…
A confidence interval is an interval that covers an unknown parameter with $100(1-\alpha)\%$ confidence. Confidence intervals are a frequentist concept. They are often confused with credible intervals…
The expected squared deviation of a random variable from its mean; or, the average squared deviation of data about their mean.
Cluster analysis is the task of partitioning data into subsets of objects according to their mutual "similarity," without using preexisting knowledge such as class labels. [Clustered-standard-errors a…
4038 questions
Prediction of the future events. It is a special case of [prediction], in the context of [time-series].
A test for comparing the means of two samples, or the mean of one sample (or even parameter estimates) with a specified value; also known as the "Student t-test" after the pseudonym of its inventor.
Categorical (also called nominal) data can take on a limited number of possible values called categories. Categorical values "label", they do not "measure". Please use [ordinal-data] tag for discrete …
lme4 and nlme are R packages used for fitting linear, generalized linear and nonlinear mixed effects models. For general questions about mixed models use [mixed-model] tag.
Repeatedly withholding subsets of the data during model fitting in order to quantify the model performance on the withheld data subsets.
Principal component analysis (PCA) is a linear dimensionality reduction technique. It reduces a multivariate dataset to a smaller set of constructed variables preserving as much information (as much v…
a method of estimating parameters of a statistical model by choosing the parameter value that optimizes the probability of observing the given sample.
Survival analysis models time to event data, typically time to death or failure time. Censored data are a common problem for survival analyses.
Creating samples from a well-specified population using a probabilistic method and/or producing random numbers from a specified distribution. As this tag is ambiguous, please consider [survey-sampling…
too general; please provide a more specific tag. For questions about the properties of specific estimators, use [estimators] tag instead.
Predictive models are statistical models whose primary purpose is to predict other observations of a system optimally, as opposed to models whose purpose is to test a particular hypothesis or explain …
Constructing and interpreting meaningful and useful graphical representations of data. (If your question is only about how to get particular software to produce a specific effect, then it is likely no…
1
2 3 4 5
57