## Data Collection, Management and Basic Statistical Concepts

Dr. Bruce L. Pihlstrom, DDS, MS

Learn about the basic principles of data collection, management and some basic statistical concepts in clinical research.

**1. Describe and understand what data should be collected.**

- Data collected depends on question being asked and hypothesis of study
- Demographic data
- Independent and dependent variable data
- Possible confounders

**2. Describe and understand standardization of data collection.**

- Data forms
- Data code book
- Methods of data collection
- Testing data collection
- Training and calibration of personnel who collect data
- Quality control of data collection
- Data storage and transmission

**3. Describe an understand some basic statistical concepts in clinical research.**

- Convenience and probability /random sampling
- Statistical Power
- Type I error
- Type II error
- Sample size
- Random allocation of interventions in clinical trials
- Common randomization methods
- Clinical versus statistical significance

## Addtional PDF Resources

## Glossary – Definitions of Key Terms

- Clinical Significance
- Change in a subject’s clinical condition regarded as important whether or not due to the test intervention. NOTE: Some statistically significant changes (in blood tests, for example) have no clinical significance. The criterion or criteria for clinical significance should be stated in the protocol. The term “clinical significance” is not advisable unless operationally defined.

- Convenience Sampling
- A sample of elements that are selected because it is convenient to use them, not because they are representative of the target population.
- http://www.ccsg.isr.umich.edu/index.php/resources/advanced-glossary/convenience-sample

- Data Management
- The process of collection, cleaning, and management of subject data in compliance with regulatory standards. The primary objective of data management processes is to provide high-quality data by keeping the number of errors and missing data as low as possible and gathering maximum data for analysis.
- http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3326906/

- P-Value
- In confirmatory (evidential) studies, null hypotheses are formulated, which are then rejected or retained (not rejected) with the help of statistical tests. The p-value is a probability, which is the result of such a statistical test. This probability reflects the measure of evidence against the null hypothesis. Small p-values correspond to strong evidence. If the p-value is below a predefined limit, the results are designated as “statistically significant”.
- http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2689604/

- Probability Sampling
- A sampling method where each element on the sampling frame has a known, non-zero chance of selection.
- http://www.ccsg.isr.umich.edu/index.php/resources/advanced-glossary/probability-sampling

- Standard Deviation
- A statistic (if referring to the population) or parameter (if referring to the sample) used as a measure of the dispersion or variation in a distribution, equal to the square root of the arithmetic mean of the squares of the deviations from the arithmetic mean.
- http://www.dictionary.com/browse/standard-deviation

- Statistical Power
- The likelihood that a study will detect an effect when there is an effect there to be detected. If statistical power is high, the probability of making a Type II error, or concluding there is no effect when, in fact, there is one, goes down. Statistical power is affected chiefly by the effect size and the size of the sample used to detect it. Bigger effects are easier to detect than smaller effects, while large samples offer greater test sensitivity than small samples.
- https://effectsizefaq.com/2010/05/31/what-is-statistical-power/

- Statistical Significance
- Describes a mathematical measure of difference between groups. The difference is said to be statistically significant if it is greater than what might be expected to happen by chance alone.
- http://www.cancer.gov/publications/dictionaries/cancer-terms?cdrid=44167

- Type I Error
- A type I (also known as ‘α’ ) error occurs if an investigator rejects a null hypothesis that is actually true in the population (false-positive).
- https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2996198/

- Type Ii Error
- A type II (also known as ‘β’) error occurs if the investigator fails to reject a null hypothesis that is actually false in the population (false-negative).
- https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2996198/