| advertise add site services publishers database health videos | ![]() | about toolbar stats live show health store more stuff JOIN/LOGIN |
Statistical data analysis | statistical support with basic analysis of dat epiresult.com | : Third Annual Biomarker Data... biotechsciencenews.com | Applied Spatial Data Analysis with R (Use R) | Epidemiology Disease immem-8.org | Data Analysis goldbamboo.com |
Data analysis is a process of inspecting, cleaning, transforming, and modelling data with the goal of highlighting useful information, suggesting conclusions, and supporting decision making. Data analysis has multiple facets and approaches, encompassing diverse techniques under a variety of names, in different business, science, and social science domains. Data mining is a particular data analysis technique that focuses on modeling and knowledge discovery for predictive rather than purely descriptive purposes. Business intelligence covers data analysis that relies heavily on aggregation, focusing on business information. In statistical applications, some people divide data analysis into descriptive statistics, exploratory data analysis, and confirmatory data analysis. EDA focuses on discovering new features in the data and CDA on confirming or falsifying existing hypotheses. Predictive analytics focuses on application of statistical or structural models for predictive forecasting or classification, while text analytics applies statistical, linguistic, and structural techniques to extract and classify information from textual sources, a species of unstructured data. All are varieties of data analysis. Data integration is a precursor to data analysis, and data analysis is closely linked to data visualization and data dissemination. The term data analysis is sometimes used as a synonym for data modeling, which is unrelated to the subject of this article. [edit] Nuclear and particle physicsIn nuclear and particle physics the data usually originate from the experimental apparatus via a data acquisition system. It is then processed, in a step usually called data reduction, to apply calibrations and to extract physically significant information. Data reduction is most often, especially in large particle physics experiments, an automatic, batch-mode operation carried out by software written ad-hoc. The resulting data n-tuples are then scrutinized by the high physicists, using specialized software tools like ROOT or PAW, comparing the results of the experiment with theory. The theoretical models are often difficult to compare directly with the results of the experiments, so they are used instead as input for Monte Carlo simulation software like Geant4, predict the response of the detector to a given theoretical event, producing simulated events which are then compared to experimental data. See also: Computational physics. 3 [edit] Qualitative data analysisQualitative research uses qualitative data analysis (QDA) to analyze text, interview transcripts, photographs, art, field notes of (ethnographic) observations, et cetera. [edit] The process of data analysisData analysis is a process, within which several phases can be distinguished:[1]
[edit] Data cleaningData cleaning is an important procedure during which the data are inspected, and erroneous data are -if necessary, preferable, and possible- corrected. Data cleaning can be done during the stage of data entry. If this is done, it is important that no subjective decisions are made. The guiding principle provided by Adèr (ref) is: during subsequent manipulations of the data, information should always be cumulatively retrievable. In other words, it should always be possible to undo any data set alterations. Therefore, it is important not to throw information away at any stage in the data cleaning phase. All information should be saved (i.e., when altering variables, both the original values and the new values should be kept, either in a duplicate dataset or under a different variable name), and all alterations to the data set should carefully and clearly documented, for instance in a syntax or a log.[2] [edit] Initial data analysisThe most important distinction between the initial data analysis phase and the main analysis phase, is that during initial data analysis one refrains from any analysis that are aimed at answering the original research question. The initial data analysis phase is guided by the following four questions:[3] [edit] 1) What is the quality of the data?The quality of the data should be checked as early as possible. Data quality can be assessed in several ways, using different types of analyses: frequency counts, descriptive statistics (mean, standard deviation, median), normality (skewness, kurtosis, frequency histograms, normal probability plots), associations (correlations, scatter plots).
The choice of analyses to assess the data quality during the initial data analysis phase depends on the analyses that will be conducted in the main analysis phase.[4] [edit] 2) What is the quality of the measurements?The quality of the measurement instruments should only be checked during the initial data analysis phase when this is not the focus or research question of the study. One should check whether structure of measurement instruments corresponds to structure reported in the literature.
[edit] >>Initial transformations<<After assessing the quality of the data and of the measurements, one might decide to impute missing data, or to perform initial transformations of one or more variables, although this can also be done during the main analysis phase.[6]
[edit] 3) Did the implementation of the study fulfill the intentions of the research design?One should check the success of the randomization procedure, for instance by checking whether background and substantive variables are equally distributed within and across groups.
[edit] 4) What are the characteristics of the data sample?In any report or article, the structure of the sample must be accurately described. It is especially important to exactly determine the structure of the sample (and specifically the size of the subgroups) when subgroup analyses will be performed during the main analysis phase.
[edit] >>Final stage of the initial data analysis<<During the final stage, the findings of the initial data analysis are documented, and necessary, preferable, and possible corrective actions are taken.
[edit] >>Analyses<<Several analyses can be used during the initial data analysis phase:[11]
[edit] Main data analysis[edit] Final data analysis[edit] Free software for data analysis
[edit] See also
[edit] References
[edit] Further reading
|
| ↑ top of page ↑ | about thumbshots |