# Data Science and Analytics Course Chennai

Training in Chennai

**Data Science** is an interdisciplinary field about processes and systems to extract knowledge or insights from data in various forms, either structured or unstructured,which is a continuation of some of the data analysis fields such as statistics, data mining, and predictive analytics, similar to knowledge discovery in database.

**Data Analytics** is the science of examining raw data with the purpose of drawing conclusions about that information. Data analytics is used in many industries to allow companies and organization to make better business decisions and in the sciences to verify or disprove existing models or theories. Data analytics is distinguished from data mining by the scope, purpose and focus of the analysis. Data miners sort through huge data sets using sophisticated software to identify undiscovered patterns and establish hidden relationships. Data analytics focuses on inference, the process of deriving a conclusion based solely on what is already known by the researcher.

Statistics is the Phase I of Data science

- Introduction to Statistics.
- Five Number Summary.
- The Centre of the Data and the Effects of Extreme Values.
- The Spread of the Data.
- The Shape of the Data.
- Categorical Variables.
- Some Features of Data.
- Relationships Between Quantitative and Categorical Variables.
- Examining Relationships Between Two Categorical Variables.
- Relationships Between Two Quantitative Variables.
- Data Collection.
- Sampling.
- Data Collection.
- Observational Studies.
- Data Collection.
- Experiments.
- The Need for Probability.
- Some Probability Basics.
- Probability Distributions.
- Long.
- Run Averages.
- Sampling Distributions.
- Introduction to Confidence Intervals.
- Confidence Intervals for Proportions.
- Sample Size for Estimating a Proportion.
- Confidence Intervals for Means.
- Robustness of Confidence Intervals.
- Introduction to Statistical Tests.
- The Structure of Statistical Tests.
- Hypothesis Testing for Proportions.
- Hypothesis Testing for Means.
- Power and Type I and Type II Errors.
- Connection Between Confidence Intervals and Hypothesis Testing.
- Matched Pairs.
- Comparing Two Proportions.
- Comparing Two Means.
- The Linear Regression Formula.
- Regression Coefficients Residuals and Variances.
- Regression Inference and Limitations.
- Residual Analysis and Transformations.

**Phase II of Data science**

**Overview.**

- History of R.
- Advantages and disadvantages.
- Downloading and installing.
- How to find documentation.

**Introduction.**

- Using the R console.
- Getting help.
- Learning about the environment.
- Writing and executing scripts.
- Saving your work.

**Data Structures and Variables.**

- Variables and assignment.
- Data types.
- Indexing, subsetting.
- Viewing data and summaries.
- Functions.
- Naming conventions.
- Objects.
- Models.
- Graphics.

**Control Flow.**

- Truth testing.
- Branching.
- Looping.
- Vectorized calculations.

**Functions.**

- Parameters.
- Return values.
- Variable scope.
- Exception Handling.

**Getting Data into the R environment.**

- Built-in data.
- Reading local data.
- Web data.

**Overview of Statistics in R.**

- Introduction to R Graphics.
- Model notation.

**Descriptive statistics.**

- Continuous data.
- Scatter plot,Box plot.
- Categorical data.
- Mosaic plot.
- Correlation.

**Inferential statistics.**

- T-test and non-parametric equivalents.
- Chi-squared test, logistic regression.
- Distribution testing.
- Power testing

**Linear Regression.**

- Linear models.
- Regression plots.
- ANOVA.

**Other Topics.**

- Classification.
- Clustering.
- Time series.
- Dimensionality reduction.
- Machine Learning.

**Object Oriented R.**

- Generic functions.
- S3/S4 classes.

**Installing Packages.**

- Finding resources.
- Installing resources.

**More about Graphics.**

- Labels.
- Exporting graphics.

**Sophisticated Graphics in R.**

- Lattice.
- GGplot2-Interactive graphics.
- Animated GIF-rGGobi.

**R for Mapping and GIS**

- Choropleth maps.
- Layers.

**Data Mining Training in Chennai**

- Introduction to Data mining
- Algorithms

Machine Learning is Phase IV of Data science

**Machine Learning**

- Introduction, Regression Analysis and Gradient Descent
- Linear Algebra – review
- Linear Regression with Multiple Variables
- Logistic Regression
- Regularization
- Representation
- Learning
- Machine learning techniques hands-on
- Machine Learning System Design
- Support Vector Machines
- Clustering
- Dimensionality Reduction
- Anomaly Detection
- Recommender Systems
- Large Scale Machine Learning
- Course Summary