I am a great fan of Quora.com. I spend sometime, everyday to check out what’s going on on my area and related areas. Past two days, I have been intersecting with some data scientist questions. I would like to plot some answers that I loved and found very useful.

But before starting to say these are necessary to become a data scientist, I would like to define a data science and data scientist:

Data science is an art of helping people with analyzing and building things with respect to those analysis. Data Scientists are people with some mix of coding and statistical skills who work on making data useful in various ways. There are 2 types of data scientists in general:

Type A is data scientists who are working mostly on analysis part. Type A Data Scientist is very similar to a statistician (and may be one) but knows all the practical details of working with data that aren’t taught in the statistics curriculum: data cleaning, methods for dealing with very large data sets, visualization, deep knowledge of a particular domain, writing well about data, and so on. They can code enough to work, not necessarily an expert on coding. But they may have some significant skills on experimental design, forecasting, modeling, statistical inference, or other things typically taught in statistics departments.

Type B is a data scientist who are working mostly on building part. Type B Data Scientists share some statistical background with Type A, but they are also very strong coders and may be trained software engineers. The Type B Data Scientist is mainly interested in using data “in production.” They build models which interact with users, often serving recommendations (products, people you may know, ads, movies, search results).

How we can be one of the type that we discussed above?

**Pre-requisites:**

- Math, Algorithms and Databases:
- Calculus-3, Linear Algebra, Algorithms, Database Systems

- Statistics:
- Probability and Statistics
- Data Analysis

- Programming:
- R programming
- Scientific Python
- pandas library

**Acquire and Scrub Data:**

- DFS and Databases:
- Data Munging:

**Filter and Mine Data:**

- Data Analysis in R:
- Data Analysis in Python (numpy, scipy, pandas, scikit):
- Exploratory Data Analysis:
- Data Mining, Machine Learning;

**Represent and Refine Data:**

- Tableau-Training & Tutorials
- Data visualisation in R with ggplot2 and plyr
- Predictive Analytics: Overview and Data visualization
- Flowing Data-Tutorials
- UC Berkeley-Data Visualization
- D3.js Tutorial

** Domain Knowledge:**

This skill is developed through experience working in an industry. Each dataset is different and comes with certain assumptions and industry knowledge. For example, a data analyst specializing in stock market data would need time to develop knowledge in analyzing transactional data for restaurants.

**Combining all the above:**

Data Literacy Course — IAP

UC Berkeley Introduction to Data Science

Coursera-Introduction to Data Science

Teach Data Science-Syracuse University

Apply the knowledge:

Harvard Data Science Course Homework

Kaggle: The Home of Data Science

Analyzing Big Data with Twitter

Analyzing Twitter Data with Apache Hadoop

Thanks to Pronojit Saha for this amazing answer on Quora.