Ask Question
Ask or Search Quora
Big Data
Statistics (academic discipline)
Book Recommendations
Books
How do I learn statistics for data science? What statistics book do you recommend to a wannabe data scientist who is familiar with basic statistics and mathematics? Answer
Request
Follow 514
Comment Share 1
Downvote
Promoted by Edureka
Become a top Hadoop developer. interactive online course. Instructor-led course with 24x7 . Master HDFS, Mapreduce, Yarn, Pig, Hive, HBase, Oozie & Flume. Learn More at Edureka.co
Read
Answer
1
Notifications
Achint
Related Questions What are the best books on statistics for data science? What statistics should I know to do data science? Which is a better career option for someone interested in statistics, probability & linear algebra? Data Science or Machine Learning? Where can I find some good free resources to learn statistics for data science and machine learning? In order to learn statistics for data science class, which one is better Udacity: Introtostatistics or Khan Academy: Probability and Statist... Is a graduate optimization course good for statistics or for data science?
23 Answers
Where can I learn data science?
William Chen, studied Statistics at Harvard University (2014) Updated Jan 16, 2015 · Upvoted by Yassine Alouini, I hold a masters in statistics (formally part III) from Cambridge.
How has learning computer science, statistics, or data sciences in general improved your understanding and rate at which you absorb informatio...
For any aspiring data scientist, I would highly recommend learning statistics with a heavy focus on coding up examples, preferably in Python or R. My favorite series is the Statistical Learning series. It's a great primer on statistical modeling / machine learning with applications in R. The Elements of Statistical Learning More Related Questions
An Introduction to Statistical Learning
Question Stats If you want something with a Python focus, I would check out Think Stats 514 Followers
There are official pdf versions generously available for FREE at Bookmark
106,172 Views
data mining, inference, and prediction. 2nd Edition.
Suggest Edits
Last Asked Nov 20
Page on usc.edu
Thank
7 Merged Questions
http://greenteapress.com/thinkstats2/index.html
Report
Log 43.8k Views · View Upvotes · Answer requested by Fadli Hidayat and Minhaz Mishu Upvote 202
Downvote Comments 5+
Greg Ryslik, Led data science teams at Bay Area companies Written Jan 27
I wouldn’t focus so much on learning statistics “for data science”, but more on just “learning statistics”. Data Science itself is a combination of two fields, statistics/mathematics and computer science. There were “data scientists” that sat at the intersection of those two fields far before the term was coined. Many of the answers above (which are great!) are targeted specifically to “machine learning”. In getting a broader perspective you gain the ability to not only implement the models but understand how they connect and are related to the deeper mathematics behind them — as such, this post is more towards the general field. In of statistics that are immediately useful to data science, they typically fall into one of two categories, either 1) inference or 2) model fitting. 1) In regards to inference that typically topics such as: 1) Parameter Estimation 2) Hypothesis testing 3) Bayesian Analysis 4) Identifying the best estimator 5) Other Statistical Theory
Edits
Some classic books on these topics include: Ask Question Ask or Search Quora (more introductory): Statistical Inference: George Casella: 9788131503942:
Read
Answer
1
Notifications
Achint
Amazon.com: Books (more advanced): Theory of Point Estimation (2nd English Edition): E.L. Lehmann,
Related Questions
George Casella: 9783698745156: Amazon.com: Books 2) In regards to model fitting there are a multitude of topics: 1) Linear Regression 2) Non-linear Regression 3) Categorical Data Analysis 4) Time Series & Longitudinal Analysis 5) Machine Learning Some famous intro books include: Linear Models: Applied Linear Statistical Models w/Student CD-ROM: Michael H.
What are the best books on statistics for data science? What statistics should I know to do data science? Which is a better career option for someone interested in statistics, probability & linear algebra? Data Science or Machine Learning? Where can I find some good free resources to learn statistics for data science and machine learning? In order to learn statistics for data science class, which one is better Udacity: Introtostatistics or Khan Academy: Probability and Statist...
Kutner, John Neter, Christopher J. Nachtsheim, William Li: 9780071122214: Amazon.com: Books Categorical Data: Amazon.com: An Introduction to Categorical Data Analysis
Is a graduate optimization course good for statistics or for data science?
(9780471226185): Alan Agresti: Books
Where can I learn data science?
3) Finally, there are also a variety of topics that are very helpful with things like
How has learning computer science, statistics, or data sciences in general improved your understanding and rate at which you absorb informatio...
A/B testing, missing data, etc. These include things like: 1) Design of Experiments (very helpful in A/B testing) 2) Bootstrapping (helpful when parameter of interest is hard to calculate) 3) Sample Size calculations (useful when trying to understand how many samples you need) 4) Multiple comparisons (what happens if you run many tests) 5) A ton of others. Many of the above you will encounter as you get through the 1) and 2) above. If you’re interested in a potential introductory syllabus, I’ll be teaching a bootcamp shortly. The course and syllabus is found here: Statistical Foundations- Metis Hope this helps! 1.3k Views · View Upvotes Upvote
8
Downvote Comment
Ferris Jumah, Data and Products Updated Jan 19, 2013 · Upvoted by Lili Jiang, Data Scientist at Quora
Working list, please suggest edits, need classifications The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition [1] -Hastie Tibshirani, Friedman Statistical Inference [2] -Casella, Berger --Excellent starting text for moving on to more advanced material Bayesian Data Analysis [3] -Gelman, Carlin, Stern, Rubin Mining of Massive Datasets [4] -Rajaraman, Ullman, Leskovec All of Statistics [5] -Wasserman Also, for a very comprehensive list, see What are some good resources for learning about statistical analysis?
Ask or Search Quora [1] data mining, inference, and prediction. 2nd Edition.
(/buy)
Ask Question
Read
Answer
1
Notifications
Achint
[2] Statistical Inference: George Casella, Roger L. Berger: 9780534243128: Amazon.com: Books
Related Questions
[3] Home page for the book, "Bayesian Data Analysis" [4]Mining of Massive Datasets - The Stanford University InfoLab [5] All of Statistics: A Concise Course in Statistical Inference (Springer Texts in
What are the best books on statistics for data science?
Statistics): Larry Wasserman: 9780387402727: Amazon.com: Books
What statistics should I know to do data science?
26.3k Views · View Upvotes
Which is a better career option for someone interested in statistics, probability & linear algebra? Data Science or Machine Learning?
Upvote 33
Downvote Comments 3+
Brian Feeny, Harvard Grad Student Written Dec 16, 2012 · Upvoted by William Chen, studied Statistics at Harvard University (2014) and Justin Rising, PhD in statistics
There are many books that will focus on statistics as it applies to data science,
Where can I find some good free resources to learn statistics for data science and machine learning? In order to learn statistics for data science class, which one is better Udacity: Introtostatistics or Khan Academy: Probability and Statist...
however I do believe you should approach statistics holistically, and not just in the frame of reference of Data Science. For that, I recommend the following book:
Is a graduate optimization course good for statistics or for data science?
Statistics, 4th Edition (9780393929720): David Freedman, Robert Pisani, Roger
Where can I learn data science?
Purves This is the same book (loosely) followed by Andrew Conway in his Coursera course Statistics One. I would try to find the International version, as they are identical to the US versions, but can be had for around $30. The first chapter or two are rather confusing, but I find the rest of the book very well laid out. Andrew Conway is very knowledgable in Statistics, and no doubt he has recommended this book for good reason. That said, I recommend using no single resource. Statistics is far too important to Data Science. You must master it, and like most things, that is a constant work in progress. I am addicted to Statistics, and I think this book is partially to blame. 15.7k Views · View Upvotes Upvote
9
Downvote Comments 2
Carl Shan, reads a lot, has written a few Written Jul 29, 2015
To brush up on some basic statistics, without dropping a load of cash on a textbook/degree, I'd like to suggest to start off by reading over a series of short primers (10-12 page PDFs per topic) meant for the novice statistician, and social science researcher written by MIT EECS PhD student Ramesh Sridharan. He taught a 1-mo course at MIT for researchers brushing up on basic or intermediate statistics, and ed all of his PDFs. (You can check out the website here: Statistics for Research Projects ) I stumbled across his notes while looking up some details regarding the KolmogorovSmirnov test—a non-parametric test (a non-parametric test is a test that doesn't assume the data has any sort of probability distribution, and is thus "parameter"-free) for differences in two distributions—and found his notes to be incredibly lucidly written and clear. If you have some mathematical or technical maturity, you may find his notes similarly helpful in getting up to speed. If not, I still think his notes are a great initial entry point into quickly getting a lay of the land. The link is to his 6-7 notes, totaling ~70 pages, is here: Statistics for Research Projects Note that he doesn't have any notes on predictive modeling, which is a key part of machine learning. I emailed him asking why, and he told me that he didn't have the chance to write anything detailed for the topic. I'm considering drafting a short primer myself...
How has learning computer science, statistics, or data sciences in general improved your understanding and rate at which you absorb informatio...
13.6k Views · View Upvotes Upvote 24
Ask Question
Ask or Search Quora
Downvote Comment
Read
1
Answer
Notifications
Achint
Related Questions Shailesh Upadhyay, former Associate at Indian School of Business (20102011) Updated Dec 23 · Upvoted by Ujala Shanker, Taught Statistics to undergrad students at UC Berkeley. Originally Answered: How do I learn statistics and probability for data science?
To become a good data scientist, you need to build a strong foundation in the following: Fundamental statistics (topics like descriptive & inferential statistics; parametric & non parametric tests, simple & multiple regression etc) Proficiency with atleast one statistical computing language like R, SAS, STATA etc. Python programmers who have done data analysis also have an edge. Good knowledge/experience with advanced modeling techniques, such as time series analysis, matrix factorization, mixed-effect models, and machine
What are the best books on statistics for data science? What statistics should I know to do data science? Which is a better career option for someone interested in statistics, probability & linear algebra? Data Science or Machine Learning? Where can I find some good free resources to learn statistics for data science and machine learning? In order to learn statistics for data science class, which one is better Udacity: Introtostatistics or Khan Academy: Probability and Statist... Is a graduate optimization course good for statistics or for data science?
learning techniques such as boosting and random forests.
Where can I learn data science?
Algorithmic thinking- ability to think about and solve problems at a level of
How has learning computer science, statistics, or data sciences in general improved your understanding and rate at which you absorb informatio...
abstraction that is beyond any specific programming language goes a long way. An understanding of how relational databases work. SQL experience helps. Experience with large data sets & distributed computing using Hadoop/Hive is an added advantage if you want to continue excelling as a data scientist. A few online resources and moocs that can help you get started are: 1. Data Analyst
(a good place to get a feel for data and practice)
2. Managing Big Data with MySQL - Coursera (learn using relation DB in business analysis) 3. Practical Machine Learning - Coursera (a primer to start machine learning intuitively) Hope this helps. 1.5k Views · View Upvotes Upvote 12
Downvote Comments 1+
Top Stories from Your Feed Answer written · India · Topic you might like · Thu
Autorickshaw drivers are facing losses due to Uber and Ola. Is this fair?
Answer written · India · Topic you might like · Feb 22
What do foreigners like about India? Sakura Su
Anna Stepanova, lives in Hyderabad, India Written Thu
Written Feb 22
Undiscovered new answer · 22m
Where can I buy drugs online? Josjhua Litese Written 22m ago
We have Pain and anxiety meds of different types with no Prescription
I am trying to
Ola and Uber have definitely saved lives of
required. Prices are moderate and with
learn
many foreigners in India. I know that
great relationship with our clients. We are
Sanskrit. I
Indians themselves suffer from
American based underground vendor
am very very
dishonesty of many auto drivers but with
with expli...
interested in
foreigners it is another level of hell.
Indian
When...
culture, especially the philosophy and religions. I’m trying hard to find a way to Read In Feed
get enlightened.Read In Feed I believe that perhaps Indian p...
Read In Feed