Introduction to Scikit-learn
Chapter 4: Understanding your features
Obtaining some statistics per category
Related to the previous question…
Question 4: What are the range and statistics of each attribute, per class?
It is also a good idea to obtain the statistics from the previous page separately for each category. So you can try to find each attribute’s range/mean/median/standard deviation separately for category 0, category 1 and category 2. You may discover some patterns and get some ideas about what features will be useful for certain classes.
Again, I will leave you to try this yourself! I’m sure you are all NumPy
experts by now and do not need any solutions for these! 😊
Hint: x[y==2]
will give you all feature vectors that belong to category 2.