Suggestions for Titanic: Machine Learning from Disaster
I enjoyed going through your page. I think you have very clearly explained the hypothesis, problem description and how you proceed towards solving it. Some suggestions and queries I had are outlined below:
1. It would be great if you could proof read once because there are typos and grammatical errors; some of which hinder in understanding what you are trying to convey.
2. You might want to explain the attributes: for instance for the attribute Embarked, what do S,C and Q stand for? What does Parch stand for? Under your section, Using probability method to fill missing age value; you might want to explain what PMM-predictive mean matching stands for and maybe in a line or two explain how it works.
3. I think it was really clever how you used title and relevant attributes to guess the age of the person. In rare title, do you mean titles like Dr.?
4. You could give links to refer to RMSE and because a layman user might not know these standards of comparison.
5. If my understanding of ensemble methods is correct, it creates and combines multiple models to sort of average out the errors. I am not sure how multiple models are created and combined in your ensemble NN. And why do you chose to ignore the attribute FamilyID in your second set of implementations. You could probably explain this more explicitly in your page.
6. So if i understand correctly; in conclusion, because of insufficient unique data you are unable to train your neural network properly because of which you cannot use it to give a good prediction on your age attribute's missing values. You could consider adding a Conclusion, Discussion and Future work section to give the readers a better idea of what you have established and what could be further checked.
Thanks for a great page!
Thank you for your feedback! I will modify my page based on your suggestions.
For question 3, the 'Dona', 'Lady', 'the Countess','Capt', 'Col', 'Don', 'Dr', 'Major', 'Rev', 'Sir', 'Jonkheer', they all belong to Rare title.
For question 4, I did test a neural network that includes the FamilyID, but the result is the almost the same. I will update the page based on your suggestion and include the FamilyID.