Prediction of crime categories in San Francisco area
MetadataShow full item record
Numerous information and data now are available to us with an increasing development of the Internet. Even though with so much useful and valuable information and data, there are still much to do and to think about how to make use of them. At the very beginning, it is required to discover the useful data for the research because different data are suitable for different researches. Then, what matters is how to put the data and information into reasonable use to construct a model to make prediction. At the same time, machine learning also plays a very important role for the big data. Machine learning is a subfield of computer science which includes lots of useful methods. I will use both decision tree and random forest method in my analysis. All the three methods will be used for the two datasets from a data science competition website which are regarding the survival from the sinking of Titanic and the crime category of San Francisco respectively. The purpose of Sinking of Titanic is to predict which passengers survived the tragedy and the purpose of the crime category of San Francisco is to predict the category of crimes that occurred in the city. I will combine the all three models' results to see if it is helpful to the accuracy of prediction.