Assignment of Sklearn
Task
In the second ML assignment you have to compare the performance of three different classification algorithms, namely Naive Bayes, SVM, and Random Forest.
For this assignment you need to generate a random binary classification problem, and then train and test (using 10-fold cross validation) the three algorithms. For some algorithms inner cross validation (5-fold) for choosing the parameters is needed. Then, show the classification performace (per-fold and averaged) in the report, and briefly discussing the results.
- Create a classification dataset (n_samples ≥1000 ≥ 1000 , n_features ≥10 ≥ 10 )
- Split the dataset using 10-fold cross validation
- Train the algorithms
-
- GaussianNB
-
- SVC (possible C values [1e-02, 1e-01, 1e00, 1e01, 1e02], RBF kernel)
-
- RandomForestClassifier (possible n estimators values [10, 100, 1000])
- Evaluate the cross-validated performance
-
- Accuracy
-
- F1-score
-
- AUC ROC
- Write a short report summarizing the methodology and the results
Step 1
Create a classification dataset (n_samples ≥1000 ≥ 1000 , n_features ≥10 ≥ 10 )