一站式論文代寫,英国、美国、澳洲留学生Essay代寫—FreePass代写

Python代寫|機器學習代寫 - machine learning
時間:2020-11-12
1. (5+10+10=25 pts) PCA For this problem, we will try to quantify the impact of dimensionality reduction on logistic regression. (a) Normalize the features of the wine quality dataset (where applicable). Train an unregularized logistic regression model on the normalized dataset and predict the probabilities on the normalized test data. (b) Run PCA on the normalized training dataset. How many components were needed to capture at least 95% of the variance in the original data. Discuss what characterizes the first 3 principal components (i.e., which original features are important). (c) Train an unregularized logistic regression models using the PCA dataset and predict the probabilities on the appropriately transformed test data (i.e., for PCA, the test data should be transformed to reflect the loadings on the k principal components). Plot the ROC curves for both models (normalized dataset, PCA dataset) on the same graph. Discuss your findings from the ROC plot. 2. (30+10+5=45 pts) Almost Random Forest For this problem, you will be implementing a variant of the random forest using the decision trees from scikit-learn. However, instead of subsetting the features for each node of each tree in your forest, you will choose a random subspace that the tree will be created on. In other words, each tree will be built using a bootstrap sample and random subset of the features. The template code for the random forest is available in rf.py (a) Build the adaptation of the random forest. Note that the forest will support the following parameters. ? nest: the number of trees in the forest ? maxFeat: the maximum number of features to consider in each tree ? criterion: the split criterion – either gini or entropy ? maxDepth: the maximum depth of each tree ? minSamplesLeaf: the minimum number of samples per leaf node Note that you’ll need to implement the train and predict function in the template code. ? train: Given a feature matrix and the labels, learn the random forest using the data. The return value should be the OOB error associated with the trees up to that point. For example, at 5 trees, calculate the random forest predictor by averaging only those trees where the bootstrap sample does not contain the observation. 1 ? predict: Given a feature matrix, predict the responses for each sample. (b) Find the best parameters on the wine quality training dataset based on classification error. Justify your selection with a few plots or tables. (c) Using your optimal parameters, how well does your version of the random forest perform on the test data? How does this compare to the estimated OOB error?

在線客服

售前咨詢
售后咨詢
微信號
Essay_Cheery
微信
专业essay代写|留学生论文,作业,网课,考试|代做功課服務-PROESSAY HKG 专业留学Essay|Assignment代写|毕业论文代写-rushmyessay,绝对靠谱负责 代写essay,代写assignment,「立减5%」网课代修-Australiaway 代写essay,代写assignment,代写PAPER,留学生论文代写网 毕业论文代写,代写paper,北美CS代写-编程代码,代写金融-第一代写网 作业代写:CS代写|代写论文|统计,数学,物理代写-天天论文网 提供高质量的essay代写,Paper代写,留学作业代写-天才代写 全优代写 - 北美Essay代写,Report代写,留学生论文代写作业代写 北美顶级代写|加拿大美国论文作业代写服务-最靠谱价格低-CoursePass 论文代写等留学生作业代做服务,北美网课代修领导者AssignmentBack