一站式論文代寫,英国、美国、澳洲留学生Essay代寫—FreePass代写

R代寫-GMGT 7530-Assignment 3
時間:2021-07-14
GMGT 7530 (T23)
(Instructors: Dr. Xikui Wang, Dr. Wenxi Pu, Dr. Carson Leung)
Assignment 3 (25%) on textual analysis
Due: July 16th, 2021 (before 12 noon Winnipeg time)
Drop your submissions to the UM Learn assignment 3 folder


Use R to complete the following questions. Submit your R codes in its original form so I can test
run. Include your results and answers to questions 2.a and 3.c in another PDF file.
You are given a set of news articles about entrepreneurs from a list of newspapers headquartered
in the United States (EntrepreneurNewsArticles.zip). Each text file is one news article. The file
names include a random ID, newspaper name, published year (YYYY) and date (MMDD).
News articles from the same newspaper are organized into one folder. Please perform the
following tasks on this dataset.
Remark: There are many R codes available for this data set for various statistical analysis and
machine learning. You are allowed to learn from these codes. However, you must digest and
write your own codes, and attribute the original ideas properly. Install necessary packages. Since
there are many things available online, feel free to go beyond what this assignment asks for.

1. (5%) Preparation of the data set
a. Load the data set into a data frame, such that you have columns for each news
articles’ source (i.e., the newspaper), publishing year and date, and the content
(similar to what you did in the lab, but you need to extract the year and date,
instead of using the filenames directly).
b. Clean the data (similar to what you did in the assignment, you might want to
modify the list of extra stop words that we created).
2. (10%) Deciding the number of topics
a. Use the ldatuning package to derive the optimal number of topics. Please provide
your justifications for your choice. Note that the dataset is a bit larger, so it is
better to sample a manageable subset (say, 25%) from the whole dataset for
deciding the number of topics.
3. (10%) Run LDA model on the cleaned dataset with the derived number of topics
a. Develop label for each topic
b. Get the document-topic distribution.
c. Propose potential research questions and potential datasets that can be merged
with the document-topic distribution you got.

學霸聯盟

在線客服

售前咨詢
售后咨詢
微信號
Essay_Cheery
微信
专业essay代写|留学生论文,作业,网课,考试|代做功課服務-PROESSAY HKG 专业留学Essay|Assignment代写|毕业论文代写-rushmyessay,绝对靠谱负责 代写essay,代写assignment,「立减5%」网课代修-Australiaway 代写essay,代写assignment,代写PAPER,留学生论文代写网 毕业论文代写,代写paper,北美CS代写-编程代码,代写金融-第一代写网 作业代写:CS代写|代写论文|统计,数学,物理代写-天天论文网 提供高质量的essay代写,Paper代写,留学作业代写-天才代写 全优代写 - 北美Essay代写,Report代写,留学生论文代写作业代写 北美顶级代写|加拿大美国论文作业代写服务-最靠谱价格低-CoursePass 论文代写等留学生作业代做服务,北美网课代修领导者AssignmentBack