Crawler Programme
In this part, first, I use some external libraries of Python, such as BeautifulSoup4 and DrissionPage, to build a crawler program, and use MultiProcessing for multi-process crawling to improve the crawling efficiency.
Finally, the data is connected to the MongoDB database to complete storage. The form of the crawled data is shown in the following figure.
Data Statistics
and the statistics of crawled data are showed below:
Deficiency
An awkward thing is that there is no replies in the database of lipstick and running shoes, and only a few replies in the ricecooker database.
So given to this deficiency, only the databases of cellphone, laptop and camera were used for processing