web data
This dataset is for researching main content extraction from web pages as a archived mongoDB file and postgresql dump file.
This dataset has crawled MHTML files of web pages from nine languages (Korean, Japanese, Indonesian, French, Russian, Saudi Arabian (Arabic), and Chinese).
Releated Resources:
- Categories:
178 Views