Skip to main content

Datasets

Standard Dataset

Zh-Probase

Citation Author(s):
Haijun Zhang (SJTU)
Submitted by:
Haijun Zhang
Last updated:
DOI:
10.21227/0wq5-6v48
Research Article Link:
No Ratings Yet

Abstract

This is a large Chinese taxonomic knowledge base, which is translated from Probase by the neural network.

It has 11,292,493 IsA pairs with an accuracy of 86.6%.

 

Instructions:

"""concept-zh entity-zh concept entity frequency popularity ConceptFrequency ConceptSize ConceptVagueness Zipf_Slope Zipf_Pearson_Coefficient EntityFrequency EntitySize

制品 樱桃番茄 product cherry tomato 1 1 199329 103249 0.787531308703 -0.990656645104571 -0.996902197791307 279 131

...

"""

Each line in the file is an IsA pair, which also has the original English pair and the metadata.