Code Generation

LeetCode-Contest

Contains 80 questions of LeetCode weekly and bi-weekly contests released after March 2024.
Each question contains an average of 644 test cases, as well as programming solutions in Python language collected from the official LeetCode website. The input fields of the data set contain function headers and natural language descriptions, which are mainly used to evaluate the ability of large models to solve programming problems according to requirements.

Categories:: Artificial Intelligence

36 Views

CodePromptEval

CodePromptEval is a dataset of 7072 prompts designed to evaluate five prompt techniques (few-shot, persona, chain-of-thought, function signature, list of packages) and their effect on the correctness, similarity, and quality of complete functions generated. Each data point in the dataset includes a function generation task, a combination of prompt techniques to be applied, the prompt in natural language that applied the prompt techniques, the ground truth of the functions (human-written functions based on CoderEval dataset by Yu et al.), the tests to evaluate the correctness of the generate

Categories:: Education and Learning Technologies
Machine Learning

32 Views

SpringProd and ApacheProd - executable text-code datasets

M. Kacmajor and J.D. Kelleher, "ExTra: Evaluation of Automatically Generated Source Code Using Execution Traces" (submitted to IEEE TSE)

Categories:: Artificial Intelligence
Machine Learning

27 Views

SpringTC - an executable text-code dataset

M. Kacmajor and J.D. Kelleher, "ExTra: Evaluation of Automatically Generated Source Code Using Execution Traces" (submitted to IEEE TSE)

Categories:: Artificial Intelligence
Machine Learning

49 Views