Dataset is intended for studying how student programming styles and usage of IDE differs between students who plagiarise their homework and students who solve them honestly.Dataset includes homeworks submitted by students during two introductory programming courses (A and B) delivered during two years (2016 and 2017). A is delivered in C programming language, while B is delivered in C++. In addition to homeworks, dataset includes full traces of all student activity and keystrokes during homework development.


The archive provided consists of three parts:SOURCE CODES:Actual submitted homeworks by students (i.e. their source codes) are stored in folder "src". Subfolders of this folder are named after courses: A2016, A2017, B2016 and B2017. This further contain subfolders for individual assignments. On each course students were required to solve 16-22 assignments labeled "Z1/Z1", "Z1/Z2", "Z2/Z1" etc. Finally, in each folder are actual C or C++ files named after student (anonymized, so actual student names were replaced by strings in form "student1393").TRACES:IDE usage traces are stored in folder named "stats". Again, this folder is organized into subfolders named after courses. These folders contain files named after student (anonymized) with extension .stats and are in JSON format. Format of JSON files is described in readme.txt file.GROUND TRUTH:Ground truth lists students and groups of students that are considered to have involved in plagiarism due to code similarity and failure to deliver an "oral defense". There are three ground truth files. ground-truth-anon.txt contains full list of plagiarisms, ground-truth-static-anon.txt only those based on source code similarity, and ground-truth-dynamic-anon.txt only those based on failure to do an "oral defense". There is some overlap between the last two files. The format of the file is: homework assignment in the format:- A2016/Z1/Z1(dash, space, course name, slash, assignment name), followed by lists of anonymized names of students (such as "student3241") or groups of students who are mutually plagiarised separated by comma.


This paper presents a fast and open source extension based on the NSGA-II code stored in the repository of the Kanpur Genetic Algorithms Laboratory (KanGAL) and the adjustment of the selection operator. It slightly modifies existing well-established genetic algorithms for many-objective optimization called the NSGA-III, the adaptive NSGA-III (A-NSGA-III), and the efficient adaptive NSGA-III,  (A$^2$-NSGA-III).


All primes can be indexed by $k$, as primes must be in the form of

$6k+1$ or $6k-1$. In this paper, we explore for what $k$ such that

either $6k+1$ or $6k-1$ is not a prime. The results can sieve primes

and especially twin primes.


$k \in S_{l} \Rightarrow 6k-1 \not \in \mathbb{P}$, $k \in S_{r}

\Rightarrow 6k+1 \not \in \mathbb{P},$ where $S_{l} = [-I]_{6I+1} =

[I]_{6I-1} \backslash \min([I]_{6I-1}), I \in \mathbb{N},$ and

$S_{r} = [-I]_{6I-1} \cup [I]_{6I+1} \backslash \min([I]_{6I+1}), I

\in \mathbb{N}.$ That is,


We study a reverse problem - given a reduced dynamics or partial dynamics, can we compute a residue class who presents that dynamics.


We design a computer program that can randomly generate extremely large integers and output their original dynamics. The source code is txpo10b.c. The bit length of integers can be defined by Macro (named MAXLEN) in source code. The number of randomly generated integers can be set by inputting argument. The program can output the original dynamics of a starting integer in terms of “-” presenting (3*x+1)/2 and “0” presenting x/2. This data can be used for observing the relation between the count of “-” and the count of “0”.