AI-Driven Crop Recommendation Dataset for Advancing Precision Farming in Zotlang, Champhai District, Mizoram, North-East India

Citation Author(s):
Zaitinkhuma
Thihlum
Mizoram University
V.D. Ambeth
Kumar
Mizoram University
Ajoy Kumar
Khan
Mizoram University
Vanlal
hruaia
Mizoram University
Mayanglambam
Sushilata Devi
Mizoram University
Saithan
tluanga
Mizoram University
Rajdip
Roy
Mizoram University
Submitted by:
Zaitinkhuma Thihlum
Last updated:
Sun, 03/02/2025 - 12:52
DOI:
10.21227/a6gs-xd55
Data Format:
License:
0
0 ratings - Please login to submit your rating.

Abstract 

Agriculture is the backbone of Mizoram’s state economy as the majority of the people use agriculture and its allied sector as their livelihood. According to the 2011 census, more than 50% of the people are still engaged in agriculture and its related activities. Jhum cultivation or shifting cultivation is the primary farming pattern in the state. However, this traditional farming method is no longer effective and productive, due to various reasons such as resource limitations due to population pressure, and a shortened jhum cycle period of 3-4 years (i.e., the ideal cycle is 14-18). Furthermore, it is considered unsustainable due to the adverse effects of deforestation, biodiversity loss, and vulnerability to climate change (as the crop cultivated under the land is rainfed). Recognizing this, policymakers in Mizoram have implemented various land use policies (policy change as government change) in an attempt to transition all farmers away from unsustainable jhum farming practices by introducing more sustainable farming methods such as terrace farming, wet rice cultivation, and organic farming, etc. Since then, Plantation crops, wet rice cultivation, and forest areas have expanded, whereas shifting cultivation areas are decreasing. However, the total prohibition of jhum cultivation, which had been the program's primary goal, was not achieved, and the method remains in use today.  

               The major challenges faced by the Mizoram farmers, particularly Jhumias (those who practice jhum cultivation) are that they grow the crop based on their traditional farming knowledge without considering the factors that influence the production and yield leading to low productivity, soil degradation, inefficient resource utilization, and vulnerability to climate variability. The cutting-edge Artificial intelligence (AI) technology like Machine Learning (ML) offers a potential solution to mitigate the challenges faced by the farmers.  However, they rely on a data-driven approach, relevant information must be provided to train the model. As no such real-time data exists for Mizoram, this dataset will serve a fruitful benefit for upcoming researchers while supporting AI-based precision farming. 

            The dataset was collected from Zotlang, Champhai District, Mizoram, North East India during September to November 2024. The data was acquired using a 7-in-1 Integrated Soil Sensor EC PH NPK Moisture Temperature Meter, Soil Moisture Sensor, and DHT11 Humidity & Temperature Sensor, which was connected to an ESP32 WROOM for real-time monitoring. Furthermore, local farmers' perspectives were incorporated to reflect real-world agricultural realities. The dataset comprises of 89 samples with seven independent variables: Nitrogen (mg/kg), Phosphorus (mg/kg), Potassium (mg/kg), Soil Moisture (in %), Temperature (in °C), Humidity (in %), and Farming Method (i.e., Jhum, Terrace, WRC (wet rice cultivation)) and label as a dependent variable (Rice 44, and Maize 45).

               The file is in CSV format, well-structured, and particularly tailored for machine learning classification tasks. It is advised to use scaling (MinmaxScaler, StandardScaler) for numerical features and one-hot-encoding for categorical features (i.e., Farming_Method) before training the ML models. This work promotes data-driven agricultural decision-making by providing the first real-time soil information from Mizoram, as well as supporting research into AI-based precision farming.

Instructions: 

Dataset Structure:
The dataset comprises 89 samples with seven independent variables:

  • Nitrogen (mg/kg)

  • Phosphorus (mg/kg)

  • Potassium (mg/kg)

  • Soil Moisture (in %)

  • Temperature (in °C)

  • Humidity (in %)

  • Farming Method (Jhum, Terrace, WRC - Wet Rice Cultivation),

  • And dependent feature as the label ( Rice 44 instances, Maize 45 instances).

    The dataset is suitable for ML classification tasks. There is no any missing values in the columns, the scaling methods like MinMaxScaler, StandardScaler and one-hot encoding is recommended to handle numerical variables and categorical variables respectively. Outlier detection and treatment such as imputation techniques like group mean replacement or group median replacement might even improve the training ML classifiers model. As the data size is small, do not remove any outliers if found to avoid information loss.  The evaluation metrics like accuracy, precision, recall and f1-score may be used to effectively evaluate the training models.

     

Funding Agency: 
IIT Bhilai Innovation and Technology Foundation (IBITF)
Grant Number: 
IBITF/Note/TSP/SanctionLetter/2023-24/ 0252)