Figure 7 illustrates that the proposed optimized algorithm reaches the optimal cumulative reward value of 41238.8 at the 300th time point, whereas the Greedy algorithm and Random algorithm achieve only 38713 and 29328, respectively. In terms of average reward values, the optimized algorithm is 137.5, the Greedy algorithm is 129, and the Random algorithm is 97.8. IoT technology plays a crucial role in optimizing algorithms by supporting information sharing and interaction among devices. This, in turn, improves the accuracy and efficiency of decision-making.