Abstract
It is necessary to maintain the silicon content of hot metal in a stipulated range to improve the productivity and energy consumption of a blast furnace. Significant research therefore went into the development of data-driven models to predict hot metal silicon content in real-time. However, these models use only a small subset of blast furnace variables that are chosen using prior process knowledge. As each blast furnace is unique in its operation, using pre-selected variables would lead to sub-optimal models. To address this, a machine learning based ensemble feature selection and modeling approach is proposed. In this approach, all the available furnace variables are ranked using multiple feature selection techniques based on their impact on silicon content. The individual ranks are combined to obtain an ensemble ranking of variables and the top variables in the ranking are used to build data-driven silicon prediction models. This approach is applied to an industrial blast furnace wherein 374 variables are used to obtain the ensemble ranking. While some of the top 100 variables in the ensemble ranking matched those that are commonly used in silicon predictions models, several new variables have also been identified. Silicon prediction models trained using the top 100 variables resulted in a hot rate of ~90% demonstrating the efficacy of the proposed approach. Real-time predictions from the models will enable blast furnace operators to control the silicon content without having to wait for laboratory results.