Persons

Петров Евгений Николаевич
PhD student of the Institute of System and Software Engineering and Computer Science, National Research University of Electronic Technology (Russia, 124498, Moscow, Zelenograd, Shokin sq., 1)

Article author

The nowadays supervised machine learning algorithms use the feature description to classify objects. Such a description may include a great number of features provided the task demands it. In the work the genetic algorithm based feature selection as a part of the software complex of bibliographic data processing has been described. The analysis of the problem situation within the framework of the subject area, related to the feature description size of the bibliographic data objects, has been carried out. A method of solving the given problem due to the genetic algorithm feature selection has been proposed. The paper includes the general principles of the software model and the implementation details in the Python programming language. The problem of feature description and re-learning in bibliographic data processing has been solved, it has been shown that learning and re-learning accelerates without loss of the classification quality. The developed software for genetic algorithm feature selection can be applied within the framework of the software complex for bibliographic data processing.The following results have been obtained during the computational experiment: the number of features used decreased from 26 to 15, and the quality of classification increased by 3 % due to the elimination of features that contribute to retraining.

  • Counter: 700 | Comments : 0

The complexity of the bibliographic data processing lies in the variety of acceptable standards and the lack of multifunctional software that can be extended for new formats and can process data containing insignificant errors. The work describes the software input/output module dynamic control as a part of the software complex of bibliographic data processing. The analysis of the problem situation within the framework of the subject area, related to multiformat bibliographic data processing, has been carried out. A method of solving the given problem due to placing the processing modules beyond the functional nucleus and due to creating the decomposition extendable system has been proposed. The article includes the general principles of the software model and the implementation details in the programming language Python. The problem of multiple acceptable bibliographic data standards and numerous proprietary formats of the organizations engaged in bibliographic data processing has been solved. The developed software for input/output dynamic control can be applied within the framework of the software complex for bibliographic data processing.

  • Counter: 1148 | Comments : 0

  • Counter: 393 | Comments : 0

124498, Moscow, Zelenograd, Bld. 1, Shokin Square, MIET, editorial office of the Journal "Proceedings of Universities. Electronics", room 7231

+7 (499) 734-62-05
magazine@miee.ru