A self-organizing map (SOM) is used to classify software documents and the associated software components with the aim to facilitate software reuse. SOM learns from input stimuli rather than training data, therefore the quality of input data representation is crucial to the success of SOM. In this paper, we use automatic indexing method to represent a document collection as the input data to train a SOM. The automatic indexing uses a phrase formation method to promote precision and a domain dependent relational thesaurus to enhance recall. A retrieval experiment based on a document collection containing 97 Unix manual pages was conducted to evaluate the effectiveness of this input data representation scheme. Promising retrieval results were observed.
History
Source title
Proceedings of the 2nd International Symposium on Knowledge Acquisition and Modeling 2009
Name of conference
2nd International Symposium on Knowledge Acquisition and Modeling, 2009 (KAM '09)
Location
Huazhong Normal University, China
Start date
2009-11-30
End date
2009-12-01
Pagination
350-353
Publisher
Institute of Electrical and Electronics Engineers (IEEE)
Place published
Piscataway, NJ
Language
en, English
College/Research Centre
Faculty of Engineering and Built Environment
School
School of Electrical Engineering and Computer Science