A self-organizing map (SOM) is used to classify software documents and the associated software components with the aim to facilitate software reuse. SOM learns from input stimuli rather than training data, therefore the quality of input data representation is crucial to the success of SOM. In this paper, we use automatic indexing method to represent a document collection as the input data to train a SOM. The automatic indexing uses a phrase formation method to promote precision and a domain dependent relational thesaurus to enhance recall. A retrieval experiment based on a document collection containing 97 Unix manual pages was conducted to evaluate the effectiveness of this input data representation scheme. Promising retrieval results were observed.