Due to the existing of the semantic gap, images with the same or similar low level features are possibly different on semantic level. How to find the underlying relationship between the high-level semantic and low level features is one of the difficult problems for image annotation. In this paper, a new image annotation method based on graph spectral clustering with the consistency of semantics is proposed with detailed analysis on the advantages and disadvantages of the existed image annotation methods. The proposed method firstly cluster image into several semantic classes by semantic similarity measurement in the semantic subspace. Within each semantic class, images are re-clustered with visual features of region Then, the joint probability distribution of blobs and words was modeled by using Multiple-Bernoulli Relevance Model. We can annotate a unannotated image by using the joint distribution. Experimental results show the the effectiveness of the proposed approach in terms of quality of the image annotation. the consistency of high-level semantics and low level features is efficiently achieved.