A vision-based two-handed interaction approach for augmented reality is described. In order to achieve the real-time performance, the hand model and gestures are simplified by only using three key points on the hands: the tips of index finger and thumb, and the concave point between the tips of index finger and thumb. Moreover, a user intention recognition method is applied for the interaction. This method allows that the operation can be added or changed easily by modifying user intention recognition rules. Finally, a prototype system is developed to demonstrate the effectiveness and robustness of the presented approaches.