In this paper, a feature-based technique for the camera pose estimation in a sequence of wide-baseline images has been proposed. Camera pose estimation is an important issue in many computer vision and robotics applications, such as, augmented reality and visual SLAM. The proposed method can track captured images taken by hand-held camera in room-sized workspaces with maximum scene depth of 3-4 meters. The system can be used in unknown environments with no additional information available from the outside world except in the first two images that are used for initialization. Pose estimation is performed using only natural feature points extracted and matched in successive images. In wide-baseline images unlike consecutive frames of a video stream, displacement of the feature points in consecutive images is notable and hence cannot be traced easily using patch-based methods. To handle this problem, a hybrid strategy is employed to obtain accurate feature correspondences. In this strategy, first initial feature correspondences are found using similarity of their descriptors and then outlier matchings are removed by applying RANSAC algorithm. Further, to provide a set of required feature matchings a mechanism based on sidelong result of robust estimator was employed. The proposed method is applied on indoor real data with images in VGA quality (640×480 pixels) and on average the translation error of camera pose is less than 2 cm which indicates the effectiveness and accuracy of the proposed approach.