I am assuming each image contains a single object.
It is possible, however, it is not as easy as you might think. Firstly, you need extract as many features as possible: original image, LBP, SIFT, moments, contour descriptors to name a few. Than concatenate these features into a single feature vector. After this step, use clustering. You will need a lot of samples to compensate for the number of features. After clustering, use a correlation method to find which features are related to each cluster.
If you need features to classify within a cluster, you could do a second clustering with full set of features and apply the same method. The features that are selected for a cluster will not be suitable to classify within the cluster.