BRISK is a good replacement to SIFT. ORB also works but didn’t work well for this example
BRISK 是 SIFT 的良好替代品。ORB 也可以工作,但在本例中效果不佳
1 2 3 4 5
brisk = cv2.BRISK_create(30) for image_path in image_paths: im = cv2.imread(image_path) kpts, des = brisk.detectAndCompute(im, None) des_list.append((image_path, des))
Stack all the descriptors vertically in a numpy array
在 numpy 数组中垂直堆叠所有描述符
1 2 3 4
descriptors = des_list[0][1] for image_path, descriptor in des_list[1:]: descriptors = np.vstack((descriptors, descriptor)) descriptors
kmeans works only on float, so convert integers to float
1
descriptors_float = descriptors.astype(float)
Perform k-means clustering and vector quantization
执行 k 均值聚类和矢量量化
这里使用 k-means,也可以使用 SVM 或 随机森林。
1 2 3 4
from scipy.cluster.vq import kmeans, vq
k = 200# k means with 100 clusters gives lower accuracy for the aeroplane example voc, variance = kmeans(descriptors_float, k, 1)
Calculate the histogram of features and represent them as vector
计算特征的直方图并将其表示为向量
vq Assigns codes from a code book to observations.
vq 将代码簿中的代码分配给观察值
1 2 3 4 5
im_features = np.zeros((len(image_paths), k), "float32") for i inrange(len(image_paths)): words, distance = vq(des_list[i][1],voc) for w in words: im_features[i][w] += 1
# Get path to all images and save them in a list # image_paths and the corresponding label in image_paths image_paths = [] image_classes = [] class_id = 0
# To make it easy to list all file names in a directory let us define a function
defimglist(path): return [os.path.join(path, f) for f in os.listdir(path)]
# Fill the placeholder empty lists with image path, classes, and add class ID number
for testing_name in testing_names: dir = os.path.join(test_path, testing_name) class_path = imglist(dir) image_paths+=class_path image_classes+=[class_id]*len(class_path) class_id+=1
# Create feature extraction and keypoint detector objects # SIFT is not available anymore in openCV # Create List where all the descriptors will be stored des_list = []
# BRISK is a good replacement to SIFT. ORB also works but didn;t work well for this example brisk = cv2.BRISK_create(30)
for image_path in image_paths: im = cv2.imread(image_path) kpts, des = brisk.detectAndCompute(im, None) des_list.append((image_path, des))
# Stack all the descriptors vertically in a numpy array descriptors = des_list[0][1] for image_path, descriptor in des_list[0:]: descriptors = np.vstack((descriptors, descriptor))
# Calculate the histogram of features # vq Assigns codes from a code book to observations. from scipy.cluster.vq import vq test_features = np.zeros((len(image_paths), k), "float32") for i inrange(len(image_paths)): words, distance = vq(des_list[i][1],voc) for w in words: test_features[i][w] += 1
# Scale the features # Standardize features by removing the mean and scaling to unit variance # Scaler (stdSlr comes from the pickled file we imported) test_features = stdSlr.transform(test_features)
Until here most of the above code is similar to Train excerpt for kmeans clustering
Report true class names so they can be compared with predicted classes
报告真实的类别名称,以便与预测的类别进行比较
1
true_class = [classes_names[i] for i in image_classes]
Perform the predictions and report predicted class names.
执行预测,并报告预测的类名。
1
predictions = [classes_names[i] for i in clf.predict(test_features)]