Speed for different kernels in scikit-learn's SVM

I'm using scikitlearn in Python to create some models while trying different kernels. I was surprised to see that rbf was fit in under a second, whereas linear took a minute and poly took hours. Can anyone explain why to help me?

My code:

import numpy as np
from sklearn.svm import SVR
import matplotlib.pyplot as plt
from sklearn.metrics 
import mean_squared_error, r2_score
f = open(datafile)
X = np.empty([0,1], dtype = int)
y = np.array([])
for i in range(100):
    my_lines = f.readline().split(" ")
    t = np.array([[int(my_lines[0])]])
    n = np.array([int(my_lines[1])])
    X = np.concatenate((X, t), axis =0 )
    y = np.concatenate((y, np.array([int(my_lines[1])])), axis = 0)
f.close()
svr_rbf = SVR(kernel='rbf', C=1e3, gamma=0.1)
svr_lin = SVR(kernel='linear', C=1e3)
svr_poly = SVR(kernel='poly', C=1e3, degree=3)
svr_rbf.fit(X, y)
svr_lin.fit(X, y)
svr_poly.fit(X, y)

The data:

0 22
1 23
2 23
3 23
4 25
5 24
6 25
7 26
8 26
9 30
10 29
11 29
12 30
13 30
14 30
15 31
16 29
17 28
18 29
19 31
20 30
21 30
22 31
23 31
24 30
25 31
26 31
27 30
28 29
29 31
30 33
31 31
32 32
33 30
34 30
35 30
36 29
37 30
38 29
39 29
40 27
41 28
42 28
43 27
44 27
45 28
46 28
47 29
48 29
49 28
50 28
51 29
52 27
53 27
54 28
55 28
56 28
57 30
58 32
59 31
60 30
61 28
62 28
63 30
64 27
65 27
66 28
67 27
68 29
69 33
70 39
71 35
72 34
73 29
74 30
75 28
76 28
77 29
78 27
79 26
80 25
81 17
82 0
83 0
84 29
85 21
86 20
87 18
88 19
89 19
90 19
91 18
92 17
93 18
94 20
95 19
96 20
97 19
98 18
99 18

Topic machine-learning-model data-mining machine-learning

Category Data Science


One option is to speed up your code is to standardize features, removing the mean and scaling to unit variance.

from sklearn.preprocessing import StandardScaler

scaler = StandardScaler()
X = scaler.fit_transform(X.reshape(-1, 1))
y = scaler.fit_transform(y.reshape(-1, 1))

After I standardized the data, each model fit in less than 500ms.

Support vector machines (SVMs) need to calcuate the distance between each data. Standardize numbers make the calculations faster.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.