2016年5月25日 星期三

[Machine Learning] 'Training/Cross-Validation/Test

from:http://stats.stackexchange.com/questions/19048/what-is-the-difference-between-test-set-and-validation-set

The concept of 'Training/Cross-Validation/Test' Data Sets is as simple as this. When you have a large data set, it's recommended to split it into 3 parts:

++Training set (60% of the original data set): This is used to build up our prediction algorithm. Our algorithm tries to tune itself to the quirks of the training data sets. In this phase we usually create multiple algorithms in order to compare their performances during the Cross-Validation Phase.

++Cross-Validation set (20% of the original data set): This data set is used to compare the performances of the prediction algorithms that were created based on the training set. We choose the algorithm that has the best performance.

++Test set (20% of the original data set): Now we have chosen our preferred prediction algorithm but we don't know yet how it's going to perform on completely unseen real-world data. So, we apply our chosen prediction algorithm on our test set in order to see how it's going to perform so we can have an idea about our algorithm's performance on unseen data.

Notes:

-It's very important to keep in mind that skipping the test phase is not recommended, because the algorithm that performed well during the cross-validation phase doesn't really mean that it's truly the best one, because the algorithms are compared based on the cross-validation set and its quirks and noises...

-During the Test Phase, the purpose is to see how our final model is going to deal in the wild, so in case its performance is very poor we should repeat the whole process starting from the Training Phase.

2016年5月24日 星期二

[Machine Learing] Training set 6: cross validation set 2: testing set 2

我们首先定义train、cv(cross validation)和test error。
之后我们根据train error训练模型。如果是模型比较时,我们就训练出多个不同的模型,如上图。
之后,对于模型选择。我们让cross validation set通过的这些模型,并计算这些模型的error(cv error)。我们选cv error最小的那个模型作为最好的模型。
最后,对于估计模型的performance(也就是generalization):我们让   test set通过我们刚刚用cv error选出的模型,得到test error,作为performance。
需要强调的是,如果不做模型选择,那么就不需要validation set了。只要training和test set。两者的比例大概是7:3。
然后,这里解释下为什么做模型选择要validation set和test set分开。首先我们不能拿training set做模型选择,因为那样没有generalization了(我们的算法就是根据training set来优化模型的,所以没有再用这个set做模型选择没意义)。其次,如果我们用test set选模型,那么选择的模型就fit到 test set,结果我们同样失去了generalization。所以要用validation set选模型。用test set测generalization。
=>
Use cross validation set to chose polynomial model (theta from training set), use testing set data to judge the performance of model which you just get. 

****
If you do not have to select model, then just use training set 7 : testing set 3 .

2016年5月23日 星期一

[octave] element by element operation

      plus       .+
      minus     .-
      times     .*
      rdivide   ./
      ldivide   .\
      power     .^  .**

https://www.gnu.org/software/octave/doc/v4.0.0/Arithmetic-Ops.html#Arithmetic-Ops

x .+ y

    Element-by-element addition. This operator is equivalent to +.



x = [1 2 3;
     4 5 6;
     7 8 9]

y = [1 1 1;
     1 1 1;
     1 1 1]

x.+y # equal x+y, because + originally is a element by element operator

ans =

    2    3    4
    5    6    7
    8    9   10



x .* y

    Element-by-element multiplication. If both operands are matrices, the number of rows and columns must both agree, or they must be broadcastable to the
same shape.

x = [1 2 3;
     4 5 6;
     7 8 9]

y = [10 2 1;
     10 2 1;
     10 2 1]

x*y #normal *

ans =

    60    12     6
   150    30    15
   240    48    24

x.*y # element by element 1*10 4*10 7*10 .....
ans =

   10    4    3
   40   10    6
   70   16    9



x ./ y

    Element-by-element right division.




x = [1 2 3;
     4 5 6;
     7 8 9]

y = [10 2 1;
     10 2 1;
     10 2 1]

x./y
ans =

   0.10000   1.00000   3.00000
   0.40000   2.50000   6.00000
   0.70000   4.00000   9.00000




y = [10 2 1;
     10 2 1;
     10 2 1]
m=5
y./5 # equal y/5

ans =

   2.00000   0.40000   0.20000
   2.00000   0.40000   0.20000
   2.00000   0.40000   0.20000










[octave]broadcasting

https://www.gnu.org/software/octave/doc/v4.0.0/Broadcasting.html#Broadcasting


在octave 中, broadcasting 的意思是
當兩個唯度不同矩陣要運算時, 維度小的 會自己擴充成跟維度大的一樣維度之後, 才進行運算


ex:

x = [1 2 3;
     4 5 6;
     7 8 9]

y = [10 20 30]

x+y  # 這本來是不能作的, 因為x y 維度不同, 但他自動幫你變成

y = [10 20 30
     10 20 30
     10 20 30];

這就是broadcasting, 此時 y 維度就跟x 一樣, 可以作加法了

>>output
warning: operator +: automatic broadcasting operation applied
ans =

   11   22   33
   14   25   36
   17   28   39


(他也會出現警告, 提醒你這個+法經過了broadcasting)



 如果兩個matrix 一樣dimension , 則直接就是
element by element 相加



x = [1 2 3;
     4 5 6;
     7 8 9]

y = [1 2 3;
     4 5 6;
     7 8 9]

x+y
ans =

    2    4    6
    8   10   12
   14   16   18







[octave ] octave zeros function

Built-in Function: zeros (n)
Built-in Function: zeros (m, n)
Built-in Function: zeros (m, n, k, …)
Built-in Function: zeros ([m n …])
Built-in Function: zeros (…, class)

    Return a matrix or N-dimensional array whose elements are all 0.

    If invoked with a single scalar integer argument, return a square NxN matrix.

    If invoked with two or more scalar integer arguments, or a vector of integer values, return an array with the given dimensions.

    The optional argument class specifies the class of the return array and defaults to double. For example:

    val = zeros (m,n, "uint8")


a=[1, 2, 33;4 ,5, 6; 7 ,8, 66;55 ,476, 22]
a =

     1     2    33
     4     5     6
     7     8    66
    55   476    22

[rows columns]=size(a);
rows =  4
columns =  3

zeros (rows,columns, "uint8")

ans =

  0  0  0
  0  0  0
  0  0  0
  0  0  0


zeros(4) # will give you 4x4 matrix
ans =

   0   0   0   0
   0   0   0   0
   0   0   0   0
   0   0   0   0


zeros(4,1) #want a 4x1 zero vector
ans =

   0
   0
   0
   0






[octave] size function

size function

https://www.gnu.org/software/octave/doc/v4.0.1/Object-Sizes.html#XREFsize


Built-in Function: size (a)
Built-in Function: size (a, dim)

    Return the number of rows and columns of a.

    With one input argument and one output argument, the result is returned in a row vector. If there are multiple output arguments, the number of rows is assigned to the first, and the number of columns to the second, etc. For example:

    size ([1, 2; 3, 4; 5, 6])
       ⇒ [ 3, 2 ]

    [nr, nc] = size ([1, 2; 3, 4; 5, 6])
        ⇒ nr = 3
        ⇒ nc = 2

    If given a second argument, size will return the size of the corresponding dimension. For example,

    size ([1, 2; 3, 4; 5, 6], 2)
        ⇒ 2

    returns the number of columns in the given matrix.

    
Testing:
    
a=[1, 2, 33;4 ,5, 6; 7 ,8, 66;55 ,476, 22]
a =

     1     2    33
     4     5     6
     7     8    66
    55   476    22

[rows columns]=size(a); 
rows =  4
columns =  3


size(a,1) # 1 means get rows ;  2 means get columns
ans =  4


size(a,2)
ans =  3



-------------------------------------------------------------------------------









 

[Octave] Octave matrix slice

CASE1:

a=[1, 2, 33;4 ,5, 6; 7 ,8, 66];

a =

    1    2   33
    4    5    6
    7    8   66

a(3)       # result is a scalar
 ans =  7


a(1:4)     # show range form index 1 to(:) index 4, and result is a row vector
ans =

   1   4   7   2

a([1; 9])  # show range form index 1 and index 9 result is a column vector
ans =

    1
   66

a(1, [1, 3])  # row 1, columns 1 and 3
ans =

    1   33


a(3, 1:3)     # row 3, columns in range 1-2
ans =

    7    8   66

a(1, :)       # row 1, all columns , use really often !!!

ans =

    1    2   33


Case2:

a = [1, 2, 3, 4]
a =

   1   2   3   4

a(1:end/2)        # first half of a => [1, 2],end is the last element in the matrix
ans =

   1   2


a(end + 1) =5   # append element
 a =

   1   2   3   4   5

a(end) = []      # delete element
a =

   1   2   3   4

a(1:2:end)        # odd elements of a => [1, 3]
ans =

   1   3

a(2:2:end)        # even elements of a => [2, 4]
ans =

   2   4

a(end:-1:1)       # reversal of a => [4, 3, 2 , 1]
ans =

   4   3   2   1



 CASE3: often use in machine learning cost function

a=[1, 2, 33;4 ,5, 6; 7 ,8, 66];

a =

    1    2   33
    4    5    6
    7    8   66

[99 ; a(1,:)']

 ans =

   99
    1
    2
   33


num_labels=4;
zeros(num_labels,1)

ans =

   0
   0
   0
   0


a=[1, 2, 33;4 ,5, 6; 7 ,8, 66;55 ,476, 22]
a =

     1     2    33
     4     5     6
     7     8    66
    55   476    22

a(:,2:end) #from second column  to end

ans =

     2    33
     5     6
     8    66
   476    22

a(1,:) #get first row
ans =

    1    2   33

a(2:end) # start from element 2 to end

ans =

     4     7    55     2     5     8   476    33     6    66    22

a(:) # unroll every element


ans =

     1
     4
     7
    55
     2
     5
     8
   476
    33
     6
    66




2016年5月3日 星期二

[octave] build octave under sublime text 3 (pause not work)

tools->build system->new build system

    {
        "path": "/usr/bin/",//path of your Octave bin folder
        "cmd": ["octave", "--no-site-file", "-p $file_path", "$file_name"]
    }