神經網路源碼
Ⅰ 用c語言編寫RBF神經網路程序
RBF網路能夠逼近任意的非線性函數,可以處理系統內的難以解析的規律性,具有良好的泛化能力,並有很快的學習收斂速度,已成功應用於非線性函數逼近、時間序列分析、數據分類、模式識別、信息處理、圖像處理、系統建模、控制和故障診斷等。
簡單說明一下為什麼RBF網路學習收斂得比較快。當網路的一個或多個可調參數(權值或閾值)對任何一個輸出都有影響時,這樣的網路稱為全局逼近網路。由於對於每次輸入,網路上的每一個權值都要調整,從而導致全局逼近網路的學習速度很慢。BP網路就是一個典型的例子。
如果對於輸入空間的某個局部區域只有少數幾個連接權值影響輸出,則該網路稱為局部逼近網路。常見的局部逼近網路有RBF網路、小腦模型(CMAC)網路、B樣條網路等。
附件是RBF神經網路的C++源碼。
Ⅱ 復雜神經網路模型用什麼軟體
bp神經網路能用MATLAB,
理論上編程語言都可以,比如VB,C語言,過程也都是建模、量化、運算及結果輸出(圖、表),但是matlab發展到現在,集成了很多的工具箱,所以用的最為廣泛,用其他的就得是要從源碼開發入手了。
bp神經網路是一種演算法,只要是演算法就可以用任何軟體工具,只要編譯器或者解釋器支持,c,c++,python,來進行實現,只是實現時的復雜程度有區別而已
Ⅲ 如何用9行Python代碼編寫一個簡易神經網路
學習人工智慧時,我給自己定了一個目標--用Python寫一個簡單的神經網路。為了確保真得理解它,我要求自己不使用任何神經網路庫,從頭寫起。多虧了Andrew Trask寫得一篇精彩的博客,我做到了!下面貼出那九行代碼:在這篇文章中,我將解釋我是如何做得,以便你可以寫出你自己的。我將會提供一個長點的但是更完美的源代碼。
首先,神經網路是什麼?人腦由幾千億由突觸相互連接的細胞(神經元)組成。突觸傳入足夠的興奮就會引起神經元的興奮。這個過程被稱為「思考」。我們可以在計算機上寫一個神經網路來模擬這個過程。不需要在生物分子水平模擬人腦,只需模擬更高層級的規則。我們使用矩陣(二維數據表格)這一數學工具,並且為了簡單明了,只模擬一個有3個輸入和一個輸出的神經元。
我們將訓練神經元解決下面的問題。前四個例子被稱作訓練集。你發現規律了嗎?『?』是0還是1?你可能發現了,輸出總是等於輸入中最左列的值。所以『?』應該是1。
訓練過程
但是如何使我們的神經元回答正確呢?賦予每個輸入一個權重,可以是一個正的或負的數字。擁有較大正(或負)權重的輸入將決定神經元的輸出。首先設置每個權重的初始值為一個隨機數字,然後開始訓練過程:
取一個訓練樣本的輸入,使用權重調整它們,通過一個特殊的公式計算神經元的輸出。
計算誤差,即神經元的輸出與訓練樣本中的期待輸出之間的差值。
根據誤差略微地調整權重。
重復這個過程1萬次。最終權重將會變為符合訓練集的一個最優解。如果使用神經元考慮這種規律的一個新情形,它將會給出一個很棒的預測。
這個過程就是back propagation。
計算神經元輸出的公式
你可能會想,計算神經元輸出的公式是什麼?首先,計算神經元輸入的加權和,即接著使之規范化,結果在0,1之間。為此使用一個數學函數--Sigmoid函數:Sigmoid函數的圖形是一條「S」狀的曲線。把第一個方程代入第二個,計算神經元輸出的最終公式為:你可能注意到了,為了簡單,我們沒有引入最低興奮閾值。
調整權重的公式
我們在訓練時不斷調整權重。但是怎麼調整呢?可以使用「Error Weighted Derivative」公式:為什麼使用這個公式?首先,我們想使調整和誤差的大小成比例。其次,乘以輸入(0或1),如果輸入是0,權重就不會調整。最後,乘以Sigmoid曲線的斜率(圖4)。為了理解最後一條,考慮這些:
我們使用Sigmoid曲線計算神經元的輸出
如果輸出是一個大的正(或負)數,這意味著神經元採用這種(或另一種)方式
從圖四可以看出,在較大數值處,Sigmoid曲線斜率小
如果神經元認為當前權重是正確的,就不會對它進行很大調整。乘以Sigmoid曲線斜率便可以實現這一點
Sigmoid曲線的斜率可以通過求導得到:把第二個等式代入第一個等式里,得到調整權重的最終公式:當然有其他公式,它們可以使神經元學習得更快,但是這個公式的優點是非常簡單。
構造Python代碼
雖然我們沒有使用神經網路庫,但是將導入Python數學庫numpy里的4個方法。分別是:
exp--自然指數
array--創建矩陣
dot--進行矩陣乘法
random--產生隨機數
比如, 我們可以使用array()方法表示前面展示的訓練集:「.T」方法用於矩陣轉置(行變列)。所以,計算機這樣存儲數字:我覺得我們可以開始構建更優美的源代碼了。給出這個源代碼後,我會做一個總結。
我對每一行源代碼都添加了注釋來解釋所有內容。注意在每次迭代時,我們同時處理所有訓練集數據。所以變數都是矩陣(二維數據表格)。下面是一個用Python寫地完整的示例代碼。
我們做到了!我們用Python構建了一個簡單的神經網路!
首先神經網路對自己賦予隨機權重,然後使用訓練集訓練自己。接著,它考慮一種新的情形[1, 0, 0]並且預測了0.99993704。正確答案是1。非常接近!
傳統計算機程序通常不會學習。而神經網路卻能自己學習,適應並對新情形做出反應,這是多麼神奇,就像人類一樣。
Ⅳ matlab BP神經網路的訓練演算法中訓練函數(traingdm 、trainlm、trainbr)的實現過程及相應的VC源代碼
VC源代碼?你很搞笑嘛。。
給你trainlm的m碼
function [out1,out2] = trainlm(varargin)
%TRAINLM Levenberg-Marquardt backpropagation.
%
% <a href="matlab:doc trainlm">trainlm</a> is a network training function that updates weight and
% bias states according to Levenberg-Marquardt optimization.
%
% <a href="matlab:doc trainlm">trainlm</a> is often the fastest backpropagation algorithm in the toolbox,
% and is highly recommended as a first choice supervised algorithm,
% although it does require more memory than other algorithms.
%
% [NET,TR] = <a href="matlab:doc trainlm">trainlm</a>(NET,X,T) takes a network NET, input data X
% and target data T and returns the network after training it, and a
% a training record TR.
%
% [NET,TR] = <a href="matlab:doc trainlm">trainlm</a>(NET,X,T,Xi,Ai,EW) takes additional optional
% arguments suitable for training dynamic networks and training with
% error weights. Xi and Ai are the initial input and layer delays states
% respectively and EW defines error weights used to indicate
% the relative importance of each target value.
%
% Training occurs according to training parameters, with default values.
% Any or all of these can be overridden with parameter name/value argument
% pairs appended to the input argument list, or by appending a structure
% argument with fields having one or more of these names.
% show 25 Epochs between displays
% showCommandLine 0 generate command line output
% showWindow 1 show training GUI
% epochs 100 Maximum number of epochs to train
% goal 0 Performance goal
% max_fail 5 Maximum validation failures
% min_grad 1e-10 Minimum performance gradient
% mu 0.001 Initial Mu
% mu_dec 0.1 Mu decrease factor
% mu_inc 10 Mu increase factor
% mu_max 1e10 Maximum Mu
% time inf Maximum time to train in seconds
%
% To make this the default training function for a network, and view
% and/or change parameter settings, use these two properties:
%
% net.<a href="matlab:doc nnproperty.net_trainFcn">trainFcn</a> = 'trainlm';
% net.<a href="matlab:doc nnproperty.net_trainParam">trainParam</a>
%
% See also trainscg, feedforwardnet, narxnet.
% Mark Beale, 11-31-97, ODJ 11/20/98
% Updated by Orlando De Jes鷖, Martin Hagan, Dynamic Training 7-20-05
% Copyright 1992-2010 The MathWorks, Inc.
% $Revision: 1.1.6.11.2.2 $ $Date: 2010/07/23 15:40:16 $
%% =======================================================
% BOILERPLATE_START
% This code is the same for all Training Functions.
persistent INFO;
if isempty(INFO), INFO = get_info; end
nnassert.minargs(nargin,1);
in1 = varargin{1};
if ischar(in1)
switch (in1)
case 'info'
out1 = INFO;
case 'check_param'
nnassert.minargs(nargin,2);
param = varargin{2};
err = nntest.param(INFO.parameters,param);
if isempty(err)
err = check_param(param);
end
if nargout > 0
out1 = err;
elseif ~isempty(err)
nnerr.throw('Type',err);
end
otherwise,
try
out1 = eval(['INFO.' in1]);
catch me, nnerr.throw(['Unrecognized first argument: ''' in1 ''''])
end
end
return
end
nnassert.minargs(nargin,2);
net = nn.hints(nntype.network('format',in1,'NET'));
oldTrainFcn = net.trainFcn;
oldTrainParam = net.trainParam;
if ~strcmp(net.trainFcn,mfilename)
net.trainFcn = mfilename;
net.trainParam = INFO.defaultParam;
end
[args,param] = nnparam.extract_param(varargin(2:end),net.trainParam);
err = nntest.param(INFO.parameters,param);
if ~isempty(err), nnerr.throw(nnerr.value(err,'NET.trainParam')); end
if INFO.isSupervised && isempty(net.performFcn) % TODO - fill in MSE
nnerr.throw('Training function is supervised but NET.performFcn is undefined.');
end
if INFO.usesGradient && isempty(net.derivFcn) % TODO - fill in
nnerr.throw('Training function uses derivatives but NET.derivFcn is undefined.');
end
if net.hint.zeroDelay, nnerr.throw('NET contains a zero-delay loop.'); end
[X,T,Xi,Ai,EW] = nnmisc.defaults(args,{},{},{},{},{1});
X = nntype.data('format',X,'Inputs X');
T = nntype.data('format',T,'Targets T');
Xi = nntype.data('format',Xi,'Input states Xi');
Ai = nntype.data('format',Ai,'Layer states Ai');
EW = nntype.nndata_pos('format',EW,'Error weights EW');
% Prepare Data
[net,data,tr,~,err] = nntraining.setup(net,mfilename,X,Xi,Ai,T,EW);
if ~isempty(err), nnerr.throw('Args',err), end
% Train
net = struct(net);
fcns = nn.subfcns(net);
[net,tr] = train_network(net,tr,data,fcns,param);
tr = nntraining.tr_clip(tr);
if isfield(tr,'perf')
tr.best_perf = tr.perf(tr.best_epoch+1);
end
if isfield(tr,'vperf')
tr.best_vperf = tr.vperf(tr.best_epoch+1);
end
if isfield(tr,'tperf')
tr.best_tperf = tr.tperf(tr.best_epoch+1);
end
net.trainFcn = oldTrainFcn;
net.trainParam = oldTrainParam;
out1 = network(net);
out2 = tr;
end
% BOILERPLATE_END
%% =======================================================
% TODO - MU => MU_START
% TODO - alternate parameter names (i.e. MU for MU_START)
function info = get_info()
info = nnfcnTraining(mfilename,'Levenberg-Marquardt',7.0,true,true,...
[ ...
nnetParamInfo('showWindow','Show Training Window Feedback','nntype.bool_scalar',true,...
'Display training window ring training.'), ...
nnetParamInfo('showCommandLine','Show Command Line Feedback','nntype.bool_scalar',false,...
'Generate command line output ring training.'), ...
nnetParamInfo('show','Command Line Frequency','nntype.strict_pos_int_inf_scalar',25,...
'Frequency to update command line.'), ...
...
nnetParamInfo('epochs','Maximum Epochs','nntype.pos_int_scalar',1000,...
'Maximum number of training iterations before training is stopped.'), ...
nnetParamInfo('time','Maximum Training Time','nntype.pos_inf_scalar',inf,...
'Maximum time in seconds before training is stopped.'), ...
...
nnetParamInfo('goal','Performance Goal','nntype.pos_scalar',0,...
'Performance goal.'), ...
nnetParamInfo('min_grad','Minimum Gradient','nntype.pos_scalar',1e-5,...
'Minimum performance gradient before training is stopped.'), ...
nnetParamInfo('max_fail','Maximum Validation Checks','nntype.strict_pos_int_scalar',6,...
'Maximum number of validation checks before training is stopped.'), ...
...
nnetParamInfo('mu','Mu','nntype.pos_scalar',0.001,...
'Mu.'), ...
nnetParamInfo('mu_dec','Mu Decrease Ratio','nntype.real_0_to_1',0.1,...
'Ratio to decrease mu.'), ...
nnetParamInfo('mu_inc','Mu Increase Ratio','nntype.over1',10,...
'Ratio to increase mu.'), ...
nnetParamInfo('mu_max','Maximum mu','nntype.strict_pos_scalar',1e10,...
'Maximum mu before training is stopped.'), ...
], ...
[ ...
nntraining.state_info('gradient','Gradient','continuous','log') ...
nntraining.state_info('mu','Mu','continuous','log') ...
nntraining.state_info('val_fail','Validation Checks','discrete','linear') ...
]);
end
function err = check_param(param)
err = '';
end
function [net,tr] = train_network(net,tr,data,fcns,param)
% Checks
if isempty(net.performFcn)
warning('nnet:trainlm:Performance',nnwarning.empty_performfcn_corrected);
net.performFcn = 'mse';
net.performParam = mse('defaultParam');
tr.performFcn = net.performFcn;
tr.performParam = net.performParam;
end
if isempty(strmatch(net.performFcn,{'sse','mse'},'exact'))
warning('nnet:trainlm:Performance',nnwarning.nonjacobian_performfcn_replaced);
net.performFcn = 'mse';
net.performParam = mse('defaultParam');
tr.performFcn = net.performFcn;
tr.performParam = net.performParam;
end
% Initialize
startTime = clock;
original_net = net;
[perf,vperf,tperf,je,jj,gradient] = nntraining.perfs_jejj(net,data,fcns);
[best,val_fail] = nntraining.validation_start(net,perf,vperf);
WB = getwb(net);
lengthWB = length(WB);
ii = sparse(1:lengthWB,1:lengthWB,ones(1,lengthWB));
mu = param.mu;
% Training Record
tr.best_epoch = 0;
tr.goal = param.goal;
tr.states = {'epoch','time','perf','vperf','tperf','mu','gradient','val_fail'};
% Status
status = ...
[ ...
nntraining.status('Epoch','iterations','linear','discrete',0,param.epochs,0), ...
nntraining.status('Time','seconds','linear','discrete',0,param.time,0), ...
nntraining.status('Performance','','log','continuous',perf,param.goal,perf) ...
nntraining.status('Gradient','','log','continuous',gradient,param.min_grad,gradient) ...
nntraining.status('Mu','','log','continuous',mu,param.mu_max,mu) ...
nntraining.status('Validation Checks','','linear','discrete',0,param.max_fail,0) ...
];
nn_train_feedback('start',net,status);
% Train
for epoch = 0:param.epochs
% Stopping Criteria
current_time = etime(clock,startTime);
[userStop,userCancel] = nntraintool('check');
if userStop, tr.stop = 'User stop.'; net = best.net;
elseif userCancel, tr.stop = 'User cancel.'; net = original_net;
elseif (perf <= param.goal), tr.stop = 'Performance goal met.'; net = best.net;
elseif (epoch == param.epochs), tr.stop = 'Maximum epoch reached.'; net = best.net;
elseif (current_time >= param.time), tr.stop = 'Maximum time elapsed.'; net = best.net;
elseif (gradient <= param.min_grad), tr.stop = 'Minimum gradient reached.'; net = best.net;
elseif (mu >= param.mu_max), tr.stop = 'Maximum MU reached.'; net = best.net;
elseif (val_fail >= param.max_fail), tr.stop = 'Validation stop.'; net = best.net;
end
% Feedback
tr = nntraining.tr_update(tr,[epoch current_time perf vperf tperf mu gradient val_fail]);
nn_train_feedback('update',net,status,tr,data, ...
[epoch,current_time,best.perf,gradient,mu,val_fail]);
% Stop
if ~isempty(tr.stop), break, end
% Levenberg Marquardt
while (mu <= param.mu_max)
% CHECK FOR SINGULAR MATRIX
[msgstr,msgid] = lastwarn;
lastwarn('MATLAB:nothing','MATLAB:nothing')
warnstate = warning('off','all');
dWB = -(jj+ii*mu) \ je;
[~,msgid1] = lastwarn;
flag_inv = isequal(msgid1,'MATLAB:nothing');
if flag_inv, lastwarn(msgstr,msgid); end;
warning(warnstate)
WB2 = WB + dWB;
net2 = setwb(net,WB2);
perf2 = nntraining.train_perf(net2,data,fcns);
% TODO - possible speed enhancement
% - retain intermediate variables for Memory Rection = 1
if (perf2 < perf) && flag_inv
WB = WB2; net = net2;
mu = max(mu*param.mu_dec,1e-20);
break
end
mu = mu * param.mu_inc;
end
% Validation
[perf,vperf,tperf,je,jj,gradient] = nntraining.perfs_jejj(net,data,fcns);
[best,tr,val_fail] = nntraining.validation(best,tr,val_fail,net,perf,vperf,epoch);
end
end
Ⅳ BP神經網路的訓練集需要大樣本嗎一般樣本個數為多少
BP神經網路的訓練集需要大樣本嗎?一般樣本個數為多少?
BP神經網路樣本數有什麼影響
學習神經網路這段時間,有一個疑問,BP神經網路中訓練的次數指的網路的迭代次數,如果有a個樣本,每個樣本訓練次數n,則網路一共迭代an次,在n>>a 情況下 , 網路在不停的調整權值,減小誤差,跟樣本數似乎關系不大。而且,a大了的話訓練時間必然會變長。
換一種說法,將你的數據集看成一個固定值, 那麼樣本集與測試集 也可以按照某種規格確定下來如7:3 所以如何看待 樣本集的多少與訓練結果呢? 或者說怎麼使你的網路更加穩定,更加符合你的所需 。
我嘗試從之前的一個例子中看下區別
如何用70行java代碼實現深度神經網路演算法
作者其實是實現了一個BP神經網路 ,不多說,看最後的例子
一個運用神經網路的例子
最後我們找個簡單例子來看看神經網路神奇的效果。為了方便觀察數據分布,我們選用一個二維坐標的數據,下面共有4個數據,方塊代表數據的類型為1,三角代表數據的類型為0,可以看到屬於方塊類型的數據有(1,2)和(2,1),屬於三角類型的數據有(1,1),(2,2),現在問題是需要在平面上將4個數據分成1和0兩類,並以此來預測新的數據的類型。
圖片描述
我們可以運用邏輯回歸演算法來解決上面的分類問題,但是邏輯回歸得到一個線性的直線做為分界線,可以看到上面的紅線無論怎麼擺放,總是有一個樣本被錯誤地劃分到不同類型中,所以對於上面的數據,僅僅一條直線不能很正確地劃分他們的分類,如果我們運用神經網路演算法,可以得到下圖的分類效果,相當於多條直線求並集來劃分空間,這樣准確性更高。
圖片描述
簡單粗暴,用作者的代碼運行後 訓練5000次 。根據訓練結果來預測一條新數據的分類(3,1)
預測值 (3,1)的結果跟(1,2)(2,1)屬於一類 屬於正方形
這時如果我們去掉 2個樣本,則樣本輸入變成如下
//設置樣本數據,對應上面的4個二維坐標數據
double[][] data = new double[][]{{1,2},{2,2}};
//設置目標數據,對應4個坐標數據的分類
double[][] target = new double[][]{{1,0},{0,1}};
1
2
3
4
1
2
3
4
則(3,1)結果變成了三角形,
如果你選前兩個點 你會發現直接一條中間線就可以區分 這時候的你的結果跟之前4個點時有區別 so 你得增加樣本 直到這些樣本按照你所想要的方式分類 ,所以樣本的多少 重要性體現在,樣本得能反映所有的特徵值(也就是輸入值) ,樣本多少或者特徵(本例子指點的位置特徵)決定的你的網路的訓練結果,!!!這是 我們反推出來的結果 。這里距離深度學習好像近了一步。
另外,這個70行代碼的神經網路沒有保存你訓練的網路 ,所以你每次運行都是重新訓練的網路。其實,在你訓練過後 權值已經確定了下來,我們確定網路也就是根據權值,so只要把訓練後的權值保存下來,將需要分類的數據按照這種權值帶入網路,即可得到輸出值,也就是一旦網路確定, 權值也就確定,一個輸入對應一個固定的輸出,不會再次改變!個人見解。
最後附上作者的源碼,作者的文章見開頭鏈接
下面的實現程序BpDeep.java可以直接拿去使用,
import java.util.Random;
public class BpDeep{
public double[][] layer;//神經網路各層節點
public double[][] layerErr;//神經網路各節點誤差
public double[][][] layer_weight;//各層節點權重
public double[][][] layer_weight_delta;//各層節點權重動量
public double mobp;//動量系數
public double rate;//學習系數
public BpDeep(int[] layernum, double rate, double mobp){
this.mobp = mobp;
this.rate = rate;
layer = new double[layernum.length][];
layerErr = new double[layernum.length][];
layer_weight = new double[layernum.length][][];
layer_weight_delta = new double[layernum.length][][];
Random random = new Random();
for(int l=0;l<layernum.length;l++){
layer[l]=new double[layernum[l]];
layerErr[l]=new double[layernum[l]];
if(l+1<layernum.length){
layer_weight[l]=new double[layernum[l]+1][layernum[l+1]];
layer_weight_delta[l]=new double[layernum[l]+1][layernum[l+1]];
for(int j=0;j<layernum[l]+1;j++)
for(int i=0;i<layernum[l+1];i++)
layer_weight[l][j][i]=random.nextDouble();//隨機初始化權重
}
}
}
//逐層向前計算輸出
public double[] computeOut(double[] in){
for(int l=1;l<layer.length;l++){
for(int j=0;j<layer[l].length;j++){
double z=layer_weight[l-1][layer[l-1].length][j];
for(int i=0;i<layer[l-1].length;i++){
layer[l-1][i]=l==1?in[i]:layer[l-1][i];
z+=layer_weight[l-1][i][j]*layer[l-1][i];
}
layer[l][j]=1/(1+Math.exp(-z));
}
}
return layer[layer.length-1];
}
//逐層反向計算誤差並修改權重
public void updateWeight(double[] tar){
int l=layer.length-1;
for(int j=0;j<layerErr[l].length;j++)
layerErr[l][j]=layer[l][j]*(1-layer[l][j])*(tar[j]-layer[l][j]);
while(l-->0){
for(int j=0;j<layerErr[l].length;j++){
double z = 0.0;
for(int i=0;i<layerErr[l+1].length;i++){
z=z+l>0?layerErr[l+1][i]*layer_weight[l][j][i]:0;
layer_weight_delta[l][j][i]= mobp*layer_weight_delta[l][j][i]+rate*layerErr[l+1][i]*layer[l][j];//隱含層動量調整
layer_weight[l][j][i]+=layer_weight_delta[l][j][i];//隱含層權重調整
if(j==layerErr[l].length-1){
layer_weight_delta[l][j+1][i]= mobp*layer_weight_delta[l][j+1][i]+rate*layerErr[l+1][i];//截距動量調整
layer_weight[l][j+1][i]+=layer_weight_delta[l][j+1][i];//截距權重調整
}
}
layerErr[l][j]=z*layer[l][j]*(1-layer[l][j]);//記錄誤差
}
}
}
public void train(double[] in, double[] tar){
double[] out = computeOut(in);
updateWeight(tar);
}
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
下面是這個測試程序BpDeepTest.java的源碼:
import java.util.Arrays;
public class BpDeepTest{
public static void main(String[] args){
//初始化神經網路的基本配置
//第一個參數是一個整型數組,表示神經網路的層數和每層節點數,比如{3,10,10,10,10,2}表示輸入層是3個節點,輸出層是2個節點,中間有4層隱含層,每層10個節點
//第二個參數是學習步長,第三個參數是動量系數
BpDeep bp = new BpDeep(new int[]{2,10,2}, 0.15, 0.8);
//設置樣本數據,對應上面的4個二維坐標數據
double[][] data = new double[][]{{1,2},{2,2},{1,1},{2,1}};
//設置目標數據,對應4個坐標數據的分類
double[][] target = new double[][]{{1,0},{0,1},{0,1},{1,0}};
//迭代訓練5000次
for(int n=0;n<5000;n++)
for(int i=0;i<data.length;i++)
bp.train(data[i], target[i]);
//根據訓練結果來檢驗樣本數據
for(int j=0;j<data.length;j++){
double[] result = bp.computeOut(data[j]);
System.out.println(Arrays.toString(data[j])+":"+Arrays.toString(result));
}
//根據訓練結果來預測一條新數據的分類
double[] x = new double[]{3,1};
double[] result = bp.computeOut(x);
System.out.println(Arrays.toString(x)+":"+Arrays.toString(result));
}
}
Ⅵ 神經網路研究與應用這塊用python好還是matlab
兩者或許無所謂好與壞。只要自己喜歡用,那就是好的,但是目前代碼數量來看,可以學習的源代碼MATLAB有非常多的源碼。最重要的是,MATLAB里有神經網路工具箱,有可視化界面更容易調整參數。若果你是需要使用神經網路去完成某些數據分析,而你的數據又不是很多,那麼建議你使用matlab,裡面有已經搭建好的工具箱,非常齊全。
若果你對神經網路已經熟悉是,是打算投入應用,而且你的數據很大,那麼根據你所需要的神經網路,用C或其他你認為性能好的語言,針對你的問題重新編一個演算法,也不會花很大功夫。這樣既省了自己的時間,又讓自己輕松學習。總結來說,不論你學什麼,用什麼路徑去學總是會達到想要的目的,但是重要的是在於學習的過程。
Ⅶ BP神經網路演算法的C++源代碼可否盡快發一份,[email protected]
#pragma hdrstop
#include <stdio.h>
#include <iostream.h>
const A=30.0;
const B=10.0;
const MAX=500; //最大訓練次數
const COEF=0.0035; //網路的學習效率
const BCOEF=0.001;//網路的閥值調整效率
const ERROR=0.002 ; // 網路訓練中的允許誤差
const ACCURACY=0.0005;//網路要求精度
double sample[41][4]={{0,0,0,0},{5,1,4,19.020},{5,3,3,14.150},
{5,5,2,14.360},{5,3,3,14.150},{5,3,2,15.390},
{5,3,2,15.390},{5,5,1,19.680},{5,1,2,21.060},
{5,3,3,14.150},{5,5,4,12.680},{5,5,2,14.360},
{5,1,3,19.610},{5,3,4,13.650},{5,5,5,12.430},
{5,1,4,19.020},{5,1,4,19.020},{5,3,5,13.390},
{5,5,4,12.680},{5,1,3,19.610},{5,3,2,15.390},
{1,3,1,11.110},{1,5,2,6.521},{1,1,3,10.190},
{1,3,4,6.043},{1,5,5,5.242},{1,5,3,5.724},
{1,1,4,9.766},{1,3,5,5.870},{1,5,4,5.406},
{1,1,3,10.190},{1,1,5,9.545},{1,3,4,6.043},
{1,5,3,5.724},{1,1,2,11.250},{1,3,1,11.110},
{1,3,3,6.380},{1,5,2,6.521},{1,1,1,16.000},
{1,3,2,7.219},{1,5,3,5.724}};
double w[4][10][10],wc[4][10][10],b[4][10],bc[4][10];
double o[4][10],netin[4][10],d[4][10],differ;//單個樣本的誤差
double is; //全體樣本均方差
int count,a;
void netout(int m, int n);//計算網路隱含層和輸出層的輸出
void calculd(int m,int n); //計算網路的反向傳播誤差
void calcalwc(int m,int n);//計算網路權值的調整量
void calcaulbc(int m,int n); //計算網路閥值的調整量
void changew(int m,int n); //調整網路權值
void changeb(int m,int n);//調整網路閥值
void clearwc(int m,int n);//清除網路權值變化量wc
void clearbc(int m,int n);//清除網路閥值變化量bc
void initialw(void);//初始化NN網路權值W
void initialb(void); //初始化NN網路閥值
void calculdiffer(void);//計算NN網路單個樣本誤差
void calculis(void);//計算NN網路全體樣本誤差
void trainNN(void);//訓練NN網路
/*計算NN網路隱含層和輸出層的輸出 */
void netout(int m,int n)
{
int i,j,k;
//隱含層各節點的的輸出
for (j=1,i=2;j<=m;j++) //m為隱含層節點個數
{
netin[i][j]=0.0;
for(k=1;k<=3;k++)//隱含層的每個節點均有三個輸入變數
netin[i][j]=netin[i][j]+o[i-1][k]*w[i][k][j];
netin[i][j]=netin[i][j]-b[i][j];
o[i][j]=A/(1+exp(-netin[i][j]/B));
}
//輸出層各節點的輸出
for (j=1,i=3;j<=n;j++)
{
netin[i][j]=0.0;
for (k=1;k<=m;k++)
netin[i][j]=netin[i][j]+o[i-1][k]*w[i][k][j];
netin[i][j]=netin[i][j]-b[i][j];
o[i][j]=A/(1+exp(-netin[i][j]/B)) ;
}
}
/*計算NN網路的反向傳播誤差*/
void calculd(int m,int n)
{
int i,j,k;
double t;
a=count-1;
d[3][1]=(o[3][1]-sample[a][3])*(A/B)*exp(-netin[3][1]/B)/pow(1+exp(-netin[3][1]/B),2);
//隱含層的誤差
for (j=1,i=2;j<=m;j++)
{
t=0.00;
for (k=1;k<=n;k++)
t=t+w[i+1][j][k]*d[i+1][k];
d[i][j]=t*(A/B)*exp(-netin[i][j]/B)/pow(1+exp(-netin[i][j]/B),2);
}
}
/*計算網路權值W的調整量*/
void calculwc(int m,int n)
{
int i,j,k;
// 輸出層(第三層)與隱含層(第二層)之間的連接權值的調整
for (i=1,k=3;i<=m;i++)
{
for (j=1;j<=n;j++)
{
wc[k][i][j]=-COEF*d[k][j]*o[k-1][i]+0.5*wc[k][i][j];
}
// printf("\n");
}
//隱含層與輸入層之間的連接權值的調整
for (i=1,k=2;i<=m;i++)
{
for (j=1;j<=m;j++)
{
wc[k][i][j]=-COEF*d[k][j]*o[k-1][i]+0.5*wc[k][i][j];
}
// printf("\n");
}
}
/*計算網路閥值的調整量*/
void calculbc(int m,int n)
{
int j;
for (j=1;j<=m;j++)
{
bc[2][j]=BCOEF*d[2][j];
}
for (j=1;j<=n;j++)
{
bc[3][j]=BCOEF*d[3][j];
}
}
/*調整網路權值*/
void changw(int m,int n)
{
int i,j;
for (i=1;i<=3;i++)
for (j=1;j<=m;j++)
{
w[2][i][j]=0.9*w[2][i][j]+wc[2][i][j];
//為了保證系統有較好的魯棒性,計算權值時乘慣性系數0.9
printf("w[2][%d][%d]=%f\n",i,j,w[2][i][j]);
}
for (i=1;i<=m;i++)
for (j=1;j<=n;j++)
{
w[3][i][j]=0.9*w[3][i][j]+wc[3][i][j];
printf("w[3][%d][%d]=%f\n",i,j,w[3][i][j]);
}
}
/*調整網路閥值*/
void changb(int m,int n)
{
int j;
for (j=1;j<=m;j++)
b[2][j]=b[2][j]+bc[2][j];
for (j=1;j<=n;j++)
b[3][j]=b[3][j]+bc[3][j];
}
/*清除網路權值變化量wc*/
void clearwc(void)
{
for (int i=0;i<4;i++)
for (int j=0;j<10;j++)
for (int k=0;k<10;k++)
wc[i][j][k]=0.00;
}
/*清除網路閥值變化量*/
void clearbc(void)
{
for (int i=0;i<4;i++)
for (int j=0;j<10;j++)
bc[i][j]=0.00;
}
/*初始化網路權值W*/
void initialw(void)
{
int i,j,k,x;
double weight;
for (i=0;i<4;i++)
for (j=0;j<10;j++)
for (k=0;k<10;k++)
{
randomize();
x=100+random(400);
weight=(double)x/5000.00;
w[i][j][k]=weight;
}
}
/*初始化網路閥值*/
void initialb(void)
{
int i,j,x;
double fa;
for (i=0;i<4;i++)
for (j=0;j<10;j++)
{
randomize();
for (int k=0;k<12;k++)
{
x=100+random(400);
}
fa=(double)x/50000.00;
b[i][j]=fa;
}
}
/*計算網路單個樣本誤差*/
void calculdiffer(void)
{
a=count-1;
differ=0.5*(o[3][1]-sample[a][3])*(o[3][1]-sample[a][3]);
}
void calculis(void)
{
int i;
is=0.0;
for (i=0;i<=19;i++)
{
o[1][1]=sample[i][0];
o[1][2]=sample[i][1];
o[1][3]=sample[i][2];
netout(8,1);
is=is+(o[3][1]-sample[i][3])*(o[3][1]-sample[i][3]);
}
is=is/20;
}
/*訓練網路*/
void trainNN(void)
{
long int time;
int i,x[4];
initialw();
initialb();
for (time=1;time<=MAX;time++)
{
count=0;
while(count<=40)
{
o[1][1]=sample[count][0];
o[1][2]=sample[count][1];
o[1][3]=sample[count][2];
count=count+1;
clearwc();
clearbc();
netout(8,1);
calculdiffer();
while(differ>ERROR)
{
calculd(8,1);
calculwc(8,1);
calculbc(8,1);
changw(8,1);
changb(8,1);
netout(8,1);
calculdiffer();
}
}
printf("This is %d times training NN...\n",time);
calculis();
printf("is==%f\n",is);
if (is<ACCURACY) break;
}
}
//---------------------------------------------------------------------------
#pragma argsused
int main(int argc, char* argv[])
{
double result;
int m,test[4];
char ch='y';
cout<<"Please wait for the train of NN:"<<endl;
trainNN();
cout<<"Now,this molar network can work for you."<<endl;
while(ch=='y' || ch=='Y')
{
cout<<"Please input data to be tested."<<endl;
for (m=1;m<=3;m++)
cin>>test[m];
ch=getchar();
o[1][1]=test[1];
o[1][2]=test[2];
o[1][3]=test[3];
netout(8,1);
result=o[3][1];
printf("Final result is %f.\n",result);
printf("Still test?[Yes] or [No]\n");
ch=getchar();
}
return 0;
}
Ⅷ 求BP神經網路演算法的C++源代碼
// AnnBP.cpp: implementation of the CAnnBP class.
//
//////////////////////////////////////////////////////////////////////
#include "StdAfx.h"
#include "AnnBP.h"
#include "math.h"
//////////////////////////////////////////////////////////////////////
// Construction/Destruction
//////////////////////////////////////////////////////////////////////
CAnnBP::CAnnBP()
{
eta1=0.3;
momentum1=0.3;
}
CAnnBP::~CAnnBP()
{
}
double CAnnBP::drnd()
{
return ((double) rand() / (double) BIGRND);
}
/*** 返回-1.0到1.0之間的雙精度隨機數 ***/
double CAnnBP::dpn1()
{
return (double) (rand())/(32767/2)-1;
}
/*** 作用函數,目前是S型函數 ***/
double CAnnBP::squash(double x)
{
return (1.0 / (1.0 + exp(-x)));
}
/*** 申請1維雙精度實數數組 ***/
double* CAnnBP::alloc_1d_dbl(int n)
{
double *new1;
new1 = (double *) malloc ((unsigned) (n * sizeof (double)));
if (new1 == NULL) {
AfxMessageBox("ALLOC_1D_DBL: Couldn't allocate array of doubles\n");
return (NULL);
}
return (new1);
}
/*** 申請2維雙精度實數數組 ***/
double** CAnnBP::alloc_2d_dbl(int m, int n)
{
int i;
double **new1;
new1 = (double **) malloc ((unsigned) (m * sizeof (double *)));
if (new1 == NULL) {
AfxMessageBox("ALLOC_2D_DBL: Couldn't allocate array of dbl ptrs\n");
return (NULL);
}
for (i = 0; i < m; i++) {
new1[i] = alloc_1d_dbl(n);
}
return (new1);
}
/*** 隨機初始化權值 ***/
void CAnnBP::bpnn_randomize_weights(double **w, int m, int n)
{
int i, j;
for (i = 0; i <= m; i++) {
for (j = 0; j <= n; j++) {
w[i][j] = dpn1();
}
}
}
/*** 0初始化權值 ***/
void CAnnBP::bpnn_zero_weights(double **w, int m, int n)
{
int i, j;
for (i = 0; i <= m; i++) {
for (j = 0; j <= n; j++) {
w[i][j] = 0.0;
}
}
}
/*** 設置隨機數種子 ***/
void CAnnBP::bpnn_initialize(int seed)
{
CString msg,s;
msg="Random number generator seed:";
s.Format("%d",seed);
AfxMessageBox(msg+s);
srand(seed);
}
/*** 創建BP網路 ***/
BPNN* CAnnBP::bpnn_internal_create(int n_in, int n_hidden, int n_out)
{
BPNN *newnet;
newnet = (BPNN *) malloc (sizeof (BPNN));
if (newnet == NULL) {
printf("BPNN_CREATE: Couldn't allocate neural network\n");
return (NULL);
}
newnet->input_n = n_in;
newnet->hidden_n = n_hidden;
newnet->output_n = n_out;
newnet->input_units = alloc_1d_dbl(n_in + 1);
newnet->hidden_units = alloc_1d_dbl(n_hidden + 1);
newnet->output_units = alloc_1d_dbl(n_out + 1);
newnet->hidden_delta = alloc_1d_dbl(n_hidden + 1);
newnet->output_delta = alloc_1d_dbl(n_out + 1);
newnet->target = alloc_1d_dbl(n_out + 1);
newnet->input_weights = alloc_2d_dbl(n_in + 1, n_hidden + 1);
newnet->hidden_weights = alloc_2d_dbl(n_hidden + 1, n_out + 1);
newnet->input_prev_weights = alloc_2d_dbl(n_in + 1, n_hidden + 1);
newnet->hidden_prev_weights = alloc_2d_dbl(n_hidden + 1, n_out + 1);
return (newnet);
}
/* 釋放BP網路所佔地內存空間 */
void CAnnBP::bpnn_free(BPNN *net)
{
int n1, n2, i;
n1 = net->input_n;
n2 = net->hidden_n;
free((char *) net->input_units);
free((char *) net->hidden_units);
free((char *) net->output_units);
free((char *) net->hidden_delta);
free((char *) net->output_delta);
free((char *) net->target);
for (i = 0; i <= n1; i++) {
free((char *) net->input_weights[i]);
free((char *) net->input_prev_weights[i]);
}
free((char *) net->input_weights);
free((char *) net->input_prev_weights);
for (i = 0; i <= n2; i++) {
free((char *) net->hidden_weights[i]);
free((char *) net->hidden_prev_weights[i]);
}
free((char *) net->hidden_weights);
free((char *) net->hidden_prev_weights);
free((char *) net);
}
/*** 創建一個BP網路,並初始化權值***/
BPNN* CAnnBP::bpnn_create(int n_in, int n_hidden, int n_out)
{
BPNN *newnet;
newnet = bpnn_internal_create(n_in, n_hidden, n_out);
#ifdef INITZERO
bpnn_zero_weights(newnet->input_weights, n_in, n_hidden);
#else
bpnn_randomize_weights(newnet->input_weights, n_in, n_hidden);
#endif
bpnn_randomize_weights(newnet->hidden_weights, n_hidden, n_out);
bpnn_zero_weights(newnet->input_prev_weights, n_in, n_hidden);
bpnn_zero_weights(newnet->hidden_prev_weights, n_hidden, n_out);
return (newnet);
}
void CAnnBP::bpnn_layerforward(double *l1, double *l2, double **conn, int n1, int n2)
{
double sum;
int j, k;
/*** 設置閾值 ***/
l1[0] = 1.0;
/*** 對於第二層的每個神經元 ***/
for (j = 1; j <= n2; j++) {
/*** 計算輸入的加權總和 ***/
sum = 0.0;
for (k = 0; k <= n1; k++) {
sum += conn[k][j] * l1[k];
}
l2[j] = squash(sum);
}
}
/* 輸出誤差 */
void CAnnBP::bpnn_output_error(double *delta, double *target, double *output, int nj, double *err)
{
int j;
double o, t, errsum;
errsum = 0.0;
for (j = 1; j <= nj; j++) {
o = output[j];
t = target[j];
delta[j] = o * (1.0 - o) * (t - o);
errsum += ABS(delta[j]);
}
*err = errsum;
}
/* 隱含層誤差 */
void CAnnBP::bpnn_hidden_error(double *delta_h, int nh, double *delta_o, int no, double **who, double *hidden, double *err)
{
int j, k;
double h, sum, errsum;
errsum = 0.0;
for (j = 1; j <= nh; j++) {
h = hidden[j];
sum = 0.0;
for (k = 1; k <= no; k++) {
sum += delta_o[k] * who[j][k];
}
delta_h[j] = h * (1.0 - h) * sum;
errsum += ABS(delta_h[j]);
}
*err = errsum;
}
/* 調整權值 */
void CAnnBP::bpnn_adjust_weights(double *delta, int ndelta, double *ly, int nly, double **w, double **oldw, double eta, double momentum)
{
double new_dw;
int k, j;
ly[0] = 1.0;
for (j = 1; j <= ndelta; j++) {
for (k = 0; k <= nly; k++) {
new_dw = ((eta * delta[j] * ly[k]) + (momentum * oldw[k][j]));
w[k][j] += new_dw;
oldw[k][j] = new_dw;
}
}
}
/* 進行前向運算 */
void CAnnBP::bpnn_feedforward(BPNN *net)
{
int in, hid, out;
in = net->input_n;
hid = net->hidden_n;
out = net->output_n;
/*** Feed forward input activations. ***/
bpnn_layerforward(net->input_units, net->hidden_units,
net->input_weights, in, hid);
bpnn_layerforward(net->hidden_units, net->output_units,
net->hidden_weights, hid, out);
}
/* 訓練BP網路 */
void CAnnBP::bpnn_train(BPNN *net, double eta, double momentum, double *eo, double *eh)
{
int in, hid, out;
double out_err, hid_err;
in = net->input_n;
hid = net->hidden_n;
out = net->output_n;
/*** 前向輸入激活 ***/
bpnn_layerforward(net->input_units, net->hidden_units,
net->input_weights, in, hid);
bpnn_layerforward(net->hidden_units, net->output_units,
net->hidden_weights, hid, out);
/*** 計算隱含層和輸出層誤差 ***/
bpnn_output_error(net->output_delta, net->target, net->output_units,
out, &out_err);
bpnn_hidden_error(net->hidden_delta, hid, net->output_delta, out,
net->hidden_weights, net->hidden_units, &hid_err);
*eo = out_err;
*eh = hid_err;
/*** 調整輸入層和隱含層權值 ***/
bpnn_adjust_weights(net->output_delta, out, net->hidden_units, hid,
net->hidden_weights, net->hidden_prev_weights, eta, momentum);
bpnn_adjust_weights(net->hidden_delta, hid, net->input_units, in,
net->input_weights, net->input_prev_weights, eta, momentum);
}
/* 保存BP網路 */
void CAnnBP::bpnn_save(BPNN *net, char *filename)
{
CFile file;
char *mem;
int n1, n2, n3, i, j, memcnt;
double dvalue, **w;
n1 = net->input_n; n2 = net->hidden_n; n3 = net->output_n;
printf("Saving %dx%dx%d network to '%s'\n", n1, n2, n3, filename);
try
{
file.Open(filename,CFile::modeWrite|CFile::modeCreate|CFile::modeNoTruncate);
}
catch(CFileException* e)
{
e->ReportError();
e->Delete();
}
file.Write(&n1,sizeof(int));
file.Write(&n2,sizeof(int));
file.Write(&n3,sizeof(int));
memcnt = 0;
w = net->input_weights;
mem = (char *) malloc ((unsigned) ((n1+1) * (n2+1) * sizeof(double)));
// mem = (char *) malloc (((n1+1) * (n2+1) * sizeof(double)));
for (i = 0; i <= n1; i++) {
for (j = 0; j <= n2; j++) {
dvalue = w[i][j];
//fast(&mem[memcnt], &dvalue, sizeof(double));
fast(&mem[memcnt], &dvalue, sizeof(double));
memcnt += sizeof(double);
}
}
file.Write(mem,sizeof(double)*(n1+1)*(n2+1));
free(mem);
memcnt = 0;
w = net->hidden_weights;
mem = (char *) malloc ((unsigned) ((n2+1) * (n3+1) * sizeof(double)));
// mem = (char *) malloc (((n2+1) * (n3+1) * sizeof(double)));
for (i = 0; i <= n2; i++) {
for (j = 0; j <= n3; j++) {
dvalue = w[i][j];
fast(&mem[memcnt], &dvalue, sizeof(double));
// fast(&mem[memcnt], &dvalue, sizeof(double));
memcnt += sizeof(double);
}
}
file.Write(mem, (n2+1) * (n3+1) * sizeof(double));
// free(mem);
file.Close();
return;
}
/* 從文件中讀取BP網路 */
BPNN* CAnnBP::bpnn_read(char *filename)
{
char *mem;
BPNN *new1;
int n1, n2, n3, i, j, memcnt;
CFile file;
try
{
file.Open(filename,CFile::modeRead|CFile::modeCreate|CFile::modeNoTruncate);
}
catch(CFileException* e)
{
e->ReportError();
e->Delete();
}
// printf("Reading '%s'\n", filename);// fflush(stdout);
file.Read(&n1, sizeof(int));
file.Read(&n2, sizeof(int));
file.Read(&n3, sizeof(int));
new1 = bpnn_internal_create(n1, n2, n3);
// printf("'%s' contains a %dx%dx%d network\n", filename, n1, n2, n3);
// printf("Reading input weights..."); // fflush(stdout);
memcnt = 0;
mem = (char *) malloc (((n1+1) * (n2+1) * sizeof(double)));
file.Read(mem, ((n1+1)*(n2+1))*sizeof(double));
for (i = 0; i <= n1; i++) {
for (j = 0; j <= n2; j++) {
//fast(&(new1->input_weights[i][j]), &mem[memcnt], sizeof(double));
fast(&(new1->input_weights[i][j]), &mem[memcnt], sizeof(double));
memcnt += sizeof(double);
}
}
free(mem);
// printf("Done\nReading hidden weights..."); //fflush(stdout);
memcnt = 0;
mem = (char *) malloc (((n2+1) * (n3+1) * sizeof(double)));
file.Read(mem, (n2+1) * (n3+1) * sizeof(double));
for (i = 0; i <= n2; i++) {
for (j = 0; j <= n3; j++) {
//fast(&(new1->hidden_weights[i][j]), &mem[memcnt], sizeof(double));
fast(&(new1->hidden_weights[i][j]), &mem[memcnt], sizeof(double));
memcnt += sizeof(double);
}
}
free(mem);
file.Close();
printf("Done\n"); //fflush(stdout);
bpnn_zero_weights(new1->input_prev_weights, n1, n2);
bpnn_zero_weights(new1->hidden_prev_weights, n2, n3);
return (new1);
}
void CAnnBP::CreateBP(int n_in, int n_hidden, int n_out)
{
net=bpnn_create(n_in,n_hidden,n_out);
}
void CAnnBP::FreeBP()
{
bpnn_free(net);
}
void CAnnBP::Train(double *input_unit,int input_num, double *target,int target_num, double *eo, double *eh)
{
for(int i=1;i<=input_num;i++)
{
net->input_units[i]=input_unit[i-1];
}
for(int j=1;j<=target_num;j++)
{
net->target[j]=target[j-1];
}
bpnn_train(net,eta1,momentum1,eo,eh);
}
void CAnnBP::Identify(double *input_unit,int input_num,double *target,int target_num)
{
for(int i=1;i<=input_num;i++)
{
net->input_units[i]=input_unit[i-1];
}
bpnn_feedforward(net);
for(int j=1;j<=target_num;j++)
{
target[j-1]=net->output_units[j];
}
}
void CAnnBP::Save(char *filename)
{
bpnn_save(net,filename);
}
void CAnnBP::Read(char *filename)
{
net=bpnn_read(filename);
}
void CAnnBP::SetBParm(double eta, double momentum)
{
eta1=eta;
momentum1=momentum;
}
void CAnnBP::Initialize(int seed)
{
bpnn_initialize(seed);
}
Ⅸ 怎樣用python構建一個卷積神經網路模型
上周末利用python簡單實現了一個卷積神經網路,只包含一個卷積層和一個maxpooling層,pooling層後面的多層神經網路採用了softmax形式的輸出。實驗輸入仍然採用MNIST圖像使用10個feature map時,卷積和pooling的結果分別如下所示。
部分源碼如下:
[python]view plain
#coding=utf-8
'''''
Createdon2014年11月30日
@author:Wangliaofan
'''
importnumpy
importstruct
importmatplotlib.pyplotasplt
importmath
importrandom
import
#test
defsigmoid(inX):
if1.0+numpy.exp(-inX)==0.0:
return999999999.999999999
return1.0/(1.0+numpy.exp(-inX))
defdifsigmoid(inX):
returnsigmoid(inX)*(1.0-sigmoid(inX))
deftangenth(inX):
return(1.0*math.exp(inX)-1.0*math.exp(-inX))/(1.0*math.exp(inX)+1.0*math.exp(-inX))
defcnn_conv(in_image,filter_map,B,type_func='sigmoid'):
#in_image[num,featuremap,row,col]=>in_image[Irow,Icol]
#featuresmap[kfilter,row,col]
#type_func['sigmoid','tangenth']
#out_feature[kfilter,Irow-row+1,Icol-col+1]
shape_image=numpy.shape(in_image)#[row,col]
#print"shape_image",shape_image
shape_filter=numpy.shape(filter_map)#[kfilter,row,col]
ifshape_filter[1]>shape_image[0]orshape_filter[2]>shape_image[1]:
raiseException
shape_out=(shape_filter[0],shape_image[0]-shape_filter[1]+1,shape_image[1]-shape_filter[2]+1)
out_feature=numpy.zeros(shape_out)
k,m,n=numpy.shape(out_feature)
fork_idxinrange(0,k):
#rotate180tocalculateconv
c_filter=numpy.rot90(filter_map[k_idx,:,:],2)
forr_idxinrange(0,m):
forc_idxinrange(0,n):
#conv_temp=numpy.zeros((shape_filter[1],shape_filter[2]))
conv_temp=numpy.dot(in_image[r_idx:r_idx+shape_filter[1],c_idx:c_idx+shape_filter[2]],c_filter)
sum_temp=numpy.sum(conv_temp)
iftype_func=='sigmoid':
out_feature[k_idx,r_idx,c_idx]=sigmoid(sum_temp+B[k_idx])
eliftype_func=='tangenth':
out_feature[k_idx,r_idx,c_idx]=tangenth(sum_temp+B[k_idx])
else:
raiseException
returnout_feature
defcnn_maxpooling(out_feature,pooling_size=2,type_pooling="max"):
k,row,col=numpy.shape(out_feature)
max_index_Matirx=numpy.zeros((k,row,col))
out_row=int(numpy.floor(row/pooling_size))
out_col=int(numpy.floor(col/pooling_size))
out_pooling=numpy.zeros((k,out_row,out_col))
fork_idxinrange(0,k):
forr_idxinrange(0,out_row):
forc_idxinrange(0,out_col):
temp_matrix=out_feature[k_idx,pooling_size*r_idx:pooling_size*r_idx+pooling_size,pooling_size*c_idx:pooling_size*c_idx+pooling_size]
out_pooling[k_idx,r_idx,c_idx]=numpy.amax(temp_matrix)
max_index=numpy.argmax(temp_matrix)
#printmax_index
#printmax_index/pooling_size,max_index%pooling_size
max_index_Matirx[k_idx,pooling_size*r_idx+max_index/pooling_size,pooling_size*c_idx+max_index%pooling_size]=1
returnout_pooling,max_index_Matirx
defpoolwithfunc(in_pooling,W,B,type_func='sigmoid'):
k,row,col=numpy.shape(in_pooling)
out_pooling=numpy.zeros((k,row,col))
fork_idxinrange(0,k):
forr_idxinrange(0,row):
forc_idxinrange(0,col):
out_pooling[k_idx,r_idx,c_idx]=sigmoid(W[k_idx]*in_pooling[k_idx,r_idx,c_idx]+B[k_idx])
returnout_pooling
#out_featureistheoutputofconv
defbackErrorfromPoolToConv(theta,max_index_Matirx,out_feature,pooling_size=2):
k1,row,col=numpy.shape(out_feature)
error_conv=numpy.zeros((k1,row,col))
k2,theta_row,theta_col=numpy.shape(theta)
ifk1!=k2:
raiseException
foridx_kinrange(0,k1):
foridx_rowinrange(0,row):
foridx_colinrange(0,col):
error_conv[idx_k,idx_row,idx_col]=
max_index_Matirx[idx_k,idx_row,idx_col]*
float(theta[idx_k,idx_row/pooling_size,idx_col/pooling_size])*
difsigmoid(out_feature[idx_k,idx_row,idx_col])
returnerror_conv
defbackErrorfromConvToInput(theta,inputImage):
k1,row,col=numpy.shape(theta)
#print"theta",k1,row,col
i_row,i_col=numpy.shape(inputImage)
ifrow>i_roworcol>i_col:
raiseException
filter_row=i_row-row+1
filter_col=i_col-col+1
detaW=numpy.zeros((k1,filter_row,filter_col))
#thesamewithconvvalidinmatlab
fork_idxinrange(0,k1):
foridx_rowinrange(0,filter_row):
foridx_colinrange(0,filter_col):
subInputMatrix=inputImage[idx_row:idx_row+row,idx_col:idx_col+col]
#print"subInputMatrix",numpy.shape(subInputMatrix)
#rotatetheta180
#printnumpy.shape(theta)
theta_rotate=numpy.rot90(theta[k_idx,:,:],2)
#print"theta_rotate",theta_rotate
dotMatrix=numpy.dot(subInputMatrix,theta_rotate)
detaW[k_idx,idx_row,idx_col]=numpy.sum(dotMatrix)
detaB=numpy.zeros((k1,1))
fork_idxinrange(0,k1):
detaB[k_idx]=numpy.sum(theta[k_idx,:,:])
returndetaW,detaB
defloadMNISTimage(absFilePathandName,datanum=60000):
images=open(absFilePathandName,'rb')
buf=images.read()
index=0
magic,numImages,numRows,numColumns=struct.unpack_from('>IIII',buf,index)
printmagic,numImages,numRows,numColumns
index+=struct.calcsize('>IIII')
ifmagic!=2051:
raiseException
datasize=int(784*datanum)
datablock=">"+str(datasize)+"B"
#nextmatrix=struct.unpack_from('>47040000B',buf,index)
nextmatrix=struct.unpack_from(datablock,buf,index)
nextmatrix=numpy.array(nextmatrix)/255.0
#nextmatrix=nextmatrix.reshape(numImages,numRows,numColumns)
#nextmatrix=nextmatrix.reshape(datanum,1,numRows*numColumns)
nextmatrix=nextmatrix.reshape(datanum,1,numRows,numColumns)
returnnextmatrix,numImages
defloadMNISTlabels(absFilePathandName,datanum=60000):
labels=open(absFilePathandName,'rb')
buf=labels.read()
index=0
magic,numLabels=struct.unpack_from('>II',buf,index)
printmagic,numLabels
index+=struct.calcsize('>II')
ifmagic!=2049:
raiseException
datablock=">"+str(datanum)+"B"
#nextmatrix=struct.unpack_from('>60000B',buf,index)
nextmatrix=struct.unpack_from(datablock,buf,index)
nextmatrix=numpy.array(nextmatrix)
returnnextmatrix,numLabels
defsimpleCNN(numofFilter,filter_size,pooling_size=2,maxIter=1000,imageNum=500):
decayRate=0.01
MNISTimage,num1=loadMNISTimage("F:\train-images-idx3-ubyte",imageNum)
printnum1
row,col=numpy.shape(MNISTimage[0,0,:,:])
out_Di=numofFilter*((row-filter_size+1)/pooling_size)*((col-filter_size+1)/pooling_size)
MLP=BMNN2.MuiltilayerANN(1,[128],out_Di,10,maxIter)
MLP.setTrainDataNum(imageNum)
MLP.loadtrainlabel("F:\train-labels-idx1-ubyte")
MLP.initialweights()
#MLP.printWeightMatrix()
rng=numpy.random.RandomState(23455)
W_shp=(numofFilter,filter_size,filter_size)
W_bound=numpy.sqrt(numofFilter*filter_size*filter_size)
W_k=rng.uniform(low=-1.0/W_bound,high=1.0/W_bound,size=W_shp)
B_shp=(numofFilter,)
B=numpy.asarray(rng.uniform(low=-.5,high=.5,size=B_shp))
cIter=0
whilecIter<maxIter:
cIter+=1
ImageNum=random.randint(0,imageNum-1)
conv_out_map=cnn_conv(MNISTimage[ImageNum,0,:,:],W_k,B,"sigmoid")
out_pooling,max_index_Matrix=cnn_maxpooling(conv_out_map,2,"max")
pool_shape=numpy.shape(out_pooling)
MLP_input=out_pooling.reshape(1,1,out_Di)
#printnumpy.shape(MLP_input)
DetaW,DetaB,temperror=MLP.backwardPropogation(MLP_input,ImageNum)
ifcIter%50==0:
printcIter,"Temperror:",temperror
#printnumpy.shape(MLP.Theta[MLP.Nl-2])
#printnumpy.shape(MLP.Ztemp[0])
#printnumpy.shape(MLP.weightMatrix[0])
theta_pool=MLP.Theta[MLP.Nl-2]*MLP.weightMatrix[0].transpose()
#printnumpy.shape(theta_pool)
#print"theta_pool",theta_pool
temp=numpy.zeros((1,1,out_Di))
temp[0,:,:]=theta_pool
back_theta_pool=temp.reshape(pool_shape)
#print"back_theta_pool",numpy.shape(back_theta_pool)
#print"back_theta_pool",back_theta_pool
error_conv=backErrorfromPoolToConv(back_theta_pool,max_index_Matrix,conv_out_map,2)
#print"error_conv",numpy.shape(error_conv)
#printerror_conv
conv_DetaW,conv_DetaB=backErrorfromConvToInput(error_conv,MNISTimage[ImageNum,0,:,:])
#print"W_k",W_k
#print"conv_DetaW",conv_DetaW