weka源碼
A. 求一段weka實現k-means的演算法
覺得不難為什麼不自己寫呢?
有興趣的話請參考下weka.clusterers.SimpleKMeans的源碼吧;
api的介紹在這里:
http://weka.sourceforge.net/doc/weka/clusterers/SimpleKMeans.html
B. 誰能幫我翻一下weka的結果,不要用翻譯軟體,真正懂weka的來幫我翻譯下,答得好的有加分,我有1000的財富
=== Run information ===運行信息(如下)
Scheme(前提,場景):weka.classifiers(weka分類標准).functions(功能).Lib(library,素材庫)SVM(support vector machine矢量演算法支持機) -S 0 -K 2 -D 3 -G 0.0 -R 0.0 -N 0.5 -M 40.0 -C 1.0 -E 0.0010 -P 0.1
Relation(關聯): Glass(玻璃)
Instances實驗次數: 214
Attributes特徵屬性: 10
RI(Refractive index 折射率 )
Na 鈉
Mg鎂
Al鋁
Si硅
K鉀
Ca鈣
Ba鋇
Fe鐵
Type類型
Test mode實驗模式:10-fold cross-validation(十倍交叉驗證)
=== Classifier model (full training set) ===分類模型(全部一整套的訓練指標)
LibSVM wrapper(打包的矢量機素材庫), original code by Yasser EL-Manzalawy (= WLSVM)
(源碼由byYasser EL-Manzalawy 提供)WLSVM(wrapper library support vector machine,打包的支持矢量機的素材庫)
Time taken to build model:(建立模型消耗時間) 0.02 seconds(秒)
=== Stratified cross-validation ===窄條交叉驗證
=== Summary ===實驗總結
Correctly Classified Instances 正確分類的次數 148 69.1589 %
Incorrectly Classified Instances 錯誤分類的次數 66 30.8411 %
Kappa statistic kappa 靜止值 0.3579
Mean absolute error 平均絕對錯誤值 0.0881
Root mean squared error 根平均平方差 0.2968
Relative absolute error 相關絕對錯誤值 60.7715 %
Root relative squared error 根相對錯誤值 111.5949 %
Total Number of Instances 總計實驗次數 214
=== Detailed Accuracy By Class ===分類後的詳細精準度
TP Rate(TP率) FP Rate(FP率) Precision(精確值) Recal(上次結果)l F-Measure (F-測量到的參數) ROC(不太清楚這個) Area(面積) Class(分類)
0.847 0.5 0.676 0.847 0.752 0.674 build wind float 風浮點建模
0.5 0.153 0.727 0.5 0.593 0.674 build wind non-float 風非浮點建模
0 0 0 0 0 ? vehic wind float 運載工具風浮點建模
0 0 0 0 0 ? vehic wind non-float運載工具風非浮點建模
0 0 0 0 0 ? containers 容器(建模後,的小環境)
0 0 0 0 0 ? tableware(桌上的物件)
0 0 0 0 0 ? headlamps(汽車前燈)
Weighted Avg.(平均權重) 0.692 0.344 0.699 0.692 0.68 0.674
=== Confusion Matrix ===混合後的基底值
a b c d e f g <-- classified as分類如下
100 18 0 0 0 0 0 | a = build wind float風浮點建模
48 48 0 0 0 0 0 | b = build wind non-float風非浮點建模
0 0 0 0 0 0 0 | c = vehic wind float運載工具風浮點建模vehicle=vehic
0 0 0 0 0 0 0 | d = vehic wind non-float運載工具風非浮點建模
0 0 0 0 0 0 0 | e = containers容俱
0 0 0 0 0 0 0 | f = tableware桌面物品
0 0 0 0 0 0 0 | g = headlamps汽車大燈
C. weka能否進行實例過濾
這個肯定是有的。不知道你是用weka作編程開發還是只用GUI作數據挖掘
如果是用GUI的話如圖就是選擇一種FILTER實現過濾實例(對原始數據進行預處理),可以看到filter有監督的和非監督的,你可以根據需求選擇對應的filter,選好後點擊那個filter的框就可以設置具體參數和規則什麼的
如果你是用weka作開發,http://weka.sourceforge.net/doc.stable/這個是weka的API可以看到weka.filters的包然後具體的應用你自己看API就可以了
如果用GUI選擇filter選暈了不知道用哪個那也去看看API吧解釋的還是可以的是在不行去下載一個weka的源碼看看注釋不過全是E文
D. Weka搭建過程中向安裝好的Eclipse導入源代碼怎麼老是失敗
你先把weka-src解壓縮,
在裡面
找到一個名為weka的
文件夾
,(有好幾個,注意一下)找到那個包含所有運行文件的那個,直接拖到所建工程的src下就可以了
E. 求助 weka 的ID3演算法java源碼
/*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
*/
/*
* Id3.java
* Copyright (C) 1999 University of Waikato, Hamilton, New Zealand
*
*/
package weka.classifiers.trees;
import weka.classifiers.Classifier;
import weka.classifiers.Sourcable;
import weka.core.Attribute;
import weka.core.Capabilities;
import weka.core.Instance;
import weka.core.Instances;
import weka.core.;
import weka.core.RevisionUtils;
import weka.core.TechnicalInformation;
import weka.core.TechnicalInformationHandler;
import weka.core.Utils;
import weka.core.Capabilities.Capability;
import weka.core.TechnicalInformation.Field;
import weka.core.TechnicalInformation.Type;
import java.util.Enumeration;
/**
<!-- globalinfo-start -->
* Class for constructing an unpruned decision tree based on the ID3 algorithm. Can only deal with nominal attributes. No missing values allowed. Empty leaves may result in unclassified instances. For more information see: <br/>
* <br/>
* R. Quinlan (1986). Inction of decision trees. Machine Learning. 1(1):81-106.
* <p/>
<!-- globalinfo-end -->
*
<!-- technical-bibtex-start -->
* BibTeX:
* <pre>
* @article{Quinlan1986,
* author = {R. Quinlan},
* journal = {Machine Learning},
* number = {1},
* pages = {81-106},
* title = {Inction of decision trees},
* volume = {1},
* year = {1986}
* }
* </pre>
* <p/>
<!-- technical-bibtex-end -->
*
<!-- options-start -->
* Valid options are: <p/>
*
* <pre> -D
* If set, classifier is run in debug mode and
* may output additional info to the console</pre>
*
<!-- options-end -->
*
* @author Eibe Frank ([email protected])
* @version $Revision: 6404 $
*/
public class Id3
extends Classifier
implements TechnicalInformationHandler, Sourcable {
/** for serialization */
static final long serialVersionUID = -2693678647096322561L;
/** The node's successors. */
private Id3[] m_Successors;
/** Attribute used for splitting. */
private Attribute m_Attribute;
/** Class value if node is leaf. */
private double m_ClassValue;
/** Class distribution if node is leaf. */
private double[] m_Distribution;
/** Class attribute of dataset. */
private Attribute m_ClassAttribute;
/**
* Returns a string describing the classifier.
* @return a description suitable for the GUI.
*/
public String globalInfo() {
return "Class for constructing an unpruned decision tree based on the ID3 "
+ "algorithm. Can only deal with nominal attributes. No missing values "
+ "allowed. Empty leaves may result in unclassified instances. For more "
+ "information see: "
+ getTechnicalInformation().toString();
}
/**
* Returns an instance of a TechnicalInformation object, containing
* detailed information about the technical background of this class,
* e.g., paper reference or book this class is based on.
*
* @return the technical information about this class
*/
public TechnicalInformation getTechnicalInformation() {
TechnicalInformation result;
result = new TechnicalInformation(Type.ARTICLE);
result.setValue(Field.AUTHOR, "R. Quinlan");
result.setValue(Field.YEAR, "1986");
result.setValue(Field.TITLE, "Inction of decision trees");
result.setValue(Field.JOURNAL, "Machine Learning");
result.setValue(Field.VOLUME, "1");
result.setValue(Field.NUMBER, "1");
result.setValue(Field.PAGES, "81-106");
return result;
}
/**
* Returns default capabilities of the classifier.
*
* @return the capabilities of this classifier
*/
public Capabilities getCapabilities() {
Capabilities result = super.getCapabilities();
result.disableAll();
// attributes
result.enable(Capability.NOMINAL_ATTRIBUTES);
// class
result.enable(Capability.NOMINAL_CLASS);
result.enable(Capability.MISSING_CLASS_VALUES);
// instances
result.setMinimumNumberInstances(0);
return result;
}
/**
* Builds Id3 decision tree classifier.
*
* @param data the training data
* @exception Exception if classifier can't be built successfully
*/
public void buildClassifier(Instances data) throws Exception {
// can classifier handle the data?
getCapabilities().testWithFail(data);
// remove instances with missing class
data = new Instances(data);
data.deleteWithMissingClass();
makeTree(data);
}
/**
* Method for building an Id3 tree.
*
* @param data the training data
* @exception Exception if decision tree can't be built successfully
*/
private void makeTree(Instances data) throws Exception {
// Check if no instances have reached this node.
if (data.numInstances() == 0) {
m_Attribute = null;
m_ClassValue = Instance.missingValue();
m_Distribution = new double[data.numClasses()];
return;
}
// Compute attribute with maximum information gain.
double[] infoGains = new double[data.numAttributes()];
Enumeration attEnum = data.enumerateAttributes();
while (attEnum.hasMoreElements()) {
Attribute att = (Attribute) attEnum.nextElement();
infoGains[att.index()] = computeInfoGain(data, att);
}
m_Attribute = data.attribute(Utils.maxIndex(infoGains));
// Make leaf if information gain is zero.
// Otherwise create successors.
if (Utils.eq(infoGains[m_Attribute.index()], 0)) {
m_Attribute = null;
m_Distribution = new double[data.numClasses()];
Enumeration instEnum = data.enumerateInstances();
while (instEnum.hasMoreElements()) {
Instance inst = (Instance) instEnum.nextElement();
m_Distribution[(int) inst.classValue()]++;
}
Utils.normalize(m_Distribution);
m_ClassValue = Utils.maxIndex(m_Distribution);
m_ClassAttribute = data.classAttribute();
} else {
Instances[] splitData = splitData(data, m_Attribute);
m_Successors = new Id3[m_Attribute.numValues()];
for (int j = 0; j < m_Attribute.numValues(); j++) {
m_Successors[j] = new Id3();
m_Successors[j].makeTree(splitData[j]);
}
}
}
/**
* Classifies a given test instance using the decision tree.
*
* @param instance the instance to be classified
* @return the classification
* @throws if instance has missing values
*/
public double classifyInstance(Instance instance)
throws {
if (instance.hasMissingValue()) {
throw new ("Id3: no missing values, "
+ "please.");
}
if (m_Attribute == null) {
return m_ClassValue;
} else {
return m_Successors[(int) instance.value(m_Attribute)].
classifyInstance(instance);
}
}
/**
* Computes class distribution for instance using decision tree.
*
* @param instance the instance for which distribution is to be computed
* @return the class distribution for the given instance
* @throws if instance has missing values
*/
public double[] distributionForInstance(Instance instance)
throws {
if (instance.hasMissingValue()) {
throw new ("Id3: no missing values, "
+ "please.");
}
if (m_Attribute == null) {
return m_Distribution;
} else {
return m_Successors[(int) instance.value(m_Attribute)].
distributionForInstance(instance);
}
}
/**
* Prints the decision tree using the private toString method from below.
*
* @return a textual description of the classifier
*/
public String toString() {
if ((m_Distribution == null) && (m_Successors == null)) {
return "Id3: No model built yet.";
}
return "Id3 " + toString(0);
}
/**
* Computes information gain for an attribute.
*
* @param data the data for which info gain is to be computed
* @param att the attribute
* @return the information gain for the given attribute and data
* @throws Exception if computation fails
*/
private double computeInfoGain(Instances data, Attribute att)
throws Exception {
double infoGain = computeEntropy(data);
Instances[] splitData = splitData(data, att);
for (int j = 0; j < att.numValues(); j++) {
if (splitData[j].numInstances() > 0) {
infoGain -= ((double) splitData[j].numInstances() /
(double) data.numInstances()) *
computeEntropy(splitData[j]);
}
}
return infoGain;
}
/**
* Computes the entropy of a dataset.
*
* @param data the data for which entropy is to be computed
* @return the entropy of the data's class distribution
* @throws Exception if computation fails
*/
private double computeEntropy(Instances data) throws Exception {
double [] classCounts = new double[data.numClasses()];
Enumeration instEnum = data.enumerateInstances();
while (instEnum.hasMoreElements()) {
Instance inst = (Instance) instEnum.nextElement();
classCounts[(int) inst.classValue()]++;
}
double entropy = 0;
for (int j = 0; j < data.numClasses(); j++) {
if (classCounts[j] > 0) {
entropy -= classCounts[j] * Utils.log2(classCounts[j]);
}
}
entropy /= (double) data.numInstances();
return entropy + Utils.log2(data.numInstances());
}
/**
* Splits a dataset according to the values of a nominal attribute.
*
* @param data the data which is to be split
* @param att the attribute to be used for splitting
* @return the sets of instances proced by the split
*/
private Instances[] splitData(Instances data, Attribute att) {
Instances[] splitData = new Instances[att.numValues()];
for (int j = 0; j < att.numValues(); j++) {
splitData[j] = new Instances(data, data.numInstances());
}
Enumeration instEnum = data.enumerateInstances();
while (instEnum.hasMoreElements()) {
Instance inst = (Instance) instEnum.nextElement();
splitData[(int) inst.value(att)].add(inst);
}
for (int i = 0; i < splitData.length; i++) {
splitData[i].compactify();
}
return splitData;
}
/**
* Outputs a tree at a certain level.
*
* @param level the level at which the tree is to be printed
* @return the tree as string at the given level
*/
private String toString(int level) {
StringBuffer text = new StringBuffer();
if (m_Attribute == null) {
if (Instance.isMissingValue(m_ClassValue)) {
text.append(": null");
} else {
text.append(": " + m_ClassAttribute.value((int) m_ClassValue));
}
} else {
for (int j = 0; j < m_Attribute.numValues(); j++) {
text.append(" ");
for (int i = 0; i < level; i++) {
text.append("| ");
}
text.append(m_Attribute.name() + " = " + m_Attribute.value(j));
text.append(m_Successors[j].toString(level + 1));
}
}
return text.toString();
}
/**
* Adds this tree recursively to the buffer.
*
* @param id the unqiue id for the method
* @param buffer the buffer to add the source code to
* @return the last ID being used
* @throws Exception if something goes wrong
*/
protected int toSource(int id, StringBuffer buffer) throws Exception {
int result;
int i;
int newID;
StringBuffer[] subBuffers;
buffer.append(" ");
buffer.append(" protected static double node" + id + "(Object[] i) { ");
// leaf?
if (m_Attribute == null) {
result = id;
if (Double.isNaN(m_ClassValue)) {
buffer.append(" return Double.NaN;");
} else {
buffer.append(" return " + m_ClassValue + ";");
}
if (m_ClassAttribute != null) {
buffer.append(" // " + m_ClassAttribute.value((int) m_ClassValue));
}
buffer.append(" ");
buffer.append(" } ");
} else {
buffer.append(" checkMissing(i, " + m_Attribute.index() + "); ");
buffer.append(" // " + m_Attribute.name() + " ");
// subtree calls
subBuffers = new StringBuffer[m_Attribute.numValues()];
newID = id;
for (i = 0; i < m_Attribute.numValues(); i++) {
newID++;
buffer.append(" ");
if (i > 0) {
buffer.append("else ");
}
buffer.append("if (((String) i[" + m_Attribute.index()
+ "]).equals("" + m_Attribute.value(i) + "")) ");
buffer.append(" return node" + newID + "(i); ");
subBuffers[i] = new StringBuffer();
newID = m_Successors[i].toSource(newID, subBuffers[i]);
}
buffer.append(" else ");
buffer.append(" throw new IllegalArgumentException("Value '" + i["
+ m_Attribute.index() + "] + "' is not allowed!"); ");
buffer.append(" } ");
// output subtree code
for (i = 0; i < m_Attribute.numValues(); i++) {
buffer.append(subBuffers[i].toString());
}
subBuffers = null;
result = newID;
}
return result;
}
/**
* Returns a string that describes the classifier as source. The
* classifier will be contained in a class with the given name (there may
* be auxiliary classes),
* and will contain a method with the signature:
* <pre><code>
* public static double classify(Object[] i);
* </code></pre>
* where the array <code>i</code> contains elements that are either
* Double, String, with missing values represented as null. The generated
* code is public domain and comes with no warranty. <br/>
* Note: works only if class attribute is the last attribute in the dataset.
*
* @param className the name that should be given to the source class.
* @return the object source described by a string
* @throws Exception if the source can't be computed
*/
public String toSource(String className) throws Exception {
StringBuffer result;
int id;
result = new StringBuffer();
result.append("class " + className + " { ");
result.append(" private static void checkMissing(Object[] i, int index) { ");
result.append(" if (i[index] == null) ");
result.append(" throw new IllegalArgumentException("Null values "
+ "are not allowed!"); ");
result.append(" } ");
result.append(" public static double classify(Object[] i) { ");
id = 0;
result.append(" return node" + id + "(i); ");
result.append(" } ");
toSource(id, result);
result.append("} ");
return result.toString();
}
/**
* Returns the revision string.
*
* @return the revision
*/
public String getRevision() {
return RevisionUtils.extract("$Revision: 6404 $");
}
/**
* Main method.
*
* @param args the options for the classifier
*/
public static void main(String[] args) {
runClassifier(new Id3(), args);
}
}
F. 誰有數據挖掘演算法源代碼啊
你也研究演算法啊,我也剛開始,交個朋友唄,2674457337,你可以去程序員網站下載,我那也下載了點MATLAB的,如果是這個,我那也有。基本的。
G. 調用weka包的java文件如何編譯
很簡單,只要用eclipse把weka.jar添加到項目的構建路徑,就可以編譯了。
如果是用手工的javac來編譯,則要用-classpath將weka.jar包括進來才能正常編譯。
如:javac -classpath ".;D:\weka.jar" xxx.java
javac用法:javac <選項> <源文件>
其中,可能的選項包括:
-classpath <路徑> 指定查找用戶類文件和注釋處理程序的位置
-cp <路徑> 指定查找用戶類文件和注釋處理程序的位置
如果是想做更深入的開發,可以將weka-src.jar 解壓出來當成源碼導入到eclipse工程。
H. 誰有JAVA實現數據挖掘APRIORI演算法的代碼急用!
要比較好的實現的話去WEKA源碼裡面找,或者http://www.helsinki.fi/~holler/datamining/algorithms.html也有~
不過其實要把人家寫的讀懂也挺煩的,Apriori是很基本的,Java也有很多好用的集合類,加把勁一天就能寫個能用的出來~
I. weka安裝啟動命令窗口一閃而過,之後就沒有了!這是怎麼回事呢(我之前已經安裝了java jre)
這么長時間,不知道你解決了沒有,找到安裝根目錄,然後找到weka.jar,這是一個可執行jar文件,選擇java運行方式打開就可以了。程序就啟動了。