一区二区三区日韩精品-日韩经典一区二区三区-五月激情综合丁香婷婷-欧美精品中文字幕专区

分享

Open Source Bayesian Network Structure Learning API, Free

 lzqkean 2014-09-04

I introduce a new open source Bayesian network structure learning API called, Free-BN (FBN). FBN is licensed under the Apache 2.0 license. Following, I’ll scratch the surface of FBN and walk you through an example of using FBN.

Why another Bayesian network structure learning API?

While working on my dissertation, I had a tough time looking for open source APIs for constraint-based structural learning of Bayesian networks. The few open source APIs I found dealing with Bayesian networks written in Java were:

This page here provides a long list of Bayesian network related software/APIs. One of the fruition of my dissertation (though not reported or included in my doctoral dissertation) was the development of FBN for Bayesian network structural learning written in Java.

Some features of FBN

So, what can FBN currently do (related to Bayesian networks)? Here’s a non-exhaustive list.

  • Structural learning
    • constraint-based (PC, TPDA, PDFS)
    • search-and-scoring (K2)
    • mixed-type (SC*, CrUMB+-GA)
  • Exact inference (using PPTC algorithm)
  • Logic sampling

Working with FBN should be relatively easy. It’s meant to be an API (not an application). Currently, FBN can only learn from database sources, although, you could extend the API to learn from flat files. FBN works primarily based on the design of inversion of control (IOC) or dependency injection (DI) and uses the Spring Framework to achieve that design. Using DI and working primarily with interfaces mean the API can easily be extended to include other structure learning algorithms.

Walkthrough preliminaries

Before I perform the walkthrough on how to use FBN, let’s provide some background information. The dataset is generated using logic sampling and the Bayesian network reported by (Cooper 1992). This Bayesian network has three variables: X1, X2, and X3. The structure of this Bayesian network is a serial connection: X1 -> X2 -> X3. The local probability models reported are shown in the table below.

P(X1=present)=0.6 P(X1=absent)=0.4
P(X2=present|X1=present)=0.8 P(X2=absent|X1=absent)=0.2
P(X2=present|X1=absent)=0.3 P(X2=absent|X1=absent)=0.7
P(X3=present|X2=present)=0.9 P(X3=absent|X2=absent)=0.1
P(X3=present|X2=absent)=0.15 P(X3=absent|X2=absent)=0.85

The algorithm to learn the Bayesian network from the data will be Three Phase Dependency Analysis (TPDA) (Cheng 2002). TPDA is a constraint-based Bayesian network structure learning algorithm. It has three phases: drafting, thickening, and thinning. TPDA is implemented in FBN and will be used to learn the Bayesian network structure from the data generated using logic sampling.

Setup your data source

FBN takes as input data stored in a database with JDBC drivers. Some examples of such databases are Oracle, MS SQL Server, and MySQL. In this walkthrough, I’ll be showing examples using MySQL.

The data must be stored in two separate tables: one table to specify the variables (denote this as vtable), and one table to hold the actual data (denote this as dtable). The vtable should have the following fields: name, type, and domain. An example of a DDL for a vtable using MySQL is:

1
2
3
4
5
create table vtable (
 name varchar(10),
 domain varchar(20),
 type varchar(10)
);

Since we have three binary variables (x1, x2, and x3), we have to insert values into the vtable to describe these variables.

1
2
3
insert into vtable(name, domain, type) values('x1','absent,present', '1');
insert into vtable(name, domain, type) values('x2','absent,present', '1');
insert into vtable(name, domain, type) values('x3','absent,present', '1');

The type is set to 1 for categorical variables. For all types see net.fdm.data.intf.Variable.

Now, we have to create a table to hold the data. The following is a sample MySQL DDL to create such a table.

1
2
3
4
5
create table dtable (
 x1 varchar(10),
 x2 varchar(10),
 x3 varchar(10)
);

Now that we have created the dtable, insert data into it.

1
2
3
4
5
6
7
insert into dtable(x1,x2,x3) values('present','present','present');
insert into dtable(x1,x2,x3) values('present','present','present');
insert into dtable(x1,x2,x3) values('present','present','present');
...
insert into dtable(x1,x2,x3) values('present','absent','absent');
insert into dtable(x1,x2,x3) values('absent','absent','absent');
insert into dtable(x1,x2,x3) values('absent','absent','absent');

If you download the source code for FBN, the MySQL scripts are located in demo/mysql.sql. The source code to create the Bayesian network and perform logic sampling is located in demo/com/vang/jee/fbn/demo/DataGenerator.java.

Set up your structure learning algorithm

Now it’s time to setup our structure learning algorithm of choice. We can do so in code (using Java), or, the better alternative, is to “wire up” the algorithm using Spring and XML files. The following code shows how to wire up the TPDA structure learning algorithm using Java.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
/**
 * Copyright 2009 Jee Vang
 *
 * Licensed under the Apache License, Version 2.0 (the "License");
 * you may not use this file except in compliance with the License.
 * You may obtain a copy of the License at
 *
 
 *  Unless required by applicable law or agreed to in writing, software
 *  distributed under the License is distributed on an "AS IS" BASIS,
 *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 *  See the License for the specific language governing permissions and
 *  limitations under the License.
 */
package com.vang.jee.fbn.demo;
import java.util.Iterator;
import javax.sql.DataSource;
import net.fbn.data.condcorr.impl.MiCondCorr;
import net.fbn.data.condcorr.impl.MiCondIndepTestImpl;
import net.fbn.data.corr.impl.MutualInformation;
import net.fbn.graph.algo.impl.MWSTImpl;
import net.fbn.graph.factory.impl.UnGraphFactoryImpl;
import net.fbn.graph.intf.Graph;
import net.fbn.learner.struct.cb.tpda.impl.DSeparateA;
import net.fbn.learner.struct.cb.tpda.impl.DSeparateB;
import net.fbn.learner.struct.cb.tpda.impl.SimpleOrientArcsImpl;
import net.fbn.learner.struct.cb.tpda.impl.StDraftImpl;
import net.fbn.learner.struct.cb.tpda.impl.StTPDALearnerImpl;
import net.fbn.learner.struct.cb.tpda.impl.StThickenImpl;
import net.fbn.learner.struct.cb.tpda.impl.StThinImpl;
import net.fbn.learner.struct.intf.StructureLearner;
import net.fdm.data.dao.impl.VariableDaoImpl;
import net.fdm.data.dao.intf.VariableDao;
import net.fdm.data.intf.Variable;
import org.apache.commons.dbcp.BasicDataSource;
/**
 * Demo for structure learning using TPDA.
 * @author Jee Vang
 *
 */
public class TestLearning {
    private DataSource _dataSource;
    private VariableDao _variableDao;
    private StructureLearner _structureLearner;
     
    /**
     * Gets a structure learner.
     * @return StructureLearner.
     */
    public StructureLearner getStructureLearner() {
        if(null == _structureLearner) {
            //set the algorithm to perform TPDA drafting phase
            StDraftImpl draft = new StDraftImpl();
            draft.setMwstAlgo(new MWSTImpl());
            draft.setUnGraphFactory(new UnGraphFactoryImpl());
             
            //these are some classes used to help TPDA proceed
            double delta = 0.01d;
            double theta = 0.001d;
            double epsilon = 0.001d;
             
            MutualInformation mi = new MutualInformation();
            mi.setVariableDao(getVariableDao());
             
            MiCondCorr miCondCorr = new MiCondCorr();
            miCondCorr.setVariableDao(getVariableDao());
            miCondCorr.setCorrMetric(mi);
             
            MiCondIndepTestImpl condIndepTest = new MiCondIndepTestImpl();
            condIndepTest.setVariableDao(getVariableDao());
            condIndepTest.setCondCorrMetric(miCondCorr);
            condIndepTest.setDelta(delta);
             
            DSeparateA dSeparateA = new DSeparateA();
            dSeparateA.setVariableDao(getVariableDao());
            dSeparateA.setCondIndepTest(condIndepTest);
             
            DSeparateB dSeparateB = new DSeparateB();
            dSeparateB.setVariableDao(getVariableDao());
            dSeparateB.setEpsilon(epsilon);
            dSeparateB.setCondIndepTest(condIndepTest);
             
            SimpleOrientArcsImpl orientArcs = new SimpleOrientArcsImpl();
            orientArcs.setCondIndepTest(condIndepTest);
            orientArcs.setEpsilon(epsilon);
             
            //set the algorithm to perform the TPDA thickening phase
            StThickenImpl thicken = new StThickenImpl();
            thicken.setDSeparate(dSeparateA);
             
            //set the algorithm to perform the TPDA thinning phase
            StThinImpl thin = new StThinImpl();
            thin.setDSeparateA(dSeparateA);
            thin.setDSeparateB(dSeparateB);
             
            //now wire up tpda
            StTPDALearnerImpl tpda = new StTPDALearnerImpl();
            tpda.setDelta(delta);
            tpda.setTheta(theta);
            tpda.setCorrMetric(mi);
            tpda.setRemoveInsignificantCorrelations(true);
            tpda.setDraft(draft);
            tpda.setThin(thin);
            tpda.setThicken(thicken);
            tpda.setOrientArcs(orientArcs);
             
            _structureLearner = tpda;
        }
        return _structureLearner;
    }
     
    /**
     * Gets variable data access object.
     * @return VariableDao.
     */
    public VariableDao getVariableDao() {
        if(null == _variableDao) {
            VariableDaoImpl variableDao = new VariableDaoImpl();
            variableDao.setDataSource(getDataSource());
            variableDao.setDataTable("dtable");
            variableDao.setDomainColumnName("domain");
            variableDao.setDomainDelimiter(",");
            variableDao.setTypeColumnName("type");
            variableDao.setVarTable("vtable");
             
            _variableDao = variableDao;
        }
         
        return _variableDao;
    }
     
    /**
     * Gets a data source.
     * @return DataSource.
     */
    public DataSource getDataSource() {
        if(null == _dataSource) {
            String driverClassName = "com.mysql.jdbc.Driver";
            String url = "jdbc:mysql://localhost/bn?user=jee&password=jee";
             
            BasicDataSource dataSource = new BasicDataSource();
            dataSource.setDriverClassName(driverClassName);
            dataSource.setUrl(url);
             
            _dataSource = dataSource;
        }
         
        return _dataSource;
    }
     
    /**
     * Gets an array of variables.
     * @return Array of Variable.
     * @throws Exception
     */
    public Variable[] getVariables() throws Exception {
        VariableDao variableDao = getVariableDao();
        Variable[] variables = variableDao.getVariables();
        return variables;
    }
    /**
     * Main method.
     * @param args
     * @throws Exception
     */
    public static void main(String[] args) throws Exception {
        TestLearning testLearning = new TestLearning();
        Variable[] variables = testLearning.getVariables();
        StructureLearner learner = testLearning.getStructureLearner();
        Graph graph = learner.learn(variables);
        System.out.println("NODES");
        for(Iterator it = graph.getNodes().iterator(); it.hasNext(); ) {
            System.out.println(it.next());
        }
         
        System.out.println("ARCS");
        for(Iterator it = graph.getArcs().iterator(); it.hasNext(); ) {
            System.out.println(it.next());
        }
    }
}

The getDataSource method gets a DataSource pointing to your database (in this MySQL instance). The getVariableDao method provides a reference to the VariableDao object that has access to the variable and data. The getStructureLearner method wires up the TPDA implementation. In the main method, you get a reference to all the variables for which you want to perform Bayesian network structure learning and instance of the structure learner. You then pass this array of variables into the learner to produce a Graph. The nodes in the graph should be: x1, x2, x3. The arcs in this graph is: x1–x2 and x2–x3. Therefore, the structure is: x1–x2–x3. Clearly, this graph structure is an undirected graph, and thus cannot satisfy the directed acyclic graph (DAG) requirement of a Bayesian network. The source code for this learning example is located in the source distribution under demo/src/com/vang/jee/fbn/demo/TestLearning.java.

How to get the source and dependencies?

The FBN API is dependent on two other minor projects called, Free-Display and Free-GA (FGA). The Free-Dispaly API is used to visualize the Bayesian networks, while the FGA API is used for search-and-scoring methods for Bayesian network structure learning. You may download all these APIs, and they are all licensed under the Apache 2.0 license.

Free-BN source
Free-BN binary
Free-Disp source
Free-Disp binary
Free-GA source
Free-GA binary

I hope this API helps you in your research. Happy research, data mining, and programming! Cheers! Sib ntsib dua mog!

References

  • G. F. Cooper and E. Herskovitz. “A Bayesian method for the induction of probabilistic networks from data,” Machine Learning, vol. 9, 1992, pp. 309–347.
  • J. Cheng, R. Greiner, J. Kelly, D. A. Bell, and W. Liu. “Learning Bayesian Networks from Data: an Information-Theory Based Approach,” The Artificial Intelligence Journal, vol. 137, 2002, pp. 43–90.

    本站是提供個人知識管理的網(wǎng)絡(luò)存儲空間,所有內(nèi)容均由用戶發(fā)布,不代表本站觀點(diǎn)。請注意甄別內(nèi)容中的聯(lián)系方式、誘導(dǎo)購買等信息,謹(jǐn)防詐騙。如發(fā)現(xiàn)有害或侵權(quán)內(nèi)容,請點(diǎn)擊一鍵舉報(bào)。
    轉(zhuǎn)藏 分享 獻(xiàn)花(0

    0條評論

    發(fā)表

    請遵守用戶 評論公約

    類似文章 更多

    免费大片黄在线观看国语| 日韩无套内射免费精品| 日韩女优精品一区二区三区| 国产女性精品一区二区三区| 在线免费国产一区二区三区| 欧美日本精品视频在线观看| 久久一区内射污污内射亚洲| 日本中文在线不卡视频| 极品熟女一区二区三区| 国产一区二区三区色噜噜| 99久只有精品免费视频播放| 好吊妞在线免费观看视频| 国产日产欧美精品视频| 国产女同精品一区二区| 欧美精品激情视频一区| 日本东京热加勒比一区二区| 亚洲另类女同一二三区| 国产一级不卡视频在线观看| 高清一区二区三区不卡免费| 日韩欧美一区二区不卡视频| 中文字幕人妻综合一区二区 | 亚洲做性视频在线播放| 国产精品一区二区三区日韩av | 久久精品欧美一区二区三不卡| 老熟妇乱视频一区二区| 亚洲中文字幕人妻av| 国产又粗又黄又爽又硬的| 男人和女人草逼免费视频| 五月天综合网五月天综合网| 91日韩欧美在线视频| 久久香蕉综合网精品视频| 国产日韩综合一区在线观看| 国产精品久久男人的天堂| 国产一区二区三区口爆在线| 午夜国产成人福利视频| 在线观看国产成人av天堂野外| 国产高清三级视频在线观看| 精品国产丝袜一区二区| 免费一区二区三区少妇| 欧美激情中文字幕综合八区| 懂色一区二区三区四区|