- · IF a linguistic subnation’s economy is large AND growing THEN ethnic political mobilisation will be present [14 of 19 positive cases]
- · IF a linguistic subnation’s economy is large, NOT growing AND is relatively wealthy THEN ethnic political mobilisation will be present [2 of 19 positive cases]
- · IF a linguistic subnation’s economy is NOT large AND is relatively wealthy AND speaks and writes in its own language THEN ethnic political mobilisation will be present [3 of 19 positive cases]
Both QCA and classification trees have procedures for simplifying the association rules that are found. With classification trees there is an automated “pruning” option that removes redundant parts. My impression is that there are no redundant parts in the above tree, but I may be wrong.
PS: I have just found and tested another package, called XLSTAT, that also generates classification trees. Here is a graphic showing the same result as found above, but in more detail. (Click on the image to enlarge it)
PS 4 May 2012: I have just discovered there is what looks like a very good open source data mining package called RapidMiner, which comes with a whole stack of training videos, and a big support and development community
PS 29 May 2012: Pertinent comment from Dilbert
PS 3 June 2012: Prediction versus explanation: I have recently found a number of web pages on the issue of prediction versus explanation. Data mining methods can deliver good predictions. However information relevant to good predictions does not always provide good explanations e.g. smoking may be predictive of teenage pregnancy but it is not a cause of it (see interesting exercise here). So is data mining a waste of time for evaluators? On reflection it occured to me that it depends on the circumstances and how the results of any analysis are to be used. In some circumstances the next steps may be to choose between existing alternatives. For example, which organisation or project to fund. Here good predictive knowledge, based on data about past performance, would be valuable. In other circumstances a new intervention may need to be designed from the ground up. Here access to some explanatory knowledge about possible causal relationships would be especially relevant.On further reflection, even where a new intervention has to be designed it is likely that it will involve choices of various modules (e.g. kinds of staff, kinds of activities) where knowledge of their past performance record is very relevant. But so would be a theory about their likely interactions.
At the risk of being too abstract,it would seem that a two way relationship is needed: proposed explanations need to be followed by tested predictions and successful predictions need to be followed by verified explanations.