User Tools

Site Tools


faq

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
faq [2017/02/12 11:30]
yzan
faq [2018/03/08 18:19] (current)
yzan
Line 14: Line 14:
 ‘Non-Fiction’= 2,​‘Fiction’= 3 and calculate the product of the primes (ACORA, 2005). But do we actually want to preserve all the information?​ ‘Non-Fiction’= 2,​‘Fiction’= 3 and calculate the product of the primes (ACORA, 2005). But do we actually want to preserve all the information?​
  
-Let's consider a single table with n binary attributes and one binary target. Let's also assume that all of the attributes are independent of each other. And that each attribute has the maximal possible entropy. Then whenever ​there is more than one attribute we have more information at the input of a classifier than on the output of the classifier. Hence if we want to perform classification,​ we have to loose information.+Nevertheless,​ whenever we are performing classification,​ we generally want to loose information.  
 +For example, let's consider a single table with //n// binary attributes ​(//​n//>​1) ​and one binary target. Let's also assume that all of the attributes are independent of each other. And that each attribute has the maximal possible entropy. Then whenever we want to perform classification,​ we have to loose information, because at the input we have more information than we want to have at the output.
  
-Tl;dr: If you dislike idea of (desirableinformation loss I may suggest you loss-less compression ​algorithms ​like zip ;).+Are there some scenarios in machine learning where we do not want to loose any information?​ Indeed, whenever we are performing an unsupervised exploratory analysis, we (maywant to find all patterns in the data. Association rules and ILP algorithms ​are then good candidates for the task.
  
 === What is relational data mining? === === What is relational data mining? ===

Page Tools