(Due time and date: 2:00pm on Apr. 30 (postponed to May.
- All answer texts must be computer generated. You can draw the
diagrams manually. But they should be clear and neat. diagrams).
- The hand-in version must be ordered correctly and stapled in the top left
- The hand-in version must include a header page indicating: student name,
student number, user id, course number and assignment number.
1. Finding Association Rules on Transaction Databases (20 Marks)Take a
look at the following table, where T1, T2, T3, T4,
T5, and T6 are the transaction ID's, and A, B,
C, D, and E are the item ID's.
||List of Item ID's|
||A, B, E|
||B, C, D|
||B, D, E|
||C, D, E|
||B, C, D, E|
||B, C, E|
Let the min_support = 20% and min_conf = 60%. In this question, we are
considering the Apriori algorithm and two of its variations. They are:
Use each of the above three algorithm (5 marks for each algorithm) to mine
all the rules which match the following meta-rule template.
- General Apriori algorithm.
- Hash-based Apriori algorithm (Suppose order(A)=1,
order(B)=2, order(C)=3, order(D)=4, order(E)=5.
The hashing function used is hash(x,y) = (order(x) * 10 + order(y)) mod 7,
e.g. hash(A, B) = 5).
- Partitioning-based Apriori algorithm (Suppose the above transaction
database is divided into two partitions. Transactions T1, T2,
and T3 are in one partition while transactions T4, T5,
and T6 are in the other).
buys(X, Y) => buys(X, "E") -- [s, c]
2. Calculation Question (20 marks)Suppose a data relation about a large
set of students in a university database has been generalized to a relation
R. You are required to derive a characteristic rule and a discriminant
rule from this relation.
You can print out the data sheet for
the concept hierarchies and the relation R which are used in this
Let the attribute thresholds (denoted as T(attribute)) be: T(major)
= 3, T(status) = 2, T(age) = 2, T(nationality) = 2, and
T(gpa) = 3.
- Derive a characteristic rule for R.
- Let the attribute thresholds be the same as above. Derive a discriminant
rule which contrasts applied_science vs. arts students.
3. Chapter 5, Exercise 2, 3 (40 marks)
4. Chapter 6, Exercise 7 (20 marks)