DEV Community

Cover image for MAKAUT B.Tech Data Warehousing and Data Mining MCQs
ANIRUDDHA ADAK
ANIRUDDHA ADAK

Posted on

MAKAUT B.Tech Data Warehousing and Data Mining MCQs

Multiple Choice Questions (Ordered by Frequency)


1. Question: A data warehouse is said to contain a 'time-varying' collection of data because

  • Options:
    • a) Its content vary automatically with time
    • b) Its life-span is very limited
    • c) Every key structure of data warehouse contains either implicitly or explicitly an element of time
    • d) Its content has explicit time-stamp
  • WBUT Years: 2010, 2013, 2014, 2015, 2016, 2017
  • Answer: (c)
  • Explanation: > "Time-varying implies that data in the warehouse includes historical context, often through explicit or implicit time elements in its structure."

2. Question: Which of the following techniques are appropriate for data warehousing?

  • Options:
    • a) Hashing on primary keys
    • b) Indexing on foreign keys of the fact table
    • c) Bit-map indexing
    • d) Join indexing
  • WBUT Years: 2009, 2013, 2018
  • Answer: (c)
  • Explanation: > "Bit-map indexing is efficient for queries on low-cardinality columns, which are common in dimensional data warehouses."

3. Question: A drill-down operation is concerned with

  • Options:
    • a) which merges cells of two dimension
    • b) which merges cells of any one dimension based on the characteristics of the dimension
    • c) which splits cells of two dimensions
    • d) which splits cells of any one dimension based on the characteristics of the dimension
  • WBUT Years: 2009, 2016, 2018
  • Answer: (d)
  • Explanation: > "Drill-down allows users to navigate from summary data to more detailed data by adding new dimensions or stepping down a hierarchy."

4. Question: K-means is based on

  • Options:
    • a) Euclidian distance
    • b) Hamming distance
    • c) RMS
    • d) None of these
  • WBUT Years: 2011, 2014, 2015
  • Answer: (a)
  • Explanation: > "K-means clustering uses Euclidean distance to measure the similarity between data points and cluster centroids."

5. Question: Data Warehousing is used for

  • Options:
    • a) Decision Support System
    • b) OLTP applications
    • c) Database applications
    • d) Data Manipulation applications
  • WBUT Years: 2010, 2012, 2016
  • Answer: (a)
  • Explanation: > "Data Warehousing is primarily designed to support decision-making and analytical processing."

6. Question: A data warehouse is an integrated collection of data

  • Options:
    • a) It is a collection of data of different types
    • b) It is a collection of data derived from multiple sources
    • c) It is a relational database
    • d) It contains summarized data
  • WBUT Years: 2009, 2015
  • Answer: (b)
  • Explanation: > "Data warehouses integrate data from various, often disparate, operational sources into a unified system."

7. Question: A data warehouse is said to contain a 'subject oriented' collection of data because

  • Options:
    • a) Its contents have a common theme
    • b) It is built for a specific application
    • c) It cannot support multiple subjects
    • d) It is a generalization of 'object-oriented'
  • WBUT Years: 2009, 2013
  • Answer: (a)
  • Explanation: > "Subject-oriented means data is organized around core business subjects rather than specific applications."

8. Question: Which of the following is TRUE?

  • Options:
    • a) Data warehouse can be used for analytical processing only
    • b) Data warehouse can be used for information processing (query, report) and analytical processing
    • c) Data warehouse can be used for data mining only
    • d) Data warehouse can be used for information processing (query, report), analytical processing and data mining
  • WBUT Years: 2010, 2012
  • Answer: (d)
  • Explanation: > "A data warehouse supports a wide range of analytical activities including querying, reporting, OLAP, and data mining."

9. Question: A data warehouse is built as a separate repository of data, different from the operational data of an enterprise because

  • Options:
    • a) It is necessary to keep the operational data free of any warehouse operations
    • b) A data warehouse cannot afford to allow corrupted data within it
    • c) A data warehouse contains summarized data whereas the operational database contains transactional data
    • d) None of these
  • WBUT Years: 2012, 2013
  • Answer: (c)
  • Explanation: > "Data warehouses store summarized and historical data for analysis, unlike operational databases which focus on transactional processing."

10. Question: Dimension data within a warehouse exhibits which one of the following properties?

  • Options:
    • a) Dimension data consists of the minor part of the warehouse
    • b) The aggregated information is actually dimension data
    • c) It contains historical data
    • d) Dimension data is the information that is used to analyze the elemental transaction
  • WBUT Years: 2012, 2015
  • Answer: (b)
  • Explanation: > "Dimension data provides the descriptive context and hierarchies through which aggregated facts are analyzed."

11. Question: The important aspect of the data warehouse environment is that data found within the data warehouse is

  • Options:
    • a) subject-oriented
    • b) time-variant
    • c) integrated
    • d) all of these
  • WBUT Years: 2016, 2018
  • Answer: (a)
  • Explanation: > "Subject-orientation is a core characteristic of a data warehouse, organizing data around business subjects."

12. Question: ...... is an example of predictive type of data mining whereas ...... is an example . of descriptive type of data mining.

  • Options:
    • a) Association Rule, Clustering
    • b) Association Rule, Classification
    • c) Classification, Clustering
    • d) Clustering, Classification
  • WBUT Years: 2010, 2012
  • Answer: (c)
  • Explanation: > "Classification predicts a target variable, making it predictive, while clustering discovers patterns without a target, making it descriptive."

13. Question: The 'Dice' operation is concerned with

  • Options:
    • a) Multiple runs of slice
    • b) slice on more than one dimension
    • c) selecting certain cells of more than one dimension
    • d) two consecutive slice operations in two different dimensions
  • WBUT Years: 2009, 2014
  • Answer: (d)
  • Explanation: > "The dice operation filters data on multiple dimensions, effectively creating a subcube."

14. Question: The major drawback of CLARANS algorithms is

  • Options:
    • a) it cannot handle very large volumes of data
    • b) it assumes that all objects fit into the main memory, and the result is very sensitive to input order
    • c) it cannot find the best clustering if any sampled medoit is not among the best k methods
    • d) None of these
  • WBUT Years: 2009, 2011
  • Answer: (b)
  • Explanation: > "A limitation of CLARANS is its sensitivity to the data input order and its memory-intensive nature for very large datasets."

15. Question: Parameters used for association Rule Mining are

  • Options:
    • a) Confidence and Support
    • b) Confidence and Itemcount
    • c) Support and Itemcount
    • d) Support, Confidence and Itemcount
  • WBUT Years: 2010, 2018
  • Answer: (a)
  • Explanation: > "Support and Confidence are the primary metrics used to evaluate the strength and interestingness of association rules."

16. Question: Two main types of clustering techniques in data mining are

  • Options:
    • a) Serial clustering and parallel clustering
    • b) Hierarchical clustering and partitioning clustering
    • c) Homogeneous clustering and heterogeneous clustering
    • d) k-medoids clustering and K-means clustering
  • WBUT Years: 2010, 2018
  • Answer: (b)
  • Explanation: > "Clustering methods are broadly categorized into hierarchical (building a tree of clusters) and partitioning (dividing into non-overlapping groups) approaches."

17. Question: Which one is not a data mining task?

  • Options:
    • a) indexing
    • b) classification
    • c) clustering
    • d) regression
  • WBUT Years: 2014, 2015
  • Answer: (a)
  • Explanation: > "Indexing is a database optimization technique, not a fundamental data mining task like classification, clustering, or regression."

18. Question: An example of hierarchical clustering algorithm is

  • Options:
    • a) clarans
    • b) C4.5
    • c) average linkage
    • d) rock
  • WBUT Years: 2014, 2018
  • Answer: (d)
  • Explanation: > "ROCK (Robust Clustering using links) is a hierarchical clustering algorithm for categorical data."

19. Question: The mining activity which mines web log records to discover user access patterns of web pages is

  • Options:
    • a) web content mining
    • b) web usage mining
    • c) web structure mining
    • d) web search mining
  • WBUT Years: 2011, 2014
  • Answer: (b)
  • Explanation: > "Web usage mining analyzes user behavior patterns from web server logs and clickstreams."

20. Question: Data warehouse architecture is just an over guideline. It is not a blueprint for the data warehouse

  • Options:
    • a) True
    • b) False
  • WBUT Years: 2011
  • Answer: (b)
  • Explanation: > "A data warehouse architecture provides a structured blueprint and a clear framework for its design and implementation."

21. Question: The most distinguishing characteristic of DSS data is

  • Options:
    • a) Granularity
    • b) Timespan
    • c) Dimensionality
    • d) Data currency
  • WBUT Years: 2011
  • Answer: (c)
  • Explanation: > "Dimensionality is crucial for DSS data, enabling multi-perspective analysis of business performance."

22. Question: ......... is a subject-oriented, integrated, time-variant, non-volatile collection of data

  • Options:
    • a) Data Mining
    • b) Data Warehousing
    • c) Document Mining
    • d) Text Mining
  • WBUT Years: 2017
  • Answer: (b)
  • Explanation: > "This is the standard definition of a data warehouse, highlighting its four key characteristics."

23. Question: What is Metadata?

  • Options:
    • a) Summarized data
    • b) Operational data
    • c) Data about data
    • d) None of these
  • WBUT Years: 2017
  • Answer: (c)
  • Explanation: > "Metadata provides descriptive information about other data, defining its structure, meaning, and context."

24. Question: The full form of OLAP is

  • Options:
    • a) Online Analytical Processing
    • b) Online Advanced Processing
    • c) Online Advanced preparation
    • d) Online Analytical Performance
  • WBUT Years: 2017
  • Answer: (a)
  • Explanation: > "OLAP stands for Online Analytical Processing, which enables fast, interactive analysis of multidimensional data."

25. Question: The apriori algorithm is a

  • Options:
    • a) top - down search
    • b) breadth first search
    • c) depth first search
    • d) bottom-up search
  • WBUT Years: 2017
  • Answer: (d)
  • Explanation: > "The Apriori algorithm uses a bottom-up approach, building frequent itemsets from smaller ones."

26. Question: Classification rules are extracted from

  • Options:
    • a) Root node
    • b) Decision tree
    • c) Siblings
    • d) Branches
  • WBUT Years: 2017
  • Answer: (b)
  • Explanation: > "Decision trees provide clear, interpretable rules for classification by mapping decision paths."

27. Question: Which of the following is a predictive model?

  • Options:
    • a) Clustering
    • b) Regression
    • c) Summarization
    • d) Association rules
  • WBUT Years: 2017
  • Answer: (b)
  • Explanation: > "Regression is a predictive modeling technique used to forecast continuous numerical values."

28. Question: All set of items whose support is greater than the user-specified minimum support are called as

  • Options:
    • a) Border set
    • b) Frequent set
    • c) Maximal frequent set
    • d) Lattice
  • WBUT Years: 2017
  • Answer: (b)
  • Explanation: > "A frequent set (or frequent itemset) is a collection of items that appears together in transactions above a specified support threshold."

29. Question: The algorithm which uses the concept of a train running over data to find associations of items in data mining known as

  • Options:
    • a) Apriority Algorithm
    • b) Partition Algorithm
    • c) Dynamic Item-set Counting Algorithm
    • d) FP-Tree growth Algorithm
  • WBUT Years: 2011
  • Answer: (c)
  • Explanation: > "Dynamic Item-set Counting uses a 'train' metaphor to incrementally count itemsets as data transactions are processed."

30. Question: If we know exactly what information we need then...would suffice, but if we vaguely know the possible patterns then.......are useful.

  • Options:
    • a) Data Warehouse, Data Mining techniques
    • b) DBMS Query, Data Mining techniques
    • c) DBMS Query, Data Warehouse applications
    • d) Data Warehouse applications, Data Mining techniques
  • WBUT Years: 2012
  • Answer: (b)
  • Explanation: > "DBMS queries are for retrieving specific, known information, while data mining techniques are used to discover hidden or vague patterns."

31. Question: Association analysis is used for

  • Options:
    • a) transaction data analysis
    • b) olap
    • c) molap
    • d) none of these
  • WBUT Years: 2014
  • Answer: (a)
  • Explanation: > "Association analysis is predominantly applied to transactional databases to find relationships between items."

32. Question: Which frequent pattern mining technique mines without candidate generation?

  • Options:
    • a) Partitioning
    • b) Apriori
    • c) FP-growth
    • d) Dynamic intensive counting
  • WBUT Years: 2018
  • Answer: (c)
  • Explanation: > "FP-Growth avoids the costly candidate generation step by using a compact FP-tree structure for frequent pattern discovery."

33. Question: Choose correct alternatives from the following options:

  • Options:
    • a) Both (i) and (ii) is true
    • b) Both (ii) and (iii) is true
    • c) (i) is true and (iv) is false
    • d) (i) is true and (iii) is false
    • (i) The attribute with the highest information gain is chosen as the splitting attribute
    • (ii) The attribute with the lowest information gain is chosen as the splitting attribute
    • (iii) The attribute with the Highest Gini index is chosen as the splitting attribute
    • (iv) The attribute with the Lowest Gini index is chosen as the splitting attribute
  • WBUT Years: 2018
  • Answer: (d)
  • Explanation: > "In decision tree construction, the goal is to reduce impurity, which means selecting attributes with the highest information gain or the lowest Gini index for splitting."

Top comments (0)