Loading...
3 results
Search Results
Now showing 1 - 3 of 3
- Data mining process models: a roadmap for knowledge discoveryPublication . Mendes, Armando B.; Cavique, Luís; Santos, Jorge M. A.Extracting knowledge from data is the major objective of any data analysis process, including the ones developed in several sciences as statistics and quantitative methods, data base \ data warehouse and data mining. From the latter disciplines the data mining is the most ambitious because intends to analyse and extract knowledge from massive often badly structured data with many specific objectives. It is also used for relational data base data, network data, text data, log file data, and data in many other forms. In this way, is no surprise that a myriad of applications and methodologies have been and are being developed and applied for data analysis functions, where CRISP-DM (cross industry standard process for data mining) and SEMMA (sample, explore, modify, model, assessment) are two examples. The need for a roadmap is, therefore, highly recognised in the field and almost every software company has established their own process model.
- A feature selection approach in the study of azorean proverbsPublication . Cavique, Luís; Mendes, Armando B.; Funk, Matthias; Santos, Jorge M. A.A paremiologic (study of proverbs) case is presented as part of a wider project based on data collected among the Azorean population. Given the considerable distance between the Azores islands, we present the hypothesis that there are significant differences in the proverbs from each island, thus permitting the identification of the native island of the interviewee, based on his or her knowledge of proverbs. In this chapter, a feature selection algorithm that combines Rough Sets and the Logical Analysis of Data (LAD) is presented. The algorithm named LAID (Logical Analysis of Inconsistent Data) deals with noisy data, and we believe that an important link was established between the two different schools with similar approaches. The algorithm was applied to a real world dataset based on data collected using thousands of interviews of Azoreans, involving an initial set of twenty-two thousand Portuguese proverbs.
- An algorithm to discover the k-clique cover in networksPublication . Cavique, Luís; Mendes, Armando B.; Santos, Jorge M. A.In social network analysis, a k-clique is a relaxed clique, i.e., a k-clique is a quasi-complete sub-graph. A k-clique in a graph is a sub-graph where the distance between any two vertices is no greater than k. The visualization of a small number of vertices can be easily performed in a graph. However, when the number of vertices and edges increases the visualization becomes incomprehensible. In this paper, we propose a new graph mining approach based on k-cliques. The concept of relaxed clique is extended to the whole graph, to achieve a general view, by covering the network with k-cliques. The sequence of k-clique covers is presented, combining small world concepts with community structure components. Computational results and examples are presented.