Loading...
12 results
Search Results
Now showing 1 - 10 of 12
- Data mining process models: a roadmap for knowledge discoveryPublication . Mendes, Armando B.; Cavique, Luís; Santos, Jorge M. A.Extracting knowledge from data is the major objective of any data analysis process, including the ones developed in several sciences as statistics and quantitative methods, data base \ data warehouse and data mining. From the latter disciplines the data mining is the most ambitious because intends to analyse and extract knowledge from massive often badly structured data with many specific objectives. It is also used for relational data base data, network data, text data, log file data, and data in many other forms. In this way, is no surprise that a myriad of applications and methodologies have been and are being developed and applied for data analysis functions, where CRISP-DM (cross industry standard process for data mining) and SEMMA (sample, explore, modify, model, assessment) are two examples. The need for a roadmap is, therefore, highly recognised in the field and almost every software company has established their own process model.
- Bringing underused learning objects to the light: a multi-agent based approachPublication . Behr, André; Cscalho, José; Mendes, Armando B.; Guerra, Hélia; Cavique, Luís; Trigo, Paulo; Coelho, Helder; Vicari, RosaThe digital learning transformation brings the extension of the traditional libraries to online repositories. Learning object repositories are employed to deliver several functionalities related to the learning object’s lifecycle. However, these educational resources usually are not described effectively, lacking, for example, educational metadata and learning goals. Then, metadata incompleteness limits the quality of the services, such as search and recommendation, resulting in educational objects that do not have a proper role in teaching/learning environments. This work proposes to bring an active role to all educational resources, acting on the analysis generated from the usage statistics. To achieve this goal, we created a multi-agent architecture that complements the common repository’s functionalities to improve learning and teaching experiences. We intend to use this architecture on a repository focused on ocean literacy learning objects. This paper presents some steps toward this goal by enhancing, when needed, the repository to adapt itself.
- A feature selection approach in the study of azorean proverbsPublication . Cavique, Luís; Mendes, Armando B.; Funk, Matthias; Santos, Jorge M. A.A paremiologic (study of proverbs) case is presented as part of a wider project based on data collected among the Azorean population. Given the considerable distance between the Azores islands, we present the hypothesis that there are significant differences in the proverbs from each island, thus permitting the identification of the native island of the interviewee, based on his or her knowledge of proverbs. In this chapter, a feature selection algorithm that combines Rough Sets and the Logical Analysis of Data (LAD) is presented. The algorithm named LAID (Logical Analysis of Inconsistent Data) deals with noisy data, and we believe that an important link was established between the two different schools with similar approaches. The algorithm was applied to a real world dataset based on data collected using thousands of interviews of Azoreans, involving an initial set of twenty-two thousand Portuguese proverbs.
- Clique communities in social networksPublication . Cavique, Luís; Mendes, Armando B.; Santos, Jorge M. A.Given the large amount of data provided by the Web 2.0, there is a pressing need to obtain new metrics to better understand the network structure; how their communities are organized and the way they evolve over time. Complex network and graph mining metrics are essentially based on low complexity computational procedures like the diameter of the graph, clustering coefficient and the degree distribution of the nodes. The connected communities in the social networks have, essentially, been studied in two contexts: global metrics like the clustering coefficient and the node groups, such as the graph partitions and clique communities.
- Logical Analysis of Inconsistent Data (LAID) for a paremiologic studyPublication . Cavique, Luís; Mendes, Armando B.; Funk, MatthiasA paremiologic (study of proverbs) case is presented as a part of a wider project, based on data collected by thousands of interviews made to people from Azores, and involving a set of twenty-two thousand Portuguese proverbs, where we searched for the minimum information needed to identify the birthplace island of an interviewee. The concept of birthplace was extended for all respondents that have lived in any locations more than 5 years,unintentionally introducing inconsistencies in the data classification task. The rough sets differ from classical sets by their ability to deal with inconsistent data. A parallel approach to data reduction is given by the logical analysis of data (LAD). LAD handicaps, like the inability to cope with the contradiction and the limited number of classification classes, will be overcome in this version of Logical Analysis of Inconsistent Data (LAID).
- Big data in SATA Airline: finding new solutions for old problemsPublication . Mendes, Armando B.; Guerra, Hélia; Gomes, Luís; Oliveira, Ângelo; Cavique, LuísWith the rapid growth of operational data needed in airlines and the value that can be attributed to knowledge extracted from these data, airlines have already realized the importance of technologies and methodologies associated with the concept of big data. In this article we present the case study of SATA Airlines. The operational and the decision support systems are described as well as the perspectives of using these new technologies to support knowledge creation and aid the solution of problems in this specific company. The proposed system provides a new operational environment.
- Proverbs knowledge discovery in the virtual social network due to common knowledge of proverbsPublication . Mendes, Armando B.; Funk, Matthias; Cavique, LuísIn a series of interviews, it was collected a heterogeneous set of several million relations of positive and negative knowledge that a group of thousands of people has about a set of circa twenty-two thousand Portuguese Proverbs. This is a unique source for socio-cultural analysis of the mechanisms of transmission of oral culture in geographic discontinuous spaces. We present in this article some results on the problem of finding a homomorphism between proverbial knowledge and geographical locations. To find this relation, we chose an approach based on the Analysis of social networks where the broadcast of oral culture, at least historically, could be interpreted as a trace of direct social contact between some of their users. We can simply give the Hamming Distance between two people by comparing their proverbial knowledge and, then, choose for every person only those relations to a peer where this distance is minimal. The resulting graph is analysed by a new Clique Analysis procedure, proposed in this work, design to work on very dense networks. The procedure was tested on a subset of data and we found that there are clusters where the neighbourhood relation inducted by the minimum Hamming Distance could be a reflex of the geographical distribution and of some migration flux of the Azorean population. When we compare the cliques with high geographic proximity, we found some proverbs which are good discriminators between the different clusters.
- A bi-objective feature selection algorithm for large omics datasetsPublication . Cavique, Luís; Mendes, Armando B.; Martiniano, Hugo F. M. C.; Correia, LuísFeature selection is one of the most important concepts in data mining when dimensionality reduction is needed. The performance measures of feature selection encompass predictive accuracy and result comprehensibility. Consistency based methods are a significant category of feature selection research that substantially improves the comprehensibility of the result using the parsimony principle. In this work, the bi-objective version of the algorithm Logical Analysis of Inconsistent Data is applied to large volumes of data. In order to deal with hundreds of thousands of attributes, heuristic decomposition uses parallel processing to solve a set covering problem and a cross-validation technique. The bi-objective solutions contain the number of reduced features and the accuracy. The algorithm is applied to omics datasets with genome-like characteristics of patients with rare diseases.
- Integration of UML diagrams from the perspective of enterprise architecturePublication . Cavique, Luís; Cavique, Mariana; Mendes, Armando B.An integrated view of the information system has been an objective to deal with complexity. However, bibliography proposes many solutions with many synonyms depending on the layer, methodology, framework or tool used, that does not allow a broad view of the system. In this work we chose three basic elements of the information systems and we demonstrate how they are enough to integrate a set of essential UML diagrams. The proposed model firstly defines a set of UML diagrams for each layer of the Enterprise Architecture, and then heuristic rules are detailed in order to ensure vertical and horizontal alignment.
- Enhancing learning object repositories with ontologiesPublication . Behr, André; Mendes, Armando B.; Cascalho, José; Rossi, Luiz; Vicari, Rosa; Trigo, Paulo; Novo, Paulo; Cavique, Luís; Guerra, HéliaIn this paper, we present a review on the use of ontologies in learning object repositories systems for searching and suggestion purposes, considering its adoption for the seaThings project that aims to promote the ocean literacy. We also describe the use case of the Cognix system and Agent-based Learning Objects - OBAA metadata standard for learning objects which is being implemented on a new learning objects repository. This system includes concepts from arti cial intelligence such as agents and ontologies that aim to improve the search and so making the system more responsive. This paper also sugests how an ontology can be implemented, using metadata in learning object repositories to provide relevant aspects, such as interoperability, reuse, and searching.