Correia, Luís

Resultados da pesquisa

A mostrar 1 - 8 de 8

Data pre-processing and data generation in the student flow case study
Publication . Cavique, Luís; Pombinho, Paulo; Tallón Ballesteros, Antonio J.; Correia, Luís
Education covers a range of sectors from kindergarten to higher education. In the education system, each grade has three possible outcomes: dropout, retention and pass to the next grade. In this work, we study the data from the Department of Statistics of Education and Science (DGEEC) of the Education Ministry. DGEEC maintains those outcomes for each school year, therefore, this study seeks a longitudinal view based on student flow. The document reports the data pre-processing, a stochastic model based on the pre-processed data and a data generation process that uses the previous model.
2020Documento de conferência Acesso aberto Ver mais
A bi-objective feature selection algorithm for large omics datasets
Publication . Cavique, Luís; Mendes, Armando B.; Martiniano, Hugo F. M. C.; Correia, Luís
Feature selection is one of the most important concepts in data mining when dimensionality reduction is needed. The performance measures of feature selection encompass predictive accuracy and result comprehensibility. Consistency based methods are a significant category of feature selection research that substantially improves the comprehensibility of the result using the parsimony principle. In this work, the bi-objective version of the algorithm Logical Analysis of Inconsistent Data is applied to large volumes of data. In order to deal with hundreds of thousands of attributes, heuristic decomposition uses parallel processing to solve a set covering problem and a cross-validation technique. The bi-objective solutions contain the number of reduced features and the accuracy. The algorithm is applied to omics datasets with genome-like characteristics of patients with rare diseases.
2018Artigo científico Acesso aberto Ver mais
Qualidade de dados em bases de dados anonimizadas: uma abordagem de avaliação mista
Publication . Pombinho, Paulo; Cavique, Luís; Correia, Luís
A qualidade dos dados é essencial para uma correta compreensão dos conceitos que representam. Em projetos de prospeção de dados é especialmente relevante evitar dados com qualidade inferior uma vez que se usam algoritmos que dependem de dados corretos para criar modelos e previsões precisos. Neste artigo, propomos uma abordagem de avaliação de qualidade que considera métricas que lidam com atributos individuais e, adicionalmente, uma análise longitudinal de fluxo, que permite fazer uma avaliação de qualidade que tem em consideração informação contextual. São propostas métricas de Qualidade de Dados por Entrada e Qualidade de Dados por Atributo e, finalmente, é proposta uma medida de Qualidade Global de Dados baseada nessas métricas.
2021Artigo científico Acesso aberto Ver mais
Influência de fatores socioeconómicos no sistema de ensino português
Publication . Pombinho, Paulo; Cavique, Luís; Correia, Luís
O presente artigo estuda a influência dos fatores socioeconómicos dos diferentes municípios no sucesso educacional dos estudantes. Para verificar a existência de fatores relevantes para o percurso académico dos estudantes, foram obtidos datasets com descritores socioeconómicos por município, médias das notas dos exames nacionais e as taxas de sucesso dos alunos. Estes datasets foram submetidos a uma técnica de K-nearest neighbours para permitir encontrar valores de atributos em municípios com valores em falta. Foram, de seguida, aplicados algoritmos de classificação, através de árvores de decisão e regressão, que permitiram analisar quais os atributos socioeconómicos que tinham, potencialmente, maior relação com o sucesso escolar. O trabalho efetuado permite identificar alguns fatores como alvos de potenciais estudos futuros sem, no entanto, se verificar correlações fortes com nenhum atributo socioeconómico.
2022-12Artigo científico Acesso aberto Ver mais
Self-organised systems: fundamental properties
Publication . Correia, Luís
A set of fundamental properties of self-organised systems is identified. Asynchronism is here proposed as one of these properties. It is shown that, by overlooking it, the concept of self-organisation is not fulfilled. Implications of this property to the study of selforganisation are discussed. Further, two other salient aspects are identified: minimisation of local conflicts produces optimal evolutionarily stable self-organisation; and the hypothesis that complexity variations may distinguish living from non-living self-organised systems. Conclusions and further research bring the document to an end.
2006Artigo científico Acesso aberto Ver mais
Previsão eleitoral para a Assembleia da República Portuguesa
Publication . Azevedo, Diamantino; Correia, Luís; Gaspar, Graça
Em trabalho anterior utilizaram-se técnicas de Data Mining para predizer resultados eleitorais, sem utilizar sondagens, recorrendo a variáveis socioeconómicas, disponíveis publicamente sobre Portugal, no período abrangido pelas treze eleições para a Assembleia da República, entre 1974 e 2009. No entanto, o espectro político considerado nesse trabalho não abrange os 100% dos votos expressos, mas apenas os quatro partidos com assento parlamentar regular desde 1975 cuja votação atinge cerca de 84%. Na abordagem anteriormente adoptada, cada um dos quatro partidos tradicionais foi tratado separadamente, resultando em previsões independentes. Neste artigo analisa-se a extensão desse trabalho à previsão do intervalo restante dos resultados eleitorais e a sua utilização para garantir a restrição de que a percentagem total de votos expressos soma 100%. Os resultados mostraram que os métodos anteriormente aplicados permitem obter previsões com resultados de qualidade similar para o conjunto das forças partidárias complementares.
2012Artigo científico Acesso aberto Ver mais
Errors of identifiers in anonymous databases: impact on data quality
Publication . Pombinho, Paulo; Cavique, Luís; Correia, Luís
Data quality is essential for a correct understanding of the concepts they represent. Data mining is especially relevant when data with inferior quality is used in algorithms that depend on correct data to create accurate models and predictions. In this work, we introduce the issue of errors of identifiers in an anonymous database. The work proposes a quality evaluation approach that considers individual attributes and a contextual analysis that allows additional quality evaluations. The proposed quality analysis model is a robust means of minimizing anonymization costs.
2023Documento de conferência Acesso aberto Ver mais
A data science maturity model applied to students' modeling
Publication . Cavique, Luís; Pombalinho, Paulo; Correia, Luís
Maturity models define a series of levels, each representing an increased complexity in information systems. Data Science appears in the Business Intelligence (BI) and Business Analytics (BA) literature. This work applies the _IABE maturity model, which includes two additional levels: Data Engineering (DE) at the bottom and Business Experimentation (BE) at the top. This study uses the _IABE model for students' modeling in the ModEst project. For this purpose, the Public Administration organism is the Directorate-General for Statistics of Education and Science (DGEEC) of the Portuguese Education Ministry. DGEEC provided vast data on two million students per year in the Portuguese school system, from pre-scholar to doctoral programs. This work presents the comprehensible _IABE maturity model to extract new knowledge from the DGEEC dataset. The method applied is _IABE, where after the DE level, wh-questions are formulated and answered with the most appropriate techniques at each maturity level. This work's novelty is applying the maturity model _IABE to a unique dataset for the first time. Wh-questions are stated at the BI level using data summarization; at the BA level, predictive models are performed, and counterfactual approaches are presented at the BE level.
2023-12-06Artigo científico Acesso aberto Ver mais

Correia, Luís

Filtros

Autor

Assunto

Data

Entidade

Configurações

Ordenar por

Resultados por página

Resultados da pesquisa