Repository logo
 
Publication

Addressing low dimensionality feature subset selection: reliefF(-k) or extended correlation-based feature selection(eCFS)?

dc.contributor.authorTallón-Ballesteros, Antonio J.
dc.contributor.authorCavique, Luís
dc.contributor.authorFong, Simon
dc.date.accessioned2019-11-04T17:24:32Z
dc.date.available2019-11-04T17:24:32Z
dc.date.issued2019-05
dc.description.abstractThis paper tackles problems where attribute selection is not only able to choose a few features but also to achieve a low performance classification in terms of accuracy compared to the full attribute set. Correlation-based feature selection (CFS) has been set as the baseline attribute subset selection due to its popularity and high performance. Around hundred data sets have been collected and submitted to CFS; then the problems fulling simultaneously the conditions: a) a number of selected attributes lower than six and b) a percentage of selected attributes lower than a forty per cent, have been tested onto two directions.Firstly, in the scope of data selection at the feature level, some options proposed in a prior work as well as an advanced contemporary approach have been conducted. Secondly, the data pre-processed and initial problems have been tested with some sturdy classifiers. Moreover, this work introduces a new taxonomy of feature selection according to the solution type and the followed way to compute it. The test bed comprises seven problems, three out of them report a single selected attribute, another one with two extracted features and the three remaining data sets with four or five retained attributes, all of them by CFS; additionally, the feature set is between six and twenty nine and the complexity of the problems, in terms of classes, uctuates between two and twenty one, throwing averages of sixteen and around five for both aforementioned properties. The contribution concluded that the advanced procedure is suitable for problems where only one or two attributes are selected by CFS; for data sets with more than two selected features the baseline method is preferable to the advanced one, although the considered feature ranking method achieved intermediate results.pt_PT
dc.description.versioninfo:eu-repo/semantics/publishedVersionpt_PT
dc.identifier.doi10.1007/978-3-030-20055-8_24
dc.identifier.urihttp://hdl.handle.net/10400.2/8705
dc.language.isoengpt_PT
dc.peerreviewedyespt_PT
dc.subjectMachine learningpt_PT
dc.subjectFeature subset selectionpt_PT
dc.subjectFeature rankingpt_PT
dc.subjectExtended feature subset selectionpt_PT
dc.titleAddressing low dimensionality feature subset selection: reliefF(-k) or extended correlation-based feature selection(eCFS)?pt_PT
dc.typeconference object
dspace.entity.typePublication
oaire.citation.conferencePlaceSevilhapt_PT
oaire.citation.endPage260pt_PT
oaire.citation.startPage251pt_PT
oaire.citation.titleSOCO 2019. 14th International Conference on Soft Computing Models in Industrial and Environmental Applicationspt_PT
oaire.citation.volume950pt_PT
person.familyNameCavique
person.givenNameLuís
person.identifier.ciencia-id911E-84AC-3956
person.identifier.orcid0000-0002-5590-1493
rcaap.rightsopenAccesspt_PT
rcaap.typeconferenceObjectpt_PT
relation.isAuthorOfPublication40906a16-46a2-42f1-b26d-7db7012294ee
relation.isAuthorOfPublication.latestForDiscovery40906a16-46a2-42f1-b26d-7db7012294ee

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
soco2019_paper_26.pdf
Size:
216.46 KB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.97 KB
Format:
Item-specific license agreed upon to submission
Description: