Publicação
Managing missing data and predictions in short time series
| datacite.subject.fos | Engenharia e Tecnologia | |
| dc.contributor.author | António, Francisco de Araújo | |
| dc.contributor.author | Cavique, Luís | |
| dc.date.accessioned | 2026-05-08T11:14:31Z | |
| dc.date.available | 2026-05-08T11:14:31Z | |
| dc.date.issued | 2026-01-15 | |
| dc.description.abstract | Sales forecasting in the presence of Missing Data poses significant challenges, particularly for short time series where limited observations amplify the impact of incomplete records. This study analyzes a real-world transactional dataset (2021–2024) to predict quantities and prices for 2025. We classify miss-ingness patterns and mechanisms (MCAR, MAR, MNAR) to inform the selection of imputation strategies. We evaluate techniques including MICE, Mean, KNN, and Linear Regression under simulated missingness rates, with KNN emerging as the most robust for the MAR mechanism. Regarding very short-term series pre-dictions, the naive forecast Max2 (maximum of the last two observed values) out-performed moving averages. The results highlight the importance of mechanism-aware imputation and domain-tailored forecasting in sparse datasets. This work presents a practical framework for businesses to effectively utilize incomplete sales data. | eng |
| dc.description.sponsorship | Acknowledgments. This work was supported by the LASIGE Research Unit, reference UID/00408/2025 – LASIGE. | |
| dc.identifier.doi | 10.1007/978-3-032-05176-9_22 | |
| dc.identifier.issn | 1611-3349 | |
| dc.identifier.uri | http://hdl.handle.net/10400.2/22008 | |
| dc.language.iso | eng | |
| dc.peerreviewed | yes | |
| dc.publisher | Springer Nature [academic journals on nature.com] | |
| dc.relation | LARGE-SCALE INFORMATICS SYSTEMS LABORATORY | |
| dc.rights.uri | http://creativecommons.org/licenses/by/4.0/ | |
| dc.subject | Missing data | |
| dc.subject | Time series forecasting | |
| dc.subject | Imputation techniques | |
| dc.subject | Sales prediction | |
| dc.subject | Short time series | |
| dc.title | Managing missing data and predictions in short time series | por |
| dc.type | conference proceedings | |
| dspace.entity.type | Publication | |
| oaire.awardNumber | UID/CEC/00408/2019 | |
| oaire.awardTitle | LARGE-SCALE INFORMATICS SYSTEMS LABORATORY | |
| oaire.awardURI | info:eu-repo/grantAgreement/FCT/6817 - DCRRNI ID/UID%2FCEC%2F00408%2F2019/PT | |
| oaire.citation.conferenceDate | 2025-09 | |
| oaire.citation.title | LNAI 16121. Progress in Artificial Intelligence (EPIA 2025) | |
| oaire.fundingStream | 6817 - DCRRNI ID | |
| oaire.version | http://purl.org/coar/version/c_970fb48d4fbd8a85 | |
| person.familyName | António | |
| person.familyName | Cavique | |
| person.givenName | Francisco de Araújo | |
| person.givenName | Luís | |
| person.identifier.ciencia-id | 911E-84AC-3956 | |
| person.identifier.orcid | 0009-0006-1154-9690 | |
| person.identifier.orcid | 0000-0002-5590-1493 | |
| project.funder.identifier | http://doi.org/10.13039/501100001871 | |
| project.funder.name | Fundação para a Ciência e a Tecnologia | |
| relation.isAuthorOfPublication | b6e0f781-97fb-4fec-ac50-a07abd18bcd4 | |
| relation.isAuthorOfPublication | 40906a16-46a2-42f1-b26d-7db7012294ee | |
| relation.isAuthorOfPublication.latestForDiscovery | 40906a16-46a2-42f1-b26d-7db7012294ee | |
| relation.isProjectOfPublication | e888b4af-8eff-4efb-826c-fd87c5facd97 | |
| relation.isProjectOfPublication.latestForDiscovery | e888b4af-8eff-4efb-826c-fd87c5facd97 |
