Repository logo
 

Search Results

Now showing 1 - 6 of 6
  • Regular sports services: dataset of demographic, frequency and service level agreement
    Publication . Pinheiro, Paulo; Cavique, Luís
    This article describes a dataset of different services acquired by users during the period in which they are active in a sports facility as well as their behavior in terms of frequency of the sport facility itself and the type of classes they prefer to attend. Each observation in the dataset corresponds to one user, including the features of subscriptions and frequency. Data were collected between June 1st 2014 and October 31st 2019 from a database of an ERP solution operating in a sports facility in Lisbon, Portugal. From this database, it was possi- ble to perform operations of extraction, transformation and loading into the dataset. The dataset with real data can be useful for research in ar- eas such as customer retention, machine learning, marketing, actionable knowledge and others. Although we present real data from users of a sports facil- ity, in order to comply the GDPR legislation, the attributes that could identify the users were removed making the data anonymized.
  • Uplift modeling using the transformed outcome approach
    Publication . Pinheiro, Paulo; Cavique, Luís
    Churn and how to deal with it is an essential issue in the telecommunications sector. Within the scope of actionable knowledge, we argue that it is crucial to find effective personalized interventions that can lead to a reduction in dropouts and that, at the same time, make it possible to determine the causal effect of these interventions. Considering an intervention that encourages clients to opt for a longer-term contract for benefits, we used Uplift modeling and the Transformed Outcome Approach as a machine learning-based technique for individual-level prediction. The result is actionable profiles of persuadable customers that increase retention and strike the right balance between the campaign budget.
  • Telco customer churn analysis: measuring the effect of different contracts
    Publication . Pinheiro, Paulo; Cavique, Luís
    Customer retention is nowadays a challenge that requires concrete and personalized actions. Traditional data mining studies focused on predictive analytics, neglecting the business domain. This work aims to present an actionable knowledge discovery based on specific, actionable attributes and measuring of their effects. It is common to use matching, and propensity score approaches in healthcare to evaluate causality. After performing matching using the actionable attributes in this analysis, the causal effect is quantified. This work concludes that the difference between having a yearly contract versus having a monthly contract affects the churn of around 34%.
  • A bi‐objective procedure to deliver actionable knowledge in sport services
    Publication . Pinheiro, Paulo; Cavique, Luís
    The increase in retention of customer in gyms and health clubs is nowadays a challenge that requires concrete and personalized actions. Traditional data mining studies focused essentially on predictive analytics, neglecting the business domain. This work presents an actionable knowledge discovery system which uses the following pipeline (data collection, predictive model, retention interventions). In the first step, it extracts and transforms existing real data from databases of the sports facilities. In a second step, predictive models are applied to identify user profiles more susceptible to dropout, where actionable withdrawal rules are based on actionable attributes. Finally, in the third step, based on the previous actionable knowledge some of the values of the actionable attributes should be changed in order to increase retention. Simulation of scenarios is carried out, with test and control groups, where business utility and associate cost are measured. This document presents a bi-objective study in order to choose the more efficient scenarios.
  • Large language model for querying databases in Portuguese
    Publication . Figueiredo, Lourenço; Pinheiro, Paulo; Cavique, Luís; Marques, Nuno
    This study introduces a system that helps non-expert users find information easily without knowing database languages or asking technicians for help. A specific domain is explored, focusing on a subscription-based sports facility, which serves as an open-source version of a real case study. Utilizing the star schema, the available data in the database is structured to provide accessibility through Portuguese Natural Language queries. Using a Large Language Model (LLM), SQL queries are generated based on the question and the provided star schema. We created a dataset with 115 highly challenging questions drawn from real-world usage scenarios to validate the correctness of the system. Challenges found during testing, like attribute value interpretation, out-of-scope questions, and temporal interval adequacy issues, highlight the insufficiency of the star schema alone in providing the needed context for generating accurate SQL queries by the LLM. Addressing these challenges through enhanced contextual information shows significant improvement in query correctness, with validation results increasing from 57.76% to 88.79%. This study shows the potential and limitations of LLMs in generating SQL queries from Portuguese Natural Language queries.
  • Data science maturity model: from raw data to pearl’s causality hierarchy
    Publication . Cavique, Luís; Pinheiro, Paulo; Mendes, Armando B.
    Data maturity models are an important and current topic since they allow organizations to plan their medium and long-term goals. However, most maturity models do not follow what is done in digital technologies regarding experimentation. Data Science appears in the literature related to Business Intelligence (BI) and Business Analytics (BA). This work presents a new data science maturity model that combines previous ones with the emerging Business Experimentation (BE) and causality concepts. In this work, each level is identified with a specific function. For each level, the techniques are introduced and associated with meaningful wh-questions.We demonstrate the maturity model by presenting two case studies.