I am a data scientist with a software engineering background passionate about learning and technology. My main interests are Machine Learning and Data Science, which have been the core of my early career.
- Courses: Calculus, Linear Algebra, Algorithms and Complexity, Operating Systems, Computer Architecture, Functional Programming, Imperative Programming, Object-Oriented Programming, Numerical Optimization.
- Finished with a 17/20 GPA. The main topics I specialized in were:
- Artificial Intelligence, covering intelligent agents, robotics, machine learning, and data science.
- Parallel and Distributed Computing, covering advanced computer architectures (manycore processors, GPUs), parallel algorithms, and performance engineering.
- Dissertation: Human-Computer Interaction Data Analysis using Deep Learning.
- Publication: Pinto, J.P., Pimenta, A. & Novais, P. Deep learning and multivariate time series for cheat detection in video games. Machine Learning 110, 3037–3057 (2021).
- Conducted in-depth data analysis and developed dashboards used by leadership as well as by the Product, Marketing, Business Development, and Financial teams to enhance targeting, increase revenue, and allocate resources more efficiently based on key business metrics.
- Implemented a recommender system to help millions of users diversify their cryptocurrency investment portfolios.
- Designed A/B tests and other controlled experiments to continuously improve features and analyze the impact of new releases. Advocated for the systematic and methodical use of this approach as the ideal standard for decision-making.
- Played a key role in the relationship between the Data and Product teams pursuing a data-driven product development process.
- Designed a system for cheat detection and continuous authentication in video games using deep learning and multivariate time series.
- Developed an end-to-end data science & machine learning pipeline:
- ETLs processing user input data (MongoDB, Python, pandas, NumPy);
- Data cleaning and feature engineering to produce a dataset of multivariate time series;
- Training, hyperparameter optimization, and evaluation of CNNs in fraud detection and continuous authentication (TensorFlow + Keras, sklearn, Optuna);
- Serving the trained models through a RESTful API (Java, TensorFlow Java API, SpringBoot, Docker).
- Helped extend the company’s internal framework by developing a dashboard to monitor ETL execution logs. The ETL processes were designed, managed, and executed using Pentaho Data Integration.