Seminari CS

The Seminari CS is the monthly seminar of the Computer Science Department at UPC. The seminar is announced internally and on the webpage of the department. Alternatively, you can also subscribe to the Google Calendar.

Where/When? room C6 003, the first wednesday of each month 12:00–13:00 (unless announced otherwise)
In what language? english/spanish/catalan (depending on the speaker/audience)

If you (or a visitor of yours) want to give a talk please send me .

Next talk

Apr 8, 2026 (Room C6 003, h. 12:00–13:00)
From Black to White Boxes: Interpretable Regression with the trust-free Python package
Albert Dorador (University of Wisconsin - Madison)

Machine Learning practitioners often face a trade-off: high accuracy with complex, black-box models (like XGBoost or Random Forests) or lower accuracy with transparent models (like decision trees or linear models). What if you didn’t have to choose?

This tutorial introduces TRUST (Transparent, Robust, and Ultra-Sparse Trees), a new interpretable regression framework that combines decision trees with sparse linear models to deliver Random Forest accuracy. The algorithm is implemented in the Python package trust-free (available via pip install). We will demonstrate how TRUST autonomously recovers the WHO obesity threshold (BMI = 30) from raw data to inform medical risk pricing.

By the end, you will be able to train high-performing, interpretable regression models and generate automated, natural-language explanation reports for individual predictions and deterministic feature importance.

All talks (past and upcoming)

Jul 1, 2026 (Room C6 003, h. 12:00–13:00)
TBD
TBD ()

Abstract

TBD
Jun 3, 2026 (Room C6 003, h. 12:00–13:00)
TBD
TBD ()

Abstract

TBD
May 6, 2026 (Room C6 003, h. 12:00–13:00)
TBD
Albert Oliveras (UPC)

Abstract

TBD
Apr 8, 2026 (Room C6 003, h. 12:00–13:00)
From Black to White Boxes: Interpretable Regression with the trust-free Python package
Albert Dorador (University of Wisconsin - Madison)

Abstract

Machine Learning practitioners often face a trade-off: high accuracy with complex, black-box models (like XGBoost or Random Forests) or lower accuracy with transparent models (like decision trees or linear models). What if you didn’t have to choose?

This tutorial introduces TRUST (Transparent, Robust, and Ultra-Sparse Trees), a new interpretable regression framework that combines decision trees with sparse linear models to deliver Random Forest accuracy. The algorithm is implemented in the Python package trust-free (available via pip install). We will demonstrate how TRUST autonomously recovers the WHO obesity threshold (BMI = 30) from raw data to inform medical risk pricing.

By the end, you will be able to train high-performing, interpretable regression models and generate automated, natural-language explanation reports for individual predictions and deterministic feature importance.
Mar 11, 2026 (Room C6 003, h. 12:00–13:00)
Research in microgravity: Computational space medicine (slides)
Antoni Perez-Poch (UPC)

Abstract

In this seminar we will have a look into the interesting area of aerospace medicine and its relation with computer science applications. How can we get to know better the behavior of human physiology in microgravity when experimental opportunities are scarce, in particular for long-term scenarios? We will first introduce this multidisciplinary research area and the experimental platforms available. A summary of the known and yet-to-be-known effects on human health when considering crewed low-orbit and space missions will be introduced. Recent experiments conducted in parabolic flight by our group on immunology and human reproduction are discussed.

We will focus on results from a Computational Model (NELME: Numerical Estimation of Long-term Modified-gravity effects) of the cardiovascular system deconditioning under modified gravity exposure. Orthostatic hypotension is a well-known risk that may put a human mission into jeopardy due to the effects caused by a transition from long-term hypogravity to sudden gravity exposure. This factor is mitigated when astronauts return to Earth because of external medical help is available but it may be a problem when landing on Mars or other deep space scenarios. We simulate a number of different mission scenarios at different gravity levels. Parameters of the model have been calculated and results have been validated from available experimental data from previous experiments in parabolic flights and experiments with diverse exposure to different gravity loads. Results show that vascular resistance is not mitigated by Moon or Mars gravity from two weeks to nine-months exposure. Results of the intensive numerical simulations show a at response of the vascular resistance when returning to the Earth gravity until g=0.45 g(Earth). Then the response is nearly lineal until g=0.78 g(Earth) when normal values from g(Earth) are recovered. Aerobic exercise is not enough to fully compensate this decondition, with women benefitting more than men. A total risk of putting a human mission in jeopardy is then estimated based on NASA standards, showing that a Mars mission returning to Earth is still safe to be conducted in accordance with previous studies. Other Moon-based scenarios are estimated as well. This model has been successful in estimating the risks associated with cardiovascular deconditioning in human long-term missions in space. Other existing models are compared.

Future implications of later developments such as Artificial Intelligence applied to astronautics, as well as educational and outreach parabolic flight campaigns conducted in Sabadell Airport by our laboratory will be finally discussed.
Feb 11, 2026 (Room C6 003, h. 13:00–14:00)
Simulation and modeling of realistic 3D forest scenes
Óscar Argudo Medrano (UPC)

Abstract

The generation of truly believable simulated natural environments remains an unsolved problem in computer graphics. Realistic virtual forests are a central component of computer-generated environments in games, film, and simulation, yet current approaches often fail to capture the visual richness of real ecosystems. A key limitation is that achieving realism requires more than static modeling focused on placing trees of the right species and size; it emerges from a long sequence of growth, disturbance, and decay processes. In real forests, events such as fire, wind, pests, competition, and senescence leave lasting and visually salient traces that convey the history and structure of the ecosystem.

In this talk, I will present a framework for the simulation and modeling of realistic 3D forest scenes that explicitly incorporates life history, disturbance events, and death and decay. The approach decouples ecosystem simulation from geometric realization, allowing forests with millions of trees to be generated efficiently while preserving visual diversity and plausibility. The pipeline proceeds in three stages: (1) simulation of individual tree growth under varying environmental conditions to sample feasible plant forms and derive scaling relationships; (2) an individual-based ecosystem simulation that models competition, disturbance, and decay over time; and (3) a model quantization and instantiation step that selects and places representative geometric models consistent with each tree’s simulated history. Beyond computer graphics, this work illustrates how combining long-term simulation with deferred model realization enables scalable, history-aware synthesis of complex ecosystems spanning millions of plants and several square kilometers.
Jan 9, 2026 (Room C6 003, h. 12:00–13:00)
Sublinear Random Sampling and Applications (slides)
Conrado Martinez (UPC)

Abstract

Consider the following scenario: we have a large database of gene sequences, hundreds of thousands, even, millions (e.g. in GenBank). In order to enable fast similarity search, clustering, phylogenetic reconstruction, etc. it is usual to consider the set of distinct k-mers of each sequence in the database (typical values of k are around 20), but instead of using the full set of k-mers of each sequence, we will use a sketch (a random sample) instead: computing the similarity (Jaccard, cosine, …) of the sketches yields accurate estimates of the true similarities.

One problem that we face then is that of producing random samples for each gene sequence in an efficient way. Practical constraints require, among other things, that the algorithms can be parallelized and that the size n of the set from which we draw the samples is not known in advance.

Algorithms such as MinHash (mash) or bottom-k are easy, efficient and will produce random samples of a given fixed size. Because the size of the sketches is fixed in advance these schemes have limited accuracy and adapt poorly when the database contains very different types of gene sequences (small/medium/large sequences, low/high variability, …). Recent proposals, such as FracMinHash (sourmash) are also fast, simple and efficient, but produce random samples of linear size w.r.t. the size n of “population”. For FracMInHash, similarity estimates are very accurate, but this is at the expense of bigger sketches and, hence, larger processing times too.

In the talk I will describe my recent work on two sublinear random sampling algorithms. The first is Affirmative Sampling (Lumbroso and M., 2022) which is the first of its kind, as far as we know. Affirmative Samping is simple and efficient, and its standard variant produces samples of (expected) size k \ln(n/k) + O(k), for a fixed parameter k. Another variant generates samples of expected size \Theta(n^\alpha), for a given value \alpha, 0 < \alpha < 1. Unfortunately, parallelization of Affirmative Sampling will render different samples depending on how the database is “split” among parallel threads, which is not desirable in practice.

In a more recent work, we introduce MaxGeomHash (Hera, Koslicki and M., submitted) which admits “deterministic” parallelization, and has better concentration around the expected size of the samples. The algorithm is also very simple, and effcient in practice. The standard variant (MGH) produces samples of expected size k \log_2(n/k) + k + O(1), and variance is 1.215 ... k + o(1); for the variant \alpha-MGH, 0 < \alpha < 1, we have samples of expected size is f(\alpha) n^\alpha + o(n^\alpha), for a explicitly computable f(\alpha), and the variance is n^\alpha + o(n^\alpha). MGH achieves accuracy close to that of FracMInHash, while only needing a fraction of the computational resources needed by FracMInHash.

We will end briefly describing some of the applications and implications of these algorithms, besides the initial example motivated in computational biology.
Dec 10, 2025 (Room Omega S215, h. 13:00–14:00)
Randomization for individual fairness
David Garcia Soriano (UPC)

Abstract

As algorithms are increasingly used to make decisions affecting individuals (from matching donors to recipients, to ranking candidates, to recommending routes…), ensuring these decisions are fair has become a critical concern. Traditional optimization often yields many equally optimal solutions, but selecting among them fairly is paramount when the outcomes impact people’s lives. This talk explores a unified, principled approach to algorithmic fairness centered on the individual, using the power of randomization to provide the strongest possible guarantees.

To this end, we introduce the distributional maxmin fairness framework, grounded in a Rawlsian principle of justice. It ensures a fair selection by randomizing over valid solutions so as to maximize the minimum probability that any individual gets a desirable outcome. Equivalently, a probability distribution over feasible solutions is maxmin-fair if it is not possible to improve the satisfaction probability of any individual without decreasing it for some other individual which is no better off. We present efficient algorithms for three core problems: fair matching, fair ranking under group constraints, and fair route recommendation, showing that our maxmin-fair-by- design methodology provides a rigorous and practical foundation for building equitable algorithms.
Nov 5, 2025 (Room Omega S215, h. 13:00–14:00)
Determinación de la Matriz de Transición Crediticia a Partir de Data de Mercado
Henryk Gzyl (IESA, Caracas)

Abstract

Las matrices de transición crediticia son una manera de entender como las corporaciones cambian de calificación crediticia. La propuesta consiste en suponer que el cambio se modela por una cadena Markoviana con un número finito de estados. El problema consiste en determinar las probabilidades de transición. Les contaré una manera de hacerlo usando la data que proveen las agencias calificadoras, que consiste en la probabilidad acumulada de incumplimiento por clase de riesgo. El problema de hallar la matriz de transición crediticia se formula como un problema inverso, que usa esa data, y tiene restricciones convexas sobre las incógnitas, que se resuelve usando un método de minimización de entropía.
Oct 1, 2025 (Room Omega S210, h. 13:00–14:00)
Online General Knapsack with Reservation Costs
Elisabet Burjons (UPC)

Abstract

In the online general knapsack problem, an algorithm is presented with an item x = (s, v) of size s and value v and must irrevocably choose to pack such an item into the knapsack or reject it before the next item appears. The goal is to maximize the total value of the packed items without overﬂowing the knapsack’s capacity. As this classical setting is way too harsh for many real-life applications, we will analyze the online general knapsack problem under the reservation model. Here, instead of accepting or rejecting an item immediately, an algorithm can delay the decision of whether to pack the item by paying a fraction 0 \leq \alpha of the size or the value of the item. This models many practical applications, where, for example, decisions can be delayed for some costs e.g. cancellation fees. We present results for both variants: First, for costs depending on the value of the items and then for costs depending on the size of the items. If the reservation costs depend on the value of the items, we ﬁnd that no algorithm is competitive for reservation costs larger than 1/2 of the item value, and we ﬁnd upper and lower bounds for the rest of the reservation factor range 0 \leq \alpha < 1/2. On the other hand, if the reservation costs depend on the size of the items, we ﬁnd a matching upper and lower bound of 2 for every reservation factor \alpha.