Second Dutch Stan meetup

Slides.

This event has already taken place. Slides for the different talks are available below.

Duco Veen: On shinystan 3.0. 

Christof Seiler: Uncertainty quantification in multivariate mixed models for mass cytometry data. 

Date.

Thursday, 31 October 2019.

Location.

Dutch Healthcare Authority (“Nederlandse Zorgautoriteit”)

Newtonlaan 1-41
3584 BX Utrecht

IMPORTANT: All attendants must identify themselves with either a passport, id card, or driver’s license.

Program.

18:00 – 18:30. Pizzas (vegetarian and non-vegetarian) made available by the Dutch Healthcare Authority
18:30 – 18:50. Welcome & presentation Dutch Healthcare Authority
18:50 – 19:35. First speaker: Duco Veen
19:35 – 19:50. Break
19:50 – 20:35. Second speaker: Christof Seiler
20:35 – 21:30. Drinks

Registration.

You can register here.

Sponsor.

Dutch Healthcare Authority (“Nederlandse Zorgautoriteit”; NZa)
The Dutch healthcare system is a highly complex collaboration between healthcare providers, health insurers, consumers, politics and a diverse circle of stakeholders and authorities. Together we organize and deliver healthcare. The Dutch Healthcare Authority is at the center of that sphere, with a specific role in the healthcare landscape. In the work of the NZa, the interest of the 17 million people in the Netherlands is the first priority. They should be able to rely on  good and affordable healthcare if and when they need it. From that perspective, we make rules and monitor healthcare. Good data is essential for this. These enable us to monitor developments in the market, conduct research and create policies to prevent or solve social problems. We make extensive use of data analysis and models.

Abstracts.

On shinystan 3.0
Duco Veen

In this talk I will discuss several angles with respect to the new version of shinystan that is being developed. From a user’s perspective I will explain some new features, including report generation. I discuss how some new features might be useful for teaching Bayesian statistics and stan in particular. We will discuss diagnosing your model, finding relevant issues, and how shinystan can help make those tasks easier. From a development perspective I explain the reason to restructure the backend of shinystan, and how the restructuring enables easier contributions from the community to the project. Finally, feedback and wishlist, what do you want (shiny)stan to help you with?

Uncertainty quantification in multivariate mixed models for mass cytometry data
Christof Seiler

Mass cytometry technology enables the simultaneous measurement of over 40 proteins on single cells. This has helped immunologists to increase their understanding of heterogeneity, complexity, and lineage relationships of white blood cells. Current statistical methods often collapse the rich single-cell data into summary statistics before proceeding with downstream analysis, discarding the information in these multivariate datasets. In this talk, our aim is to exhibit the use of statistical analyses on the raw, uncompressed data thus improving replicability, and exposing multivariate patterns and their associated uncertainty profiles. We show that multivariate generative models are a valid alternative to univariate hypothesis testing. Including correlations in the model can have positive effects: it makes the statistical procedure more efficient, it exposes additional structure with which to interpret results, and it provides information as to eventual confounders that need to be attended to.

We propose a multivariate Poisson log-normal mixed model. We use Hamiltonian Monte Carlo to provide Bayesian uncertainty quantification. We implemented our approach in a new R package cytoeffect (https://christofseiler.github.io/cytoeffect). Our model provides key advantages over existing approaches. We work on the uncompressed full data and model marker correlations explicitly. Our model is more robust to donor-to-donor variability because we explicitly model this variability with random effects. We can also incorporate complicated experimental designs.

We reanalyzed NK cell populations in different subjects during pregnancy. We were able to corroborate an increase of pSTAT1 during the third trimester when samples were stimulated with interferon alpha, as previously reported. We also present a new way to visualize multivariate donor-to-donor uncertainty.

Our reanalysis proves that this family of models avoids loss of information through data compression. In other words, adapting Don Knuth’s quote about premature optimization in programming: “premature summarization is the root of all evil (or at least most of it) in statistics.”