Sie sind hier: Startseite Weitere Lehrveranstaltungen Wintersemester Multivariable model-building

Multivariable model-building

Issues in building multivariable regression models and the importance of transparent reporting


Lecturers: Prof. Dr. Willi Sauerbrei, Edwin Kipruto
Dates: Monday and Tuesday, 13 January – 28 January 2020
(13.,14., 20., 21., 27., 28.01.2020)
Time: Daily 14.15 – 17.30
Place: Med. Biometrie und Med. Informatik, Stefan-Meier-Straße 26, Hörsaal
Registration required: deadline 08 January 2020.
Please send an email to giving:
Name, surname, Department and Institution, student (y/n)
Language: German or English (depends on participants), slides in English

Structure: 5 days lecture, 1 day demonstration with R and Stata

Language: German or English (depends on participants), teaching material in English


For many years the quality of research in the health sciences has been criticized and it is obvious that ‘waste in research’ has to be reduced (Ioannidis et al., 2014). Problems in design, analysis and reporting of studies are among the most important reasons for this very disappointing situation. Deficiencies in statistical methods and their applications have been raised and consistently expressed over many years (Altman et al 1994, Sauerbrei 2005). Statistical methodologies have been substantially developed, but many of them are ignored in practice and insufficient statistical knowledge in the research community is often emphasized. It is obvious that fishing for significant p-values produces many false positive results (Kyzas et al 2005). The untapped potential of observational research to inform clinical decision making is well known (Visvanathan et al., 2017). Essentially, it is necessary to ensure the use of rigorous methodologies with suitable methods for the design and analysis of a study and transparent reporting of results as key issues.

During the last two decades several initiatives have been started that aim at improving the research process. Obviously, transparent and complete reporting is a pre-requisite to judge the usefulness of data and to interpret study results in the appropriate context. For many different types of studies reporting guidelines have been developed and the EQUATOR network acts as an “umbrella” for developers of such guidelines (Simera et al 2010, Moher et al 2014, Altman et al 2012, Moons et al 2015).

The development of guidance for the statistical analysis of observational studies is one of the difficult areas that warrant more efforts. The STRATOS (STRengthening Analytical Thinking for Observational Studies) initiative was recently founded to address these issues (Sauerbrei et al 2014). Currently there are nine topic groups (TG), each working on specific tasks such as study design, missing data, measurement error and misclassification, causal inference or high-dimensional data.

With an emphasize on topic groups TG2 ‘Selection of variables and their functional forms in multivariable analysis’ and TG6 ‘Evaluating diagnostic tests and prediction models’ we will illustrate the concepts, structure and the general approach of the STRATOS initiative. It will become apparent that considerable research is required to gain more insight into advantages and disadvantages of competing strategies.  We will concentrate on the discussion of strategies for variable selection, the role of shrinkage and the multivariable fractional polynomial (MFP) approach to conduct variable selection and the selection of the functional form for continuous variables.

In the context of prognostic marker research we will illustrate common weaknesses of the design, analysis and reporting studies, with an emphasis on continuous variables. Problems caused by categorization will be expounded and we will attest that modelling of continuous variables has numerous advantages (Sauerbrei and Royston 2010). The main aims of the PROGnosis RESearch Strategy (PROGRESS) partnership will be outlined (Hemingway et al 2013, Riley et al 2013, Riley et al 2019).

In many statistical analysis, models are fitted to the data but results are presented without notification that small perturbations in the data might lead to major changes in the model (Royston and Sauerbrei, 2008). As such, issues of model stability assessment using resampling method will be presented with a practical example.

References and links


  • Altman DG, Lausen B, Sauerbrei W, Schumacher M (1994): Dangers of using “Optimal” cutpoints in the evaluation of prognostic factors. Journal of the National Cancer Institute 86: 829-835.
  • Ioannidis JP (2005): Why most published research findings are false. PLoS Med 2(8): E 124.
  • Ioannidis JP, Greenland S, Hlatky MA, Khoury MJ, Macleod MR, Moher D, Schulz KF, Tibshirani R (2014): Increasing value and reducing waste in research design, conduct, and analysis, Lancet 383(9912): 166-175.
  • Kyzas PA, Denaxa-Kyza D, Ioannidis JP (2007): Almost all articles on cancer prognostic markers report statistically significant results. European Journal of Cancer 43: 2559 – 2579.
  • Royston P, Sauerbrei W (2008): ‘Multivariable Model-Building – A pragmatic approach to regression analysis based on fractional polynomials for modelling continuous variables’. Wiley
  • Sauerbrei W (2005): Prognostic Factors – Confusion caused by bad quality of design, analysis and reporting of many studies. Bier H. (ed). Current Research in Head and Neck Cancer. Advances in Oto-Rhino-Laryngology. Basel, Karger, 62:184-200.
  • Sauerbrei W, Royston P (2010): Continuous Variables: To Categorize or to Model? In: Reading, C. (Ed.): The 8th International Conference on Teaching Statistics- Data and Context in statistics education: Towards an evidence based society. International Statistical Institute, Voorburg.
  • The Lancet Series: “Research: increasing value, reducing waste” (2014):


EQUATOR (Enhancing the QUAlity and Transparency Of health Research)

  • Altman DG, McShane L, Sauerbrei W, Taube SE (2012): Reporting recommendations for tumor marker prognostic studies (REMARK): explanation and elaboration. PLoS Med 9(5): E 1001216.
  • Moher D, Altman DG, Schulz K, Simera I, Wager E (Eds.) (2014): Guidelines for reporting health research: a user's manual. John Wiley & Sons.
  • Moons KG, Altman DG, Reitsma JB, Ioannidis JP, Macaskill P, Steyerberg EW, et al. (2015): Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and Elaboration. Annals of internal medicine, 162:W1-W73.
    • TRIPOD website:
  • Simera I, Moher D, Hirst A, Hoey J, Schulz KF, Altman DG (2010): Transparent and accurate reporting increases reliability, utility, and impact of your research: reporting guidelines and the EQUATOR Network. BMC Med (8): 24.


PROGnosis RESearch Strategy (PROGRESS)

PROGRESS Partnership is a UK Medical Research Council (MRC) funded, international, interdisciplinary collaboration developing understanding in research into quality of care outcomes, prognostic factors, risk prediction models, and predictors of differential treatment response.

  • Hemingway H, Croft P, Perel P, Hayden JA, Abrams K, Timmis A et al. (2013): Prognosis research strategy (PROGRESS) 1: A framework for researching clinical outcomes BMJ 2013; 346 :e5595
  • Recommendations table from PROGRESS papers 1-4 (.docx):
  • Riley RD, Hayden JA, Steyerberg EW, Moons KGM, Abrams K, et al. (2013): Prognosis Research Strategy (PROGRESS) 2: Prognostic Factor Research. PLOS Medicine 10(2): e1001380.
  • Riley RD, van der Windt D, Croft P, Moons, KG (Eds.) (2019): Prognosis Research in Health Care: Concepts, Methods, and Impact. Oxford University Press.


STRengthening Analytical Thinking for Observational Studies (STRATOS)

  • Sauerbrei W, Abrahamowicz M, Altman DG, le Cessie S, Carpenter J on behalf of the STRATOS initiative. (2014): STRengthening Analytical Thinking for Observational Studies: the STRATOS initiative. Statistics in Medicine, 33: 5413-5432.