Project description


Serious adverse effects resulting from the treatment with thalidomide prompted modern drug legislation more than 40 years ago. During that period, the mainstay of drug safety surveillance has been the collection of spontaneous Adverse Drug Reactions (ADRs). , The current and future challenges of drug development and drug utilization, and a number of recent high-impact drug safety issues (e.g. rofecoxib (Vioxx) and SSRIs) require re-thinking of the way safety monitoring is conducted. It has become evident that adverse effects of drugs may be detected too late, when millions of persons have already been exposed. The need to change drug safety monitoring is underlined in the current public consultation about the future of pharmacovigilance in the EU. 
Pharmacovigilance is the study of the safety of marketed drugs under the practical conditions of clinical usage in large communities. The timely discovery of unknown or unexpected ADRs is one of its major challenges, because most of the drugs enter the market with less than 3000 exposed subjects, implying that reactions occurring with rates lower than 1/1000 could easily remain undetected for long periods of time. Post-marketing spontaneous reporting systems for suspected ADRs have been a cornerstone to detect safety signals in pharmacovigilance. Although many ADRs were detected by spontaneous reporting systems, these systems have inherent limitations that hamper signal detection. The major weakness is that these systems depend entirely on the ability of a physician to, first, recognize an adverse event as being related to the drug. Subsequently, the physician needs to actually report the case to the local spontaneous reporting database. The greatest limitations, therefore, are under-reporting and biases due to selective reporting. Investigations have shown that the percentage of ADRs being reported varies between 1 and 10%. , , These problems may lead to underestimation of the significance of a particular reaction and delay in signal detection, as well as spurious detections.

Project Description

In this project, an alternative approach towards the detection of ADR signals will be developed with the objective of overcoming the shortcomings of spontaneous reporting databases and providing a solid basis for large-scale monitoring of drug safety. Rather than relying on the physician’s capability and willingness to recognize and report suspected ADRs, in ALERT a systematic calculation of the occurrence of disease (potentially ADRs) during specific drug use will be based on data (time-stamped exposure and morbidity data) available in electronic patient records. 
Europe plays a leading role in the development and use of electronic patient records. , As a result, a number of European Electronic Healthcare Record (EHR) databases are available. Appropriate monitoring and use of these databases has an enormous potential for earlier detection of ADR signals. , In this project, electronic healthcare records comprising demographics, drug use and clinical data of over 30 million patients from several European countries will be available. These EHR databases form the foundation of the project. Special attention will be given to patient groups that are not routinely involved in clinical trials, for ethical or practical reasons (e.g. pregnant women, elderly people, people using many drugs simultaneously, and children). In particular in children there is an increased need for post-marketing surveillance. , We therefore included in the project a database exclusively devoted to pediatric data (PEDIANET, Italy). PEDIANET, together with data from general practice in other databases, will provide representation of children in our data set. 
Another objective of this project is to study and compare a number of different techniques that, in essence, all aim to detect unexpected or disproportional rates of events. The algorithms that we will study originate not only from the field of (pharmaco)epidemiology, but also from fields such as bio-terrorism, machine learning, and “classical” signal detection. 
Once generated, the signals will be substantiated by applying causality criteria (biological plausibility, known reactions). The purpose of this substantiation process is to place the signals in the context of the current biomedical knowledge that might explain the signal. Essentially, we are searching for evidence that supports causal inference of the signal. The list of signals will be assessed by automatically investigating feasible paths connecting the drug and the adverse reaction involved in the signal. The general strategy is the automatic linkage of biomedical entities (drugs, proteins and their genetic variants, biological pathways, and clinical events) by means of data mining approaches and in silico predictions based on biomolecular structures. The biological annotations of the drug involved in the signal will be expanded by automatically detecting its metabolites and other molecules showing similar pharmacophoric patterns. To detect associations between these biomedical entities, data and text mining techniques will be used on pharmacological repositories and biomedical literature. , , Proteins interacting with the drug or related molecules will be mapped into biological pathways that could be involved in the clinical event that is part of the signal. Information about the human genome variations that affects the proteins of the considered pathways will also be used. 
Both the underlying patient data (e.g. the number of people using a given drug increases, or the indication domain of a drug changes) and our biological understanding evolve over time. Consequently, both signal generation and assessment have to be viewed as a continuous process. As a result, any monitoring system should be able to re-assess previous conclusions in the light of new data or evidence. With optimal use of ICT both in generating and assessing signals, a largely automated procedure for detection, substantiation and re-assessment should be feasible. 
As mentioned above, the ultimate aim of this proposal is to develop an innovative approach to the early detection of adverse drug reactions. In order to assess whether that claim is met, validation is an integral part of ALERT. The system will be tested retrospectively using test sets that are based on recent literature, including both known side effects and spurious signals. Rediscovery of drug-event combinations from the test set with known side effects will provide an indication of the sensitivity of the approach. The ability not to signal drug-event combinations from the test set with spurious signals will provide an indication of the specificity of the approach.


As mentioned in the introduction, the collection of post-marketing, spontaneous reports of suspected adverse drug reactions has been so far the main pillar of drug safety surveillance. Although several initiatives of the åCouncil for International Organizations of Medical Sciences (CIOMS) and International Conference on Harmonization (ICH) have added guidance on the collection, evaluation and reporting of safety data, progress has to be made in the development of more robust methodologies for monitoring drug safety.5, This view is shared by the European Medicines Evaluation Agency, which has asked for a public consultation on the future of pharmacovigilance in Europe. 

Spontaneous reporting systems have inherent limitations that hamper signal detection, both by traditional and automated methods. , In this project, a new approach towards the detection of ADR signals will be developed. It will help to overcome the ‘reporting bias’ and underreporting of physicians, and it will more efficiently use clinical data that are already available in electronic format. The solution is based on automatically exploiting the data stored in large EHR systems. So far, electronic health care databases have been used only for hypothesis testing and not for systematic monitoring of drug exposure and event rates, an approach that could lead to efficient and unbiased signal generation. A good example of what databases may add to evidence development in the field of drug safety is the case of Vioxx (rofecoxib). Soon after the first signal was generated, more than 15 studies were conducted together including more than 60,000 cases of myocardial infarction and 1500 exposed cases. 

In Europe, the introduction of electronic healthcare records has, albeit with significant differences between Member States, seen a steady growth over the past years. In some countries, whole segments of the healthcare delivery system rely on electronic records (e.g. primary healthcare in the UK or The Netherlands).13 Compared to other developed areas, Europe is playing a leading role in the use of electronic healthcare records. For monitoring of adverse events, very large populations need to be followed up to achieve early detection of disproportional event rates with specific drugs. New drugs, for example, may slowly penetrate the market, thereby requiring a large amount of patient data in order to comprise a significant user population. Recently, a number of calculations on required population size have been performed based on newly discovered side effects. It took five years for rofecoxib to be withdrawn from the market. Using actual penetration of rofecoxib in the market, it has been calculated that if the medical records of 100 million patients would have been available for safety monitoring, the adverse cardiovascular effect would have been discovered in just three months.15,16 
In Europe, however, the application of ICT in healthcare is fragmented. There is no obvious method to combine different electronic medical records from different locations into a uniform repository. Considering the size of the populations required for early detection of adverse drug events, however, pooling of data is mandatory. The first challenge ALERT will face is the federation of different databases of electronic medical records, creating for the first time a resource of unprecedented size for monitoring of adverse events. In this project, eight different databases containing medical records of, in total, more than 30 million European citizens, will join forces. The databases stem from different European countries: IPCI (Netherlands), PHARMO (Netherlands), QRESEARCH (UK), the AUHD database (Denmark), the Regional health databases of Lombardy and Tuscany (Italy), Health Search (Italy), and PEDIANET (Italy). IPCI, Health Search, and QRESEARCH are primary care databases, containing routinely collected data on both adults and children. PEDIANET contains data exclusively on children. PHARMO is a comprehensive database of drug prescription data, which has been linked with primary care data as well as hospital data. AUHD and the regional databases in Lombardy and Tuscany are population-based databases of dispensed drugs that can be linked to hospitalizations, death records and laboratory data. 

It is also important to note that all of these databases are currently used for pharmacovigilance (albeit for signal verification rather than signal detection). As a result, all of these databases have a rich publication history and a well-developed mechanism to ensure that European and local regulations dealing with ethical use of the data and adequate privacy protection are adhered to. From the project’s perspective, this is major advantage: the ethical and legal procedures that are required when patient data are used to investigate side effects are already in place. 
Fragmentation can also be seen as diversity; and diversity constitutes an opportunity. Researchers in, for example, statistical pattern recognition have long recognized that variety in environments can be translated into variety of learning and testing sets and may result in better understanding of underlying patterns. The second challenge, therefore, will be the exploitation of this European diversity for routine drug monitoring. 

Mining large datasets in order to discover patterns has a long history. A number of methods have been specifically developed for monitoring side effects of drugs based on spontaneous reporting data, , , but the number of fields that could contribute methodology to mine EHR data is much larger and include methods developed to monitor for epidemic diseases (e.g. flu or malaria), bio-terrorism , (e.g. an attack with an infectious agent), “classical signal analysis” (e.g. the detection of abnormal events in a neonatal intensive care unit ), and the general domain of “machine learning”. , Having access to data of more than 30 million individuals will provide an opportunity to test and compare these different algorithms and methods on a scale hitherto not possible. The third challenge, therefore, will be evaluation on a realistic scale (that is, involving a population of millions of patients across different databases) of a number of data mining techniques. We believe that further development of these techniques constitutes a significant scientific contribution to the methodology of data mining. 

To contain the number of spurious (false-positive) detections, several approaches will be followed, including causal reasoning based on information in the EHR, semantic mining of the literature, and the automatic use of information in biological (targets, anti-targets, pathways) and chemical/drug databases. The issue of reducing spurious signals is, from our perspective, a hitherto undervalued issue. Spurious signals constitute a significant risk. From the public health perspective, spurious signals may result in withdrawal of effective drugs. Literature documents the impact that such a false alarm can have on public health. In principle, the negative impact of spurious signals may well outweigh the benefit of earlier detection of a true adverse event. Therefore, the benefit of early detection must be balanced with unnecessary concern about spurious signals. From a regulatory perspective, the risk of spurious signals is considerable: it may overwhelm our ability to review and regulate the consequences of these signals. Finally, from a commercial perspective, it is hard to underestimate the consequences of a false alarm. History shows that, even when a drug is cleared from suspicion, the impact on the drug’s reputation often cannot be undone. The fourth challenge, therefore, will be the automated exploitation of heterogeneous sources of information to reduce the number of spurious signals. 

Spurious signals have significant consequences. Sertindole, a new atypical neuroleptic known to prolong the QT interval, was suspended in November 1998 because the proportion of reports of fatal reactions suggesting arrhythmia among all reports with sertindole was almost ten (!) times higher than that for other atypical neuroleptics in the UK. This excess risk was not predicted in preclinical data and had not been found in pre-marketing trials. Further studies showed that there was no indication of an actual increase of risk of all causes or cardiac deaths during sertindole treatment, but only an increased risk of it being reported. Three years later, October 2001, the suspension of sertindole was rescinded by the Committee on Proprietary Medicinal Products (CPMP).


ALERT aims to develop and use advanced ICT technologies for demonstrating new ways to exploit the existing wealth of clinical and biomedical data sources for better and faster detection of ADRs.

General Information

·ICT Theme – Grant Agreement no. 215847 
·Duration: 42 months 
·Start date: 1-Feb-2008 
·Funding: 4.500.000 € 
·Participating institutions: 
- Aarhus University Hospital, Denmark 
- Agenzia regionale di Sanità, Italy 
- AstraZeneca AB, Sweden 
- Erasmus University Medical Center, Netherlands 
- Fundació IMIM, Spain 
- Health Search - Italian College of General Practitioners, Italy 
- mperial College London, UK 
- IRCCS Centro Neurolesi “Bonino-Pulejo”, Italy 
- London School of Hygiene & Tropical Medicine, UK 
- Pedianet – Società Servizi Telematici SRL, Italy 
- PHARMO Coöperation UA, Netherlands 
- Tel-Aviv University, Israel 
- Università di Milano-Bicocca, Italy 
- Université Victor-Segalen Bordeaux II, France 
- University of Aveiro – IEETA, Portugal 
- University of Nottingham, UK 
- University of Santiago de Compostela, Spain 
- University Pompeu Fabra, Spain

Contact Information

- Prof. Johan van der Lei (Project Co-ordinator) - email: [email protected]
Tel: +31 (10) 4087050 
- Mr. Carlos Díaz (Project Manager) - email: [email protected] Tel: +34 93 3160518 
- Ms. Nathalie Villahoz (Press and Media) - email: [email protected] Tel: +34 933160525