Building on our experience in the area of statistical bioinformatics and genetics, our aim is to continue the development of tools for the analysis of genomic and genetic data in conjunction with cell biological and physiological high throughput data. We propose to focus on gene regulation and biopathways as derived from genetic, phylogenetic, gene expression, and molecular biomarker data. Biological systems of particular interest are stem and cancer cells (human, mouse, and drosophila), blood cells (platelets involved in artherothrombosis), bacteria (Mycobacterium tuberculosis), and plants (Arabidopsis thaliana). We will extend methods which are successful in the inference of models for cellular regulation to intercellular regulation, in particular to the immune response to parasite infection (Schistosoma mansoni). Statistical modelling of such systems is challenging. Experimental data as well as other sources of information, such as bioinformatics databases, are quite comprehensive and have special storage, normalisation and preliminary analysis requirements. Statistical and mathematical models which are able to represent key features of a biological system, features important for its understanding, prediction, and manipulation, are quite complex. We will explore how to combine statistical inference methods, machine learning algorithms, and mathematical modelling to derive useful representations of the biological systems of interest. Genetics research comprises a distinct sub-programme within this proposal with strong links to epidemiological studies and hence to primary clinical research. Recent technologies allow genetic association studies on very large scales, for which new analysis methods are urgently needed. We will develop methods suitable for whole genome association scans that are sensitive to the presence of multiple interacting genes while respecting the multiple testing problems that arise. We will exploit the near-linear arrangement of genes on chromosomes to develop multipoint mapping methods giving improved localisation of genes in association scans. We will improve methodology for genetic epidemiology by applying recent ideas from likelihood theory to existing regression models. We will anticipate new technologies for high throughput whole genome sequencing, by developing methods for direct analysis of DNA sequences, as opposed to genetic markers. Integrating evidence from multiple data sources from different levels of organisation of a biological system will enable the discovery of important functional links and the assessment of the predictive import of molecular biomarkers with respect to phenotypes of direct clinical interest. Development of application-specific methods will result in the creation of generic computational tools and software for use by a larger community of bioinformaticians and biologists, not necessarily experts in the detailed statistical background.