Sometimes I have to put text on a path

Saturday, September 19, 2009

neurosciences data mining

neurosciences data mining

pubmed selection


: Neuroinformatics. 2008 Sep;6(3):241-52. Epub 2008 Oct 24.Click here to read Click here to read // Links

NeuroMorpho.Org implementation of digital neuroscience: dense coverage and integration with the NIF.

Center for Neural Informatics, Structure, & Plasticity, and Molecular Neuroscience Department, Krasnow Institute for Advanced Study, George Mason University, Fairfax, VA, USA. Neuronal morphology affects network connectivity, plasticity, and information processing. Uncovering the design principles and functional consequences of dendritic and axonal shape necessitates quantitative analysis and computational modeling of detailed experimental data. Digital reconstructions provide the required neuromorphological descriptions in a parsimonious, comprehensive, and reliable numerical format. NeuroMorpho.Org is the largest web-accessible repository service for digitally reconstructed neurons and one of the integrated resources in the Neuroscience Information Framework (NIF). Here we describe the NeuroMorpho.Org approach as an exemplary experience in designing, creating, populating, and curating a neuroscience digital resource. The simple three-tier architecture of NeuroMorpho.Org (web client, web server, and relational database) encompasses all necessary elements to support a large-scale, integrate-able repository. The data content, while heterogeneous in scientific scope and experimental origin, is unified in format and presentation by an in house standardization protocol. The server application (MRALD) is secure, customizable, and developer-friendly. Centralized processing and expert annotation yields a comprehensive set of metadata that enriches and complements the raw data. The thoroughly tested interface design allows for optimal and effective data search and retrieval. Availability of data in both original and standardized formats ensures compatibility with existing resources and fosters further tool development. Other key functions enable extensive exploration and discovery, including 3D and interactive visualization of branching, frequently measured morphometrics, and reciprocal links to the original PubMed publications. The integration of NeuroMorpho.Org with version-1 of the NIF (NIFv1) provides the opportunity to access morphological data in the context of other relevant resources and diverse subdomains of neuroscience, opening exciting new possibilities in data mining and knowledge discovery. The outcome of such coordination is the rapid and powerful advancement of neuroscience research at both the conceptual and technological level. PMID: 18949582 [PubMed - indexed for MEDLINE] PMCID: PMC2655120

Related articles

Cited by 1 PubMed Central article


2: J Biomed Inform. 2008 Apr;41(2):251-63. Epub 2007 Nov 22.Click here to read Click here to read // Links

A prototype symbolic model of canonical functional neuroanatomy of the motor system.

Department of Radiology, Brigham and Women's Hospital, Harvard Medical School, 75 Francis St., Boston, MA 02115, USA. talos@bwh.harvard.edu Recent advances in bioinformatics have opened entire new avenues for organizing, integrating and retrieving neuroscientific data, in a digital, machine-processable format, which can be at the same time understood by humans, using ontological, symbolic data representations. Declarative information stored in ontological format can be perused and maintained by domain experts, interpreted by machines, and serve as basis for a multitude of decision support, computerized simulation, data mining, and teaching applications. We have developed a prototype symbolic model of canonical neuroanatomy of the motor system. Our symbolic model is intended to support symbolic look up, logical inference and mathematical modeling by integrating descriptive, qualitative and quantitative functional neuroanatomical knowledge. Furthermore, we show how our approach can be extended to modeling impaired brain connectivity in disease states, such as common movement disorders. In developing our ontology, we adopted a disciplined modeling approach, relying on a set of declared principles, a high-level schema, Aristotelian definitions, and a frame-based authoring system. These features, along with the use of the Unified Medical Language System (UMLS) vocabulary, enable the alignment of our functional ontology with an existing comprehensive ontology of human anatomy, and thus allow for combining the structural and functional views of neuroanatomy for clinical decision support and neuroanatomy teaching applications. Although the scope of our current prototype ontology is limited to a particular functional system in the brain, it may be possible to adapt this approach for modeling other brain functional systems as well. PMID: 18164666 [PubMed - indexed for MEDLINE] PMCID: PMC2376098

Related articles

Cited by 1 PubMed Central article


3: J Struct Biol. 2008 Mar;161(3):220-31. Epub 2007 Oct 16.Click here to read Click here to read // Links

The cell centered database project: an update on building community resources for managing and sharing 3D imaging data.

Department of Neurosciences, University of California at San Diego, San Diego, CA 92093-0608, USA. mmartone@ucsd.edu Databases have become integral parts of data management, dissemination, and mining in biology. At the Second Annual Conference on Electron Tomography, held in Amsterdam in 2001, we proposed that electron tomography data should be shared in a manner analogous to structural data at the protein and sequence scales. At that time, we outlined our progress in creating a database to bring together cell level imaging data across scales, The Cell Centered Database (CCDB). The CCDB was formally launched in 2002 as an on-line repository of high-resolution 3D light and electron microscopic reconstructions of cells and subcellular structures. It contains 2D, 3D, and 4D structural and protein distribution information from confocal, multiphoton, and electron microscopy, including correlated light and electron microscopy. Many of the data sets are derived from electron tomography of cells and tissues. In the 5 years since its debut, we have moved the CCDB from a prototype to a stable resource and expanded the scope of the project to include data management and knowledge engineering. Here, we provide an update on the CCDB and how it is used by the scientific community. We also describe our work in developing additional knowledge tools, e.g., ontologies, for annotation and query of electron microscopic data. PMID: 18054501 [PubMed - indexed for MEDLINE] PMCID: PMC2367257

Related articles

Cited by 6 PubMed Central articles


4: Prog Brain Res. 2006;158:83-108.Click here to read // Links

Functional genomics and proteomics in the clinical neurosciences: data mining and bioinformatics.

The Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA 30322, USA. The goal of this chapter is to introduce some of the available computational methods for expression analysis. Genomic and proteomic experimental techniques are briefly discussed to help the reader understand these methods and results better in context with the biological significance. Furthermore, a case study is presented that will illustrate the use of these analytical methods to extract significant biomarkers from high-throughput microarray data. Genomic and proteomic data analysis is essential for understanding the underlying factors that are involved in human disease. Currently, such experimental data are generally obtained by high-throughput microarray or mass spectrometry technologies among others. The sheer amount of raw data obtained using these methods warrants specialized computational methods for data analysis. Biomarker discovery for neurological diagnosis and prognosis is one such example. By extracting significant genomic and proteomic biomarkers in controlled experiments, we come closer to understanding how biological mechanisms contribute to neural degenerative diseases such as Alzheimers' and how drug treatments interact with the nervous system. In the biomarker discovery process, there are several computational methods that must be carefully considered to accurately analyze genomic or proteomic data. These methods include quality control, clustering, classification, feature ranking, and validation. Data quality control and normalization methods reduce technical variability and ensure that discovered biomarkers are statistically significant. Preprocessing steps must be carefully selected since they may adversely affect the results of the following expression analysis steps, which generally fall into two categories: unsupervised and supervised. Unsupervised or clustering methods can be used to group similar genomic or proteomic profiles and therefore can elucidate relationships within sample groups. These methods can also assign biomarkers to sub-groups based on their expression profiles across patient samples. Although clustering is useful for exploratory analysis, it is limited due to its inability to incorporate expert knowledge. On the other hand, classification and feature ranking are supervised, knowledge-based machine learning methods that estimate the distribution of biological expression data and, in doing so, can extract important information about these experiments. Classification is closely coupled with feature ranking, which is essentially a data reduction method that uses classification error estimation or other statistical tests to score features. Biomarkers can subsequently be extracted by eliminating insignificantly ranked features. These analytical methods may be equally applied to genetic and proteomic data. However, because of both biological differences between the data sources and technical differences between the experimental methods used to obtain these data, it is important to have a firm understanding of the data sources and experimental methods. At the same time, regardless of the data quality, it is inevitable that some discovered biomarkers are false positives. Thus, it is important to validate discovered biomarkers. The validation process may be slow; yet, the overall biomarker discovery process is significantly accelerated due to initial feature ranking and data reduction steps. Information obtained from the validation process may also be used to refine data analysis procedures for future iteration. Biomarker validation may be performed in a number of ways - bench-side in traditional labs, web-based electronic resources such as gene ontology and literature databases, and clinical trials. PMID: 17027692 [PubMed - indexed for MEDLINE]

Related articles


5: Neuroinformatics. 2004;2(4):369-80.Click here to read // Links

Mining for associations between text and brain activation in a functional neuroimaging database.

Neurobiology Research Unit, Rigshospitalet, Copenhagen University Hospital, Denmark. fnielsen@nru.dk We describe a method for mining a neuroimaging database for associations between text and brain locations. The objective is to discover association rules between words indicative of cognitive function as described in abstracts of neuroscience papers and sets of reported stereotactic Talairach coordinates. We invoke a simple probabilistic framework in which kernel density estimates are used to model distributions of brain activation foci conditioned on words in a given abstract. The principal associations are found in the joint probability density between words and voxels. We show that the statistically motivated associations are well aligned with general neuroscientific knowledge. PMID: 15800369 [PubMed - indexed for MEDLINE]

Related articles

Cited by 3 PubMed Central articles


6: Neuroinformatics. 2003;1(4):379-95.Click here to read // Links

The cell-centered database: a database for multiscale structural and protein localization data from light and electron microscopy.

Department of Neurosciences, University of California at San Diego, San Diego, CA, USA. mmartone@ucsd.edu The creation of structured shared data repositories for molecular data in the form of web-accessible databases like GenBank has been a driving force behind the genomic revolution. These resources serve not only to organize and manage molecular data being created by researchers around the globe, but also provide the starting point for data mining operations to uncover interesting information present in the large amount of sequence and structural data. To realize the full impact of the genomic and proteomic efforts of the last decade, similar resources are needed for structural and biochemical complexity in biological systems beyond the molecular level, where proteins and macromolecular complexes are situated within their cellular and tissue environments. In this review, we discuss our efforts in the development of neuroinformatics resources for managing and mining cell level imaging data derived from light and electron microscopy. We describe the main features of our web-accessible database, the Cell Centered Database (CCDB; http://ncmir.ucsd.edu/CCDB/), designed for structural and protein localization information at scales ranging from large expanses of tissue to cellular microdomains with their associated macromolecular constituents. The CCDB was created to make 3D microscopic imaging data available to the scientific community and to serve as a resource for investigating structural and macromolecular complexity of cells and tissues, particularly in the rodent nervous system. PMID: 15043222 [PubMed - indexed for MEDLINE]

Related articles

Cited by 12 PubMed Central articles


7: Methods Inf Med. 2003;42(2):126-33.Click here to read // Links

Informatics united: exemplary studies combining medical informatics, neuroinformatics and bioinformatics.

Intelligent Bioinformatics Systems, German Cancer Research Center, Heidelberg, Germany. OBJECTIVES: Medical informatics, neuroinformatics and bioinformatics provide a wide spectrum of research. Here, we show the great potential of synergies between these research areas on the basis of four exemplary studies where techniques are transferred from one of the disciplines to the other. METHODS: Reviewing and analyzing exemplary and specific projects at the intersection of medical informatics, neuroinformatics, and bioinformatics from our experience in an interdisciplinary research group. RESULTS: Synergy emerges when techniques and solutions from medical informatics, bioinformatics, or neuroinformatics are successfully applied in one of the other disciplines. Synergy was found in 1. the modeling of neurophysiological systems for medical therapy development, 2. the use of image processing techniques from medical computer vision for the analysis of the dynamics of cell nuclei, and 3. the application of neuroinformatics tools for data mining in bioinformatics and as classifiers in clinical oncology. CONCLUSIONS: Each of the three different disciplines have delivered technologies that are readily applicable in the other disciplines. The mutual transfer of knowledge and techniques proved to increase efficiency and accuracy in a manifold of applications. In particular, we expect that clinical decision support systems based on techniques derived from neuro- and bioinformatics have the potential to improve medical diagnostics and will finally lead to a personalized delivery of healthcare. PMID: 12743648 [PubMed - indexed for MEDLINE]

Related articles


8: Philos Trans R Soc Lond B Biol Sci. 2001 Aug 29;356(1412):1159-86.Click here to read Click here to read // Links

Advanced database methodology for the Collation of Connectivity data on the Macaque brain (CoCoMac).

Computational Systems Neuroscience Group, C. and O. Vogt Brain Research Institute, Heinrich Heine University Düsseldorf, Moorenstrasse 5, 40225, Düsseldorf, Germany. The need to integrate massively increasing amounts of data on the mammalian brain has driven several ambitious neuroscientific database projects that were started during the last decade. Databasing the brain's anatomical connectivity as delivered by tracing studies is of particular importance as these data characterize fundamental structural constraints of the complex and poorly understood functional interactions between the components of real neural systems. Previous connectivity databases have been crucial for analysing anatomical brain circuitry in various species and have opened exciting new ways to interpret functional data, both from electrophysiological and from functional imaging studies. The eventual impact and success of connectivity databases, however, will require the resolution of several methodological problems that currently limit their use. These problems comprise four main points: (i) objective representation of coordinate-free, parcellation-based data, (ii) assessment of the reliability and precision of individual data, especially in the presence of contradictory reports, (iii) data mining and integration of large sets of partially redundant and contradictory data, and (iv) automatic and reproducible transformation of data between incongruent brain maps. Here, we present the specific implementation of the 'collation of connectivity data on the macaque brain' (CoCoMac) database (http://www.cocomac.org). The design of this database addresses the methodological challenges listed above, and focuses on experimental and computational neuroscientists' needs to flexibly analyse and process the large amount of published experimental data from tracing studies. In this article, we explain step-by-step the conceptual rationale and methodology of CoCoMac and demonstrate its practical use by an analysis of connectivity in the prefrontal cortex. PMID: 11545697 [PubMed - indexed for MEDLINE]

1 comment: