The paper presents an image-oriented modality to functionally describe articially and biologically nanostruc-
tured surfaces, which can be used for the characterization of the atom neighborhoods on the surface of proteins.
The both properties,hydrophobicity and charge distribution on protein surface, are analyzed in this paper. The
actual discrete hydrophobicity and charge distribution attached to the atoms that form a surface atom's vicinity
is replaced by an approximately equivalent density distribution, computed in a standardized octagonal pattern
around each atom. These representations of hydrophobicities and charges are used to compute the resemblance of surface atom neighborhoods belonging to a protein, dened as the sum of the products of hydrophobicity densities of the corresponding patches (the pattern's central circles or angular sectors having the same position). The similitude and the interaction of a pair of atom neighborhoods are dened as their resemblance for parallel, respectively, anti-parallel orientations of the normals on the molecular surfaces in the points where the central atoms are located. Surface atom neighborhoods have been classied in terms of both resemblance and vector description.
The paper presents an image-oriented modality to functionally describe artificially and biologically nanostructured
surfaces, which can be used for the characterization of the atom neighborhoods on the surface of proteins. The property
which is mainly analyzed in this paper is the hydrophobicity distribution on protein surface, but the distributions of
charges and mutual electrical potentials can also be considered. The actual discrete hydrophobicity distribution attached
to the atoms that form a surface atom's vicinity is replaced by an approximately equivalent hydrophobicity density
distribution, computed in a standardized octagonal pattern around each atom. These representation of hydrophobicities is
used to compute the resemblance of surface atom neighborhoods belonging to a protein, defined as the sum of the
products of hydrophobicity densities of the corresponding patches (the pattern's central circles or angular sectors having
the same position). The similitude and the interaction of a pair of atom neighborhoods are defined as their resemblance
for parallel, respectively, anti-parallel orientations of the normals on the molecular surfaces in the points where the
central atoms are located. The purpose of this work is to create a database of selected protein surfaces that will be used
for nanotechnology research and applications purposes.
The paper presents a methodology using atom or amino acid hydrophobicities to describe the surface properties of
proteins in order to predict their interactions with other proteins and with artificial nanostructured surfaces. A
standardized pattern is built around each surface atom of the protein for a radius depending on the molecule type and
size. The atom neighborhood is characterized in terms of the hydrophobicity surface density. A clustering algorithm is
used to classify the resulting patterns and to identify the possible interactions. The methodology has been implemented in
a software package based on Java technology deployed in a Linux environment.
Nucleotide genomic signals satisfy regularities that reveal restrictions in the distribution of nucleotides and pairs of
nucleotides along DNA sequences. Structurally, a chromosome appears to be more than a plain text, by satisfying
symmetry constrains that evoke the rhythm and rhyme in poems. These regularities make it easy to identify exogenous
inserts in the genomes of prokaryotes, because such inserts obey different regularities than the background sequence. The
paper presents instances of inserts found in the genomes of Bacillus subtilis, Mycobacterium tuberculosis and other
prokaryotes. Inserts of exogenous material are frequently accompanied by complementary inserts tending to restore the
original constrains.
KEYWORDS: Resistance, Signal analysis, Pathogens, Data conversion, Digital signal processing, Signal processing, Image segmentation, Imaging spectroscopy, Current controlled current source, Polymers
As previously shown the conversion of nucleotide sequences into digital signals offers the possibility to apply signal
processing methods for the analysis of genomic data. Genomic Signal Analysis (GSA) has been used to analyze large
scale features of DNA sequences, at the scale of whole chromosomes, including both coding and non-coding regions.
The striking regularities of genomic signals reveal restrictions in the way nucleotides and pairs of nucleotides are
distributed along nucleotide sequences. Structurally, a chromosome appears to be less of a "plain text", corresponding to
certain semantic and grammar rules, but more of a "poem", satisfying additional symmetry restrictions that evoke the
"rhythm" and "rhyme". Recurrent patterns in nucleotide sequences are reflected in simple mathematical regularities
observed in genomic signals. GSA has also been used to track pathogen variability, especially concerning their resistance
to drugs. Previous work has been dedicated to the study of HIV-1, Clade F and Avian Flu. The present paper applies
GSA methodology to study Mycobacterium tuberculosis (MT) rpoB gene variability, relevant to its resistance to
antibiotics. Isolates from 50 Romanian patients have been studied both by rapid LightCycler PCR and by sequencing of a
segment of 190-250 nucleotides covering the region of interest. The variability is caused by SNPs occurring at specific
sites along the gene strand, as well as by inclusions. Because of the mentioned symmetry restrictions, the GS variations
tend to compensate. An important result is that MT can act as a vector for HIV virus, which is able to retrotranscribe its
specific genes both into human and MT genomes.
Sequences of 2D images, taken by a single moving video receptor, can be fused to generate a 3D representation. This
dynamic stereopsis exists in birds and reptiles, whereas the static binocular stereopsis is common in mammals, including
humans. Most multimedia computer vision systems for stereo image capture, transmission, processing, storage and
retrieval are based on the concept of binocularity. As a consequence, their main goal is to acquire, conserve and enhance
pairs of 2D images able to generate a 3D visual perception in a human observer. Stereo vision in birds is based on the
fusion of images captured by each eye, with previously acquired and memorized images from the same eye. The process
goes on simultaneously and conjointly for both eyes and generates an almost complete all-around visual field. As a
consequence, the baseline distance is no longer fixed, as in the case of binocular 3D view, but adjustable in accordance
with the distance to the object of main interest, allowing a controllable depth effect. Moreover, the synthesized 3D scene
can have a better resolution than each individual 2D image in the sequence. Compression of 3D scenes can be achieved,
and stereo transmissions with lower bandwidth requirements can be developed.
The paper presents results in the study of pathogen variability by using genomic signals. The conversion of symbolic nucleotide sequences into digital signals offers the possibility to apply signal processing methods to the analysis of genomic data. The method is particularly well suited to characterize small size genomic sequences, such as those found in viruses and bacteria, being a promising tool in tracking the variability of pathogens, especially in the context of developing drug resistance. The paper is based on data downloaded from GenBank [32], and comprises results on the variability of the eight segments of the influenza type A, subtype H5N1, virus genome, and of the Hemagglutinin (HA) gene, for the H1, H2, H3, H4, H5 and H16 types. Data from human and avian virus isolates are used.
The conversion of genomic sequences into digital genomic signals offers the possibility to use powerful signal processing methods for the analysis of genomic information. The study of genomic signals reveals local and global features of chromosomes that would be difficult to identify by using only the symbolic representation used in genomic data bases. The paper presents a study of HIV variability using standard 'wet' methods of nucleotide sequence analysis, corroborated with IT techniques based on the genomic signal approach. Specifically, Independent Component Analysis is used to characterize the variability defining the F subtype HIV strains isolated in Romania.
KEYWORDS: Switching, Signal processing, Data storage, Proteins, Molecules, Data conversion, Genetics, Digital signal processing, Statistical analysis, Hydrogen
For large scale analysis purposes, the conversion of genomic sequences into digital signals opens the possibility to use powerful signal processing methods for handling genomic information. The study of complex genomic signals reveals large scale features, maintained over the scale of whole chromosomes, that would be difficult to find by using only the symbolic representation. Based on genomic signal methods and on statistical techniques, the paper defines parameters of DNA sequences which are invariant to transformations induced by SNPs, splicing or crossover. Re-orienting concatenated coding regions in the same direction, regularities shared by the genomic material in all exons are revealed, pointing towards the hypothesis of a regular ancestral structure from which the current chromosome structures have evolved. This property is not found in non-nuclear genomic material, e.g., plasmids.
KEYWORDS: Scattering, Statistical analysis, Image segmentation, Visualization, Molecules, Signal processing, Data conversion, Switching, Acquisition tracking and pointing, Biological research
Symbolic nucleotide sequences are converted into digital genomic signals by using a complex representation derived from a tetrahedral vector representation of nucleotides. The study of complex genomic signals using signal processing methods reveals large scale features of chromosomes that would be difficult to grasp by using the statistical and pattern matching methods for the analysis of symbolic genomic sequences. On the other hand, in the context of operating with a large volume of data at various resolutions and visualizing the results to make them available to humans, the problem of data representability becomes critical. A novel mathematical description of data representability, based on the data scattering ratio on a pixel is defined and is applied for several typical cases of standard signals and for genomic signals. It is shown that the variation of genomic data along nucleotide sequences, specifically the cumulated and unwrapped phase, can be visualized adequately as simple graphic lines for low and large scales, while for medium scales (thousands to tens of thousands of base pairs) the statistical descriptions have to be used.
The paper briefly reviews the methodology of the symbolic nucleic sequence conversion into genomic signals and presents large scale and global features of the resulting genomic signals. Whole chromosomes or whole genomes are converted into complex signals and phase analysis is performed. The phase, cumulated phase and unwrapped phase of genomic signals are studied as tools for revealing important features of to the first and second order statistics of nucleotide distribution along DNA strands. It is shown that the unwrapped phase displays an almost linear variation along whole chromosomes. The property holds for all the investigated genomes, being shared by both prokaryotes and eukaryotes, while the magnitude and sign of the unwrapped phase slope is specific for each taxon and chromosome. The comparison between the behavior of the cumulated phase and of the unwrapped phase across the putative origins and termini of the replichores suggests a model of the 'patchy' structure of the chromosomes.
KEYWORDS: Statistical analysis, Signal processing, Switching, Image segmentation, Molecules, Proteins, Data conversion, Genetics, Databases, Digital signal processing
The paper presents some new results in the analysis of genomic information at the scale of whole chromosomes or whole genomes based on the conversion into genomic signals. Mainly, the phase analysis -- phase, cumulated phase and unwrapped phase, and the sequence path analysis are presented. The unwrapped phase displays an almost linear variation along whole chromosomes. The property holds for all the investigated genomes, being shared by both prokaryotes and eukaryotes, while the magnitude and sign of the unwrapped phase slope is specific for each taxon and chromosome. Such a behavior proves a rule similar to Chargaff's rule, but reveals a statistical regularity of the succession of the nucleotides -- a second order statistics, not only of the distribution of nucleotides -- a first order statistics. The cumulated phase of the genomic signal of certain prokaryotes also shows interesting specific behavior. The comparison between the behavior of the cumulated phase and of the unwrapped phae across the putative origins and termini of the replichores suggests an interesting model for the structure of chromosomes.
On the capillary level of intact organs of humans and mammals, fields of gradients of all measurable parameters are found exclusively. Therefore, a precise monitoring e.g. of hemoglobin concentration (Hb) and oxygenation (HbO2) in the microcosm of blood capillaries is only possible when the heterogeneity of tissue data is recorded at a representative number of measuring points. For the collecting of the required data both stochastic or imaging techniques can be applied.
The feasibility to obtain visualized information of myocardium by imaging is a new dimension. However, during heart surgery the surgeon does not need all data of images continuously. Therefore, development of strategies able to reduce flux of information transiently in between images might become important. Arrangements of images in 3-dimensional structures can produce better outlines. Images often contain information of several parameters. Therefore, a selection of important parts of the pictures might be helpful. Optical sensors will have the ability to detect dangerous situations in tissues which can release optical or acoustic signals.
The paper presents a new way to study the results obtained by back-scattering of light in tissue through artificial intelligence. The artificial neural networks' (ANN) ability to extract significant information from an initial set of data allows both an interpolation, in the a priori defined points, and an extrapolation outside of the range bordered by the extreme points from the initial training set. The data obtained from EMPHO Spectrophotometer were used for neural networks learning. Specific aspects related to the training procedure and parameter fitting are presented. The evaluation of the computing effort shows some way for future optimizations.
In 2000 by 2D-imaging we were able for the first time to visualize in subcellular space functional structures of myocardium. For these experiments we used hemoglobin-free perfused pig hearts in our lab. Step by step we learned to understand the meaning of subcellular structures. Principally, the experiment revealed that in subcellular space very fast changes of light scattering can occur. Furthermore, coefficients of different parameters were determined on the basis of multicomponent system theory.
KEYWORDS: Genetics, Proteins, Molecules, Signal analyzers, Digital signal processing, Signal processing, Independent component analysis, Statistical analysis, Control systems, Image segmentation
An original tetrahedral representation of the Genetic Code (GC), that better catches its structure, degeneracy and evolution trends, is defined. The possibility to reduce the dimensionality of the description by the projection of the GC tetrahedron on an adequately oriented plane is also considered, leading to complex representations of the GC. On these bases, optimal symbolic-to-digital mappings of the linear, one-dimensional and one-directional strands of nucleic acids into real or complex genetic signals are derived at nucleotide, codon and amino acid levels. By converting the sequences of nucleotides and polypeptides into digital genetic signals, this approach opens the possibility to use a large variety of signal processing methods for their processing and analysis. It is also shown that some essential features of nucleotide sequences can be better extracted using this representation. Some preliminary results in the comparative analysis of the statistical properties of intragenic vs. intergenic genetic signals are also presented. The use of Independent Component Analysis (ICA) to search for control sequences in the intergenic DNA, i.e., the part of the genome that does not encode proteins, is suggested.
Oxyscan proved to be a very reliable optical sensor system for application in physiology and pathophysiology. Furthermore, it can be applied during liver transplantation in order to diminish or prevent reperfusion injury. Oxyscan could be used for monitoring of subcellular structures with great success.
The EMPHO SSK is a scanning micro lightguide spectrophotometer constructed for 3D-imaging in all tissues of humans and mammals. By monitoring of intracapillary hemoglobin oxygenation and concentration very precise information of the microcosm of the oxygen supply level of tissues can be gained. Measurements in skin and liver revealed that on the basis of numeric data obtained by optical remission techniques 3D-images can be constructed which provide information on functional structures of intact organs.
Two-dimensional functional images of local O2 supply parameters in intact forehead skin of human volunteers were constructed on the basis of optical monitoring by the micro- lightguide spectrometer (EMPHO II SSK). The different parameters obtained by optical measurements, such as hemoglobin concentration and oxygenation, O2 content, as well as local O2 gradients in areas of 10 by 10 mm are processed by means of new imaging techniques. The results reveal that pronounced differences between oxygenation of healthy skin of young and elder volunteers do exist. There is evidence that regulation of capillary flow in skin of elder volunteers might be altered.
SC316: Independent Component Analysis for Genetic Signals
The course enables attendees to use Independent Component Analysis (ICA) to extract significant information from complex biomedical signals. The conceptual basis and the implementation of ICA algorithms is studied through useful practical examples, focusing on genetic sequences. This provides you with a powerful tool for the analysis, representation and interpretation of signals, important in both research and clinical practice.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.