Ryan's Blog

Getting Sizes Right for CUDA

Posted in programming, research by ryanlayer on January 12, 2010

Block Elements:  this depends on the size of the block’s local memory (LOCAL_MEM) and the size of each element in the data structure (sizeof(int)).  For the Tesla card there is 16K of space so LOCAL_MEM=16384.  16384/4 = 4096

Block Size (number of threads per block):  Each thread is responsible for one window.  The number of threads per block depends on the block elements, the window size, and the slide size.  (block_elements – window_length) / slide_length. (4096 – 200)/50 = 77

Usable Threads:  Each thread is responsible for loading the first slide_length of its range from global memory to local memory.  This means that some threads at the end of the block will not have all of their data load.  There will be window_length/slide_length unusable threads, and block_size – (window_length/slide_length) usable threads.

NOTE:  blockDim.x and window_length/slide refer to number of windows (how many windows in a block, and how many unusable windows/threads per block).  To convert to position we often must multiply the number of windows by the slide_length.  For example, window 5 will start at position 5*slide_length.

Each block starts at position blockIdx.x*(blockDim.x – window_length/slide_length)*slide_length

blockIdx.x is the block ID, if blocks did not need to overlap, then we would just multiply this by blockDim.x*slide_length (blockDim.x refers to the number of windows, and we need position, so we multiply by slide_length).  Since things overlap, we need each block (after the first one) to start a few positions back.  The number of unusable windows at the end of each block is equal to window_length/slide_length.  The next block needs to cover these windows.  Block sizes are fixed, so blocks that are moved back to cover unused windows will leave some amount of unprocessed windows that must be covered by the next block (in addition to the unusable windows).  Block 1 needs to be moved back window_length/slide_length to cover the unusable windows in block 0; block 2 needs to be moved back 2*(window_length/slide_length) to cover both the unprocessed space and the unusable windows; block 3 needs to be moved back 3*(window_length/slide_length); and so forth.  The amount a block must be moved back is blockIdx.x*window_length/slide_length, and therefore each block starts at  blockIdx.x(blockDim.x – window_length/slide_length)slide_length.

Each thread, which corresponds to a window, starts at an offset from where the block starts, that offset is based on the slide size: threadIdx.x*slide_length + blockIdx.x(blockDim.x – window_length/slide_length)slide_length

Number of Blocks:  block_elements/chrom_size would be correct if there was no overlapping, but blocks must overlap to account for the unusable threads

Reading mapped files and fastq files

Posted in programming, research by ryanlayer on December 2, 2009

When reading the files, we deal with two types, three files in total.  Two of the files (.out files) are the result of mapping the fastq files to the reference.  Each file represents all tags on one side of the pair.  The fastq file can represent either side of the pair.

We assume that all files are ordered.

For each file, we want to extract the entries.

In the mapped file, we ignore any line that begins with ‘#’, or has a status other than ‘U’ which indicates it is a uniq mapping.  While we are reading the mapped files, each line read can have one of three results and will return one of three integer values:

  1. valid entry (1)
  2. invalid entry (status != ‘R’, ^ = ‘#’, etc.) (0)
  3. end of file (-1)

These different return values allow us to loop for the next valid entry and stop looping when we reach the end:

while(read_line(mapped.file) == 0) {;} will loop until we have a valid entry or have reached the end of the file.

In the fastq file, there are 4 lines of data per tag, we are only concerned with the first, so we simply skip three lines between entry reads.

Tagged with: , , , , ,

Finding Structural Variations with Pair-End Sequences and a Sliding Window

Posted in research by ryanlayer on October 19, 2009

Yeast genome analysis identifies chromosomal translocation, gene conversion events and several sites of Ty element insertion.

Posted in research by ryanlayer on October 8, 2009

Nucleic Acids Res. 2009 Aug 26.

Paper

Abstract

Paired end mapping of chromosomal fragments has been used in human cells to identify numerous structural variations in chromosomes of individuals and of cancer cell lines; however, the molecular, biological and bioinformatics methods for this technology are still in development. Here, we present a parallel bioinformatics approach to analyze chromosomal paired-end tag (ChromPET) sequence data and demonstrate its application in identifying gene rearrangements in the model organism Saccharomyces cerevisiae. We detected several expected events, including a chromosomal rearrangement of the nonessential arm of chromosome V induced by selective pressure, rearrangements introduced during strain construction and gene conversion at the MAT locus. In addition, we discovered several unannotated Ty element insertions that are present in the reference yeast strain, but not in the reference genome sequence, suggesting a few revisions are necessary in the latter. These data demonstrate that application of the chromPET technique to a genetically tractable organism like yeast provides an easy screen for studying the mechanisms of chromosomal rearrangements during the propagation of a species.

Next-generation DNA sequencing of paired-end tags (PET) for transcriptome and genome analyses

Posted in research by ryanlayer on October 8, 2009

Genome Res. 2009 19: 521-532

Paper

Abstract

Comprehensive understanding of functional elements in the human genome will require thorough interrogation and comparison of individual human genomes and genomic structures. Such an endeavor will require improvements in the throughputs and costs of DNA sequencing. Next-generation sequencing platforms have impressively low costs and high throughputs but are limited by short read lengths. An immediate and widely recognized solution to this critical limitation is the paired-end tag (PET) sequencing for various applications, collectively called the PET sequencing strategy, in which short and paired tags are extracted from the ends of long DNA fragments for ultra-high-throughput sequencing. The PET sequences can be accurately mapped to the reference genome, thus demarcating the genomic boundaries of PET-represented DNA fragments and revealing the identities of the target DNA elements. PET protocols have been developed for the analyses of transcriptomes, transcription factor binding sites, epigenetic sites such as histone modification sites, and genome structures. The exclusive advantage of the PET technology is its ability to uncover linkages between the two ends of DNA fragments. Using this unique feature, unconventional fusion transcripts, genome structural variations, and even molecular interactions between distant genomic elements can be unraveled by PET analysis. Extensive use of PET data could lead to efficient assembly of individual human genomes, transcriptomes, and interactomes, enabling new biological and clinical insights. With its versatile and powerful nature for DNA analysis, the PET sequencing strategy has a bright future ahead.

Tagged with: , ,