Ryan's Blog

Running commands in parallel with xargs

Posted in programming by ryanlayer on February 18, 2016

Because GNU Parallel is too complex for me, I use xargs.  The  -P N option will farm off each command to a pool of N threads, and since each command is likely to be some chain of pipes and file writes I also use the sh -c 'foo | bar > baz' option.  You will want to modify the -d " " option to contain whatever delimiter you are using.  Here I have spaces, but if your input is coming from a file you may need "\t" or "\n".

For example, if you have a file f.bed:

#chr start end
10   100   200
2    200   300
1    300   400
X    400   500
Y    500   600
15   600   700
7    100   200
1    200   300
3    300   400
5    400   500

And you want to split the file out by chromosome, sort by start, keep the header, and use 10 threads. Then you could:

echo -n $(seq 1 22) X Y \
| xargs -d ' ' -I{} -P 10 \
sh -c '(head -n1 f.bed; grep -w "^{}" f.bed | sort -n -k 2) > {}.f.bed'

This sends a list of chromosomes (echo -n $(seq 1 22) X Y xargs) to xargs. That list is then split by space (-d ' '). Each element in the split list is used to create a new command where the “{}” values are replaced by the element value. xargs will then manage the execution of these commands, in this case running 10 of these commands at a time (-P 10).

Tagged with: ,

Structural Variation Graph (sv graph) Thoughts

Posted in programming, research by ryanlayer on January 5, 2010

The data structure consists of:

– a set of chromosomes

– each chromosome is represented by a name, and an ordered (doublely linked) list of nodes

– each node represents one tag of a pair

– each node has a double sided link to the next node in chromosome order (the node with the next largest offset), and a list of nodes in which the node is part of a pair.  The links to the pairs are one-way.

The chromosome name is not part of the node struct.  Each node does have a pointer back to it’s chromosome structure, and that chromosome struct contains the name.  This prevents a potentially long chromosome name from being stored a ton of time (in each node).  When we are reading the nodes from a file, we must pass in a char pointer so that the name of the chromosome can be set.  We will use this char pointer to put the node into the proper chromosome structure.

Tagged with: ,

Simple CUDA Program

Posted in research by ryanlayer on October 8, 2009

Getting this code to work may require some environment variable changes:

  • export LD_LIBRARY_PATH=/usr/local/cuda/lib/:$LD_LIBRARY_PATH
  • export PATH=/usr/local/cuda/bin/:$PATH

#include <stdio.h>
#include <stdlib.h>
#include <cuda.h>
#include <sys/time.h>

__global__ void vecMult_d(int *A, int *B, int N)
int i = blockIdx.x * blockDim.x + threadIdx.x ;
if(i<N) { B[i] = A[i]*2; }

void vecMult_h(int *A, int *B, int N)
for(int i=0;i<N;i++) { B[i] = A[i]*2; }

int main() {
int *a_h, *b_h; // pointers to host memory; a.k.a. CPU
int *a_d, *b_d; // pointers to device memory; a.k.a. GPU
//int blocksize=512, grid_size, n=32000;
int blocksize=512, n=1000000;
struct timeval t1_start,t1_end,t2_start,t2_end;
double time_d, time_h;

// allocate arrays on host
a_h = (int *)malloc(sizeof(int)*n);
b_h = (int *)malloc(sizeof(int)*n);

// allocate arrays on device
cudaMalloc((void **)&a_d,n*sizeof(int));
cudaMalloc((void **)&b_d,n*sizeof(int));
dim3 dimBlock( blocksize);
dim3 dimGrid( ceil(float(n)/float(dimBlock.x)));
for(int j=0;j<n;j++) a_h[j]=j;

// GPU

// CPU
time_d = (t1_end.tv_sec-t1_start.tv_sec)*1000000 + t1_end.tv_usec – t1_start.tv_usec;
time_h = (t2_end.tv_sec-t2_start.tv_sec)*1000000 + t2_end.tv_usec – t2_start.tv_usec;
printf(“%d %lf %lf\n”,n,time_d,time_h);free(a_h);

SOURCE: https://visualization.hpc.mil/wiki/Simple_CUDA_Program

Tagged with: ,