程序師世界是廣大編程愛好者互助、分享、學習的平台,程序師世界有你更精彩!
首頁
編程語言
C語言|JAVA編程
Python編程
網頁編程
ASP編程|PHP編程
JSP編程
數據庫知識
MYSQL數據庫|SqlServer數據庫
Oracle數據庫|DB2數據庫
您现在的位置: 程式師世界 >> 編程語言 >  >> 更多編程語言 >> Python

Python implementation of gatk multithreading acceleration example

編輯:Python

Catalog

GATK Analysis of variation

The following from the network is not verified

GATK Analysis of variation

For big data samples, it may be slow , Therefore, multi thread parallel computing can be carried out after chromosome splitting .

Here's one I wrote python Multithreaded scripts , For reference only , Please correct the clumsiness .

#!/usr/bin/python3import _threadimport osimport threadingimport timemuthreads=[]bam_file="a.mkdup.bam"out_file_prefix="flower" chr_list=["CHR01","CHR02","CHR03","CHR04","CHR05","CHR06","CHR07","CHR08","CHR09","CHR10","CHR11","CHR12","CHR13"]for chr in chr_list: threads_comonder_name= "gatk HaplotypeCaller --intervals " + chr +" -R /mnt/j/BSA/02-read-align/Tifrunner2.fasta -I " + bam_file + " -ERC GVCF -O "+ out_file_prefix +"-"+chr+".erc.g.vcf" muthreads.append(threads_comonder_name)exitFlag = 0class myThread (threading.Thread): def __init__(self, threadID, name, counter, comander): threading.Thread.__init__(self) self.threadID = threadID self.name = name self.counter = counter self.comander = comander def run(self): print (" Start thread :" + self.name) print_time(self.name, self.counter, 5, self.comander) print (" Exit thread :" + self.name)def print_time(threadName, delay, counter,comander): # while counter: if exitFlag: threadName.exit() time.sleep(delay) print(comander) os.system(comander)# Call the operating system command line to process data # counter -= 1# Create a new thread threadlist=[]for i, threadsnu in enumerate(muthreads[0:11]): print(i) print(threadsnu) threadsnew=myThread(1, "Thread-" + str(i), 2, threadsnu) threadlist.append(threadsnew)# Start a new thread for threads in threadlist: threads.start()for threads in threadlist: threads.join()print (" Exit the main thread after running ") The following from the network is not verified

The same sample of multiple chromosomes vcf File merge

# for i in {1..22} X Y ;do echo "-I final_chr$i.vcf" '\';done# for i in {10..19} {1..9} M X Y ;do echo "-I final_chr$i.vcf" '\';donemodule load java/1.8.0_91GATK=/home/jianmingzeng/biosoft/GATK/gatk-4.0.3.0/gatk$GATK GatherVcfs \-I final_chr1.vcf \-I final_chr2.vcf \-I final_chr3.vcf \-I final_chr4.vcf \-I final_chr5.vcf \-I final_chr6.vcf \-I final_chr7.vcf \-I final_chr8.vcf \-I final_chr9.vcf \-I final_chr10.vcf \-I final_chr11.vcf \-I final_chr12.vcf \-I final_chr13.vcf \-I final_chr14.vcf \-I final_chr15.vcf \-I final_chr16.vcf \-I final_chr17.vcf \-I final_chr18.vcf \-I final_chr19.vcf \-I final_chr20.vcf \-I final_chr21.vcf \-I final_chr22.vcf \-I final_chrX.vcf \-I final_chrY.vcf \-O merge.vcf

Attention should be paid to when merging ,vcf The order of documents is the same as that of each vcf The order of header files in the file is the same .

That's all python Realization GATK Details of the multithreading acceleration example , More about python GATK For information on Multithreading acceleration, please pay attention to other relevant articles on software development network !



  1. 上一篇文章:
  2. 下一篇文章:
Copyright © 程式師世界 All Rights Reserved