生信分析Python编程高级技巧

生信与基因组学

于 2024-09-27 19:50:33 发布

阅读量1k

点赞数 11

分类专栏： Python 文章标签： python 数据分析开发语言

本文链接：https://blog.csdn.net/LittleComputerRobot/article/details/142600996

版权

Python 专栏收录该内容

9 篇文章

订阅专栏

1. 使用statistics模块进行统计运算

statistics 模块是 Python 标准库的一部分，专门用于执行基本的统计运算

import statistics
import random

random.seed(123)
list_num = [random.randint(1,10) for _ in range(5)]
print(list_num)
# [1, 5, 2, 7, 5]

# 平均数
mean = statistics.mean(list_num)
# 中位数
median = statistics.median(list_num)
# 标准差
std = statistics.variance(list_num)
# 众数
mode = statistics.mode(list_num)

print(f"mean: {mean}")
print(f"median: {median}")
print(f"std: {std}")
print(f"mode: {mode}")

# [1, 5, 2, 7, 5]
# mean: 4
# median: 5
# std: 6
# mode: 5

2. 使用策略模式

将算法实现和使用分开，使得算法变化时不影响其他的代码运行。

实现分为以下三步：

策略接口：定义了所有支持的算法的公共接口；
具体策略：实现了策略接口的具体算法或方法；
上下文：使用策略对象来调用具体算法。

from abc import ABC, abstractmethod

class MappingMethod(ABC):
    """定义策略接口"""
    @abstractmethod
    def mapping(self, method):
        pass
    
class BwaMem(MappingMethod):
    """bwa mem比对具体策略"""
    def mapping(self):
        return "Using bwa mem..."
    
class Bowtie2(MappingMethod):
    """bowtie2比对具体策略"""
    def mapping(self):
        return "Using bowtie2..."
    
class Alignment():
    """上下文类"""
    def __init__(self, method: MappingMethod):
        self._method = method
        
    def set_method(self, method: MappingMethod):
        self._method = method
        
    def run_alignment(self):
        return self._method.mapping()
    
if __name__ == '__main__':
    
    # 比对实例
    alignment = Alignment(BwaMem())
    print(alignment.run_alignment()) 
    
    # 切换为bowtie2比对
    alignment.set_method(Bowtie2())
    print(alignment.run_alignment())
    
	# Using bwa mem...
	# Using bowtie2...

3. 使用字段访问元组

使用字段访问元组，避免使用索引直接访问元组。

from collections import namedtuple

# 创建namedtuple，包含sample_id和sample_name 2个字段
dtuple = namedtuple('sample_info', ['sample_id', 'sample_name'])

# 实例化dtuple对象
sample_dtuple = dtuple("sample-01", 'test')

# 访问字段
print(sample_dtuple.sample_id)
print(sample_dtuple.sample_name)
# sample-01
# test

4. 使用deque操作队列

deque是一个双端队列，支持从两端添加和删除元素， deque比列表处理队列效率更高。

from collections import deque

# 创建双端队列
queue = deque(['sample1', 'sample2', 'sample3'])
print(queue)

# 左侧添加元素和右侧添加元素
queue.appendleft('sample0')
queue.append('sample4')
print(queue)

# 左侧删除元素和右侧删除元素
queue.popleft()
queue.pop()
print(queue)


# deque(['sample1', 'sample2', 'sample3'])
# deque(['sample0', 'sample1', 'sample2', 'sample3', 'sample4'])
# deque(['sample1', 'sample2', 'sample3'])

5. 使用decimal模块设置计算的精度

from decimal import Decimal, getcontext

# 设置精度
getcontext().prec = 3

a = Decimal('1.21212')
b = Decimal('1.323')

print(a+b)
# 2.54

6. 使用协程实现并行运行

import asyncio
import time
import os 

async def run_fastq_qc(fastq_path: str):
    print( f"Run fastqc, input fastq path: {fastq_path}")
    await asyncio.sleep(1)
    print( f"Finiash fastqc!")

async def run_mapping(fastq_path: str):
    print(f"Run bwa mem mapping, input fastq path: {fastq_path} ")
    await asyncio.sleep(3)
    print(f"Finish bwa mem mapping!") 
    
async def run_async(fastq_path: str):
    await asyncio.gather(run_fastq_qc(fastq_path), run_mapping(fastq_path))

asyncio.run(run_async(fastq_path="/path/sample.fastq"))