BDS - Part II - Prerequisites - Chapter 2 : Setting Up and Managing - Reading Notes(2)

Wildcards and "Argument list too long"

OS X and Linux systems have a limit to the number of arguments that can be supplied to a command (more technically, the limit is to the total length of the arguments.) In general, it’s best to be as restrictive as possible with wildcards. This protects against accidental matches. Shell wildcards allow allow you to match specific characters or ranges of characters. For example, we could match the characters U, V, W, X, and Y with either [UVWXY] or [U-Y] (both are equivalent). Back to our example, we could exclude the C sample using either:

$ ls zmays[AB]_R1.fastq
zmaysA_R1.fastq zmaysB_R1.fastq
$ ls zmays[A-B]_R1.fastq
zmaysA_R1.fastq zmaysB_R1.fastq

There’s one very important caveat: ranges operate on character ranges, not numeric ranges like 13 through 30. This means that wildcards like snps_[10-13].txt will not match files snps_10.txt, snps_11.txt, snps_12.txt, and snps_13.txt
while wildcard matching and brace expansion may seem to behave similarly, they are slightly different. Wildcards only expand to existing files that match them, whereas brace expansions always expand regardless of whether corresponding files or directories exist or not.

Common Unix filename wildcards

WildcardWhat it matches
*Zero or more characters (but ignores hidden files starting with a period)
?One character (also ignores hidden files).
[A-Z]Any character between the supplied alphanumeric range

Leading Zeros and Sorting
use leading zeros (e.g., file-0021.txt rather than file-21.txt) when naming files. This is useful because lexicographically sorting files (as ls does) leads to the correct ordering.Using leading zeros isn’t just useful when naming filenames; this is also the best way to name genes, transcripts, and so on. Projects like Ensembl use this naming scheme in naming their genes (e.g.,ENSG00000164256).

Markdown for Project Notebooks
It’s very important to keep a project notebook containing detailed information about the chronology of your computational work, steps you’ve taken, information about why you’ve made decisions, and of course all pertinent information to reproduce your work. Plain text is a future-proof format. Additionally, plain-text project notebooks can also be put under version control.
A lightweight markup language called Markdown is a plain-text format that is easy to read and painlessly incorporated into typed notes, and can also be rendered to HTML or PDF.

Markdown Formatting Basics
Features: text can be broken down into hierarchical sections, there’s syntax for both code blocks and inline code, and it’s easy to embed links and images.
John Gruber’s full markdown syntax specification is available on his website. https://daringfireball.net/projects/markdown/syntax
Here is a basic Markdown document illustrating the format:
Markdown Formate
Markdown Formate
HTML_Rendering_of_the_markdown_notebook
Using Pandoc to Render Markdown to HTML
We’ll use Pandoc, (http://johnmacfarlane.net/pandoc/ )a popular document converter, to render our Markdown documents to valid HTML. These HTML files can then be shared with collaborators or hosted on a website. See the Pandoc installation page(http://bit.ly/pan-install) for instructions on how to install Pandoc on your system.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值