When data is uploaded to PipeBio, we parse it and display the results to users. Sequence and related data can then be exported in a wide range of common bioinformatic formats, such as .ab1, Genbank, Fasta, Fastq and more.
But what if you want the original, un-parsed file out? Many PipeBio customers require an audit trail of the sequences for taking drug candidates to clinical trials.
Exporting the originally uploaded files from PipeBio is easy via the online web application. Simply select the file to export, choose File > export from the menu and select “Source files” from the dropdown.
These files might come from Sanger trace files (.ab1), Illumina NGS reads (fasta or fastq) or PacBio format.
You can also export these source files using the PipeBio REST API. A full working example with daily integration tests is available in the public PipeBio Github repo.
Here's how to download the original files from PipeBio Bioinformatics cloud in 3 steps:
1. Authenticate with PipeBio
[gist id=2ae466fa041559eca56ebe5ff6d8c80e file=authenticate.py]
2. Request a pre-signed url for the document you want to export
[gist id=2ae466fa041559eca56ebe5ff6d8c80e file=get_signed_url.py]
3. Download data from the pre-signed url
[gist id=2ae466fa041559eca56ebe5ff6d8c80e file=download_data.py]