Metagenome-Assembled Genomes

Metagenome-Assembled Genomes

You can find this application in the demos folder of your Jupyter notebook environment.

    • samplesheet.csv
    • mag_workflow.ipynb
  • This tutorial demonstrates that Nextflow Engine can handle nf-core/mag pipeline.

    The first step is to import the nextflow package:

    from camber import nextflow

    Here’s an example of how to setup configurations and execute a job:

    • pipeline="nf-core/mag": specify pipeline to run.
    • engine_size="XXSMALL": indicate engine size to perform the job.
    • num_engines=8: indicate number of engines to run workflow tasks in parallel.

    Pipeline parameters must be defined in params argument. To ensure the pipeline works as expected, please take note that:

    • "--input": "./samplesheet.csv": the relative path of samplesheet.csv file to the current notebook. In case of using local FastQ files, the locations of them in samplesheet.csv file content are relative also.
    • "--outdir": "/camber_outputs": the location stores output data of the job.
    # Declare URLs to download necessary files
    kraken2_db = "https://raw.githubusercontent.com/nf-core/test-datasets/mag/test_data/minigut_kraken.tgz"
    centrifuge_db = "https://raw.githubusercontent.com/nf-core/test-datasets/mag/test_data/minigut_cf.tar.gz"
    busco_db = "https://busco-data.ezlab.org/v5/data/lineages/bacteria_odb10.2024-01-08.tar.gz"
    nf_mag_job = nextflow.create_job(
        pipeline="nf-core/mag",
        engine_size="XXSMALL",
        num_engines=8,
        params={
            "--input": "samplesheet.csv",
            "--outdir": "/camber_outputs",
            "-r": "3.4.0",
            "--kraken2_db": kraken2_db,
            "--centrifuge_db": centrifuge_db,
            "--busco_db": busco_db,
            "--skip_krona": "true",
            "--skip_gtdbtk": "true",
            "--skip_maxbin2": "true",
        },
    )

    This step is to check job status:

    nf_mag_job.status

    To monitor job exectution, you can show job logs in real-time by read_logs method:

    nf_mag_job.read_logs()

    When the job is done, you can discover and download the results and logs of the job by two ways:

    1. Browser data directly in notebook environment:

    image

    1. Go to the Stash UI:

    image