𝗗𝗲𝗽𝗹𝗼𝘆𝗶𝗻𝗴 𝗚𝗿𝗮𝗱𝗶𝗼 𝗼𝗻 𝗖𝗹𝗼𝘂𝗱 𝗥𝘂𝗻

📅2 days ago⏱1 min read

I am moving my bioinformatics workflows from Mac M4 to an RTX 3090. This shift allows me to test how hardware changes performance for genomic analysis.

My current testing plan focuses on comparing traditional tools with AI-driven solutions.

𝗧𝗲𝘀𝘁𝗶𝗻𝗴 𝗣𝗿𝗼𝗴𝗿𝗮𝗺

Scanpy: Completed on RTX 3090. It finished PBMC 3k analysis in 60 seconds and identified 9 cell clusters.
DeepVariant vs GATK: This is my top priority. I will compare GATK HaplotypeCaller (CPU) against Google DeepVariant (GPU) to measure speed and accuracy.
PrimateAI-3D: Moving from Mac M4 to RTX 3090 using Docker.
ProkBERT: Testing promoter prediction and phage detection using BERT architecture.

𝗚𝗼𝗼𝗴𝗹𝗲 𝗖𝗹𝗼𝘂𝗱 𝘃𝘀 𝗟𝗼𝗰𝗮𝗹 𝗛𝗮𝗿𝗱𝘄𝗮𝗿𝗲

I am also comparing Google Cloud infrastructure with local GPU setups.

Google Cloud uses tools like BigQuery Genomics for SQL queries and Dataflow for parallel processing.
Local RTX 3090 setups provide high speed for specific tasks like DeepVariant.
Google Cloud scales better for massive datasets.
Local hardware is better for controlled, repetitive testing.

𝗡𝗲𝘅𝘁 𝗦𝘁𝗲𝗽𝘀

I will spend the next few days running performance benchmarks. I want to see the exact difference in resource usage between CPU and GPU when calling variants.

I will also build a VCF Intelligent Interpreter. This tool will use Large Language Models to explain clinical significance in plain language.

Source: https://dev.to/jh5_pulse/zai-cloud-run-shang-bu-shu-gradio-ying-yong-3f35

Optional learning community: https://t.me/GyaanSetuAi

𝗗𝗲𝗽𝗹𝗼𝘆𝗶𝗻𝗴 𝗚𝗿𝗮𝗱𝗶𝗼 𝗼𝗻 𝗖𝗹𝗼𝘂𝗱 𝗥𝘂𝗻

Continue reading

𝗦𝗽𝗲𝗰𝘂𝗹𝗮𝘁𝗶𝘃𝗲 𝗗𝗲𝗰𝗼𝗱𝗶𝗻𝗴: 𝗙𝗮𝘀𝘁𝗲𝗿 𝗟𝗟𝗠 𝗜𝗻𝗳𝗲𝗿𝗲𝗻𝗰𝗲

𝗪𝗵𝘆 𝗬𝗼𝘂𝗿 𝗡𝗲𝘅𝘁 𝗔𝗜 𝗧𝗼𝗼𝗹 𝗠𝗶𝗴𝗵𝘁 𝗕𝗲 𝗕𝗼𝘁𝘁𝗹𝗲𝗻𝗲𝗰𝗸𝗲𝗱 𝗕𝘆 𝗧𝗵𝗲 𝗪𝗿𝗼𝗻𝗴 𝗖𝗵𝗶𝗽

𝗗𝗲𝗽𝗹𝗼𝘆𝗶𝗻𝗴 𝗕𝗶𝗼𝗶𝗻𝗳𝗼𝗿𝗺𝗮𝘁𝗶𝗰𝘀 𝗧𝗼𝗼𝗹𝘀 𝗼𝗻 𝗮𝗻 𝗜𝘀𝗼𝗹𝗮𝘁𝗲𝗱 𝗦𝗲𝗿𝘃𝗲𝗿

𝗗𝗲𝗽𝗹𝗼𝘆𝗶𝗻𝗴 𝗚𝗿𝗮𝗱𝗶𝗼 𝗼𝗻 𝗖𝗹𝗼𝘂𝗱 𝗥𝘂𝗻

𝗚𝗼𝗼𝗴𝗹𝗲 𝗗𝗲𝗲𝗽𝗦𝗼𝗺𝗮𝘁𝗶𝗰 𝗮𝗻𝗱 𝘁𝗵𝗲 𝗙𝘂𝘁𝘂𝗿𝗲 𝗼𝗳 𝗖𝗮𝗻𝗰𝗲𝗿 𝗚𝗲𝗻𝗼𝗺𝗶𝗰𝘀