17  CCR5 Scavenger Hunt

Author
Affiliation

Dr Randy Johnson

Hood College

Published

August 20, 2025

CCR5 Scavenger Hunt

Today we will explore some of the resources we discussed and see what information we can uncover regarding human CCR5.

Instructions

You may work in small groups to complete the following tasks, but everyone should compile their own report. For each task, record the information you find. Include the database used, your specific search terms, the accession numbers/IDs you find, and the key information extracted. Provide the direct URL to the relevant entry or search results if possible.

I recommend you use quarto in RStudio if you are familiar with it, but a Word document will be sufficient.

At the end of class, please upload a word document (rendered by quarto if possible) with your notes to the corresponding assignment on blackboard.

CCR5 nucleotide sequence (the gene)

In this first part, we’ll start by looking for CCR5 in GenBank.

  • Go to the gene database at GenBank
  • Search for the CCR5 gene and identify:
    • The official gene symbol
    • Full gene name
    • Number of sequences for human CCR5?
    • Accession numbers (including links)
    • Chromosome location
    • Look at rs333 in the graphical display - this 32 bp deletion confers protection against HIV infection. What stands out to you regarding the number of base pairs in this variant?
    • Any other features that look interesting?

CCR5 function and structure (the protein)

Next, we’ll explore the functional and structural information found in UniProt, PDB and AlphaFold. Remember to include direct URLs to the information you find. Are there any images or movies you could include in your document?

  • Go to UniProt and search for CCR5 and identify:
    • The primary UniProt accession ID
    • The organism this protein belongs to
    • A concise summary of its main function
    • The protein family or classification
    • Any known domains or features mentioned (e.g., transmembrane regions, binding sites)
    • Predicted subcellular location
  • Search the Protein Data Bank (PDB) for CCR5
    • Are there any experimental 3D structures of the full human CCR5 protein available? If so, note the PDB ID(s) and the experimental method used (e.g., X-ray crystallography, NMR spectroscopy, Cryo-EM). If not, describe what kind of related structures you found (e.g., partial structures, complexes, or related receptors).
  • Search AlphaFold for CCR5
    • Is there a predicted 3D structure for the human CCR5 protein? If yes, note and link its AlphaFold ID (e.g., AF-PXXXXX-F1).
    • How does its availability compare to what you found in PDB (e.g. is the full protein predicted here when no full experimental structure was found in PDB)?
    • Is there a structure for CCR5$$32? If so, how does it compare with the reference version?

CCR5 in biological systems

Lastly, we’ll look at the KEGG pathway database to learn more about the role CCR5 plays in the cell and more generally in human biology.

  • Search for CCR5 in the KEGG pathway database (you can start at the main KEGG site and navigate to “Pathways”) and identify:
    • At least one biological pathway involving CCR5
    • Is it linked to any disease pathways (e.g., HIV-1 infection)?
    • Briefly describe CCR5’s role within one of these pathways

More data

Now that we are starting to get a clear picture of CCR5, we may want to find some additional raw data to further our investigation. We’ll look for these data at SRA.

  • Search NCBI’s Sequence Read Archive (SRA) for CCR5. Given that CCR5 is a key receptor for HIV entry, search for studies related to “HIV infection” or “CCR5 knockout” in Homo sapiens. You may not to find raw reads specifically of the CCR5 gene directly, but rather studies where its expression or variants might be relevant in a broader experimental context.
    • Find the accession number (SRP/SRR/DRP/DRR) of at least one relevant study (e.g. related to human HIV or immune response).
    • Briefly describe the type of experiment (e.g., RNA-seq, Whole Genome Sequencing) and organism for this SRA entry.

Discussion

  • Which database did you find most helpful for understanding the function of CCR5? Why?

  • Which database was most useful for understanding the molecular sequence and genomic location of CCR5?

  • How do the experimentally determined 3D structures (from PDB) compare to predicted ones (from AlphaFold) for CCR5 or related proteins? Why might both types of structural information be important in research?

  • Considering the information you’ve gathered on CCR5’s sequence and function, how might you use a sequence alignment tool (like BLAST, which we’ll discuss in more detail next week) to find similar genes or proteins in other organisms or to investigate different versions (variants) of CCR5 in humans?

  • Why would a researcher might deposit raw data related to CCR5 (or the disease it’s involved in) in SRA, considering that SRA stores raw sequencing reads.

Choose your own adventure

If there is time remaining, repeat the steps above for a new gene of your choosing.