Hi developers,
I am trying to use CFEIntact to QC a known, intact lab strain (NL4-3). However, because the tool forces an alignment against the default subtypes, it flags my input sequence as "Scrambled" due to minor coordinate differences between the two strains. Currently, there is no way to tell the tool that the input sequence is the "correct" reference.
I would like to request an option to provide a custom reference genome (e.g., --reference NL4-3.fasta). This would allow the tool to align the input sequence against a perfect match, thereby bypassing incorrect "scrambled" or "inversion" flags for known intact clones.
What I have tried:
I have already attempted using --ignore-scramble, which works but disables the scramble check entirely rather than resolving the underlying alignment issue.
Additional context:
The specific defect I am receiving is: Sequence is plus-scrambled. When I do a separate blast for my NL4-3 reference sequence to the default HXB2 reference that CFEIntact uses for subtype B, I see high identity (~96%), but the coordinate differences and gaps are causing the global alignment to fragment, triggering the false positive.
Why this matters:
If there were an option to provide a custom reference genome, it would be greatly useful for many HIV researchers who work with different engineered lab strains to check the intactness of their sequences. I would greatly appreciate if this can be achieved on your end, or if you can point me to any settings I can manually change to accommodate my needs.
Thank you for your consideration.
Hi developers,
I am trying to use CFEIntact to QC a known, intact lab strain (NL4-3). However, because the tool forces an alignment against the default subtypes, it flags my input sequence as "Scrambled" due to minor coordinate differences between the two strains. Currently, there is no way to tell the tool that the input sequence is the "correct" reference.
I would like to request an option to provide a custom reference genome (e.g., --reference NL4-3.fasta). This would allow the tool to align the input sequence against a perfect match, thereby bypassing incorrect "scrambled" or "inversion" flags for known intact clones.
What I have tried:
I have already attempted using --ignore-scramble, which works but disables the scramble check entirely rather than resolving the underlying alignment issue.
Additional context:
The specific defect I am receiving is: Sequence is plus-scrambled. When I do a separate blast for my NL4-3 reference sequence to the default HXB2 reference that CFEIntact uses for subtype B, I see high identity (~96%), but the coordinate differences and gaps are causing the global alignment to fragment, triggering the false positive.
Why this matters:
If there were an option to provide a custom reference genome, it would be greatly useful for many HIV researchers who work with different engineered lab strains to check the intactness of their sequences. I would greatly appreciate if this can be achieved on your end, or if you can point me to any settings I can manually change to accommodate my needs.
Thank you for your consideration.