Troubleshooting
This guide helps resolve common issues when using U-Probe.
Installation Issues
Command not found: uprobe
Problem: After installation, the uprobe command is not recognized.
Solutions:
- Check if installed correctly:
pip list | grep uprobe
python -c "import uprobe; print(uprobe.__version__)"- Try using Python module syntax:
python -m uprobe --help- Check PATH (for --user installs):
# Add to ~/.bashrc or ~/.zshrc
export PATH="$HOME/.local/bin:$PATH"- Reinstall in a virtual environment:
python -m venv uprobe_env
source uprobe_env/bin/activate
pip install uprobeImportError: No module named 'uprobe'
Problem: Python cannot find the uprobe module.
Solutions:
- Verify installation:
pip show uprobe- Check Python environment:
which python
which pip
# Ensure both point to the same environment- Reinstall:
pip uninstall uprobe
pip install uprobeMissing dependencies errors
Problem: Errors about missing packages like pandas, click, etc.
Solutions:
- Install all requirements:
pip install -r requirements.txt- Update pip and try again:
pip install --upgrade pip
pip install uprobe- For development installs:
pip install -e ".[dev]"Configuration Issues
FileNotFoundError: [Errno 2] No such file or directory
Problem: U-Probe cannot find specified files.
Solutions:
- Use absolute paths:
# Instead of relative paths
fasta: "genome.fa"
# Use absolute paths
fasta: "/full/path/to/genome.fa"- Check file permissions:
ls -la /path/to/genome.fa
# Ensure files are readable- Verify file existence:
file /path/to/genome.fa
head -n 5 /path/to/genome.faTarget validation failed
Problem: Error message "Invalid targets found" or no targets pass validation.
Solutions:
- Check gene names in GTF:
# Search for your gene in GTF
grep -i "GAPDH" /path/to/annotation.gtf
# Check available gene names
awk '$3=="gene"' /path/to/annotation.gtf | \
grep -o 'gene_name "[^"]*"' | sort | uniq | head -20- Try different gene identifiers:
targets:
- "GAPDH" # Gene symbol
- "ENSG00000111640" # Ensembl ID
- "2597" # Entrez ID- Use continue-invalid flag for testing:
uprobe validate-targets -p protocol.yaml -g genomes.yaml --continue-invalid- Check GTF format:
# GTF should have these columns:
# seqname source feature start end score strand frame attribute
head -n 5 /path/to/annotation.gtfInvalid YAML syntax
Problem: YAML parsing errors.
Solutions:
- Check indentation (use spaces, not tabs):
# Correct
probes:
main_probe: # 2 spaces
template: "{seq}" # 4 spaces
# Wrong (tabs or inconsistent spacing)
probes:
main_probe: # tab character
template: "{seq}" # 3 spaces- Validate YAML syntax:
python -c "import yaml; yaml.safe_load(open('protocol.yaml'))"- Quote strings with special characters:
# Quote expressions and conditions
expr: "rc(target_region[0:20])"
condition: "gc_content >= 0.4 & gc_content <= 0.6"Runtime Issues
No target sequences generated
Problem: The generate-targets step produces an empty result.
Solutions:
- Check extraction parameters:
extracts:
target_region:
source: "exon" # Try "gene" if exons are too short
length: 50 # Reduce if regions are smaller
overlap: 10 # Reduce overlap- Verify targets exist:
uprobe validate-targets -p protocol.yaml -g genomes.yaml -v- Check for gene annotation issues:
# Look for your gene in GTF
grep "GAPDH" /path/to/annotation.gtf | head -5No probes constructed
Problem: The construct-probes step fails or produces no output.
Solutions:
- Check probe expressions:
probes:
test_probe:
template: "{simple_part}"
parts:
simple_part:
length: 20
expr: "target_region[0:20]" # Simple expression- Verify encoding mappings:
# Ensure all target genes have encoding entries
encoding:
GAPDH: # Must match target name exactly
BC1: "ACGTACGTACGT"- Test with minimal probe:
probes:
minimal:
expr: "target_region[0:25]"All probes filtered out
Problem: Post-processing removes all probes.
Solutions:
- Use --raw flag to see unfiltered probes:
uprobe run -p protocol.yaml -g genomes.yaml --raw- Relax filtering conditions:
post_process:
filters:
gc_content:
condition: "gc_content >= 0.2 & gc_content <= 0.8" # Very relaxed- Check attribute calculations:
# Remove problematic attributes temporarily
attributes:
basic_gc:
target: main_probe
type: gc_content
# Comment out complex attributes:
# off_targets: ...- Examine raw results:
import pandas as pd
df = pd.read_csv('results/experiment_raw.csv')
print(df.describe()) # Check attribute distributions
print(df[df['gc_content'].isna()]) # Find failed calculationsPerformance Issues
Slow execution
Problem: U-Probe runs very slowly.
Solutions:
- Increase thread count:
uprobe run -p protocol.yaml -g genomes.yaml -t 16- Use faster extraction:
extracts:
target_region:
source: "exon" # Faster than "gene"
length: 100 # Shorter regions- Reduce expensive attributes:
attributes:
# Keep fast attributes
gc_content:
target: main_probe
type: gc_content
# Remove slow ones temporarily:
# fold_score: ...
# kmer_count: ...- Process in batches:
# Split large target lists
uprobe run -p small_batch.yaml -g genomes.yamlMemory issues
Problem: Out of memory errors or system becomes unresponsive.
Solutions:
- Process smaller batches:
targets:
- "GAPDH"
- "ACTB"
# Process 5-10 genes at a time for large genomes- Reduce sequence length:
extracts:
target_region:
length: 80 # Shorter sequences use less memory
overlap: 15- Skip memory-intensive attributes:
# Avoid these for large datasets:
# - n_mapped_genes with blast
# - kmer_count
# - complex fold_score calculationsIndex building fails
Problem: Genome index building fails or crashes.
Solutions:
- Check available disk space:
df -h /path/to/genome/directory- Verify genome file integrity:
file /path/to/genome.fa
head -n 10 /path/to/genome.fa
tail -n 10 /path/to/genome.fa- Build indices manually:
# Bowtie2
bowtie2-build /path/to/genome.fa /path/to/indices/genome
# BLAST
makeblastdb -in /path/to/genome.fa -dbtype nucl -out /path/to/indices/genome- Use pre-built indices:
# Point to existing indices
human_hg38:
fasta: "/data/hg38.fa"
gtf: "/data/hg38.gtf"
out: "/data/existing_indices" # Pre-built indices locationAttribute Calculation Issues
Melting temperature calculation fails
Problem: Tm calculation produces NaN values or errors.
Solutions:
- Check sequence validity:
# Sequences should only contain ATCG
import re
def check_sequence(seq):
return bool(re.match('^[ATCG]*$', seq))- Handle short sequences:
# Ensure minimum sequence length
probes:
main_probe:
parts:
binding:
length: 15 # Minimum for reliable Tm calculationOff-target calculation fails
Problem: Alignment-based attributes fail.
Solutions:
- Verify indices exist:
ls -la /path/to/indices/
# Should contain .bt2 files for bowtie2- Test aligner manually:
# Test bowtie2
echo "ATCGATCGATCGATCG" | bowtie2 -x /path/to/indices/genome -- Use alternative aligner:
attributes:
off_targets:
target: main_probe
type: n_mapped_genes
aligner: blast # Try blast if bowtie2 failsK-mer counting fails
Problem: kmer_count attributes produce errors.
Solutions:
- Check Jellyfish database:
jellyfish info genome.jf- Build Jellyfish database:
jellyfish count -m 15 -s 1000000000 -t 8 -o genome.jf genome.fa- Use alternative complexity measures:
# Instead of kmer_count, use:
attributes:
sequence_complexity:
target: main_probe
type: complexity_scoreData Format Issues
Unexpected output format
Problem: Output CSV has unexpected columns or values.
Solutions:
- Check probe names match:
# Probe names become column names
probes:
my_probe: # Creates column 'my_probe'
template: "{seq}"- Verify attribute names:
attributes:
probe_gc: # Creates column 'probe_gc'
target: my_probe
type: gc_content- Examine raw output:
uprobe run -p protocol.yaml -g genomes.yaml --raw
# Check _raw.csv file for all calculated valuesMissing sequences in output
Problem: Some expected probes are missing from results.
Solutions:
- Check filtering criteria:
# Very permissive filters for debugging
post_process:
filters:
anything_goes:
condition: "True" # Passes everything- Look for errors in logs:
uprobe --verbose run -p protocol.yaml -g genomes.yaml 2>&1 | tee log.txt- Check intermediate files:
ls -la results/
wc -l results/*.csv # Count lines in each fileGetting Help
Check Logs
Always run with verbose output for troubleshooting:
uprobe --verbose run -p protocol.yaml -g genomes.yaml 2>&1 | tee uprobe.logMinimal Test Case
Create a minimal test to isolate issues:
# minimal_test.yaml
name: "minimal_test"
genome: "human_hg38"
targets: ["GAPDH"] # Just one target
extracts:
target_region:
source: "exon"
length: 50
overlap: 10
probes:
simple:
expr: "target_region[0:20]"
# No attributes or filters initiallyReport Issues
When reporting issues, include:
- U-Probe version:
uprobe version - Full error message and traceback
- Configuration files (anonymized)
- System information: OS, Python version
- Steps to reproduce
Where to Get Help
- Documentation: Check this documentation first
- GitHub Issues: Report bugs
- GitHub Discussions: Ask questions
- Examples: Review working examples in the repository
Common Error Messages
.. list-table:: :header-rows: 1 :widths: 40 60
- Error Message
- Solution
- "Genome 'X' not found"
- Check genome name matches genomes.yaml key
- "No targets specified"
- Add targets list to protocol.yaml
- "Invalid expression: X"
- Check probe expression syntax
- "Attribute calculation failed"
- Verify required files and indices exist
- "No data to concatenate"
- Check that previous steps generated output
- "YAML parsing error"
- Check indentation and syntax
- "Permission denied"
- Check file permissions and disk space
- "Index not found"
- Run build-index command first
Prevention Tips
- Start simple: Begin with basic configurations and add complexity gradually
- Validate early: Use
validate-targetsbefore full runs - Test with subsets: Use small target lists for initial testing
- Use version control: Track configuration changes
- Document decisions: Comment your configuration files
- Regular backups: Keep backups of working configurations
Next Steps
If you're still having issues:
- Review the examples for working configurations
- Check the configuration guide for detailed option descriptions
- Ask for help on GitHub Discussions